Biowulf at the NIH
Impute on Biowulf

IMPUTE is a program for estimating ("imputing") unobserved genotypes in SNP association studies. The program is designed to work seamlessly with the output of the genotype calling program CHIAMO and the population genetic simulator HAPGEN, and it produces output that can be analyzed using the program SNPTEST. IMPUTE website at Oxford.

Small numbers of Impute jobs (less than 3 simultaneous) should be run on Helix. It is only useful to run Impute on Biowulf if you want to run large numbers of simultaneous Impute jobs. The easiest way to set up multiple Biowulf jobs is via the swarm program.

Note that impute2 is a 64-bit application, so you need to specify a 64-bit node when submitting jobs via qsub or swarm.

Setting up a swarm of Impute jobs

Set up a swarm commmand file with one line for each Impute run. Example:

# this file is impute_swarm
cd /data/user/dir1; impute2 -ref_samp_out -m chr16.map -h chr16.haps  -l chr16.legend -g gtypes -s refstrand1  -Ne 11418 -int 5000000 5500000 -buffer 250 -k 10 -iter 10 -burnin 3  -o out1  -i info1  -r summary1
cd /data/user/dir2; impute2 -ref_samp_out -m chr26.map -h chr26.haps  -l chr26.legend -g gtypes -s refstrand2  -Ne 22428 -int 5000000 5500000 -buffer 250 -k 20 -iter 20 -burnin 3  -o out2  -i info2  -r summary2
cd /data/user/dir3; impute2 -ref_samp_out -m chr36.map -h chr36.haps  -l chr36.legend -g gtypes -s refstrand3  -Ne 33438 -int 5000000 5500000 -buffer 250 -k 30 -iter 30 -burnin 3  -o out3  -i info3  -r summary3
[...]

Submit this swarm with

swarm -f impute_swarm -l nodes=1:x86-64

Documentation

IMPUTE user manual

IMPUTE v2 documentation

GTOOL

SNPTEST