PLINK/SEQ is an open-source C/C++ library for working with human genetic variation data. The specific focus is to provide a platform for analytic tool development for variation data from large-scale resequencing projects, particularly whole-exome and whole-genome studies. However, the library could in principle be applied to other types of genetic studies, including whole-genome association studies of common SNPs.
Plink/Seq was developed at Harvard University
/usr/local/plinkseq
Create a script file similar to the one below. The plinkseq executables are added to your PATH by including the 'module load plinkseq' command in your script file.
#!/bin/bash # This file is YourOwnFileName # #PBS -N plinkseq #PBS -m be #PBS -k oe module load plinkseq cd /data/user/somewhereWithInputFile pseq ex1.vcf v-view --vmeta --gmeta
2. Submit the script using the 'qsub' command on Biowulf.
Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.
Set up a swarm command file (eg /data/username/cmdfile). Here is a sample file:
module load plinkseq; pseq ex1.vcf v-view --vmeta --gmeta module load plinkseq; pseq ex2.vcf v-view --vmeta --gmeta module load plinkseq; pseq ex3.vcf v-view --vmeta --gmeta module load plinkseq; pseq ex4.vcf v-view --vmeta --gmeta [... etc...]
Submit this swarm of jobs with:
swarm -f cmdfile
By default, each line of the commands above will be executed on '1' processor core of a node and uses 1GB of memory. If each of your Plinkseq commands will require more than 1 GB of memory, you must specify the required memory using the -g flag to swarm. e.g. if each command requires 5 GB of memory, you would submit with:
swarm -g 5 -f cmdfile
For more information regarding running swarm, see swarm.html
Users may need to run jobs interactively sometimes. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.
[user@biowulf] $ qsub -I -l nodes=1
qsub: waiting for job 2236960.biobos to start
qsub: job 2236960.biobos ready
[user@p4]$ module load plinkseq
[user@p4]$ cd /data/userID/plinkseq/run1
[user@p4]$ pseq ex4.vcf v-view --vmeta --gmeta
chr1:1001 rs1001 T/C . 1 PASS . VM=1;SM=100
P001 1 C/C [GM=1]
P002 1 T/T [GM=2]
P003 1 T/C [GM=3]
P004 1 C/C [GM=4]
[...etc...]
[user@p4] exit
qsub: job 2236960.biobos completed
[user@biowulf]$
Users may add property of node in the qsub command to request specific interactive node. For example, if you need a node with 24gb of memory to run a job interactively, do this:


