Biowulf at the NIH
RSS Feed
VEGAS on Biowulf

VEGAS (Versatile Gene-based Association Study) is a program for performing gene-based tests for association using the results from genetic association studies. It annotates SNPs to corresponding genes, produces a gene-based test statistic, and then uses simulation to calculate an empirical gene-based p-value. Vegas was developed by Jimmy Liu at the Queensland Institute of Medical Resarch, Australia.

The available version(s) of VEGAS can be seen by typing 'module avail vegas'. You can add the vegas executables to your path by typing 'module load vegas' as in the example below.

Submitting a single batch job

1. Create a script file. The file will contain the lines similar to the lines below. Modify the path of program location before running.

# This file is YourOwnFileName
#PBS -N yourownfilename
#PBS -m be
#PBS -k oe

module load vegas

cd /data/user/somewhereWithInputFile
vegas infile1 -pop hapmapCEU -chr 10 -out default-test

2. Submit the script using the 'qsub' command on Biowulf.

[user@biowulf]$ qsub -l nodes=1 /data/username/theScriptFileAbove

Submitting a swarm of jobs

Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.

Set up a swarm command file (eg /data/username/cmdfile). Here is a sample file:

module load vegas; cd /data/user/mydir; vegas infile1 -chr 1 -out chr1.out 
module load vegas; cd /data/user/mydir; vegas infile2 -chr 2 -out chr2.out

Submit this swarm job with:

swarm -f cmdfile

By default, each line of the commands above will be executed on '1' processor core of a node and uses 1GB of memory. If each command requires more than 1 GB of memory, you need to specify that to the swarm command using the -g # flag, where # represents the number of GB of memory required. For example, if each command requires 2 GB of memory, submit with:

swarm -g 2 -f cmdfile

For more information regarding running swarm, see swarm.html


Running an interactive job

User may need to run jobs interactively sometimes. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.

[user@biowulf] $ qsub -I -l nodes=1
qsub: waiting for job 2236960.biobos to start
qsub: job 2236960.biobos ready

[user@p4]$ module load vegas
[user@p4]$ cd /data/userID/vegas/run1
[user@p4]$ vegas example.txt -pop hapmapCEU -genelist example_genelist.txt -out genelist-test [user@p4] exit
qsub: job 2236960.biobos completed

To request a particular type of interactive node (e.g. with 24 GB of memory), you can specify this on the qsub command line.

[user@biowulf]$ qsub -I -l nodes=1:g24