Biowulf at the NIH
RSS Feed
Ray on Biowulf

Ray is a paralleled computer-controlled software that computes de novo genome assemblies of next-gen sequencing data using message passing interface.

Programs Location


Useful commands

freen: see

qstat: search for 'qstat' on for it's usage.

jobload: search for 'jobload' on for it's usage.

Submitting a single batch job

1. Create the MPD password file

biowulf> $ echo 'password=<password> ' > ~/.mpd.conf
biowulf> $ chmod 600 ~/.mpd.conf

2. Create a script file. The file will contain the lines similar to the lines below.

# This file is rayscript
#PBS -N ray
#PBS -m be
#PBS -k oe

cd /data/user/ray
export PATH=/usr/local/mpich2-intel64/bin:$PATH
mpdboot -f $PBS_NODEFILE -n `cat $PBS_NODEFILE | wc -l` mpiexec -n $np /usr/local/ray/bin/Ray -k 31 -p input1.fastq.gz input2.fastq.gz -o output

3. Submit the script using the 'qsub' command on Biowulf, e.g. Note, user is recommend to run benchmarks to determine what kind of node is suitable for his/her jobs.

qsub -v np=16 -l nodes=2:g24 /data/username/rayscrip

This job will run on 2 g24 nodes. Since each g24 has 8 cores, number of processor (np) of each node is 8. 2 nodes will have 16 np. That's why np=16.

Running an interactive job

User may need to run jobs interactively sometimes. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.

biowulf% qsub -I -l nodes=2:g72
qsub: waiting for job 2236960.biobos to start
qsub: job 2236960.biobos ready

[user@p67]$ cd /data/user/myruns
[user@p4]$ /usr/local/ray/bin/Ray -k 31 -p input1.fastq.gz input2.fastq.gz -o output
[user@p4]$ other commands...........
qsub: job 2236960.biobos completed
[user@biowulf ~]$

User may add property of node in the qsub command to request specific interactive node. For example, if you need a node with 24gb of memory to run job interactively, do this:

biowulf% qsub -I -l nodes=1:g24