Biowulf at the NIH
RSS Feed
SICER on Helix & Biowulf

A clustering approach for identification of enriched domains from histone modification ChIP-Seq data

SICER is freely available under here.

Running on Helix

Example input files can be copied from /usr/local/apps/sicer/1.1/SICER/ex
Sample session:

helix$ module load sicer
helix$ cp /usr/local/apps/sicer/1.1/ex/* /data/$USER/sicer 
helix$ cd /data/$USER/sicer 
helix$ sh /usr/local/apps/sicer/1.1/SICER/SICER.sh /data/maoj/sicer test.bed control.bed . hg18 1 200 150 0.74 600 .01

Submitting a single batch job on Biowulf

1. Create a script file. Sample batch script file

#!/bin/bash
# This file is SicerScript
#
#PBS -N sicer
#PBS -m be
#PBS -k oe

module load sicer
cd /data/$USER/sicer
sh /usr/local/apps/sicer/1.1/SICER/SICER.sh /data/maoj/sicer test.bed control.bed . hg18 1 200 150 0.74 600 .01

2. Submit the script using the 'qsub' command on Biowulf, with, for example:

$ qsub -l nodes=1:g24:c16 ./SicerScript

This job will run on g24 (24 GB of memory) node. You may need to run a few test jobs to determine the amount of memory required then detemine the node type suitable for your job.

Running an interactive job on Biowulf

Users may need to run jobs interactively sometimes. Such jobs should not be run on the Biowulf login node.

Allocate an interactive node as described below, and run the interactive job there. Alternatively, run interactively on Helix.

biowulf% qsub -I -l nodes=1
qsub: waiting for job 2236960.biobos to start
      qsub: job 2236960.biobos ready

[user@pxxx]$ cd /data/$USER/sicer

[user@pxxx]$ module load sicer

[user@pxxx]$ sh /usr/local/apps/sicer/1.1/SICER/SICER.sh /data/maoj/sicer test.bed control.bed . hg18 1 200 150 0.74 600 .01
[user@pxxx]$ exit
qsub: job 2236960.biobos completed

[user@biowulf ~]$ 

Submitting a swarm of jobs

Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.

Set up a swarm command file (e.g. /data/$USER/cmdfile). Here is a sample file. Please note each command is one single line. Do not add any line breaks in one command. Also note that each jobs runs in its own subdirectory.

cd /data/$USER/run1; module load sicer; sh /usr/local/apps/sicer/1.1/SICER/SICER.sh .....
cd /data/$USER/run2; module load sicer; sh /usr/local/apps/sicer/1.1/SICER/SICER.sh ..... 
cd /data/$USER/run3; module load sicer; sh /usr/local/apps/sicer/1.1/SICER/SICER.sh .....
....
....

Swarm requires one flag: -f, and users will probably want to specify -t, -g, and --module

-f: the swarm command file name above (required)
-t: number of processors per node to use for each line of the commands in the swarm file above.(optional)
-g: GB of memory needed for each line of the commands in the swarm file above.(optional)

You need to tell swarm how many cores to use for each command. It's 1 core each command by default. This is done with the -t switch to swarm. In addition, each command may require, say, 12 GB of memory. This is specified to swarm using the -g 12 switch. Thus, this swarm command file can be submitted with:

biowulf> $ swarm -g 12 -f cmdfile
Users may need to run a few test jobs to determine how much memory is used. Set up a single job, then submit it. The output from the job will list the memory used by that job.

For more information regarding running swarm, see swarm.html

 

Documentation

http://home.gwu.edu/~wpeng/Software.htm