bamUtil is a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable, bam.
BamUtil was developed at the Abecasis lab at U. MichiganThe available version(s) of bamutil can be seen using 'module avail bamutil', and the bamutil executable can be added to your path by using the 'module load bamutil' as in the example below.
[user@biowulf modulefiles]$ module avail bamutil ------------------ /usr/local/Modules/3.2.9/modulefiles ------------------ bamutil/1.0.6 [user@biowulf modulefiles]$ module load bamutil
Create a script file along the following lines:
#!/bin/bash # This file is runbamutil # #PBS -N RunBam #PBS -m be #PBS -k oe module load bamutil cd /data/user/somewhereWithInputfile bam convert myfile.sam myfile.bam bam splitChromosome --in myfile.bam --out myfile.bam bam diff --mapQual --in1 file1.bam --in2 file2.bam
Submit the script using the 'qsub' command on Biowulf.
Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.
Set up a swarm command file (eg /data/username/cmdfile). Here is a sample file:
cd /data/user/mydir; bam convert file1.bam file1.sam cd /data/user/mydir; bam convert file2.bam file2.sam cd /data/user/mydir; bam convert file3.bam file3.sam cd /data/user/mydir; bam convert file4.bam file4.sam
Submit this swarm with
swarm -f cmdfile --module bamutil
This will submit a swarm of jobs so that each of the commands above runs on a single core using up to 1 GB of memory. If the commands will require more than 1 GB of memory each, you need to specify that to swarm using the -g # flag, where # is the number of GB of memory required. For example:
swarm -g 3 -f cmdfile --module bamutil
For more information regarding running swarm, see swarm.html
User may need to run jobs interactively sometimes. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.
qsub: waiting for job 2236960.biobos to start
qsub: job 2236960.biobos ready
[user@p4]$ cd /data/user/myruns
[user@p4]$ module load bamutil
[user@p4]$ cd /data/user/somewhereWithInputfile
[user@p4]$ bam splitBam -v -i myfile.bam -o outfile -L logfile
[user@p4] exit
qsub: job 2236960.biobos completed
[user@biowulf ~]$
A specific type of node (e.g. with more memory) can be specified on the qsub command line. For example, if you need a node with 8gb of memory to run job interactively, do this:
biowulf% qsub -I -l nodes=1:g8
Documentation


