SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.
Programs Location
/usr/local/samtools/
You can add the samtools, bcftools and associated misc tools to your path most easily by using the modules commands, as in the example below:
[user@biowulf]$ module avail samtools (see what versions are available) ------------------- /usr/local/Modules/3.2.9/modulefiles ------------------- samtools/0.1.12a samtools/0.1.15 samtools/0.1.18(default) samtools/0.1.13 samtools/0.1.17 [user@biowulf]$ module load samtools (load the default version) [user@biowulf]$ module list (see what version is loaded) Currently Loaded Modulefiles: 1) samtools/0.1.18 [user@biowulf]$ module unload samtools (unload this version) [user@biowulf]$ module load samtools/0.1.15 (load a specific version) [user@biowulf]$ module list Currently Loaded Modulefiles: 1) samtools/0.1.15
Submitting a single SAMtools batch job
Samtools sample files can be copied from /usr/local/src/samtools/: ex1.fa; ex1.sam.gz
1. Copy the same files to your own area.
3. Create a script file similar to the one below:
#!/bin/bash # This file is runSamtools # #PBS -N Samtools #PBS -m be #PBS -k oe module load samtools cd /home/user/samtools/run1 samtools faidx ex1.fa samtools import ex1.fa.fai ex1.sam.gz ex1.bam samtools index ex1.bam samtools tview ex1.bam ex1.fa samtools pileup -cf ex1.fa ex1.bam
4. Submit the script using the 'qsub' command on Biowulf. In this example, job was submitted to g8 node which has 8 GB of memory. User can also type 'freen' on Biowulf head node to see availabe node types based on your need:
qsub -l nodes=1:g8 /data/username/runSamtools
Submitting a swarm of Samtools jobs
1. Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.
Set up a swarm command file (eg /data/username/cmdfile). Here is a sample file:
module load samtools; cd /home/user/samtools/run1;\ samtools faidx ex1.fa;\ samtools import ex1.fa.fai ex1.sam.gz ex1.bam;\ samtools index ex1.bam\ samtools tview ex1.bam ex1.fa\ samtools pileup -cf ex1.fa ex1.bam module load samtools; cd/home/user/samtools/run2;\ samtools faidx ex1.fa;\ samtools import ex1.fa.fai ex1.sam.gz ex1.bam;\ samtools index ex1.bam\ samtools tview ex1.bam ex1.fa\ samtools pileup -cf ex1.fa ex1.bam module load samtools; cd/home/user/samtools/run3;\ samtools faidx ex1.fa;\ samtools import ex1.fa.fai ex1.sam.gz ex1.bam;\ samtools index ex1.bam\ samtools tview ex1.bam ex1.fa\ samtools pileup -cf ex1.fa ex1.bam
Submit this swarm with:
swarm -f cmdfile
By default, each line of the commands above will be executed on '1' processor core of a node and uses 1GB of memory.
If each line of the commandds above will need to use more than 1 GB of memory, say for example 4 GB, make sure swarm understands this by including '-g 4' flag:
swarm -g 4 -f cmdfile
For more information regarding running swarm, see swarm.html


