SOAP3 is a GPU-based software for aligning short reads with a reference sequence.It can find all alignments with k mismatches, where k is chosen from 0 to 3. When compared with its previous version SOAP2, SOAP3 can be up to tens of times faster. For example, when aligning length-100 reads with the human genome, SOAP3 is the first software that can find all 3-mismatch alignments in tens of seconds per one million reads.
The alignment program in this package is optimized to work for multi-millions of short reads each time by running a multi-core CPU and the GPU concurrently.
To exploit the parallelism of the GPU effectively, SOAP3 is using an adapted version of the 2BWT index of SOAP2 (the new index is called the GPU-2BWT). The index and algorithms were developed by the algorithms research group of the University of Hong Kong (T.W. Lam, C.M. Liu, Thomas Wong, Edward Wu and S.M. Yiu). Please remember to cite their effort:
- SOAP3 Publication
- To reference SOAPsnp, please cite this paper : Ruiqiang Li, Yingrui Li, Xiaodong Fang, et al. (2009) "SNP detection for massively parallel whole-genome resequencing" (2009) Genome Res. , doi:10.1101/gr.088013.108
- To reference SOAP 2, please cite this paper : Li et al. (2009) SOAP2: an improved ultrafast tool for short read alignment. BIOINFORMATICS, doi:10.1093/bioinformatics/btp336
- To reference SOAP v1 program, please cite this paper : Li et al. (2008) SOAP: short oligonucleotide alignment program" . BIOINFORMATICS, 24 no.5,713
GPU NODES ONLY - SOAP3 _ONLY_ run on GPU nodes of biowulf cluster.INDEX FILES - We can create shared index files for users when requested. Currently hg18, hg19 and mm9 index files for soap3 are located under /fdb/soap3.
SOAP package includes many programs, see /usr/local/apps/soap :
Running a single SOAP3 job on Biowulf
1. See http://www.cs.hku.hk/2bwt-tools/soap3-dp/ for instructions.
2. Create a directory for your job, /data/user/soap/run1 for example.
3. Put your reference file(ref1.fa for example) and input file(s) with reads (infile1.fa and infile2.fa for example) under this directory.
4. Copy the initialization file /usr/local/soap/soap3/*.ini to this directory and modify them for the parameters you want.
5. Set up a batch script along the following lines:
#!/bin/bash #PBS -m be # this file is soap3_script module load soap3 cd /data/user/soap/run1 soap3-dp-builder ref1.fa BGS-Build ref1.fa.index soap3-dp pair ref1.fa.index infile1.fa infile2.fa -u 500 -v 200 merge-succinct.sh infile1.fa
biowulf% qsub -l nodes=1:gpu2050 soap3_script
It may be useful for debugging purposes to run SOAP3 jobs interactively. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.
biowulf% qsub -I -l nodes=1:gpu2050 qsub: waiting for job 2236960.biobos to start qsub: job 2236960.biobos ready [user@p1234]$ module load soap3 [user@p1234]$ cd /data/user/soap/run1 [user@p1234]$ soap3-builder ref1.fa [user@p1234]$ BGS-Build ref1.fa.index [user@p1234]$ soap3-dp pair ref1.fa.index infile1.fa infile2.fa -u 500 -v 200 [user@p1234]$ make-view-succinct.sh infile1.fa [user@p1234] exit qsub: job 2236960.biobos completed [user@biowulf ~]$
Make sure to exit the node once you have finished your run.