BMTagger (Best Match Tagger) removes human reads from metagenomics datasets. Given FASTA, FASTQ files or SRA accession of microbiome dataset, bmtagger produces a list of reads that are most probably human contaminants and should not be disclosed to public.
BMTagger was developed by Richa Agarwala at NCBI, NIH. BMTagger website.
Sample session running the test scripts provided with BMTagger. Note that loading the bmtagger also loads Blast 2.2.28+
helix% module load bmtagger helix% bmtool -d
-ohelix% srprism mkindex -i -A 0 -w 18 -ohelix% makeblastdb -in -M 7168 -dbtype nuclhelix% bmtagger.sh -b reference.bitmask -x reference.srprism -T tmp -q0 -1 -o
The following sample batch script runs the commands described in the BMTagger documentation.
#!/bin/bash #PBS -N BMTagger # # this file is called bmtagger.bat cd /data/$USER/bmtagger bmtool -d
-o -A 0 -w 18 srprism mkindex -i -o -M 7168 makeblastdb -in -dbtype nucl bmtagger.sh -b reference.bitmask -x reference.srprism -T tmp -q0 -1 -o
Submit this job with:
qsub -l nodes=1 bmtagger.batIf this job will require more than 1 GB of memory, you should specify the memory required when submitting. e.g.
qsub -l nodes=1:g24 bmtagger.batwill allocate a node with 24 GB of memory.