Biowulf at the NIH
RSS Feed
BMTagger on Helix & Biowulf

BMTagger (Best Match Tagger) removes human reads from metagenomics datasets. Given FASTA, FASTQ files or SRA accession of microbiome dataset, bmtagger produces a list of reads that are most probably human contaminants and should not be disclosed to public.

BMTagger was developed by Richa Agarwala at NCBI, NIH. BMTagger website.

Running BMTagger on Helix

Sample session running the test scripts provided with BMTagger. Note that loading the bmtagger also loads Blast 2.2.28+

helix% module load bmtagger

helix% bmtool -d  -o  -A 0 -w 18

helix% srprism mkindex -i  -o  -M 7168

helix% makeblastdb -in  -dbtype nucl

helix% bmtagger.sh -b reference.bitmask -x reference.srprism -T tmp -q0 -1 -o

Running a BMTagger job on Biowulf

The following sample batch script runs the commands described in the BMTagger documentation.

#!/bin/bash
#PBS -N BMTagger
#
# this file is called bmtagger.bat

cd /data/$USER/bmtagger

bmtool -d  -o  -A 0 -w 18
srprism mkindex -i  -o  -M 7168
makeblastdb -in  -dbtype nucl
bmtagger.sh -b reference.bitmask -x reference.srprism -T tmp -q0 -1 -o

Submit this job with:

qsub -l nodes=1 bmtagger.bat
If this job will require more than 1 GB of memory, you should specify the memory required when submitting. e.g.
qsub -l nodes=1:g24 bmtagger.bat
will allocate a node with 24 GB of memory.

Documentation

BMTagger documentation