Biowulf at the NIH
RSS Feed
MuTect

Description

MuTect is a method developed at the Broad Institute for the reliable and accurate identification of somatic point mutations in next generation sequencing data of cancer genomes.

How to Use

MuTect uses environment modules. Type

module load muTect

at the prompt. Then type

muTect

Two extra options have been added to allow for memory allocation and temporary file directory.

  • --memory memory allocated (default = 2gb)
  • --tmpdir tmpdir location (default = /scratch/$USER/muTect)

By default, muTect uses 2gb of memory. To allocate 5gb of memory, include --memory 5g on the commandline.

NOTE 1: muTect does NOT work with java v1.7. Instead, make sure to either rely on the java version set by the muTect module, or only use java v1.6.

module load java/1.6.0

NOTE 2: muTect uses code base from GATK, and therefore has many of the same options. One option, -nt or --num_threads DOES NOT work properly. DO NOT use this option.

MuTect requires two BAM input files, one for normal tissues, the other for the tumor tissue. MuTect outputs a wiggle format coverage file. An additional wiggle file can be generated to display observed depth.

MuTect takes as parameters database files, depending on the build of your alignments and which dbSNP version you are using. These files are located in /fdb/muTect.

A typical qsub script for a 5gb memory, single-threaded job would be as follows:

Then submit to the appropriate node type:

qsub -l nodes=1:g8 muTect.run

The reference files in the above example are for alignments against the UCSC reference genome. For alignments against the Ensembl/NCBI/1000genomes reference genome, use:

--reference_sequence /fdb/muTect/human_g1k_v37.fasta \
--dbsnp              /fdb/muTect/dbsnp_137.b37.vcf \
--cosmic             /fdb/muTect/cosmic_v67.b37.vcf \

Documentation