biowulf_logo

Status
About
Hardware
Applications
Batch queues
Disk storage

MPI
Performance
New Users
User Guide
Documentation
Research
Photos


meme_mast

Meme and Mast on Biowulf

Meme is designed to discover motifs (highly conserved regions) in groups of related DNA or protein sequences, and Mast will search sequence databases using motifs. Meme & Mast were developed at UCSD and Purdue. Meme/Mast website.

Meme is cpu-intensive for large numbers of sequences or long sequences. Short jobs are most easily run on Helix, but if larger datasets are used, a parallel run on Biowulf is appropriate.

How to run Meme on Biowulf

Your input database should consist of a file containing sequences in fasta format. In the example below, the file is 'mini-drosoph.seqs'. Determine the number of characters in the file using 'wc -c filename' to use for the parameter 'maxsize'. Set up a batch script along the lines of the one below:
------------ this file is meme.batch --------------------------------------
#!/bin/csh
#PBS -N Meme
#PBS -m be
#PBS -j oe

setenv PATH /usr/local/mpich/bin:$PATH

cd /data/user/mydir/
mpirun -machinefile $PBS_NODEFILE -np $np  /usr/local/meme/bin/meme_p \
      mini-drosoph.seqs -dir /usr/local/meme/ -maxsize 500000 -text > mini-drosoph.meme
mast mini-drosoph.meme -text
----------------------------------------------------------------------------
Submit this script using
qsub -v np=32 -l nodes=16 meme.batch

Meme scales well, and large meme jobs (maxsize ~500,000) can be submitted on up to 128 processors.

Documentation

  1. Type 'meme' or 'mast' with no parameters on the command line to see a list of all available options and more information.
  2. Meme documentation at the SDSC website.
  3. Mast documentation at the SDSC website.

This document is available as http://biowulf.nih.gov/apps/meme.html
Biowulf home page | Helix Systems | NIH