Meme is cpu-intensive for large numbers of sequences or long sequences. Short jobs are most easily run on Helix, but if larger datasets are used, a parallel run on Biowulf is appropriate.
Before running Meme, you will need to load the Meme/Mast environment with 'module load meme'. This command will always load the latest installed version of Meme. To see what versions are available, or to load a particular version, use the 'module' commands as shown below. (More about environment modules)
[user@biowulf ~]$ module avail meme ------------------------- /usr/local/Modules/3.2.9/modulefiles ------------------------ meme/4.6.1 meme/4.7.0 meme/4.8.1 meme/4.9.0 [user@biowulf ~]$ module load meme/4.7.0 [user@biowulf ~]$ module list Currently Loaded Modulefiles: 1) meme/4.7.0
The 'module load' command will set up the appropriate MPICH, MPICH2 or OpenMPI path that the Meme executable was built with.
Your input database should consist of a file containing sequences in fasta format. In the example below, the file is 'mini-drosoph.s'.
Maxsize parameter: The maximum dataset size in characters. Determine the number of characters in your dataset by typing 'wc -c filename'. e.g.
[user@biowulf mydir]$ wc -c mini-drosoph.s 506016 mini-drosoph.s
Important cautionary note: Please check your meme parameters and input file sizes before submitting jobs. Very large input file sizes are known to cause problems, and may crash the job and hang the allocated nodes. See forum discussion.
Set up a batch script along the lines of the ones below:
Create a batch script along the following lines:
---- this file is called meme.batch --------- #!/bin/bash #PBS -N Meme #PBS -m be #PBS -j oe module load meme/4.9.0 cd /data/username/mydir `which mpirun` -machinefile $PBS_NODEFILE -np $np `which meme_p` mini-drosoph.s \ -oc meme_out -maxsize 600000 -p $np
Submit this job with a command along the lines of
qsub -v np=64 -l nodes=4:e2666 scriptname
Meme scales well, and large meme jobs (maxsize ~500,000) can be submitted on up to 128 processors.
The standard output and standard error from the job will appear in the files Meme.oJobNum and Meme.eJobNum. If the job does not appear to be running correctly, check these files for errors.