![]() |
|
||
| |
|||
HMMER on BiowulfProfile hidden Markov models for biological sequence analysisProfile hidden Markov models (profile HMMs) can be used to do sensitive database searching using statistical descriptions of a sequence family's consensus. HMMER uses profile HMMs, and can be useful in situations like:HMMER (pronounced 'hammer', as in a more precise mining tool than BLAST) was developed by Sean Eddy at Washington University in St. Louis. The HMMER website is hmmer.janelia.org.
HMMER User Guide (PDF) HMMER is a very cpu-intensive program and is parallelized using threads, so that each instance of hmmpfam or hmmsearch can use all the cpus available on a node. HMMER on Biowulf is intended for those who need to run HMMER searches on large numbers of query sequences.
Create a swarm command file with one line for each of the query sequences. Sample swarm command file: ---------------- file swarm.cmd ---------------------------------------------------- hmmpfam /fdb/fastadb/pfam/Pfam_fs /data/user/seqs/myseq1 > /data/user/out/seq1.out hmmpfam /fdb/fastadb/pfam/Pfam_fs /data/user/seqs/myseq2 > /data/user/out/seq2.out hmmpfam /fdb/fastadb/pfam/Pfam_fs /data/user/seqs/myseq3 > /data/user/out/seq3.out hmmpfam /fdb/fastadb/pfam/Pfam_fs /data/user/seqs/myseq4 > /data/user/out/seq4.out hmmpfam /fdb/fastadb/pfam/Pfam_fs /data/user/seqs/myseq5 > /data/user/out/seq5.out [....] ------------------------------------------------------------------------------------The HMMER programs hmmcalibrate, hmmsearch, and hmmpfam are set up to use all available cpus on a node. Therefore this swarm job should be submitted so as to run only a single command on each node. Submit with: swarm -f swarm.cmd -n 1
----------- file hmm_homolog ----------------------------------------- #!/bin/csh #PBS -N Hmmer #PBS -m be #PBS -k oe cd /data/user/mydir hmmbuild -g globins.hmm globins.msf hmmcalibrate globins.hmm hmmsearch globins.hmm /fdb/fastadb/ecoli.aa.fas ------------------------------------------------------------------------This script starts with a multiple sequence alignment of a protein domain or protein family in the file globins.msf. This file can be created by aligning sequences with ClustalW. The hmmbuild command builds a profile HMM from the alignment, the hmmcalibrate command increases the sensitivity of the search, and the hmmsearch command uses the globin model to search for globin domains in the Ecoli database. See the HMMER documentation for more information. Submit this file with: qsub -l nodes=1 hmm_homolog
More InfoThe entire HMMER suite of programs is available in /usr/local/hmmer. Note that only hmmcalibrate, hmmsearch and hmmpfam are parallelized.
A large collection of protein sequence databases is in /fdb/fastadb/. |
|||
| This
document is available as http://biowulf.nih.gov/apps/hmmer/index.html Biowulf home page | Helix Systems | NIH |
|||