![]() |
|
||
| |
|||
EMBOSS PACKAGEEMBOSS stands for "The European Molecular Biology Open Software Suite". Within EMBOSS you will find around hundreds of programs (applications) covering areas such as:
When to use EMBOSS on Biowulf EMBOSS on Biowulf is intended for running a large number of sequence files, such as hundreds or thousands of query sequences, using the programs in EMBOSS. If you have just a few query sequences, you should use EMBOSS web interface or command line on Helix. Please contact the Helix Systems staff staff@helix.nih.gov, or 301-594-6248) if you have questions about your EMBOSS jobs. Submit Multiple Jobs Using swarm Program 1. For csh or tcsh users, set PLPLOT_LIB variable and add /usr/local/emboss/bin to your path. You can insert the following lines at the end of your .cshrc file:
Or for bash/ksh/sh users, insert the following at the end of your
.bashrc file:
PLPLOT_LIB=/usr/local/emboss/lib PATH=/usr/local/emboss/bin:$PATH emboss_acdroot=/usr/local/emboss/share/EMBOSS/acd export PLPLOT_LIB PATH emboss_acdroot 2. Setup a command file to run swarm. For example, to run the emboss program 'seqret' for 2500 sequences, create a file called 'cmd.file' which contains the following lines: seqret -sequence 'genbank:ab1681*' -outseq 'outseq1' seqret -sequence 'swissprot:P16310' -outseq 'outseq2' seqret -sequence 'genpept:M31661' -outseq 'outseq3' ............... ............. ............... seqret -sequence 'refseqnt:nc_011*' -outseq 'outseq4' Each command line in the cmd.file should appear just as they would be entered on a command line. 3. If you have over 1000 commands, especially if each one runs for a short time, you should 'bundle' your jobs with the -b flag. This will greatly increase the speed of your jobs and prevent overwork of cluster. To bundle your jobs, first use the following formula to determine the value BN: BN= 'command number' / (nodes no. x 2) So for example, if you have 5000 commands in your swarm file, and the current maximum node number per user is 64, then BN = 5000 / (64x2) = 39.06 (round to 40). Then submit the swarm job as below, where 40 is the BN value: swarm -f cmdfile -b 40 4. Sometimes, it is very time-consuming to put together a command file for a swarm job ( for example, 800 lines in a file). you will probably want to write a simple csh or perl script to build this swarm command file. If you are unfamiliar with csh, Basic scripting with csh maybe useful. The following is an exmaple using csh to build a command file: helix% cd my_sequence_directory
helix% touch cmdfile
helix% foreach file (*)
foreach> echo "patmatmotifs $file $file.out >> cmdfile end
helix%
5. More info regarding swarm program
|
|||
| Biowulf home page | Helix Systems | NIH | |||