Biowulf at the NIH
RSS Feed
Clustal Omega on Helix & Biowulf
Clustal Omega is a new development to the Clustal family, which offers a significant increase in scalability over previous versions, allowing hundreds of thousands of sequences to be aligned in only a few hours. It will also make use of multiple processors, where present. In addition, the quality of alignments is superior to previous versions, as measured by a range of popular benchmarks.

Clustal Omega was developed by the Higgins lab in Dublin. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.

Running Clustal-Omega on Helix

Sample session:

helix% cd /home/user/mydir

helix% module load clustalo

helix% clustalo -i globins630.fa -o globins630.msf

helix% more globins630.msf
>BAHG_VITSP
-----------------MLDQQTINIIKATVPVLKEHG-----VTITTTFYKN--LFAKH
PEVRPLFDMGRQESL--E----QP---K--AL-----AMTVLAAAQNIENLPAIL--PAV
KKIAVKHCQA-GVAAAHYPIVGQE--------LLGAIKEVLG-DAATDDILDAWGKAYGV
I-ADVfiqveadLYAQAVE-------------------------
>GLB1_ANABR
-------PSVQGAAA--QLTADVKKDLRDSWKVI-G----SDKKGNGVALMTT--LFADN
QETIGYFKRLGNVSQG-M----AND--KLRGHSITLMYALQNFIDQLDNTDDL---VCVV
EKFAVNHITR-KISAAEFGKINGP--------IKKVLAS----KNFGDKYANAWAKLVAV
V-QAAL--------------------------------------
>GLB1_ARTSX
[... etc...]

Running a Clustal Omega job on Biowulf

Set up a batch script along the following lines:

#!/bin/bash
#PBS -N ClustalO
#PBS -m be

module load clustalo 

cd /data/user/mydir
clustalo -i myseqs.fasta --auto

Submit this job with:

qsub -l nodes=1 myjob.bat
The above command will submit the job to a default node with 2 cpus. By default, Clustal Omega will use all the cores available on the allocated node. To submit to a node with 4 cores, use:
qsub -l nodes=1:c4 myjob.bat

Type 'freen' to see the available node types and number of cores on each.

Benchmarks: An input file with 630 globin sequences in fasta format took about 28 seconds on a 4-core node.

Documentation

Clustal Omega paper
README file which explains all the options