![]() |
|
||
| |
|||
ClustalW on BiowulfClustalW is a multiple sequence alignment program for DNA or protein sequences. The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. ClustalW on Biowulf is intended for batch jobs. Short or occasional ClustalW jobs should be run on Helix.
Setting up a Clustalw jobSample batch script:
qsub -l nodes=1 clustalw.runThe batch system will send an email message at the end of the run, which will look something like this:
qsub -l nodes=1:m2048 clustalw.run qsub -l nodes=1:m4096 clustalw.run
Submitting to FireboltIf the job requires more than 4GB of memory, it should be submitted to Firebolt. Please read the Firebolt User Guide before using Firebolt. The same batch script can be used. The job would be submitted with a command like:aqsub -l nodes=1:altix,mem=4gb /data/user/clustalw.runIt is important to specify the required memory in the command above.
ClustalW Command-line optionsThese options can be listed interactively by typing 'clustalw -help' or 'clustalw -check' on the command-line.
CLUSTAL W (1.83) Multiple Sequence Alignments
DATA (sequences)
-INFILE=file.ext :input sequences.
-PROFILE1=file.ext and -PROFILE2=file.ext :profiles (old alignment).
VERBS (do things)
-OPTIONS :list the command line parameters
-HELP or -CHECK :outline the command line params.
-ALIGN :do full multiple alignment
-TREE :calculate NJ tree.
-BOOTSTRAP(=n) :bootstrap a NJ tree (n= number of bootstraps; def. = 1000).
-CONVERT :output the input sequences in a different file format.
PARAMETERS (set things)
***General settings:****
-INTERACTIVE :read command line, then enter normal interactive menus
-QUICKTREE :use FAST algorithm for the alignment guide tree
-TYPE= :PROTEIN or DNA sequences
-NEGATIVE :protein alignment with negative values in matrix
-OUTFILE= :sequence alignment file name
-OUTPUT= :GCG, GDE, PHYLIP, PIR or NEXUS
-OUTORDER= :INPUT or ALIGNED
-CASE :LOWER or UPPER (for GDE output only)
-SEQNOS= :OFF or ON (for Clustal output only)
-SEQNO_RANGE=:OFF or ON (NEW: for all output formats)
-RANGE=m,n :sequence range to write starting m to m+n.
***Fast Pairwise Alignments:***
-KTUPLE=n :word size
-TOPDIAGS=n :number of best diags.
-WINDOW=n :window around best diags.
-PAIRGAP=n :gap penalty
-SCORE :PERCENT or ABSOLUTE
***Slow Pairwise Alignments:***
-PWMATRIX= :Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
-PWDNAMATRIX= :DNA weight matrix=IUB, CLUSTALW or filename
-PWGAPOPEN=f :gap opening penalty
-PWGAPEXT=f :gap opening penalty
***Multiple Alignments:***
-NEWTREE= :file for new guide tree
-USETREE= :file for old guide tree
-MATRIX= :Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
-DNAMATRIX= :DNA weight matrix=IUB, CLUSTALW or filename
-GAPOPEN=f :gap opening penalty
-GAPEXT=f :gap extension penalty
-ENDGAPS :no end gap separation pen.
-GAPDIST=n :gap separation pen. range
-NOPGAP :residue-specific gaps off
-NOHGAP :hydrophilic gaps off
-HGAPRESIDUES= :list hydrophilic res.
-MAXDIV=n :% ident. for delay
-TYPE= :PROTEIN or DNA
-TRANSWEIGHT=f :transitions weighting
***Profile Alignments:***
-PROFILE :Merge two alignments by profile alignment
-NEWTREE1= :file for new guide tree for profile1
-NEWTREE2= :file for new guide tree for profile2
-USETREE1= :file for old guide tree for profile1
-USETREE2= :file for old guide tree for profile2
***Sequence to Profile Alignments:***
-SEQUENCES :Sequentially add profile2 sequences to profile1 alignment
-NEWTREE= :file for new guide tree
-USETREE= :file for old guide tree
***Structure Alignments:***
-NOSECSTR1 :do not use secondary structure-gap penalty mask for profile 1
-NOSECSTR2 :do not use secondary structure-gap penalty mask for profile 2
-SECSTROUT=STRUCTURE or MASK or BOTH or NONE :output in alignment file
-HELIXGAP=n :gap penalty for helix core residues
-STRANDGAP=n :gap penalty for strand core residues
-LOOPGAP=n :gap penalty for loop regions
-TERMINALGAP=n :gap penalty for structure termini
-HELIXENDIN=n :number of residues inside helix to be treated as terminal
-HELIXENDOUT=n :number of residues outside helix to be treated as terminal
-STRANDENDIN=n :number of residues inside strand to be treated as terminal
-STRANDENDOUT=n:number of residues outside strand to be treated as terminal
***Trees:***
-OUTPUTTREE=nj OR phylip OR dist OR nexus
-SEED=n :seed number for bootstraps.
-KIMURA :use Kimura's correction.
-TOSSGAPS :ignore positions with gaps.
-BOOTLABELS=node OR branch :position of bootstrap values in tree display
| |||
This document is available as http://biowulf.nih.gov/apps/dmol.html Biowulf home page | Helix Systems | NIH |
|||