![]() |
|
||
| |
|||
Tree-Puzzle on Biowulf
TREE-PUZZLE is a computer program to reconstruct phylogenetic trees from molecular sequence data by maximum likelihood. It
implements a fast tree search algorithm, quartet puzzling, that allows analysis of large data sets and automatically assigns estimations
of support to each internal branch. TREE-PUZZLE also computes pairwise maximum likelihood distances as well as branch lengths
for user specified trees. Branch lengths can also be calculated under the clock-assumption. In addition, TREE-PUZZLE offers
likelihood mapping, a method to investigate the support of a hypothesized internal branch without computing an overall tree and to
visualize the phylogenetic content of a sequence alignment. TREE-PUZZLE also conducts a number of statistical tests on the data set
(chi-square test for homogeneity of base composition, likelihood ratio to test the clock hypothesis, Kishino-Hasegawa test). The
models of substitution provided by TREE-PUZZLE are TN, HKY, F84, SH for nucleotides, Dayhoff, JTT, mtREV24, BLOSUM 62,
VT, WAG for amino acids, and F81 for two-state data. Rate heterogeneity is modeled by a discrete Gamma distribution and by
allowing invariable sites. The corresponding parameters can be inferred from the data set.
Tree-Puzzle Documentation Tree-Puzzle on Biowulf has been built with MPI for parallel runs. To submit a job on Biowulf, create a command file similar to the following: -------------------Sample command file for Tree-Puzzle----------------------- #!/bin/csh #PBS -N Ppuzzle #PBS -m be #PBS -k oe set path = (/usr/local/mpich/bin $path) cd /data/username/tree/ mpirun -machinefile $PBS_NODEFILE -np $np /usr/local/bin/ppuzzle << EOF primates.b y EOF -----------------------------------------------------------------------------where primates.b is the input file for puzzle. See the Tree-Puzzle documentation for a list of all available parameters. Submit this job using the qsub command, e.g: qsub -v np=4 -l nodes=2 command-filewhere 'command-file' is the file you created above.
Tree-Puzzle optionsTree-Puzzle has many options. A summary is below:GENERAL OPTIONS b Type of analysis? Tree reconstruction k Tree search procedure? Quartet puzzling v Approximate quartet likelihood? No u List unresolved quartets? No n Number of puzzling steps? 1000 j List puzzling step trees? No o Display as outgroup? Gibbon z Compute clocklike branch lengths? No e Parameter estimates? Approximate (faster) x Parameter estimation uses? Neighbor-joining tree SUBSTITUTION PROCESS d Type of sequence input data? Nucleotides m Model of substitution? HKY (Hasegawa et al. 1985) t Transition/transversion parameter? Estimate from data set f Nucleotide frequencies? Estimate from data set RATE HETEROGENEITY w Model of rate heterogeneity? Uniform rateDetails about all options are available in the Tree-Puzzle documentation. Options are specified in the command file by simply entering the interactive menu options and values as needed. For example, to change the number of puzzling steps in your run to 8000, the command file would look like -------------------------------------------------------- #!/bin/csh #PBS -N Ppuzzle #PBS -m be #PBS -k oe set path = (/usr/local/mpich/bin $path) cd /data/username/tree/ mpirun -machinefile $PBS_NODEFILE -np $np /usr/local/bin/ppuzzle << EOF primates.b n 8000 y EOF ----------------------------------------------------It is often simplest to determine the parameters by running puzzle (not ppuzzle, but puzzle, which is the non-parallel version) on the biobos command-line, selecting the parameters and noting the order in which they are needed, and then entering the same parameters into the command file. |
|||
| This
document is available as http://biowulf.nih.gov/puzzle/index.html Biowulf home page | Helix Systems | NIH |
|||