BEAST (Bayesian Evolutionary Analysis Sampling Trees) is a cross-platform program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability. We include a simple to use user-interface program for setting up standard analyses and a suit of programs for analysing the results.
BEAST is a single-threaded program. It is only advantageous to run BEAST on Biowulf if you need to run a large number of BEAST jobs simultaneously.
Please set your environment using environment modules. When using swarm, this is easily done with the --module option.
Create a swarm command file with a single line for each run. Sample file:
----- this file is beast.swarm------------ beast file1.xml beast file2.xml beast file3.xml beast file4.xml [...]
Submit this swarm of jobs with the command:
swarm -f beast.swarm --module BEAST
BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics package. It can make use of GPUs.
Note that BEAST+Beagle only uses one GPU. The Biowulf GPU nodes have 2 GPUs, so the script below is set to run two different instances of BEAST+Beagle to fully utilize the GPUs.
Sample batch script using two of the Beast benchmark sets as input:
#!/bin/bash #PBS -N beast-beagle module load BEAST/1.7.5-gpu cd $PBS_O_WORKDIR cp /usr/local/apps/BEAST/1.7.5/examples/Benchmarks/benchmark1.xml . cp /usr/local/apps/BEAST/1.7.5/examples/Benchmarks/benchmark2.xml . beast -beagle -seed 123456 -beagle_GPU benchmark1.xml > benchmark1.out 2>&1 & beast -beagle -seed 123456 -beagle_GPU benchmark2.xml > benchmark2.out 2>&1 & wait
Submit this job with:
qsub -l nodes=1:gpu2050 jobscript
You can determine whether the GPU is being used in 2 ways:
- Standard output from the job.
BEAST v1.7.5, 2002-2013 Bayesian Evolutionary Analysis Sampling Trees [...etc...] Using strict molecular clock model. Creating state frequencies model 'frequencies': Initial frequencies = {0.25, 0.25, 0.25, 0.25} Creating HKY substitution model. Initial kappa = 2.0 Creating site model. Using BEAGLE TreeLikelihood Branch rate model used: strictClockBranchRates Using BEAGLE resource 1: Tesla M2050 Global memory (MB): 2687 Clock speed (Ghz): 1.15 Number of cores: 448 with instance flags: PRECISION_SINGLE COMPUTATION_SYNCH EIGEN_REAL SCALING_MANUAL SCALERS_RAW VECTOR_NONE THREADING_NONE PROCESSOR_GPU [...etc...] - Use the 'nvidia-smi' command on the allocated GPU node. This will report the actual compute processes on the GPU.
[susanc@biowulf ~]$ rsh p83 nvidia-smi Fri Jun 14 14:06:48 2013 +------------------------------------------------------+ | NVIDIA-SMI 3.295.33 Driver Version: 295.33 | |-------------------------------+----------------------+----------------------+ | Nb. Name | Bus Id Disp. | Volatile ECC SB / DB | | Fan Temp Power Usage /Cap | Memory Usage | GPU Util. Compute M. | |===============================+======================+======================| | 0. Tesla M2050 | 0000:02:00.0 Off | 0 0 | | N/A N/A P0 N/A / N/A | 5% 122MB / 2687MB | 40% Default | |-------------------------------+----------------------+----------------------| | 1. Tesla M2050 | 0000:03:00.0 Off | 0 0 | | N/A N/A P1 N/A / N/A | 0% 6MB / 2687MB | 0% Default | |-------------------------------+----------------------+----------------------| | Compute processes: GPU Memory | | GPU PID Process name Usage | |=============================================================================| | 0. 8204 java 114MB | +-----------------------------------------------------------------------------+


