Biowulf at the NIH
RSS Feed
Breakdancer on Biowulf

The BreakDancer package provides genome-wide detection of structural variants from next generation paired-end sequencing reads.

Program breakdancer-max predicts five types of structural variants: insertions, deletions, inversions, inter- and intra-chromosomal translocations from next-generation short paired-end sequencing reads using read pairs that are mapped with unexpected separation distances or orientation.

Program locations

Before running any breakdancer commands, you must set the proper environment variable(s). The easiest way to do this is by using modules commands as in the example below.

biowulf$
biowulf$ module avail breakdancer
----------------------------- /usr/local/Modules/3.2.9/modulefiles --------------------------
breakdancer/1.4.4(default)
biowulf$
biowulf$ module load breakdancer
biowulf$
biowulf$ module list
Currently Loaded Modulefiles:
  1) breakdancer/1.4.4
biowulf$
biowulf$ module unload breakdancer
biowulf$
biowulf$ module load breakdancer/1.4.4
biowulf$
biowulf$ module show breakdancer
biowulf$
-------------------------------------------------------------------

module-whatis    Sets up breakdancer 1.4.4 
prepend-path     PATH /usr/local/apps/breakdancer/1.4.4/cpp
prepend-path     PATH /usr/local/apps/breakdancer/1.4.4/perl 
-------------------------------------------------------------------

Sample Sessions On Biowulf

Submitting a single breakdancer batch job

1. Create a script file along the lines of the example below.

#!/bin/bash
#
## This file is runFile
#
#PBS -N breakdancer
#PBS -m be
#PBS -k oe
#
module load breakdancer
#
cd /data/$USER/breakdancer/run1
bam2cfg.pl bam_file1 bam_file2 %BreakdancerOptions% > config_file.cfg
breakdancer-max config_file.cfg > file.ctx
#

3. Submit the script using the 'qsub' command on biowulf

biowulf$
biowulf$ qsub -l nodes=1:g8 /data/$USER/runFile
biowulf$

Submitting a swarm of breakdancer jobs

Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.

Set up a swarm command file (e.g. "/data/$USER/cmdfile"). Here is a sample file:

#
## swarm command file
#
cd /data/$USER/Dir1;
bam2cfg.pl bam_file1 bam_file2 %BreakdancerOptions% > config_file.cfg;
breakdancer-max config_file.cfg > file.ctx
#
cd /data/$USER/Dir2; 
bam2cfg.pl bam_file1 bam_file2 %BreakdancerOptions% > config_file.cfg; 
breakdancer-max config_file.cfg > file.ctx
#
[.....]
#
cd /data/$USER/Dir20;
bam2cfg.pl bam_file1 bam_file2 %BreakdancerOptions% > config_file.cfg; 
breakdancer-max config_file.cfg > file.ctx
#

The '-f' and '--module' options for swarm are required

By default, each line of the command file above will be executed on 1 processor core of a node and use 1gb of memory.

If this is not what you want, you will need to specify '-g' flags when you submit the job on biowulf. For example, if each line of the command file above needs to use 10gb of memory instead of the default 1gb of memory make sure swarm understands this by including the '-g 10' flag:

biowulf$
biowulf$ swarm -g 10 -f swarmFile --module breakdancer
biowulf$

For more information regarding running swarm, see swarm.html

Submit an interactive breakdancer job

DO NOT SUBMIT BREAKDANCER JOBS ON THE BIOWULF LOGIN NODE

1. To do so, first allocate a node from the cluster and then run commands interactively on the node.

biowulf$
biowulf$ qsub -I -l nodes=1:g8
biowulf$

or if your job requires bigger memory,

biowulf$
biowulf$ qsub -I -l nodes=1:g24:c16
biowulf$

2. Once a cluster node is allocated and an interactive shell is started, run the interactive commands

clusterNode$
clusterNode$ module load breakdancer
clusterNode$
clusterNode$ cd /data/$USER/breakdancer/run1
clusterNode$ bam2cfg.pl bam_file1 bam_file2 breakdancer_options > config_file.cfg
clusterNode$ breakdancer-max config_file.cfg > file.ctx 
clusterNode$

Documentation