Biowulf at the NIH
RSS Feed
bedops on Biowulf

A suite of tools to address common questions raised in genomic studies — mostly with regard to overlap and proximity relationships between data sets — BEDOPS aims to be scalable and flexible, facilitating the efficient and accurate analysis and management of large-scale genomic data.

The environment variable(s) need to be set properly first. The easiest way to do this is by using the modules commands as in the example below.

[user@biowulf]$ module avail bedops
----------------------------- /usr/local/Modules/3.2.9/modulefiles --------------------------
bedops/2.0.0b

[user@biowulf]$ module load bedops

[user@biowulf]$ module list
Currently Loaded Modulefiles:
  1) bedops/2.0.0b

[user@biowulf]$ module unload bedops

[user@biowulf]$ module load bedops/2.0.0b

[user@biowulf]$ module list
Currently Loaded Modulefiles:
  1) bedops/2.0.0b

[user@biowulf]$ module show bedops
-------------------------------------------------------------------
/usr/local/Modules/3.2.9/modulefiles/bedops/2.0.0b:

module-whatis    Sets up bedops 2.0.0b 
prepend-path     PATH /usr/local/apps/bedops/2.0.0b 
-------------------------------------------------------------------

Sample Sessions On Biowulf

Submitting a single bedops batch job

1. Create a script file. The file will contain the lines similar to the lines below. Modify the path of location before running.

2. Example files can be copied from /usr/local/apps/bedops/examples/

#!/bin/bash
# This file is runbedops
#
#PBS -N bedops
#PBS -m be
#PBS -k oe
module load bedops
cd /data/$USER/bedops/run1
bedops --everything BEDFileA BEDFileB

3. Submit the script using the 'qsub' command on Biowulf.

$ qsub -l nodes=1:g8 /data/$USER/runbedops

 

Submitting a swarm of bedops jobs

Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.

Set up a swarm command file (eg /data/$USER/cmdfile). Here is a sample file:

bedops --everything BEDFileA BEDFileB > /data/$USER/out1
bedops --everything BEDFileA BEDFileB > /data/$USER/out2
[.....] bedops --everything BEDFileA BEDFileB > /data/$USER/out30

The '-f' and '--module' options for swarm are required

By default, each line of the command file above will be executed on 1 processor core of a node and use 1gb of memory. If this is not what you want, you will need to specify '-g' flags when you submit the job on biowulf.

Say if each line of the commands above also will need to use 10gb of memory instead of the default 1gb of memory, make sure swarm understands this by including '-g 10' flag:

biowulf> $ swarm -g 10 -f cmdfile --module bedops

For more information regarding running swarm, see swarm.html

 

Submit an interactive bedops job

1. To do so, user first allocate a node from the cluster then run commands interactively on the node. DO NOT RUN ON BIOWULF LOGIN NODE:

$ qsub -I -l nodes=1:g8

or if your job require bigger memory,

$ qsub -I -l nodes=1:g24:c16

2. Once the job started and a node is allocated, run the interactive commands.

pXXX> $ cd /data/$USER/bedops
pXXX> $ module load bedops
pXXX> $ bedops --everything BEDFileA BEDFileB > /data/$USER/out

pXXX> $ exit

Documentation

https://bedops.readthedocs.org/