Biowulf at the NIH
RSS Feed
miRDeep2 on Biowulf

miRDeep2 is a completely overhauled tool which discovers microRNA genes by analyzing sequenced RNAs. The tool reports known and hundreds of novel microRNAs with high accuracy in seven species representing the major animal clades. The low consumption of time and memory combined with user-friendly interactive graphic output makes miRDeep2 accessible for straightforward application in current reasearch.

Programs location

/usr/local/apps/mirdeep2/current

 

The environment variable(s) need to be set properly first. The easiest way to do this is by using the modules commands as in the example below.

$ module avail mirdeep2
----------------------------- /usr/local/Modules/3.2.9/modulefiles --------------------------
mirdeep2/2.0.0.5

$ module load mirdeep2

$ module list
Currently Loaded Modulefiles:
  1) mirdeep2/2.0.0.5
$ module unload mirdeep2 $ module load mirdeep2/2.0.0.5 $ module show mirdeep2 ------------------------------------------------------------------- /usr/local/Modules/3.2.9/modulefiles/mirdeep2/2.0.0.5: module-whatis Sets up mirdeep 2.0.0.5 prepend-path PATH /usr/local/apps/mirdeep2/2.0.0.5 prepend-path PATH /usr/local/apps/bowtie/0.12.9 prepend-path PATH /usr/local/apps/viennarna/current/bin prepend-path PATH /usr/local/randfold-2.0/bin -------------------------------------------------------------------

Sample Sessions On Biowulf

The following example can be copied from /usr/local/apps/mirdeep2/2.0.0.5/TUTORIAL . Then follow instruction in the 'TUTORIAL' file in this directory.

 

Submitting a single mirdeep2 batch job

1. Create a script file alone the lines below.

#!/bin/bash
# This file is runFile
#
#PBS -N mirdeep2
#PBS -m be
#PBS -k oe

module load mirdeep2
cd /data/$USER/mirdeep2/run1
bowtie-build cel_cluster.fa cel_cluster
mapper.pl reads.fa -c -j -k TCGTATGCCGTCTTCTGCTTGT \ 
  -l 18 -m -p cel_cluster -s reads_collapsed.fa \
  -t reads_collapsed_vs_genome.arf -v 
quantifier.pl -p precursors_ref_this_species.fa 
  -m mature_ref_this_species.fa \
  -r reads_collapsed.fa -t cel -y 16_19
miRDeep2.pl reads_collapsed.fa cel_cluster.fa reads_collapsed_vs_genome.arf \
  mature_ref_this_species.fa mature_ref_other_species.fa \
  precursors_ref_this_species.fa -t C.elegans 2> report.log

3. Submit the script using the 'qsub' command on Biowulf

$ qsub -l nodes=1:g8 /data/$USER/runFile

 

Submitting a swarm of mirdeep2 jobs

Using the 'swarm' utility, one can submit many jobs to the cluster to run concurrently.

Set up a swarm command file (eg /data/$USER/cmdfile). Here is a sample file:

cd /data/$USER/Dir1; mapper.pl .... ; quantifier.pl .....; miRDeep2.pl ...... 
cd /data/$USER/Dir2; mapper.pl .... ; quantifier.pl .....; miRDeep2.pl ...... 
[.....]
cd /data/$USER/Dir30; mapper.pl .... ; quantifier.pl .....; miRDeep2.pl ...... 

The '-f' and '--module' options for swarm are required

By default, each line of the command file above will be executed on 1 processor core of a node and use 1gb of memory. If this is not what you want, you will need to specify '-g' flags when you submit the job on biowulf. Say if each line of the commands above need to use 10gb of memory instead of the default 1gb of memory, make sure swarm understands this by including '-g 10' flag:

biowulf> $ swarm -g 10 -f swarmFile --module mirdeep2

For more information regarding running swarm, see swarm.html

 

Submit an interactive mirdeep2 job

1. To do so, user first allocate a node from the cluster then run commands interactively on the node. DO NOT RUN ON BIOWULF LOGIN NODE:

$ qsub -I -l nodes=1:g8

or if your job require bigger memory,

$ qsub -I -l nodes=1:g24:c16

2. Once the job started and a node is allocated, run the interactive commands.

$ module load mirdeep2
$ cd /data/$USER/mirdeep2/run1
$ bowtie-build cel_cluster.fa cel_cluster
$ mapper.pl reads.fa -c -j -k TCGTATGCCGTCTTCTGCTTGT \ 
  -l 18 -m -p cel_cluster -s reads_collapsed.fa -t reads_collapsed_vs_genome.arf -v
$ quantifier.pl -p precursors_ref_this_species.fa -m mature_ref_this_species.fa \
  -r reads_collapsed.fa -t cel -y 16_19
$ miRDeep2.pl reads_collapsed.fa cel_cluster.fa reads_collapsed_vs_genome.arf \
  mature_ref_this_species.fa mature_ref_other_species.fa \
  precursors_ref_this_species.fa -t C.elegans 2> report.log

 

Documentation

http://www.mdc-berlin.de/en/research/research_teams/systems_biology_of_gene_regulatory_elements/projects/miRDeep/documentation.html