Biowulf at the NIH
RSS Feed
Magic on Biowulf

The Magic RNA-seq integrative analysis pipeline is a pipeline to analyse next-generation sequencing. It was developed by Danielle and Jean Thierry-Mieg at NCBI, NIH. Magic website.

The human reference genome index and annotations for Magic are in /fdb/magic/TARGET.human.2013_06_15. The paths and required environment variables for Magic can be set up by typing module load magic, as in the examples below.

Submitting a single MAGIC job

At present, MAGIC is not set up to use multiple nodes. It will utilize all cores on a single allocated node by default.

Set up a batch script along the following lines:

#!/bin/bash

# cd to the desired directory
cd /data/$USER/magic

# set up the paths for MAGIC
module load magic                    

# set the environment variable MAGIC to be the name of the project
export MAGIC=TEST                    

#create a link to the human genome index
ln -s /fdb/magic/TARGET.human.2013_06_15/ ./TARGET    

# initialize MAGIC for DNA or RNA
MAGIC init DNA

# create a test dataset
MAGIC createGenomicTestSet

# run the alignment
MAGIC ALIGN

Submit this job with:

qsub -l nodes=1:c24 mybatchscript

This test job takes about 20 mins and uses 7.5 GB memory on a 24-core node.

Running a MAGIC job interactively

The following sample session runs a MAGIC job interactively on a single node. (user input in bold). The MAGIC processes will use all the cores on the node. The job is run in the local /scratch on the node.

[susanc@biowulf ~]$ qsub -I -l nodes=1:c24
qsub: waiting for job 5248384.biobos to start
qsub: job 5248384.biobos ready

[susanc@p2274 ~]$ cd /scratch

[susanc@p2274 scratch]$ clearscratch

[susanc@p2274 scratch]$ module load magic

[susanc@p2274 scratch]$ mkdir magic_test; cd magic_test

[susanc@p2274 magic_test]$ ln -s /fdb/magic/TARGET.human.2013_06_15/ ./TARGET

[susanc@p2274 magic_test]$ MAGIC init DNA
## Please edit scripts/submit and scripts/LIMITS for optional choices of configuration
/scratch/magic_test/MetaDB /scratch/magic_test 
// 2014-01-16_12:44:50 done: max memory 47 Mb
/scratch/magic_test 
/scratch/magic_test/VariantDB /scratch/magic_test 
// 2014-01-16_12:44:50 done: max memory 47 Mb
/scratch/magic_test 

[susanc@p2274 magic_test]$ export MAGIC=TEST

[susanc@p2274 magic_test]$ MAGIC createGenomicTestSet
MAGIC start Thu Jan 16 12:45:15 EST 2014
Project TEST
Standard configuration parameters were read in scripts/LIMITS and TARGET/LIMITS
User specified parameters were read in  ./LIMITS
Species=hs MOLECULE_TYPE=DNA  Strategy=Exome targets=DNASpikeIn SpikeIn mito genome gdecoy
phaseSet=createGenomicTestSet
Scan the meta-database // 2014-01-16_12:45:15 done: max memory 42 Mb
                   done  Project TEST
Construct the lists of Runs and Groups   Project TEST    done   0 runs, 0 groups of runs
phase createGenomicTestSet start Thu Jan 16 12:45:15 EST 2014
   done   0 runs, 0 groups of runs
Construct the lists of jobs               done   the runs are split in 0 jobs of at most 300 Mb or 5M reads
createTestSet:  6 times 800000 reads (forward, reverse, SOLiD) times (exact, SNP)
// start: 2014-01-16_12:45:16
Processed 1 fragments
Processed 1 sequences135524747 bp
Processed 1 reads135524747 bp
// done: 2014-01-16_12:46:31max memory 484 Mb
    createGenomicTestSet: Make a 2 test sets of x Million 50mers, covering 20 times a section of the genome, one exact one with with artificial SNPs at position 100,200,...
// start: 2014-01-16_12:46:32
[...etc...]
Processed 244736 reads12236800 bp
// done: 2014-01-16_12:46:53max memory 0 Mb

[susanc@p2274 magic_test]$ MAGIC ALIGN
MAGIC start Thu Jan 16 12:48:20 EST 2014
Project TEST
Standard configuration parameters were read in scripts/LIMITS and TARGET/LIMITS
User specified parameters were read in  ./LIMITS
Species=hs MOLECULE_TYPE=DNA  Strategy=Genome targets=SpikeIn mito genome gdecoy
phaseSet=  a0C wait a0D  a123 a0P wait c1 c2 wait c3 c4 c5 wait c6 
Scan the meta-database // 2014-01-16_12:48:20 done: max memory 42 Mb
                   done  Project TEST
Construct the lists of Runs and Groups   Project TEST    done   6 runs, 4 groups of runs
phase a0C start Thu Jan 16 12:48:20 EST 2014
   done   6 runs, 4 groups of runs
Construct the lists of jobs               done   the runs are split in 6 jobs of at most 300 Mb or 5M reads
phasea0C:count the sequences in each run Thu Jan 16 12:48:20 EST 2014
Using 24 processors
background submit: pgm=###bin/dna2dna -i Fastc/exact_ForwardStrand/f.1.fastc.gz -I fastc -gzi -O count -minEntropy 16  -minLength 24  -clipN 2   -o Fastc/exact_ForwardStrand/f.1###    stdout/err=###Fastc/exact_ForwardStrand/f.1.count.out/err###
[1] 24357
Using 24 processors
[...etc...]
phase c6 done : Thu Jan 16 13:05:36 EST 2014
Using 24 processors
MAGIC done 

[susanc@p2274 magic_test]$exit
qsub: job 5248384.biobos completed

Documentation

Magic User Guide (Word doc).