STAR on Biowulf
STAR aligns RNA-seq reads to a reference genome.
Its advantages include:
- 'Ab initio' splice junctions --un-annotated, non-canonical, distal exons, chimeric ...
- Unique and multiple mappers
- Any read length, any number of splice junctions per read
- Any (reasonable) number of mismatches and indels
- Alignment scoring utilizing Illumina quality scores
- "Auto" trimming of poor quality ends
- poly-A tails detection
- Very Fast: human 75-mer reads: 60 Million read per hour
STAR was developed by Alex Dobin. STAR website
The STAR executable can be added to your path by typing 'module load STAR' or including it in a batch script.
Submitting a single batch job
1. Create a script file. Sample batch script file
#!/bin/bash # This file is starScript # #PBS -N star #PBS -m be #PBS -k oe module load STAR cd /data/user/mydir STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDir \ --genomeFastaFiles /path/to/genome/fasta1 /path/to/genome/fasta2 --runThreadN <n> …
2. Submit the script using the 'qsub' command on Biowulf, with, for example:
$ qsub -l nodes=1:g24:c16 ./script
This job will run on g24 (24 GB of memory) node. You may need to run a few test jobs to determine the amount of memory required then detemine the node type suitable for your job.
Running an interactive job
Users may need to run jobs interactively sometimes. Such jobs should not be run on the Biowulf login node.
Allocate an interactive node as described below, and run the interactive job there.
biowulf% qsub -I -l nodes=1:g24:c16
qsub: waiting for job 2236960.biobos to start
qsub: job 2236960.biobos ready
[user@pxxx]$ cd YourDir
[user@pxxx]$ module load STAR
[user@pxxx]$ STAR --runMode genomeGenerate --genomeDir /path/to/GenomeDir \
--genomeFastaFiles /path/to/genome/fasta1 /path/to/genome/fasta2 --runThreadN <n> …
[user@pxxx]$ exit
qsub: job 2236960.biobos completed
[user@biowulf ~]$
Documentation


