Biowulf at the NIH
RSS Feed
cutadapt on Biowulf

cutadapt removes adapter sequences from high-throughput sequencing data. This is usually necessary when the read length of the sequencing machine is longer than the molecule that is sequenced, for example when sequencing microRNAs.

cutadapt was developed by Marcel Martin. cutadapt website.

Running a cutadapt batch job

Set up a batch script along the following lines:

#!/bin/bash
# this file is called job.bat

cd /data/mydir

module load cutadapt

cutadapt -a AACCGGTT input.fastq > output.fastq

Submit this job with:

qsub -l nodes=1 job.bat

Running a swarm of cutadapt jobs

Set up a swarm command file along the following lines:

cd /data/user/mydir; cutadapt -a AACCGGTT input1.fastq > output1.fastq
cd /data/user/mydir; cutadapt -a AACCGGTT input2.fastq > output2.fastq
cd /data/user/mydir; cutadapt -a AACCGGTT input3.fastq > output3.fastq
[...]

Submit this swarm with:

swarm -f swarm.cmd --module cutadapt

Running an interactive cutadapt job

Most programs on the Biowulf cluster should be run as batch or swarm jobs, but users may occasionally want to run an interactive job for debugging purposes. Allocate an interactive node and run cutadapt there. Example:

biowulf% qsub -I -l nodes=1
qsub: waiting for job 2236960.biobos to start
qsub: job 2236960.biobos ready

p24% module load cutadapt
p24% cd /data/user/mydir
p24% cutadapt -a AACCGGTT input.fastq > output.fastq
p24% exit

biowulf%

Documentation

cutadapt documentation