Description
MOSAIK is a reference-guided assembler comprising of four main modular programs:
- MosaikBuild
- MosaikAligner
- MosaikSort
- MosaikAssembler
MosaikBuild converts various sequence formats into the Mosaik native read format. MosaikAligner pairwise aligns each read to a specified series of reference sequences. MosaikSort resolves paired-end reads and sorts the alignments by the reference sequence coordinates. Finally, MosaikAssembler parses the sorted alignment archive and produces a multiple sequence alignment which is then saved into an assembly file format.
The MOSAIK suite was written by Michael Strömberg of the Marth lab at Boston College.
How To Use
There are multiple versions of Mosaik available. An easy way of selecting the version is to use modules. To see the modules available, type
module avail mosaikTo select a module, type
module load mosaik/[ver]where [ver] is the version of choice. This will set your $PATH variable.
As an example, create a batch script to run the commands to align reads to a chromosome:
#----- This file is Mosaik.bat -----# #!/bin/bash #PBS -mbe #PBS -N Mosaik #PBS -e Mosaik.err #PBS -o Mosaik.out cd $PBS_O_WORKDIR # Set the environment using mosaik module module load mosaik # Set the temporary directory export MOSAIK_TMP=/scratch # Build the Mosaik .dat file for reads MosaikBuild -fr myreads.fasta -fq myreads.fasta.qual -out myreads.dat # Build the Mosaik .dat file for the reference chromosome MosaikBuild -fr myreference.fasta -oa myreference.dat # Align the reads to the reference chromosome using 8 processors MosaikAligner -in myreads.dat -out myreads_aligned.dat -ia myreference.dat -hs 15 -mm 4 -m all -mhp 100 -act 20 -j myjumpdb -p 8The -p option sets the number of CPUs to use during execution. This batch script uses -p 8, which requires the batch script to be submitted to a 8CPU node or higher:
qsub -l nodes=1:c16 Mosaik.bat


