Biowulf at the NIH
RSS Feed
Fastlink & FastSLINK & SUP on Biowulf

Genetic linkage analysis is a statistical technique used to map genes and find the approximate location of disease genes. The LINKAGE program suite was an earlier package for genetic linkage analysis. FASTLINK is a signficantly modified and improved version of the main programs of LINKAGE that run much faster sequentially, allow the user to recover gracefully from a crash, and provides new improved documentation.

The SLINK package contains a simluation program slink and backend programs unknown, msim, lsim, isim. SLINK generates random replicates, and msim, lsim, isim analyze the replicates.

SUP is an extension to SLINK to allow a larger number of marker loci to be simulated in pedigrees conditional on trait values and in linkage equilibrium or disequilibrium with a trait locus. This version of SUP is meant to be used only with version 3.0 or above of FastSLINK (released in March 2010). Using version 3.0 of FastSLINK, there's no upper limit to the number of founders a pedigree may have. Also, by using 'slink -t' instead of 'slink', the trait locus genotypes are now readily available in a new output file, which simplifies the whole simulation set up.

The FastSLINK program is the merger of code from FASTLINK version 2.x to the slink program, by Alejandro Schaffer and Dan Weeks.

FASTLINK homepage at NCBI

Fastlink, FastSLINK and SUP on Biowulf are not parallelized. The advantage of running them on Biowulf would be to run 'swarms' of single-threaded jobs, since each job is sent to a separate Biowulf node

 

The environment variable(s) need to be set properly first. The easiest way to do this is by using the modules commands as in the example below.

There are fastlink, fastslink and sup modules. User can run individual ones or combine modules. The program 'unknown' exists in both fastlink and fastslink and there maybe version difference between them. The last issued module will overwrite previous ones. For example, the 'unknown' in fastSlink will be used if user first run 'module load fastlink' then run 'module load fastslink'.

$ module avail fastlink
-------------------- /usr/local/Modules/3.2.9/modulefiles ----------------------
fastlink/current

$ module load fastlink

$ module list
Currently Loaded Modulefiles:
1) fastlink/current $ module unload fastlink $ module load fastlink/current $ module show fastlink ------------------------------------------------------------------- /usr/local/Modules/3.2.9/modulefiles/fastlink/current: module-whatis Sets up fastlink prepend-path PATH /usr/local/apps/fastlink/ -----------------------------------------------------------------

Submitting a batch job

Set up a batch file along the following lines:

#!/bin/bash
# This file is fastlinkscript
#
#PBS -N fastlink
#PBS -m be
#PBS -k oe
module load fastlink
cd /data/$USER
mlink .......
Save the file. On biowulf, do this:
$ qsub -l nodes=1 /data/$USER/fastlinkscript
This qsub command will run your job on the first availabe node. If your job requires certain amount memory, specify property like this:
$ qsub -l nodes=1:g4 fastlinkscript
or 
$ qsub -l nodes=1:g8 fastlinkscript

Submitting a swarm of jobs

Set up a swarm command file along the following lines:

module load fastslink; cd /home/$USER/run1; slink; cp datafile.msi datafile.dat; unknown; msim
module load fastslink; cd /home/$USER/run2; slink; cp datafile.msi datafile.dat; unknown; msim       
module load fastslink; cd /home/$USER/run3; slink; cp datafile.msi datafile.dat; unknown; msim       
[.....]

-f: the swarm command file name above (required)
-g: GB of memory needed for each line of the commands in the swarm file above.(optional)

By default, each line of the commands above will be executed on '1' processor core of a node and uses 1GB of memory. If this is not what you want, you will need to specify '-g' flags when you submit the job on biowulf.

Say if each line of the commands above also will need to use 10gb of memory instead of the default 1gb of memory, make sure swarm understands this by including '-g 10' flag:

biowulf> $ swarm -g 10 -f cmdfile

For more information regarding running swarm, see swarm.html

 

Documentation

http://www.ncbi.nlm.nih.gov/CBBresearch/Schaffer/fastlink.html

[Paper 1] [Paper 2] [Paper 3] [Paper 4] [Loops in Fastlink] [Pedigree traversal] [UNKNOWN docs] (PDF)

http://watson.hgen.pitt.edu/docs/SLink.html

/usr/local/apps/fastslink/sup-fastslink-tutorial/