biowulf_logo

Status
About
Hardware
Applications
Batch queues
Disk storage

MPI
Performance
New Users
User Guide
Documentation
Research
Photos


FSL on Biowulf

FSL is a comprehensive library of image analysis and statistical tools for FMRI, MRI and DTI brain imaging data. FSL is written mainly by members of the Analysis Group, FMRIB, Oxford, UK. FSL website.

On the Biowulf cluster, FSL is installed in /usr/local/fsl. Regular FSL users should set up the environment variables in their startup files.
csh users should add the following to their .cshrc file:

setenv FSLDIR /usr/local/fsl
source $FSLDIR/etc/fslconf/fsl.csh
setenv PATH $FSLDIR/bin:$PATH
Bash users should add the following to their .bash_profile file:
export FSLDIR=/usr/local/fsl
. $FSLDIR/etc/fslconf/fsl.sh 
export PATH=$FSLDIR/bin:$PATH

Setting up an FSL job

FSL is typically used on Biowulf to process many images. This is most easily done via the swarm utility. Below is a sample swarm command file which runs mcflirt and bedfunc in succession on each image.

# this file is called myswarmfile
mcflirt -in /data/user/fmri1 -out mcf1 -mats -plots -refvol 90 -rmsrel -rmsabs; betfunc mcf1 bet1
mcflirt -in /data/user/fmri2 -out mcf2 -mats -plots -refvol 90 -rmsrel -rmsabs; betfunc mcf2 bet2
mcflirt -in /data/user/fmri3 -out mcf3 -mats -plots -refvol 90 -rmsrel -rmsabs; betfunc mcf3 bet3
...

This file would be submitted as follows:

swarm -f myswarmfile
Note that by default, two or more lines in the file above will be processed by each node. The Biowulf nodes have a minimum of 1GB RAM per processor. If the individual FSL programs in the swarm command file require more than 1GB memory, it may be necessary to specify a node type or to ensure that only one line is processed by each node. If you need help with optimizing your FSL swarm jobs, contact the Biowulf staff (staff@biowulf.nih.gov).

Parallel Bedpost

Important - 30 Aug 2007: Note that in v4.0.0 (the current default version on the Biowulf cluster), bedpost parallelization has changed. To run bedpost in parallel and use the new crossing fibre tractography, you should specify bedpostX in your command file below.

The FSL bedpost program can run for many hours, so it is often worthwhile to run in parallel.

To run bedpost in parallel, the file fsl.sh must be copied into the user's home directory. This is a one-time operation.

mkdir ~/.fslconf
cp /usr/local/fsl/fsl.sh ~/.fslconf/

Other required environment variables (e.g. FSLREMOTECALL) have been set in the FSL system startup files, so users do not need to set any other variables.

Below is a sample script to run bedpost on multiple nodes, ensuring that both processors of each node will be utilized.

#!/bin/bash
#PBS -m be
#PBS -N bedpost

# set environment variables
export FSLMACHINELIST=`cat $PBS_NODEFILE $PBS_NODEFILE`

#run parallel bedpost
/usr/local/fsl/bin/bedpostX /data/user/myfsldir/subject22

This batch script would be submitted with, for example,

qsub -l nodes=4:x86_64 myscript
Since each node has 2 processors, the job would run on 8 processors.

Benchmarks

The FSL example suite was run on several types of nodes in the Biowulf cluster.

Node typeUser cpu timeWallclock Time
o2800
2.8 GHz Opteron
4 GB RAM
2703 s2835 s
o2200
2.2 GHz Opteron
2 GB RAM
3397 s 3509 s
o2000
2.0 GHz Opteron
2 GB RAM
3837s 3984 s
p2800
2.8 GHz Xeon
4 GB RAM
4287 s 4440 s

Bedpost parallelizes by distributing the independent data slices to multiple nodes. Thus, it is "embarassingly parallel", and a parallel bedpostX job submitted to 16 processors will run approximately 16 times faster than on a single processor.


Biowulf home page | Helix Systems | NIH