Biowulf at the NIH
RSS Feed
PBAT on Biowulf

PBAT: Tools for the statistical analysis of family-based association studies (FBAT), developed by Christoph Lange.

PBAT is not a parallel program. Single PBAT jobs should be run interactively on the Biowulf interactive nodes or Helix. If you have multiple PBAT jobs to run, the swarm utility is recommended.

Below are sample sessions for three submission methods.


Swarm Sample Session:

The swarm program is a convenient way to submit large numbers of jobs all at once instead of manually submitting them one by one.

1. First create different directories for each PBAT run. Put all the required input files under each of the created directories.

2. For each directory, create a script file which contains the PBAT commands.. Each line contains the command exactly like what you would enter when running interactively at the prompt. For example, under /data/user/pbat/run1 directory, a file called 'script1' was created and content of the file looks like below. Please note the blank lines in this file represent the 'enter' keys and is required as when prompted 'any key to clear screen' when running interactively:

cd /data/user/pabt/run1;/usr/local/bin/pbat <<EOF


1

1
1
1
0
213
1
1
2
1
175
0
1
1
3
1
175
0
1
1
2
2
220
0
1
1
3
2
220
0
1
3
test1
0
2
0
2
1
2
.1
1
.1
.05
0
3
0.01
0.1


4
1

4
2

4
3


4
3

-1
EOF

3. Now prepare the swarm command file. For example, a swam command file called 'cmdfile' contains the following lines:

/data/user/pbat/run1/script1
/data/user/pbat/run2/script2
/data/user/pbat/run3/script3
/data/user/pbat/run4/script4
.....
....
/data/user/pbat/runN/scriptN

4. There are one flag of swarm that's required '-f' and two other flags of swarm user most possibly needs to specify when submit a swarm job: '-t' and '-g'.

-f: the swarm command file name above (required)
-t: number of processors per node to use for each line of the commands in the swarm file above.(optional)
-g: GB of memory needed for each line of the commands in the swarm file above.(optional)

By default, each line of the commands above will be executed on '1' processor core of a node and uses 1GB of memory. If this is not what you want, you will need to specify '-t' and '-g' flags when you submit the job on biowulf.

Say if each line of the commands above also will need to use 10gb of memory instead of the default 1gb of memory, make sure swarm understands this by including '-g 10' flag:

biowulf> $ swarm -g 10 -f cmdfile

For more information regarding running swarm, see swarm.html

Single plink jobs would typically be submitted only for debugging purposes.

1. Create a script file which contains the PBAT commands like above and put all commands in that file:

--- file pbat.script ----------
#!/bin/bash
#PBS -m be
#
cd /data/user/pabt/run1
/usr/local/bin/pbat <<EOF


1

1
1
1
0
213
1
1
2
1
175
0
1
1
3
1
175
0
1
1
2
2
220
0
1
1
3
2
220
0
1
3
test1
0
2
0
2
1
2
.1
1
.1
.05
0
3
0.01
0.1


4
1

4
2

4
3


4
3

-1
EOF

2. Now submit the script using the 'qsub' command, e.g.

qsub -l nodes=1 /data/user/pbat/run1/script