Biowulf at the NIH
RSS Feed
Swarm on Biowulf

Swarm is a script designed to simplify submitting a group of commands to the Biowulf cluster. Some programs do not scale well or can't use distributed memory. Other programs may be 'embarrassingly parallel', in that many independent jobs need to be run. These programs are well suited to running 'swarms of jobs'. The swarm script simplifies these computational problems.

Swarm reads a list of commands from a swarm command file (termed the "swarmfile"), then automatically submits those commands to the PBS batch system to execute. A process is created on a node when a command is executed. Swarm runs one process for each core on a node, making optimum use of a node (thus a node with 8 cores will run eight processes in parallel).

Commands in the swarmfile should appear just as they would be entered on a command line. STDOUT (or STDERR) output that isn't explicitly directed elsewhere will be sent to a file named swarm#nPID.o (or .e) in your current working directory. A line where the first non-whitespace character is "#" is considered a comment and is ignored, unless either the --no-comment or --comment-char options are given.

For example, create a file that looks something like this:

# My first swarmfile -- this file is file.swarm
uptime > file1 ; echo 'file1 done'
uptime > file2 ; echo 'file2 done'
uptime > file3 ; echo 'file3 done'
uptimd > file4 ; echo 'file4 done'

Then submit to the batch system:

[biowulf]$ swarm -f file.swarm
987654321.biobos

This will result in the commands being run independently on 4 cpus as a single job (987654321.biobos) in the same directory from which the swarm was submitted. When finished, STDOUT and STDERR files (in this example sw1n837.o and sw1n837.e) files will be written, along with any other files created within the jobs:

[biowulf]$ ls
file1  file2  file3  file4  file.swarm  sw1n837.e  sw1n837.o

Swarm files can be extremely long, with hundreds of thousands of lines. If this is the case, it is recommended or required to bundle the swarm. This means that multiple commands will be run serially, one after the other, on the same cpu. Bundling is enabled with the -b option. In the following example, up to 10 commands will be run sequentially per cpu:

[biowulf]$ swarm -f very_large.swarm -b 10

A swarmfile may contain commands that are extremely long and complex, spanning many lines. To allow users to create such swarmfiles with any degree of readability, swarm will honor line continuation markers (a space followed by a backslash '\' followed immediately by a newline character), and will automatically wrap lines written in such a fashion. Users may want to include line continuation markers in their swarmfiles for ease of editing. For example, both of these example swarmfiles will result in the same output. However, this is easy to edit:

[biowulf]$ cat swarmfile_clean
# align read pair 001
tophat \
 -p 32 \
 -N 2 \
 -a 8 \
 -m 0 \
 -i 50 \
 -I 500000 \
 -g 20 \
 -r 50 \
 --no-coverage-search \
 --read-gap-length 2 \
 --read-edit-dist 2 \
 --max-insertion-length 3 \
 --max-deletion-length 3 \
 --min-coverage-intron 50 \
 --max-coverage-intron 20000 \
 --mate-std-dev 20 \
 --segment-mismatches 2 \
 --segment-length 25 \
 --min-segment-intron 50 \
 --max-segment-intron 500000 \
 --b2-N 0 \
 --b2-L 20 \
 --b2-D 15 \
 --b2-R 2 \
 --b2-i S,1,1.25 \
 --b2-n-ceil L,0,0.15 \
 --b2-gbar 4 \
 --b2-mp 6,2 \
 --b2-np 1 \
 --b2-rdg 5,3 \
 --b2-rfg 5,3 \
 --b2-score-min L,-0.6,-0.6 \
 --fusion-anchor-length 20 \
 --fusion-min-dist 10000000 \
 --fusion-read-mismatches 2 \
 --fusion-multireads 2 \
 --fusion-multipairs 2 \
 --rg-id 1 \
 --rg-sample SM:BOGUS_R1_001 \
 -o /data/user/BOGUS_001 \
 /fdb/bowtie2mm10 BOGUS_READ_R1_001.fastq BOGUS_READ_R2_001.fastq

while this is not:

[biowulf]$ cat swarmfile_messy
tophat --no-coverage-search -p 32 -N 2 --read-gap-length 2 --read-edit-dist 2 -a 8 -m 0 -i 50 -I 500000 -g 20 --max-insertion-length 3 --max-deletion-length 3 --min-coverage-intron 50 --max-coverage-intron 20000 -r 50 --mate-std-dev 20 --segment-mismatches 2 --segment-length 25 --min-segment-intron 50 --max-segment-intron 500000 --b2-N 0 --b2-L 20 --b2-i S,1,1.25 --b2-n-ceil L,0,0.15 --b2-gbar 4 --b2-mp 6,2 --b2-np 1 --b2-rdg 5,3 --b2-rfg 5,3 --b2-score-min L,-0.6,-0.6 --b2-D 15 --b2-R 2 --fusion-anchor-length 20 --fusion-min-dist 10000000 --fusion-read-mismatches 2 --fusion-multireads 2 --fusion-multipairs 2 --rg-id 1 --rg-sample SM:BOGUS_R1_001 -o /data/user/BOGUS_001 /fdb/bowtie2mm10 BOGUS_READ_R1_001.fastq BOGUS_READ_R2_001.fastq

Swarm creates PBS batch scripts that are in bash. Thus, all commands in the user's swarmfile should be bash-compatible. For backward compatibility, the option --usecsh will allow commands to be interpreted under csh/tcsh.

Swarm creates a .swarm directory in your current working directory, and creates an executable script for every PBS job created. Each PBS job will contain a subset of commands from your swarmfile, depending upon how many cores are available on the allocated nodes. These scripts are automatically deleted as the final step when they are executed.

Swarm determines what nodes are appropriate for the run based on the amount of memory required (by default 1 gb per process) and the number of threads per process (by default 1 thread per process). The user should not need to worry about the hardware, but instead what is required by the commands within the swarmfile.

If the -d or --debug option is specified, the scripts will not be deleted, and the job will NOT be submitted.

swarm -f cmdfile [ -b # ] [ -g # ] [ -t # | auto ] [ --disk # ] [ --autobundle ] [ --noht ] [ -R resource ] [ --help ] [ -d,--debug ] [ --usecsh ] [ --module ] [ --no-comment ] [ --comment-char char ] [ --jobarray ] [ --singleout ] [ --prologue command ] [ --epilogue command ] [ --quiet ] [ --no-scripts ] [ --keep-scripts ] [ qsub-options ]

The -f cmdfile option is mandatory, all others are optional.

-f cmdfile
--file
specify the file containing a list of commands, one command per line. You may use ";" to separate several commands on a line, and these will be executed sequentially. Any text following a "#" will be considered a comment and not executed. Lines ending with " \" (a space and backslash, the so-called line continuation marker) are joined to the next line.
-d
--debug
debug mode. The swarmfile is read, command scripts are generated and saved in the .swarm directory, and debugging information is printed. The scripts are NOT submitted to the batch system.
-b #
--bundle
bundle mode. Swarm runs one process per core by default. Use the bundle option to run "#" processes per core sequentially, one after the other. The advantages of bundling include fewer swarm jobs and output/error files, lower overhead due to scheduling and job startup, and disk file cache benefits under certain circumstances.
-g #
--gb-per-process
gigabytes per process. By default swarm assumes each process will require 1 gb of memory. Some applications require more than 1 gb, and so setting -g will restrict both the nodes allocated to the swarm and the number of processes per node to accomodate the memory requirement.
-t # | auto
--threads-per-process
threads per process. By default swarm assumes each process will run a single thread, meaning one process will run on each node core. If a process is multi-threaded, then assigning the -t will restrict the number of processes that will run per node to accomodate the number of threads and not overload the node. If -t auto is used, then the processes per node will be set to 1, and allow the process to create as many threads as there are cores on the node.
--disk #
--disk-per-process
/scratch disk per process. Each node has its own local /scratch disk (distinct from /scratch on the Biowulf login node). This option can be used to require a certain amount of local /scratch area be available per process.
--autobundle set a bundle value for large swarms automatically. This limits the number of jobs in a swarm to the maximum number of nodes available.
--noht assume hyperthreading is turned off on all nodes. This will cause swarm to use half as many threads on hyperthreading nodes.
-R resource
--resource-list
set any application-dependent resources required by the jobs. For example, in order to run Matlab jobs on Biowulf, you need to include the matlab resource: -R matlab=1. Please see the application pages for more information about the individual resources required.
-h
--help
prints help message
--quiet limit the messages when --debug mode is run
--no-scripts don’t create command scripts when --debug mode is run
--keep-scripts don't delete command scripts when swarm is completed
--usecsh /bin/tcsh mode, instead of /bin/bash
--module load a list of environment modules prior to execution. Module names are separated by spaces only. See here to learn more about environment modules.
--no-comment don’t ignore text following a comment character (’#’ by default).
--comment-char char use a different character other than ’#’ as a comment character.
--jobarray run swarm as a jobarray. This creates a single PBS jobid, rather than a set, allowing simple job dependencies.
--singleout concatenate STDOUT and STDERR to single files, rather an a set of .o and .e files.
--prologue command run a command or script once on a node at the start of each swarm job.
--epilogue command run a command or script once on a node upon completion of each swarm job.
[qsub-options] Allowance of qsub options is as follows:
optionallowance
[-a date_time] yes
[-A account_string] yes
[-c interval] yes
[-C directive_prefix] yes
[-e path] no, overridden
[-h] yes
[-I] no
[-j join] yes
[-J range] no, overridden with --jobarray
[-k keep] yes
[-l resource_list] no, overridden
[-m mail_events] yes
[-M user_list] yes
[-N name] yes
[-o path] no, overridden
[-p priority] yes
[-q destination] yes
[-r c] yes
[-S path_list] yes
[-u user_list] yes
[-v variable_list] yes
[-V] yes
[-W additional_attributes] yes
[-z] yes

STDOUT and STDERR output from processes executed under swarm will be directed to a file named swarm#nPID.o (or .e), for instance swarm2587n1.o (or swarm2587n1.e). Since this can be confusing (with multiple processes writing to the same file) it is a good idea to explicitly redirect output on the command line using ">".

Note that input/output redirects (and everything in the swarmfile) should be bash compatible. csh-style redirects like 'program >&; output' will not work correctly unless the --usecsh option is included.

Be aware of programs that write directly to a file using a fixed filename. A file will be overwritten and garbled if multiple processes are writing to the same file. If you run multiple instances of such programs then for each instance you will need to a) change the name of the file in each command or b) alter the path to the file. See the EXAMPLES section for some ideas.

Please pay attention to the memory requirements of your swarm jobs! At the bottom of the swarm output files, there is a section that displays the elapsed time and maximum memory usage of the job:

------------------------- PBS time and memory report ------------------------
3875575.biobos elapsed time: 425 seconds
3875575.biobos maximum memory: 3.5312 GB
-----------------------------------------------------------------------------

When a swarm job runs out of memory, the node stalls and the job is eventually killed or dies. If a job dies before it is finished, this output may not be available. Contact staff@helix.nih.gov when you have a question about why a swarm stopped prematurely.

To see how swarm works, first create a file containing a few simple commands, then use swarm to submit them to the batch queue:

[biowulf]$ cat > cmdfile
date
hostname
ls -l
^D
[biowulf]$ swarm -f cmdfile

Use qstat -u your-user-id to monitor the status of your request; an "R" in the "S"tatus column indicates your job is running (see qstat(1) for more details). This particular example will probably run to completion before you can give the qstat command. To see the output from the commands, see the files named "swarm#nPID.o".


Example 1: A program that reads to STDIN and writes to STDOUT

For each invocation of the program the names for the input and output files vary:

[biowulf]$ cat > runbix
./bix < testin1 > testout1
./bix < testin2 > testout2
./bix < testin3 > testout3
./bix < testin4 > testout4
^D

Example 2: A program that writes to a fixed filename

If a program writes to a fixed filename, then you may need to run the program in different directories. First create the necessary directories (for instance run1, run2), and in the swarmfile cd to the unique output directory before running the program: (cd using either an absolute path beginning with "/" or a relative path from your home directory). Lines with leading "#" are considered comments and ignored.

[biowulf]$ cat > batchcmds
# Run ped program using different directory
# for each run
cd pedsystem/run1; ../ped
cd pedsystem/run2; ../ped
cd pedsystem/run3; ../ped
cd pedsystem/run4; ../ped
...

[biowulf]$ swarm -f batchcmds

Example 3: Bundling large numbers of commands

If you have over 1000 commands, especially if each one runs for a short time, you should 'bundle' your jobs with the -b flag. If the swarmfile contains 2500 commands, the following swarm command will group them into bundles of 40 commands each, producing 64 bundles. Swarm will then submit two bundles as a single swarm job, so there will be 32 (2500/64) swarm jobs.

[biowulf]$ swarm -f cmdfile -b 40

Note that commands in a bundle will run sequentially on the assigned node.


Example 4: Using qsub flags

Swarm submits clusters of processes using PBS (Portable Batch System) via the qsub command; any valid qsub commandline option is also valid for swarm. In this example the "-v" option is given to pass a variable to the swarm.

[biowulf]$ swarm -f testfile -v appVar=4

In this case, the environment variable $appVar is set to the value 4. Some applications require environment variables be set in order for them to run properly.

Note that swarm overrides the qsub option -l, and chooses the node type automatically.


Example 5: Using --debug option

Before submitting a large complex swarm to the batch system, it is better to see what would happen before it's too late. In this case, the --debug option will display a good deal of information. This example shows a file of 1000 commands bundled to run 10 processes serially per core.

[biowulf]$ swarm -f swarmfile -b 10 --debug --quiet
[main] swarmfile is 'swarmfile'
[main] 1 gb per process
[main] 1 thread per process
[main] -b processes per core is '10'
[test] mini-freen output

    NT  Free /  All   C   M  ppn
-----------------------------------
 c2:g1     0 /   80   2   1    1
 c2:m1     8 /  662   2   2    2
 c2:m2     8 /  530   2   4    2
 c2:m4     0 /   78   2   8    2
    c4     0 / 1828   4   8    4
    c8    16 / 2536   8  24    8
-----------------------------------
 total    32 / 5714

[main] using c8 nodes
[main] loptions is '-l nodes=1:c8'
[main] 1000 commands on 13 c8 nodes using 8 processes per node

The mini-freen output displays 6 columns:

  • NT: nodetype as designated on Biowulf
  • Free: the number of cores (not nodes!) not currently allocated for the given nodetype
  • All: the total number of cores available for the given nodetype
  • C: the number of cores available per node
  • M: the memory available per node, in gb
  • ppn: the number of processes per node required for the swarm

Example 6: Using -g option

If the processes require significant amounts of memory (> 1 GB), a swarm can run fewer processes per node than the number of cores available on a node. For example, if the commands in a swarmfile need up to 6 GB of memory each, running swarm with --debug shows what might happen:

[biowulf]$ swarm -f swarmfile -g 6 --debug --quiet
[main] swarmfile is 'swarmfile'
[main] 6 gb per process
[main] 1 thread per process
[test] mini-freen output

    NT  Free /  All   C   M  ppn
-----------------------------------
 c2:m4     0 /   78   2   8    1
    c4     0 / 1828   4   8    1
    c8    16 / 2536   8  24    4
-----------------------------------
 total    16 / 4442

[main] using c4 nodes
[main] loptions is '-l nodes=1:c4'
[main] 1000 commands on 1000 c4 nodes using 1 process per node

Note that only 1 process would have been run per c4 node. This is because the c4 node has only 8 gb of memory, and can only accomodate one 6 gb process.

Example 7: Using --module option

It is sometimes difficult to set the environment properly before running commands. The easiest way to do this on Biowulf is with environment modules. Running commands via swarm complicates the issue, because the modules must be loaded prior to every line in the swarmfile. Instead, you can use the --module option to load a list of modules:

[biowulf]$ swarm -f testfile --module ucsc matlab python/2.7.1

Here, the environment is set to use the UCSC executables, Matlab, and an older, non-default version of Python.

Example 8: Handling a swarm as a single job using --jobarray

The option --jobarray causes swarm to be run as a PBS job array. This has a few effects:

  • swarm runs as a single job, rather than a set of independent jobs
  • The jobid of the single job has a new format
  • The output of swarm --jobarray has a new format

The --jobarray option allows a single swarm to be handled like a single PBS job. Because of this, subsequent jobs can easily run with dependencies on the swarm.

For example, a first script (first.sh) is to be run to generate some initial data files. Once this job is finished, a swarm of commands (swarmfile.txt) is run to take the output of the first script and process it. Then, a last script (last.sh) is run to consolidate the output of the swarm and further process it into its final form.

Below, the swarm is run as a job array with a dependency on the first script. Then the last script is run with a dependency on the swarm. The swarm will sit in a hold state until the first job (1001.biobos) is completed, and the last job will sit in a hold state until the entire swarm (1002[].biobos) is completed. Note the different format of the job array jobid.

[biowulf]$ qsub -l nodes=1 first.sh
1001.biobos
[biowulf]$ swarm --jobarray -f swarmfile.txt -W depend=afterany:1001.biobos
1002[].biobos
[biowulf]$ qsub -l nodes=1 -W depend=afterany:1002[].biobos last.sh
1003.biobos

The dependency key 'afterany' is used instead of 'afterok' because the exit status of a PBS job is unpredictable. It is best to encode validation of output into the scripts, rather than relying on the exit status of a PBS job.

Please note that the script must be the last argument to the qsub command.

The jobid of a job can be captured from the qsub command and passed to subsequent submissions in a script (master.sh). For example, here is a bash script which automates the above procedure, passing the variable $id to the first script. In this way, the master script can be reused for different inputs:

[biowulf]$ cat master.sh
#!/bin/bash
jobid1=`qsub -l nodes=1 -v id=$1 first.sh`
echo $jobid1
jobid2=`swarm --jobarray -f swarmfile.txt -W depend=afterany:$jobid1`
echo $jobid2
jobid3=`qsub -l nodes=1 -W depend=afterany:$jobid2 last.sh`
echo $jobid3

Now, master.sh can be submitted with a single argument

[biowulf]$ ./master.sh mydata123
1001.biobos
1002[].biobos
1003.biobos
[biowulf]$

You can check on the job status using qstat:

[biowulf]$ qstat -u user

biobos:
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
1001.biobos     user     norm     first        6170   1   1    --    --  R 00:00
1002[].biobos   user     norm     sw0n7222      --    1   1    --    --  H   --
1003.biobos     user     norm     last          --    1   1    --    --  H   --

swarm, whether in normal or jobarray mode, will generate a stdout and stderr file for each job. In jobarray mode, the output has the format sw[b]1{#1}.[o/e]{#2}.{#3}, where {#1} is a random swarm id, {#2} is the PBS job id, and {#3} is the jobarray subjob id. For example, the above script might give these files after completion:

[biowulf]$ ls
first.err               last.sh                 sw1n6848.e1002.4        sw1n6848.o1002.4
first.out               master.sh               sw1n6848.e1002.5        sw1n6848.o1002.5
first.sh                sw1n6848.e1002.1        sw1n6848.o1002.1        swarmfile.txt
last.err                sw1n6848.e1002.2        sw1n6848.o1002.2
last.out                sw1n6848.e1002.3        sw1n6848.o1002.3
[biowulf]$

Example 9: Using --prologue and --epilogue

swarm splits a list of commands into a set of command files, and then runs each command file as a single job on a single node. The --prologue option allow a single script or command to be run prior to the set of commands on a single node. The --epilogue works the same, except that is runs after all the commands are run on a single node.

For example, the swarm commands will generate output into the local scratch directory on the node. The output is quite large, so it needs to run 'clearscratch' prior to the run to remove all files from /scratch. Since the output is mostly temporary, and only a summary of the output is required, an epilogue perl script called 'cleanup.pl' needs to be called to pull out only the essential parts of the output and write it to a single file.

swarm --prologue /usr/local/bin/clearscratch --epilogue /data/user/cleanup.pl
-f swarmfile.txt

Note that the full paths to the clearscratch command and cleanup.pl scripts need to be supplied as arguments.

Example 10: Handling comment characters with --no-comment and --comment-char

By default, if a commandline in a swarmfile contains a '#' character, swarm assumes that the remainder of the line is a comment. The '#' character and the remainder of the line are not parsed as part of the command. However, there are occasions when the '#' character is an essential part of a command, for example as a unique identifier or an application option.

In such a circumstance, the --no-comment option can be used to avoid removal of the command remainder.

Otherwise, the --comment-char option can be included to allow a different character to be used as the comment character. Note that only the first character of the argument passed with the --char-comment is set as the comment character. For example, to use the '!' character to delineate comments:

$ cat swarmfile
myApp -option1 -id abc#123 ! last weeks results
myApp -option1 -id abc#456 ! this weeks results
$ swarm -f swarmfile --comment-char !

Example 11: Running mixed asynchronous and serial commands in a swarm

There are occasions when a single swarm command can contain a mixture of asynchronous and serial commands. For example, collating the results of several commands into a single output and then running another command on the pooled results. If run interactively, it would look like this:

$ cmdA < inp.1 > out.1
$ cmdA < inp.2 > out.2
$ cmdA < inp.3 > out.3
$ cmdA < inp.4 > out.4
$ cmdB -i out.1 -i out.2 -i out.3 -i out.4 > final_result

It would be more efficient if the four cmdA commands could run asynchronously (in parallel), and then the last cmdB command would wait until they were all done and then run, all on the same node and in the same swarm command. This can be achieved using process substitution with this one-liner in a swarmfile:

( cmdA < inp.1 > out.1 & cmdA < inp.2 > out.2 & \
  cmdA < inp.3 > out.3 & cmdA < inp.4 > out.4 & wait ) ; \
  cmdB -i out.1 -i out.2 -i out.3 -i out.4 > final_result

Here, the cmdA commands are all run asynchronously in four background processes, and the wait command is given to prevent cmdB from running until all the background processes are finished. Note that line continuation markers were used for easier editing.

A subset of nodes in the Biowulf cluster mount filesystems via GPFS, rather than standard NFS. Some users have /home, /data or shared /data directories available via GPFS, and so their swarm submissions can only use those nodes with GPFS mounted. To submit to only those nodes, include the gpfs resource property:

[biowulf]$ swarm -f swarmfile -R gpfs

The script swarminfo provides a list of swarmjobs and accompanying information based on the user, the swarmid, and/or the time at which the swarm was run.

usage: swarminfo [ options ]

Spill out information about swarms.

options:
  -h, --help     print this menu
  -i,--id        specify id number (pid)
  --since        give information since this time (default 1 week)
  --until        give information until this time (default NOW)
  --pwd          show submission directory
  --jobids       show jobids
  --swarmfile    show swarmfile
  --command      show command
  --queue        show chosen queue, if set
  --noheader     don't show the header line(s)
  --full         combine --pwd, --jobids, --swarmfile, --command, and --queue

By default, swarminfo provides the submission time, the swarmid, the number of commands in the swarm, the number of jobs that the swarm required, and the nodetype on which the swarm was run for the last week:

$ swarminfo | head
         submit time     id  ncmds  njobs    B        nodetype
================================================================
 2014-01-30 12:39:15  30790      1      1    -    c16:g24:gpfs
 2014-01-30 12:39:50  19252    240    120    2           c2:m1
 2014-01-30 12:40:42  30856      1      1    -    c16:g24:gpfs
 2014-01-30 12:42:05  20846      1      1    -    c16:g24:gpfs

When run in --full mode, then the directory from which the swarm was submitted, the path to the swarmfile, the jobid(s) of the swarm, and the full command given for submitting the swarm is displayed:

$ swarminfo --full --id 18031
         submit time     id  ncmds  njobs    B        nodetype  
================================================================
 2014-01-28 09:34:57  18031     12      6    -         c16:g24          
   PWD: /data/bogus/biowulf-class/dependencies
   SWARMFILE: /data/bogus/biowulf-class/dependencies/swarm.cmd
   JOBIDS: 5322144,5322145,5322146,5322147,5322148,5322149
   COMMAND: /usr/local/bin/swarm -g 10 -f swarm.cmd -W depend=afterany:5322129

swarmdel Job_ID|Job_Name [ -niqf ] [ -t # ] [ -EHQRSTW ]

The utility script swarmdel will delete a set of swarm jobs automatically:

[biowulf]$ swarmdel 123456.biobos
swarmdel swb23n12345

By default, swarm jobs are given a name that matches the pattern sw[b](submission number)n(PID number).

The submission number corresponds to the order in which the job was submitted to the queue, and the PID number corresponds to the process id that originated the submission. Bundled swarm jobs are given an additional 'b'. Thus, only the submission number is variable in the swarm job name set.

swarmdel will delete only those jobs that match the default swarm name pattern and which are owned by the user giving the swarmdel command.

To find all jobs that exactly match a Job_Name, use qselect -N [Job_Name].

swarmdel has the following options:

-n
test run: don't actually delete anything
-i
interactive mode: the user is prompted to allow the deletion
-q
quiet mode: don't give any output
-f
forceful: keep deleting until every job is gone from the queue
-t #
number of seconds to wait for additional jobs to appear (forceful mode only)
-EHQRSTW
job state selection. Only delete jobs in the selected state: -E (ending), -H (held), -Q (queued), -R (running), -S (suspended), -T (being moved), -W (waiting). State options can be combined (e.g., -W -H deletes only jobs in either W or H state).

Users will typically want to write a script to create a large swarm file. This script can be written in any scripting language, such as bash, perl, or the language of your choice. Some examples are given below to get you started.

Example 1: processing all files in a directory
Suppose you have 800 image files in a directory. You want to set up a swarm job to run an FSL command (e.g. 'mcflirt') on each one of these files.

# this file is make-swarmfile

cd /data/user/mydir   
touch swarm.cmd
for file in `ls`
do
echo "mcflirt -in $file -out $file.mcf1 -mats -plots -refvol 90 -rmsrel -rmsabs" >> swarm.cmd
done

Execute this file with

bash make-swarmfile

You should get a file called swarm.cmd which is suitable for submission to the swarm command.

Example 2: Use swarm to pull sequences out of the NCBI nt blast database.
Suppose you have a file containing 1,000,000 GI numbers of sequences. You want to pull these sequences out of the Helix/Biowulf NCBI nt Blast database. You can divide your GI file into chunks, and run a swarm of jobs, each one working on one chunk of GIs, to pull these sequences out of the database.

Once the swarm jobs are complete, you could if desired combine all the sequences into a single file with

biowulf% cat x*.fas > myseqs.fas

Swarm is available for download here. Keep in mind that swarm was written for our own systems. It will need to be adapted for other batch systems to work properly.