Biowulf at the NIH
RSS Feed
GPU Nodes on the Biowulf Cluster

Graphics processing units (GPUs) are specialized microprocessors originally designed for video and rendering. More recently, compute-intensive programs in the life sciences have been ported to GPUs to explore potential performance benefits of their massive compute power.

As part of its role to integrate new technologies into the production environment, the Biowulf staff has installed 16 GPU nodes into the cluster as a pilot project. The purpose of the pilot is to:

Note that running on the GPU nodes is not a guarantee of improved performance. It is vital to run your own benchmarks to determine the effectiveness of using the GPUs. The Biowulf staff is very interested in any GPU benchmarks; please let us know about them at staff@biowulf.nih.gov.

Hardware

Software

Allocating GPU nodes with the batch system

GPU nodes can be allocated using the "gpu2050" property:

% qsub -l nodes=1:gpu2050 gpujob.bat
% qsub -l nodes=4:gpu2050 pjob.bat
Initial testing or compiling can be done with an interactive session:
% qsub -l nodes=1:gpu2050 -I
Interactive sessions on gpu nodes have a maximum walltime of 24 hours, but please log out of interactive sessions as soon as you're finished with them.

Applications on GPU Nodes

NAMD

AMBER

Bioinformatics programs

GROMACS

MATLAB

Strategies for Benchmarking

There are 32 GPUs in the pilot cluster. Since it is possible (application dependent) to share the GPUs amongst processes running on the Intel Nehalem CPU cores, additional performance may be gained by running with CPU:GPU ratios of 2:1, 3:1 or more. (See the Biowulf NAMD GPU page for an example.) Distributed memory codes may also benefit by running over the Infiniband network instead of gigabit Ethernet.

Monitoring your jobs

CPU usage on the GPU nodes can be monitored using the jobload utility, as with other Biowulf batch jobs. There is currently no simple way to monitor the GPU usage.

Building Applications for GPU Nodes

Software packages that include GPU support via CUDA will likely have configuration or Makefile options for specifying the location of the CUDA SDK. On Biowulf, the default CUDA SDK (which includes the compilers, headers and run-time libraries) is located in /usr/local/CUDA/cuda-4.1/

There are several other versions of CUDA available in /usr/local/CUDA. The easiest way to see available versions or use a specific version to build code is by using the module commands.
[user@biowulf ~]$ module avail cuda

--------------------- /usr/local/Modules/3.2.9/modulefiles -----------------------
cuda/2.3    cuda/3.0    cuda/3.1    cuda/4.0.17 cuda/4.1    cuda/5.0    cuda/6.5

[user@biowulf ~]$ module load cuda/5.0

[user@biowulf ~]$ module list
Currently Loaded Modulefiles:
  1) cuda/5.0

Alternatively, you can see the paths set by the module with, for example, 'module display cuda/5.0', and then set them as desired.

Individuals that wish to program their own applications or would like to add GPU support to existing applications will need to learn about how GPU-assisted processing works and will likely want to become familiar with NVIDIA's developer resources portal.

The current CUDA programming guide can be found here.

The architecture-specific (Fermi) GPU tuning guide can be found here.

NVIDIA Performance Primitives

NVIDIA distributes a set of library functions for accelerating processing of image and video data. NPP is installed in /usr/local/nvidia/NPP_SDK, for documentation and downloads you can visit the NVIDIA NPP page.