Biowulf at the NIH
RSS Feed
Gaussian on Biowulf

gaussian logoGaussian09 is the latest in the Gaussian series of electronic structure programs. Designed to model a broad range of molecular systems under a variety of conditions, it performs its computations starting from the basic laws of quantum mechanics. Gaussian can predict energies, molecular structures, vibrational frequencies-along with the numerous molecular properties that are dervied from these three basic computation types-for systems in the gas phase and in solution, and it can model them in both their ground state and excited states. Chemists apply these fundamental results to their own investigations, using Gaussian to explore chemical phenomena like substituent effects, reaction mechanisms, and electronic transitions.

Gaussian is a connected system of programs for performing semiempirical and ab initio molecular orbital calculations including:

  • Calculation of one- and two-electron integrals over s, p, d, and f contracted gaussian functions
  • Evaluation of various one-electron properties of the Hartree-Fock wavefunction, including Mulliken population analysis, multipole moments, and electrostatic fields
  • Correlation energy calculations using configuration interaction using either all double excitations or all single and double excitations

The Gaussian executables are limited by the node hardware (see below). The advantage of using Gaussian on Biowulf would be to run many Gaussian jobs simultaneously as a swarm of single- or dual-threaded jobs.

Gaussian input and output can be graphically displayed using GaussView.

commandline.

NOTE: Due to licensing restrictions, users are required to belong to the group 'gaussian' on Helix computers in order to run. Please contact Helix Systems (staff@helix.nih.gov) to be added to the 'gaussian' group on Biowulf.

Gaussian Documentation
Notes regarding hyperthreading

Many of the nodes in the Biowulf cluster are enabled for hyperthreading. This can accelerate completion of many multithreaded applications. However, Gaussian is already quite efficient in how it manipulates data flow through the CPUs, and so going beyond the number of actual CPUs is very detrimental. For example, on a quad-core node with hyperthreading enabled (8 real CPUs, 16 hyperthreaded), a Gaussian job using more than %NProcShared=8 will run considerably slower than otherwise. Do not exceed the number of actual cores on a node.

Notes regarding scratch files

Gaussian uses several scratch files in the course of its computation. These include the checkpoint file (*.chk), the read-write file (*.rwf), the two-electron integral file (*.int), and the two-electron integral derivative file (*.d2e). These files can become extremely large, and because the program is accessing them constantly, I/O speed is a factor in performance.

Choosing a scratch directory can be very critical. This is done by defining the environmental variable $GAUSS_SCRDIR either immediately prior to execution or in the shell executing the program.

The default directory for scratch files on the Biowulf cluster is /scratch (on individual nodes). Diskspace depends on the node, and varies from 33 GB to 448 GB. Because /scratch is local to the node, this provides the fastest I/O speed for the Gaussian execution.

Scratch files will remain after completing Gaussian execution. There is no automatic mechanism to remove files from /scratch on the nodes, and so unless the files are required for future runs, users are encouraged to include the Link0 command %NoSave at the end of input files.

Notes regarding memory

The memory requirements for a Gaussian job is dependent on the job type, the number of basis functions, and the algorithms used for integrals. In general, it is best to use the minimum memory necessary. Too little will cause the job to fail, while too much will slow down the calculations (ONLY WITH g03 -- see below). Allocating more memory using the %Mem command than is available will cause the node to swap data back and forth from disk to memory, badly degrading the CPU performance.

Because the memory is shared among the all CPUs during multiprocessor jobs, each processor has access to a fraction of the amount of memory when run on a single processor. Thus, the amount of memory allocated will need to be increased up to N-fold, with N equal to the number of processors available.

Memory effect

This is a plot of memory allocated versus time to run a test job (g03), using different numbers of CPUs. The minimal amount of memory necessary to run on a single processor is 32MB, but 8-fold more is needed to fully utilize 8 CPUs.

Memory Use with g09

Rerunning the same test using g09 shows no memory effects, however. What is clearly seen is that this job scales well to 4p, but not much beyond.

For more information on how to calculate the amount of needed memory, see here.

Run as a batch job

Create a batch input file, e.g. 'gaussian_run', which uses the input file 'test000.com'. for example:

#!/bin/bash
#PBS -N gaussian
#PBS -e gaussian.err
#PBS -o gaussian.log
cd $PBS_O_WORKDIR
/usr/local/bin/g09 < test000.com > test000.log

Submit this job using the PBS 'qsub' command. Example:

qsub -l nodes=1 gaussian_run

See here for more information about PBS.

Running a swarm of Gaussian jobs

The swarm program is designed to submit a group of commands to the Biowulf cluster as batch jobs. Each command is represented by a single line in the swarm command file that you create, and runs as a separate batch job.

Create a swarm command file with each line containing a single gaussian command. For example, the file 'cmdfile' would have a single gaussian command per line:

g09 < test000.com > test000.log
g09 < test001.com > test001.log
g09 < test002.com > test002.log
g09 < test003.com > test003.log
g09 < test004.com > test004.log
g09 < test005.com > test005.log
...

Submit this swarm command file to the batch system with the command:

[biowulf]% swarm -f cmdfile

NOTE: Swarm will attempt to run one command per processor. If the gaussian job is to be run on multiple processors using the %NProcShared command, this will overload the node and cause the job to run much slower than possible. You will need to include the -t option to swarm. For example, using %NProcShared=2, you need to include -t 2:

[biowulf]% swarm -t 2 -f cmdfile

In this case, swarm will run one command per two processors. See the Swarm documentation for more information.

Run Gaussian on the command line

If you log in to Biowulf and type a command, your command will run on the main Biowulf login node. This is not recommended, for obvious reasons. If you really want to run interactively, either use one of the interactive nodes (see the user guide for more information) or allocate a node for interactive use. Once the node is allocated, you can type commands directly on the command line. Example:

[biowulf]% qsub -I -l nodes=1
qsub: waiting for job 2011.biobos to start
qsub: job 2011.biobos ready

[p139]% g09 < test000.com > test000.log
[p139]% exit
logout

qsub: job 2011.biobos completed
[biowulf]%

To run an older version of Gaussian (e.g. D.01), include the option -D01 or -C02 in the commandline.

Utility Programs

There are a number of utility programs available. These programs can be run both on the login node and on the cluster.

NOTE: The utilities are by default the Gaussian09 version. To run the Gaussian03 version of the utility, add the prefix 'g03' to the utility name. For example, to run ghelp for g03, type g03ghelp.

Program Function
c8609 Converts checkpoint files from previous program versions to Gaussian09 format.
chkchk Displays the route and title sections from a checkpoint file.
cubegen Standalone cube generation utility.
cubman Manipulates Gaussian-produced cubes of electron density and electrostatic potential (allowing them to be added, subtracted, and so on).
formchk Converts a binary checkpoint file into an ASCII form suitable for use with visualization programs and for moving checkpoint files between different types of computer systems.
freqchk Prints frequency and thermochemistry data from a checkpoint file. Alternate isotopes, temperature, pressure and scale factor can be specified for the thermochemistry analysis.
freqmem Determines memory requirements for frequency calculations.
gauopt Performs optimizations of variables other than molecular coordinates.
ghelp On-line help for Gaussian.
mm Standalone molecular mechanics program.
newzmat Conversion between a variety of molecular geometry specification formats.
rwfdump Dumps the file index and data from a read-write or checkpoint file.
testrt Route section syntax checker and non-standard route generation.
unfchk Convert a formatted checkpoint file back to its binary form (e.g., after moving it from a different type of computer system).

For more information, type ghelp at the prompt.

GaussView

GaussView is an X11-dependent GUI for reading, writing, and submitting gaussian jobs. See here for more information about X11 on Helix Systems.

GaussView can be started by typing the command gaussview at the prompt. Please see here for documentation. Please note that GaussView is not allowed on the Biowulf login node, but can be run on Helix or interactively on any of the nodes.

Optimizing Gaussian (and cluster) usage

A job submitted to PBS on the cluster will try to find the highest performance node available. There is no automatic accounting for the job requirements of memory, diskspace, or processor speed. A job can be directed to go to specific nodes using the expanded PBS options . For example, to send the job 'gaussian_run' to a 2.8 GHz node with 4 GB of memory per core, use the command

[biowulf]% qsub -l nodes=1:o2800:m4 gaussian_run

The nodes availabe can be seen by typing the command 'freen'. Click here for more information about directing and monitoring jobs on the cluster.

While diskspace and memory is not restricted by Gaussian, they are still limited by the cluster hardware. Using the Link0 command %Mem and the MaxDisk option in the Gaussian command file may be required to prevent memory swapping or running out of diskspace. Click here for more information about memory and diskspace requirements for Gaussian.

Running the command g03/g09 will give, in addition to the Gaussian output, three additional pieces of information regarding the Gaussian job:

[biowulf]% head -4 test000.log
host = p1070
Running revision A02
Current disk usage on p1070:
/dev/hda1 72G 2.1G 66G 4% /

The first line tells what node the Gaussian process is running on. The second line shows the version that is being run. The third and fourth lines show how much scratch space is available for the node (in this case 66GB). This information is important in deciphering any problems the job may have encountered.

Choosing A Number Of Processors

Gaussian is a natively multithreaded application and in general can scale from 2-4 CPU. A job is made multithreaded by including the L0 command %NProcShared. For example, to run a gaussian job on 4 processor dual-core node, the header must include %NProcShared=4:

%Mem=3800MB
%NProcShared=4
#p opt freq=noraman scf=tight...
...

Then make sure to submit it to a dual-core node:

qsub -l nodes=1:c4 gaussian_run

The default version of Gaussian is compiled with TCP Linda, which allows Gaussian to run across multiple nodes. A small number of Links (L502, L703, L914, L1002, L1110) will scale up to 32 CPU using TCP Linda.

However, not all calculation types parallelize well or at all. In fact, most run best as single-threaded processes.

The following table shows the best use of Gaussian with respect to the number of processors:

Method Energy Gradient / Opt Freq / Hessian
HF 4 4 4
HDFT 4 4 4
Pure DFT 4 4 4
MP2 4 3 1-2
MP3 1 1  
MP4 2-4    
MP5 1    
CCD 1 1  
CCSD 1 1  
CCSD(T) 2-4    
CIS 4 3  
CISD 1 1  
AM1 1 1  

Only HF, DFT, CCSD(T), CIS, and MP2/MP4 jobs will benefit from running with a %NProcShared=4 on dual core nodes. All others should be run with %NProcShared=2 (or 1).

Interpreting Gaussian Errors

Gaussian errors are not always straightforward to interpret. Something as simple as a "file not found" can seem baffling and cryptic. Here is a collection of errors and their translations:

Gaussian Error Translation to English
Error termination in NtrErr:
ntran open failure returned to fopen.
Segmentation fault
Can't open a file.
Internal consistency error detected in FileIO for unit 1 I= 4 J=0 I Fail= 1. Gaussian is limited to 16 GB of scratch space on the 32-bit nodes.
Out-of-memory error in routine UFChkP (IEnd= 12292175 MxCore= 6291456)
Use %Mem=12MW to provide the minimum amount of memory required to complete this step.
Error termination via Lnk1e at Thu Feb 2 13:05:32 2006.
Default memory (6 MW, set in $GAUSS_MEMDEF) is too small for unfchk.
galloc: could not allocate memory.: Resource temporarily unavailable Not enough memory.
Out-of-memory error in routine... Not enough memory.
End of file in GetChg.
Error termination via Lnk1e ...
Not enough memory.
IMax=3 JMax=2 DiffMx= 0.00D+00
Unable to allocate space to process matrices in G2DrvN:
NAtomX= 58 NBasis= 762 NBas6D= 762 MDV1= 6291106 MinMem= 105955841.
Gaussian has 6 MW free memory (MDV1) but requires at least 106 MW (MinMem).
Estimate disk for full transformation -677255533 words. Semi-Direct transformation. Bad length for file. MaxDisk has been set too low.
Error termination in NtrErr:
NtrErr Called from FileIO.
The calculation has exceeded the maximum limit of maxcyc.
Erroneous read. Read 0 instead of 6258688.
fd = 4
g_read
Disk quota or disk size exceeded. Could also be disk failure or NFS timeout.
Erroneous write. Write 8192 instead of 12288.
fd = 4
orig len = 12288 left = 12288
g_write
Disk quota or disk size exceeded. Could also be disk failure or NFS timeout.
PGFIO/stdio: Permission denied
PGFIO-F-/OPEN/unit=11/error code returned by host stdio - 13.
File name = /scratch/Gau-#####.inp
In source file ml0.f, at line number 177
The user does not have write permission for $GAUSS_SCRDIR.

Using Linda

Gaussian03 version E.01 (64-bit only) on Biowulf is compiled using TCP Linda. This allows a small subset of job types to be distributed across multiple nodes on the cluster:

To use TCP Linda, you need to include the L0 command %NProcLinda=#, where # is the number of nodes on which to distribute the job. You must also include the %NProcShared=# command, where # is the number of CPUs per node (%NProc and %NProcShared are synonyms). Finally, you need to specify the number of nodes needed for the Gaussian job. For example, to distribute a Gaussian run onto 16 CPUs across 8 o2800 (single-core, 2 CPU/node) nodes, the Gaussian input file would look like

%NProcShared=2
%NProcLinda=8
#p b3lyp 6-31G* td(nstates=10) test

Gaussian Test Job 438:
...

and would be submitted like

qsub -l nodes=8:o2800 gaussian_run

See above for details about submitting to the batch system.

Keep in mind that only the processor load is distributed. The master (mother superior) node in a multi-node Linda job must have the required amount of RAM, while the worker nodes can have less. Thus, in the case of the Gaussian input

%Mem=8GB
%NProcShared=2
%NProcLinda=8
#p opt freq=noraman scf=tight...
...

the master node must have at least 8GB of RAM. See here for a discussion of memory issues.