![]() |
|
|||
| |
||||
PROSPECT on BiowulfPROSPECT (PROtein Structure Prediction and Evaluation Computer Toolkit) is a threading-based protein structure prediction system. PROSPECT is designed particularly for the recognization of the fold template whose sequence has insignificant homology to the target sequence. The system runs efficiently; the efficiency is achieved mainly by discovering and utilizing the "topological complexity" of a protein fold. The threading templates contain both protein chains (defined by FSSP non-redundant set) and compact domains (defined by the SCOP and CATH databases). The template lists are updated when new versions are made available. The threading output provides evaluation of compactness and SVM assessment of threading reliability.
The web interfacePROSPECT can be run most simply through the web interface. The job is run as a prospect_swarm job (see below), and typically finishes in a few minutes. Enabling Z-scoring will typically finish in less than an hour. A single protein sequence is threaded against one or all template databases with a minimum of options, and the threading results are returned to the user by email. The sequence must either be in FASTA or raw format, and the user must input an email address.prospect_swarm jobsPROSPECT can be run as an swarm job by using the command prospect_swarm. This breaks the job into multiple runs for a single sequence and finishes in a fraction of the time of the single threaded version. This command has the following input options: Input sequence prefix (filename must end with a .seq suffix):
Threading template library:
Secondary structure options:
Threading method options:
Z-scoring:
Number of solutions displayed in final html file:
Here is an example running an input sequence (input.seq) against all template libraries, generating secondary structure from a PSI-BLAST sequence profile, using global alignment and enabling Z-scoring:
Batch jobsThreading jobs can be run as batch jobs from Biowulf using PBS (see User Guide for more details). Here is a simple script (script.sh) for using PSI-Blast to generate a sequence profile on the input sequence, threading the profile against all databases, sorting the output and then converting the output to an html file:
The script (script.sh) would then be submitted to the batch system using the command
swarm jobsMultiple threading jobs can be launched using the swarm command on Biobos (see swarm user guide for more details). Here are two simple scripts (ind.sh and command.sh) for threading multiple sequences (seq01.seq, seq02.seq, seq03.seq, seq04.seq, etc.) against the FSSP database. The results are then sorted by raw score, and the scores for the top 5 threading alignments are written as a table to the output. The script ind.sh sets up the environment and executes the threading job for each seq name:
The script command.sh runs ind.sh for each seq name:
command.sh would then be submitted to swarm using the command
Interactive jobsPROSPECT can be run as an interactive job. However, it must be initiated from one of the nodes. To do this, first allocate a single node for running jobs:
Then, set the environmental variables and path of the node:
Now commands can be given directly from the prompt. A simple job would include generating a secondary structure prediction, followed by a threading run against the SCOP library:
Last, sort the results by raw score and replace the original output with the sorted output:
PROSPECT Template DatabasesImportant Notes:
Available options for PROSPECTPlease see the PROSPECT web site for all available options. |
||||
| This page was last updated August 30, 2004. | ||||