Biowulf at the NIH
RSS Feed
VEP

Description

VEP (Variant Effect Predictor) determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.

How to Use

There are multiple versions of VEP available. An easy way of selecting the version is to use modules. To see the versions available, type

module avail VEP

To select a version, type

module load VEP/[ver]

where [ver] is the version of choice.

Please note that by default VEP requires internet connectivity to the Ensembl databases. THIS IS NOT POSSIBLE ON THE BIOWULF CLUSTER!. Instead, the databases have been locally cached into a version-specific directory ($VEPCACHEDIR, as set by the VEP module), allowing for offline analysis.

This requires including these options for all commands:

--offline --cache --dir_cache $VEPCACHEDIR

Commands

  • convert_cache.pl
  • filter_vep.pl: filter the output of VEP
  • gtf2vep.pl: create a VEP cache from a GTF file
  • variant_effect_predictor.pl: predict effect of variants

Sample Swarm

NOTE: By default, variant_effect_predictor.pl will write to the same output file ("variant_effect_output.txt") unless directed to do otherwise using the --output option. For swarms of multiple runs, be sure to inclue this option.

variant_effect_predictor.pl -i trial1.vcf --offline --cache --dir_cache $VEPCACHEDIR --fasta $VEPCACHEDIR/human.fa --output trial1.out
variant_effect_predictor.pl -i trial2.vcf --offline --cache --dir_cache $VEPCACHEDIR --fasta $VEPCACHEDIR/human.fa --output trial2.out
variant_effect_predictor.pl -i trial3.vcf --offline --cache --dir_cache $VEPCACHEDIR --fasta $VEPCACHEDIR/human.fa --output trial3.out
variant_effect_predictor.pl -i trial4.vcf --offline --cache --dir_cache $VEPCACHEDIR --fasta $VEPCACHEDIR/human.fa --output trial4.out

swarm --module VEP --file swarmfile

Documentation