Variant Call Format (VCF) has become the standard portable data format for variation data. Ensembl provides various tools for handling VCF files, such as the ability to upload and view VCF data on the genome browser, and the Variant Effect Predictor (VEP) for predicting the functional consequence of variants.
We now provide an easy-to-use Perl script that allows you to create and populate an Ensembl Variation database from a VCF file.
Using the script you can create a database that is ready for use on your own Ensembl website mirror, as well as with Ensembl’s Perl API. Through these you have access to our powerful tools such as:
- visualising your data as a track
- resequencing displays
- linkage disequilibrium calculations
- functional consequence predictions, including SIFT and PolyPhen
You can either build a new database from scratch, or add data on top of an existing database (for example, a copy of one of our databases downloaded from our FTP servers).
All you need is the Ensembl API modules, and a MySQL server that you can write to!
The script is located in the ensembl-variation API module, in the scripts/import/ sub-directory.