We’re pleased to announce the release of Ensembl 101, and the corresponding release of Ensembl Genomes 48. The highlights of this release include an update of the human gene annotation and new population frequency data along with 39 new genomes including a new sheep reference and crop cultivars.
Major data updates for human
Ensembl release 101 brings us up to GENCODE 35 for human. We also introduce new population frequency data from the Gambian Genome Variation Project (GGVP), which can be found in the “Population genetics” view on the variant tab.
New genomes
We present a sizable addition of 39 new genomes including new sheep and goat breeds, plus barley and five wheat cultivars as we continue to improve our resources on crops. Among the new additions, you’ll also find a millimeter-sized crustacean model organism alongside the largest animal known to have ever existed! Can you guess which one?
Six of the new genomes were generated by the Vertebrate Genomes Project (VGP) aiming to sequence the genomes of every vertebrate on Earth. Our job within this long-term global collaboration is to lead the gene annotation effort. The new species from the VGP are (from top left to bottom right): Eurasian red squirrel (collaboration between Sanger 25 Genomes Project and VGP), vaquita, lumpfish, California sea lion, Swainson’s thrush and blue whale.
Few honourable mentions of this release quirky additions:
Here’s the list of our remaining brand new species in Ensembl 101/Ensembl Genomes 48:
New agricultural animal breeds and crop cultivars
- New sheep breed: Ovis aries (Rambouillet sheep) – new sheep reference!
- New goat breed: Capra hircus (Black Bengal goat)
- New barley cultivar: Hordeum vulgare Golden Promise
- Five new wheat cultivars:
New sheep reference genome
The sheep reference has changed to the new Rambouillet breed, but we still support Texel as an alternative breed. Rambouillet will have a new set of Ensembl identifiers, while Texel will keep its existing identifiers. The variation data will only be available on the Texel assembly in the upcoming release, so make sure you’ve chosen the right breed if you’d like to explore sheep variation data. Texel breed can be accessed by choosing “sheep breeds” on the homepage species selector drop down list, or alternatively, by filtering the full list of all species. The default “sheep” will now refer to Rambouillet breed, so keep that in mind while running your Ensembl VEP jobs! Texel Oar_v3.1 assembly will remain accessible in BioMart.
New data and display for plants
We’ve got more updates in the Plant Kingdom! We added a set of almost 12,000 sunflower SNPs called in three pre-breeding collections capturing global genetic diversity and renamed almond gene names, which now start with ‘Prudul’ prefix as advised by CNAG. Phenotypic data displayed in Arabidopsis thaliana has been reduced to only significant p-values to avoid display issues. Last but not least, we added chromosome specific KASP markers for wheat from the Nottingham BBSRC Wheat Research Centre (WRC).
Variation and genotype frequency data in three sunflower pre-breeding collections belonging to INTA (Argentina), INRA (France) and USDA–UBC (United States of America & Canada).
Other updates and changes
- EPO low coverage alignment name changed to EPO extended to better reflect our alignment strategy
- The Wellcome Sanger Institute Genome Editing (WGE) CRISPR Cas9 track has been updated to include the percent in-frame and favoured mutations as predicted by FORECasT. The track can now be found at the top-level in Configure this page under Genome targeting.
- Retirement of may2015.archive.ensembl.org and july2015.archive.ensembl.org (Ensembl 80 and 81) archive sites