What’s coming in Ensembl 94 / Ensembl Genomes 41

We’re planning to release the next Ensembl and Ensembl Genomes in September.

We’ve got some exciting new genomes, including Emmer wheat, lots of fish and fungi. We’ve also got GENCODE updates for human and mouse, and new transcription factor binding motifs.

New assemblies and gene annotation

  • Updated genes: human, incorporating Ensembl automatic and Havana manual gene annotation, to GENCODE 29.
  • Updated genes: mouse, to GENCODE M19.
  • New genome: Emmer wheat (Triticum dicoccoides). Emmer wheat is the AABB precursor to bread wheat (Triticum aestivum) and is an important crop plant in its own right, due to is usage in pasta.
  • New genomes: 38 new fish genomes, including Japanese and Indian medaka (Oryzias latipes Hd-rR, HNI and HSOK and Oryzias melastigma HK-1), Eastern happy (Astatotilapia calliptera) and climbing perch (Anabas testudineus).
  • Updated genomes: We have new genome assemblies for fugu (Takifugu rubripes), cave fish (Astyanax mexicanus) and southern platyfish (Xiphophorus maculatus).
  • New genomes: import of over 200 new fungal genomes.
  • Updated genes: bakers’/brewers’ yeast (Saccharomyces cerevisiae).
  • New genome: Mung bean (Vigna radiata), used extensively in many Asian cuisines.

Other highlights and updates

  • Updated data: transcription factor binding motifs in human and mouse will be updated.
  • New page: a new interface for viewing transcription factor binding motifs in human and mouse.
  • Updated data: pan-taxonomic compara, comparing genes between representative species in all the divisions of Ensembl, will be updated to include new gene annotation from Leishmania major, Physcomitrella patens, Chlamydomonas reinhardtii, Aspergillus_nidulans, Saccharomyces cerevisiae and Danio rerio.
  • Updated page: we will add additional predictors of missense variant pathogenicity: CADD, REVEL and MutationAssessor. These are already available through the VEP and will be added to the variant web-pages.
  • Updated pipelines: our Compara gene tree and homology pipeline has shifted over to using HMMs, in vertebrates and plants (with other species sets soon to follow). This will significantly change the trees and orthologues for this release, but will allow us to add new genes to the trees in future releases without changes to the existing trees.
  • New data: variant genotypes from six diverse horse breeds.
  • Updated data: host-pathogen interaction data for microbial genes.

There is no guarantee that any of these features will make it into the release.

Addendum

In previous releases, we have assigned the consequence types ‘TF_binding_site_variant’, ‘TFBS_ablation’, ‘TFBS_fusion’, ‘TFBS_amplification’ and ‘TFBS_translocation’ to human variants which overlapped motif features from the Ensembl Regulatory Build. These descriptions have appeared in the ‘Variant’ table in the Location view and the ‘Motif feature consequences’ on individual variant pages. Variants have been coloured according to this status on the tracks in the Region in detail view. These annotations will not be available in Ensembl release 94.

Motif Feature information will also be absent from the VEP cache. A new Plugin will be provided to identify variants which overlap Motif Features.