Ensembl 113

Getting the wrong species alias for Ovis aries (Sheep) Texel

Affects: Live siteExpected fix: Ensembl 114
Description: In ensembl-vep we expect to get the species latin name using the aliases through the Bio::EnsEMBL::Registry::get_alias() function –

https://github.com/Ensembl/ensembl-vep/blob/eb55c2293637c34a0f80dfca88c439e77dd52def/modules/Bio/EnsEMBL/VEP/BaseRunner.pm#L143 It is an old code but intuitively the idea is to able to run VEP using “–species sheep”. The scientific name which should be returned as ovis_aries would then be used to get proper database or cache files. If “–species ovis_aries” is used (which is the case for web VEP) get_alias() should return ovis_aries.In e113, using “–species ovis_aries” is returning “ovis_aries_rambouillet”.Affected VEP areas:
1. VEP cache: the sheep texel contains the data dumped from Rambouillet database.
2. Web VEP: gets the wrong cache and assembly version.
3. REST VEP: REST does not allow ovis_aries (why?) but allows ovis_aries_rambouillet as species name. To get results for sheep texel before e113, we could give “sheep” as species name and it would work but now sheep gives results for Rambouillet. So we cannot get results for sheep texel in e113.

Workaround: Please use data from the Ensembl 112 archive.

Incorrect source project label for regulation data in cow

Affects: Live siteExpected fix: Ensembl 114
Description: Source data for cow e.g. [https://rc.ensembl.org/Bos_taurus/Regulation/Evidence?db=funcgen;fdb=funcgen;r=9:96621727-96624008;rf=ENSBTAR9_93FCH2] incorrectly lists ENCODE instead of FAANG.
Workaround: There is currently no workaround, however, this bug is not known to affect downstream analyses.

“Genotype: frequency (count)” does not match “Allele: frequency (count)”

Affects: Live siteExpected fix: Ensembl 114
Description: We observed the mismatch allele and genotype frequencies for a population on the variant genotype page. Allele frequency looks good however genotype frequency does not match with the allele frequency.
Workaround: The table for sample genotypes also looks okay.

Motif widget only shows human data

Affects: Live SiteExpected fix: Ensembl 114
Description: The Motif widget only shows human data, even if it is accessed from other species’ pages.
Workaround: There is currently no workaround.

Alignment artefact in some ncRNA homology and gene-tree alignments

Affects: Live site, Ensembl 100, Ensembl 101, Ensembl 102, Ensembl 103, Ensembl 104, Ensembl 105, Ensembl 106, Ensembl 107, Ensembl 108, Ensembl 109, Ensembl 110, Ensembl 111, Ensembl 112Expected fix: Ensembl 114
Description: An alignment artefact has been detected in ncRNA homology and gene-tree alignments (e.g. [ENSBTAG00000045380|http://may2024.archive.ensembl.org/Bos_taurus/Gene/Compara_Tree?collapse=none;db=core;g=ENSBTAG00000045380]). In affected cases, the alignment sequence is generated from an unflanked ncRNA sequence and a CIGAR line of a flanked genomic alignment. This may result in the original sequence being followed by a sequence of ‘N’ ambiguity symbols, or in some cases, in an alignment composed entirely of ‘N’ ambiguity symbols. As of Ensembl 112, this issue affected 193,110 ncRNA genes and 31,987 trees. Ensembl 113 saw us take our first steps to mitigate this issue, as a result it affects 75,360 genes and 8,805 trees in that release.
Workaround: There is currently no workaround.

Age-of-Base file lacks data for chromosomes 1 to 7

Affects: Live site, Ensembl 111, Ensembl 112Expected fix: Ensembl 114
Description: Due to a pipeline synchronisation issue, the Age-of-Base data file generated in Ensembl release 111 lacks a date for chromosomes 1 through 7.
Workaround: For chromosomes 1-7, we recommend to access the Age-of-Base resource via the Ensembl 110 archive site.

C-terminal ‘X’ missing from some Compara protein member sequences

Affects: Live site, Ensembl 112Expected fix: Ensembl 114
Description: For some protein sequences whose CDS ends with a partial codon (e.g. ENSAPOG00000012172), the C-terminal ‘X’ which can be found in the translation in the core database was found to be absent from the sequence of the corresponding member in the Compara database. As of release 113, this issue affects 1,928 proteins in Vertebrates, 160 in Plants, 60 in Pan Compara, and 3 in Metazoa.
We aim to correct 95% of affected protein sequences in Ensembl release 114. The remaining affected protein sequences — in Aegilops_tauschiiOryza brachyanthaOryza rufipogon and Triticum_dicoccoides — will be fixed when we next update the protein trees for these species.
Workaround: There is currently no workaround.

Large number of ancient paralogies for Triticum aestivum Kariega in a Wheat cultivar supertree

Affects: Live siteExpected fix: Ensembl 114
Description: During preparation of Wheat cultivar protein trees for Ensembl release 113, it became apparent that supertree PTHR11439_SF127 has approximately 40,000 genes in Triticum aestivum cultivar Kariega. Generation of the 261 million ancient paralogies between these genes took two weeks, and it would not have been possible to include them in Ensembl Plants 113 without delaying the release.
Workaround: A cross-section of the relevant ancient paralogies was loaded during production, involving genes with stable IDs TraesKAR6A01G0279700, TraesKAR7B01G0056390, TraesKAR7B01G0463280 and TraesKAR1D01G0304830. Due to the large number of paralogies involved, these may fail to load correctly on the Ensembl Plants website, though it should be possible to access them via the Ensembl REST API and Compara Perl API.

A file containing data on these 40,000 genes is available for download at the following location on the Ensembl Genomes FTP site:
http://ftp.ensemblgenomes.org/pub/release-60/plants/emf/ensembl-compara/homologies/Compara.113.protein_wheat_cultivars.other_paralogies.cds.fasta.gz

Issues affecting Cactus alignments

Affects: Live siteExpected fix: Timeline currently unknown
Description: Long loading times were observed during testing of several Cactus alignments, so the following measures were adopted to mitigate this.
Cactus alignments accessed via the API are filtered to remove shorter alignment blocks. This may cause Cactus alignments to appear sparse when viewed at larger scales, so please view Cactus alignments at the smallest scale that allows for viewing the feature(s) of interest.
In the image alignment view, gene tracks are switched off by default for all but the current genome. In addition, alternative gene model tracks are switched off by default in the 16-Wheat Cactus alignment. Please switch on gene tracks as needed using the configuration bar at the top of the alignment view.
Because the 16-Wheat Cactus alignment was loaded separately for each component (A, B and D) and non-polyploid genomes were included in alignments for all three components, particularly long loading times were observed when viewing this alignment from non-polyploid genomes, so the Wheat Cactus has been removed from the alignment selection dropdown menu for non-polyploid genomes on the Ensembl Plants website. Their alignments can be viewed from the perspective of polyploid genomes.
In an issue unrelated to loading time, the 16-Wheat Cactus alignment has in effect 3 species trees — one per genome component — so this alignment has a single dummy species tree to facilitate access to the alignment, which is hidden from view on the Ensembl Plants website. The full Wheat Cactus species tree can be accessed via the Compara API or from the Plants species-tree location on the Ensembl Genomes FTP site: http://ftp.ensemblgenomes.org/pub/release-60/plants/compara/species_trees/16_wheat_Cactus_default.nh
We aim to address these issues more comprehensively in Ensembl 114 and subsequent releases.
Workaround: There is currently no workaround.

Incorrect allele and genotype frequency in Watkins Collection

Affects: Ensembl Plants 60Expected fix: Ensembl Plants 61
Description: Approximately 1% of the wheat variants that were loaded from the Watkins Collection has an incorrect allele and genotype frequency. This results from sample overlap between the Axiom and the Watkins Collection.
Workaround: There is currently no workaround.