We have made a number of changes in Ensembl Compara Perl API and REST API endpoints. These changes allow Ensembl to handle non-unique member stable IDs when retrieving comparative genomics data via the APIs. These changes will come into effect in Ensembl release 110 and all old endpoints will be retired in release 112. Please continue reading if you regularly retrieve comparative genomics data via the Ensembl APIs.
We have made some changes to the Ensembl and Compara code to make it possible to deal with non-unique stable IDs of genes and other features. These changes were prompted by case-insensitive stable ID clashes between genes in Lingula anatina and Crassostrea gigas (e.g. L. anatina gene ‘g30265’ versus C. gigas gene ‘G30265’), which were seen during comparative genomics processing for Ensembl Metazoa release 107. While the Ensembl stable IDs of each genome annotated in-house are allocated a distinct prefix with the aim of ensuring that they are unique across Ensembl resources, this convention is not necessarily followed by identifiers in imported community annotations such as those of L. anatina and C. gigas. Gene identifiers in imported annotations are preserved where possible, while requiring that each identifier is unique within the gene set, but it is possible that the same ID may be used in different community annotations. A key implication of this is that, although we can reasonably assume that gene stable IDs are unique within a genome, we can no longer assume that they are unique within an Ensembl division.
Because a stable ID cannot uniquely identify a gene in an Ensembl Compara database, the Compara Perl API method MemberAdaptor::fetch_by_stable_id
(which fetches a gene or sequence member by its stable ID) has been deprecated and replaced by MemberAdaptor::fetch_by_stable_id_GenomeDB
(which requires both a stable ID and a GenomeDB). The deprecated method will be removed from the Compara Perl API in Ensembl 110, so it is strongly recommended to start using its replacement method if you have not done so already. Please check out the Compara Perl API tutorial for examples on how to use the new method.
A stable ID may also be shared by genes in different Ensembl divisions (e.g. plant species Arabidopsis halleri has a gene with stable ID ‘g30265’). There is potential for such stable IDs to clash in the Ensembl REST API, where multiple Ensembl divisions may be accessed through the same interface, so we have made changes there too.
Three Ensembl comparative genomics REST endpoints are to be deprecated in Ensembl 110, with each being replaced by a new endpoint requiring the name of the species in addition to the stable ID:
Old endpoint | New endpoint |
GET cafe/genetree/member/id/:id | GET cafe/genetree/member/id/:species/:id |
GET genetree/member/id/:id | GET genetree/member/id/:species/:id |
GET homology/id/:id | GET homology/id/:species/:id |
It is advisable to switch to the replacement endpoints as soon as practicable because the old endpoints are scheduled to be removed in Ensembl release 112.
For those using the deprecated endpoints during the transition period, you can reduce the chances of a stable ID clash by setting the species
parameter for any endpoint supporting it. For queries of all comparative genomics REST endpoints, new and old, it is recommended to specify an appropriate compara
parameter (e.g. metazoa
or plants
). If you have any questions about these endpoint changes, please contact us on the Ensembl Helpdesk.
Author: Thomas Walsh
Editors: Fergal Martin, Jorge Alvarez Jarreta, Louisse Paola Mirabueno