In its latest release, Ensembl has completely reviewed its reporting of potential Transcription Factor (TF) binding sites. TF proteins are key players of gene expression regulation that bind to specific DNA regions characterised by approximate sequence patterns, or transcription factor binding motifs (TFBM). These motifs are generally represented as a Position Specific Frequency Matrix, or Binding Matrix. Ensembl scans genomes for occurrences of these motifs, reporting Motif Features at each possible location.
New Motifs
We have extended our characterisation of TF binding by importing 632 human and 85 mouse TFBMs imputed through SELEX. This new collection greatly expands our repertoire of known motifs and covers a significant fraction of all known transcription factors.
The new motifs have been mapped onto putative regulatory elements of the Ensembl Regulatory Build using MOODS. In a given epigenome, for each Transcription Factor, if a ChIP-seq data set is also available, we annotate each Motif Feature as either experimentally verified or unverified, depending on whether they fully overlap a ChIP-Seq peak. If there is no ChIP-seq data available, all Motif Features are considered unverified. Since ChIP-seq experiments are epigenome specific, the Motif Feature annotation varies across the different epigenomes. In mouse, our resources do not contain ChIP-seq datasets for any of the new Transcription Factor Binding Matrix, therefore, all Motif Features are considered unverified.
New Binding Matrix visualisation
We developed a new visualisation for the sequence logos of regulatory motifs, which is simpler and more accurate. Rather than the commonly-used stretched base visualisation with coloured letters, this uses solid blocks of colour to represent the information content at each base. We chose this new display because it scales well, both horizontally and vertically, without losing legibility. Data can be downloaded from the image and the image itself can be exported in SVG format to enable reuse and integration into publications and presentations.
Where can I find this data?
On the “Location View”, click on “Configure this page” and then select the “Configure Region Image” tab. Under “Regulation”, select “Other regulatory regions” and enable the “Motif Features” track. A new track containing all Motif Features in the region will be displayed, highlighting verified (black) and unverified (grey) motifs.
In human, experimentally verified Motif Features are also displayed in the Epigenome activity tracks.
Stable IDs
Binding Matrices and Motif Features have been given stable identifiers. In human, they look like ENSPFMXXXX (Position Frequency Matrix) and ENSMXXXXXXXXXXX (Motif) respectively. Mouse follows a similar pattern (ENSMUSPFMXXXX and ENSMUSMXXXXXXXXXXX).