After running our brand new Ensembl Regulatory Build on the human GRCh38 and GRCh37 assemblies, we spent some time revamping our current Regulation mart to make it faster, easier to use and pack it with brand new features. A complete re-design of the mart has been done in the background to make sure our mart can provide improved performance and deal with the data increase. The new Regulation mart can still be found on the Ensembl website under the BioMart tab.
The first thing that you will notice is that you can now access each regulation data type separately, this allows you to get the data quickly and make the filter and attribute sections neater.
Each Regulation section holds its own type of data, for example you can get the following data for human GRCh38:
- Binding Motifs (TFBS Annotation)
- Binding Matrix ID (e.g: MA0003.1)
- Feature type data (e.g: BHLHE40,CTCFL,…)
- Other Regulatory Regions
- Feature type data (e.g: FANTOM predictions, VISTA Enhancers)
- Identifiers (e.g: hs1, 1:922877-923268,…)
- Regulatory Evidence (Regulatory Build Information)
- Regulatory Features (Regulatory Build Information)
- Ensembl Regulatory Stable ID (e.g: ENSR00001516677)
- Feature type (e.g: CTCF Binding Site, Enhancer, Open chromatin, Promoter, Promoter Flanking Region, TF binding site)
- Cell type (e.g: A549, DND-41, GM12878, H1ESC,…)
- Regulatory Segments (Segmentation Information)
- Feature type (e.g: CTCF enriched, Predicted Enhancer, Predicted heterochromatin, Predicted low activity, Predicted Promoter flank, Predicted Promoter with TSS, Predicted Repressed, Predicted Transcribed Region)
- Cell type (e.g: HeLa-S3, HepG2)
- miRNA Target Regions (TarBase miRNA target predictions)
- miRNA Identifier ID (e.g: hsa-miR-124-3p, hsa-miR-122-5p,…)
- miRNA Accession ID (e.g: MIMAT0000069, MIMAT0000646,…)
We also have Binding motifs, Regulatory evidence, Regulatory Features, miRNA Target Regions data available for mouse and Other Regulatory Regions available for both mouse and fruit fly.
In addition of the above, we have added the following brand new information:
- Band Start/End, Marker Start/End and ENCODE Pilot Regions filters to the six Regulation data sections
- SO name and accession to the six Regulation data sections
- EFO Term accession to the Regulatory evidence, Regulatory features and Regulatory segments.
- “Has evidence”, which denotes whether Regulatory features have supporting evidence on a particular cell type or not.
- Chromosome Strand and Evidence to miRNA Target Regions.
Working with R?
Did you know that you can access all our marts using the BiomaRt Bioconductor R package?
To do this, first install the Bioconductor BiomaRt package: http://bioconductor.org/packages/release/bioc/html/biomaRt.html.
The following R code will then give you the chromosome location and scores for the human GRCh38 Binding matrix ID MA0005.2:
> library(biomaRt) > ensembl_regulation = useMart(biomart="ENSEMBL_MART_FUNCGEN",host="www.ensembl.org",dataset="hsapiens_motif_feature") > binding_matrix = getBM(attributes=c('binding_matrix_id','chromosome_name', 'chromosome_start', 'chromosome_end','chromosome_strand', 'score'), filters='motif_binding_matrix_id',values="MA0005.2", mart=ensembl_regulation)
The new Regulation mart is available for human, mouse and fruit fly on both www.ensembl.org and www.grch37.ensembl.org.