The interpretation of non-coding variants is more challenging than that of coding variants as less prediction methods and reference data are available. On top of the annotation provided for human and mouse in the Ensembl Regulatory Build, the Ensembl Variant Effect Predictor (VEP) also integrates two other human-specific datasets providing information about how variants can affect gene expression. The plugins, satMutMPRA and FunMotifs, are available for use with command-line VEP. One provides detailed information on the impact on expression of variants in the regulatory regions of disease-associated genes; the other an alternative set of genome-wide transcription factor binding motifs.
The satMPRA study used saturation mutagenesis of functionally characterised regulatory elements to investigate the effect of genomic variation on the expression of 19 genes. For 20 known human promoters and enhancers, all possible single nucleotide substitution and deletion mutations (30k in total) were introduced across the promoter and the expression of reporter genes screened for activity. The results – any increase or decrease in expression, plus p-values – can be incorporated into VEP output, providing important evidence for the interpretation of variants in these regions.
Also available as a plugin, FunMotifs characterises transcription factor binding motifs to identify their specificity across fifteen tissue types. You can fetch information about whether these motifs overlap your variants, and in which tissues they bind transcription factors.