Current challenges in understanding the role of enhancers in disease

provides a deep learning framework to predict base-resolution profiles


Main text
Regulation of gene expression is accomplished through the integration of events at regulatory elements that are proximal (promoters) and distal (enhancers) to gene transcription start sites (TSSs).Forty years after their discovery 1 , enhancers are recognized as playing a central role in the spatiotemporal control of gene expression underlying human development and homeostasis 2 .Enhancers are short stretches of DNA that act as positive regulators of transcription via their ability to bind key proteins -transcription factors (TFs) -and complexes that control gene expression.Enhancer regulation of genes involves the three-dimensional topology of chromatin, affecting the frequency by which enhancers and gene promoters come into close proximity.Through this topology, several configurations can arise beyond single enhancer-gene pairs, including one-to-many and many-to-many enhancer-gene wirings, which may affect the robustness, strength or specificity of gene expression (Fig. 1A ).
Enhancer dysfunction has emerged as a central mechanism in the pathogenesis of certain diseases 3,4 (Table 1).In particular, the dysfunction of enhancers by either point mutations or structural variants is a significant mechanism underlying aberrant gene regulation in cancer 5 and Mendelian diseases 6 .Moreover, genetic variants associated with common diseases are frequently found in cis-regulatory elements including enhancers [7][8][9][10][11][12] .Depending on the nature of the genetic alteration, enhancer dysfunction can be classified into two main types 4 .The first type involves small single nucleotide variants (SNVs) and indels in the enhancer sequence that lead to changes in enhancer activity (Fig. 1B).Such variations can for instance alter the affinity of bound TFs or create new binding sites.The second type involves structural variants that lead to deletion, duplication or relocation of the entire enhancer, which impacts chromatin topology and enhancer function (Fig. 1C).Chromosomal rearrangements can lead to rewiring of enhancer-gene connections, which may involve both enhancer adoption/hijacking (gain-of-function; e.g., ref. 13,14 ), and enhancer disconnection (loss-offunction; e.g., ref. 15 ).Depending on the genomic alteration, enhancer dysfunction may result in either gain or loss of gene expression in a given tissue as well as more complex alterations of expression patterns (Table 1).
Elucidating the molecular basis of enhancer function in normal and pathological conditions has far-reaching translational implications.Here, we discuss important challenges that need to be considered in the study of enhancer dysfunction in disease and highlight critical areas of research to address these challenges in the near future (Fig. 2).

1) Identifying and validating functional enhancers
A persisting challenge when studying transcriptional regulation in health and disease is to systematically identify functionally meaningful enhancers in a given cell type.Partially, this is because enhancers encompass a diverse group of regulatory sequences, which may utilize various mechanisms to control gene expression 2,16 .Conceptually, a nucleotide sequence can be assigned as a biologically meaningful enhancer once it is experimentally demonstrated that it modulates the transcription of a gene in cis in its native context.Unfortunately, there is no high-throughput assay that is able to do this globally outside of cell lines.As a result, enhancers are often defined in an indirect, operational, manner 2 .For example, a sequence may be functionally qualified as an enhancer if it increases the activity of a (minimal) promoter in a plasmid reporter assay, or it may be qualified as an enhancer by association, when linked with chromatin accessibility, transcription of enhancer RNAs (eRNAs), or marked by epigenomic features that have been linked to enhancer activity (e.g.p300, H3K4me1, H3K27ac).Functionally cataloging sequences as candidate enhancers has been boosted by the development of Massively Parallel Reporter Assays (MPRAs) that allow systematic largescale testing of enhancer activities of any sequence in episomal contexts 2 , while genome-wide approaches to find putative enhancers by association were employed by several international consortia (ENCODE 17 , ROADMAP 18 , FANTOM 7 and BLUEPRINT 19 ).While these approaches have greatly expanded our ability to map enhancers, they suffer from some limitations.One is that not all sequences that are predicted to act as enhancers based on MPRAs or association strategies necessarily function to increase gene transcription in their endogenous chromosomal contexts.Furthermore, MPRAs do not account for the effects that linear distance and chromatin environment might have on enhancer activity.In principle, CRISPRbased enhancer screens are able to overcome these limitations and can be used to assess the importance of enhancers in their endogenous context [20][21][22] .However, these assays may suffer from the intrinsic redundancy or additive effects of enhancers (see Challenge 3), and high false-negative rates 2 .To further complicate things, some proven biologically meaningful enhancers may lack the expected biochemical marks while others may not show enhancer potential in reporter assays 23,24 , perhaps because these assays are based on plasmids and may not reproduce the function of chromatin-dependent enhancers 24,25 .Therefore, it is not surprising that there is no strict overlap 16 between the hundreds of thousands of putative enhancers in the human genome predicted based on indirect assays and the number of biologically confirmed enhancers.Future comparisons between predicted and functionally validated enhancers, including CRISPR screens with high-sensitivity and low false-negative rates, should help improve our definition of the fundamental features of functionally operational enhancers.

2) Defining the spatiotemporal window in which a regulatory variant affects enhancer activity
Enhancer activity is highly specific during development, across cell types, cell states, and stimulatory conditions (e.g., inflammatory response, diet, drug treatment).As a consequence, the effect of enhancer dysfunction (Fig. 1B), may only be revealed in specific contexts or tissues.This leads to the challenge of defining the spatiotemporal window in which an enhancer dysfunction has a measurable and meaningful impact.This is particularly important for interpreting disease-associated genetic variants in non-coding regions since the cell type in which an enhancer is active can be informative about the disease mechanism.For instance, the finding that genetic variants associated with Alzheimer's disease, a neurodegenerative condition, overlap enhancers in myeloid cells, rather than neurons 26 , has led to a shift in the research focus of the pathology 27 .Similarly, functional assessment of obesity-associated variants has identified putatively causal variants with regulatory properties in both adipose and neuronal cell lines 28 .
To capture inducible and context-dependent enhancers, as well as those restricted to rare cell types or developmental stages, efforts in enhancer mapping need to focus on different stimulatory conditions, environmental contexts, developmental stages and rare cell types 29- .Single-cell technologies are particularly well-suited for studying rare cell types.Specifically, single-cell chromatin accessibility assays (scATAC-seq, Assay for Transposase-Accessible CRickelshromatin and sequencing) can serve to operationally predict enhancer activity and have facilitated the functional interpretation of disease-associated non-coding variants in adult and fetal tissues [32][33][34] .scATAC-seq will be key for expanding putative enhancer maps in diverse (rare) cell types.Furthermore, since chromatin accessibility does not necessarily reflect enhancer activity 7 , further development of single-cell technologies that employ orthogonal measures of enhancer activity, e.g., large-scale perturbation assays 35,36 , will be crucial to get more confidence in assessing enhancer function for rare cell types.
In line with the importance of cataloging enhancers and their restricted activities, there is an urgent need to assess the functional impact of regulatory variants during development and differentiation.To this end, recent years have seen a promising development of biological models such as transgenic mice and zebrafish [37][38][39] , genetically manipulated human-induced Pluripotent Cells (hiPSC) or immortalized precursor cells 15,27,40,41 , human-mouse chimeras 42,43 and organoids 44 .These in vivo and ex vivo models, in combination with assays to assess developmental and differentiation potential, will facilitate the study of genetic variants and determine their impact in contexts closer to human diseases.

3) Understanding the interplay between cis-regulatory elements
Enhancers are not only highly context-dependent, but they also often work together in regulatory domains to achieve the correct gene expression output.Thus, a major challenge is to understand the interplay between enhancers and other regulatory elements, including promoters, and how the joint activity of a domain is influenced by disruptions of individual enhancers.Multiple enhancers for the same gene may allow distinct enhancers to either be activated under different conditions or to cooperate, both of which can lead to robustness in gene activity 24,37,[45][46][47] .For instance, many developmental genes are associated with "shadow" enhancers with similar transcription factor (TF) binding to ensure robust expression under suboptimal conditions 45,48,49 , an observation that has been confirmed by 3D topology-based methods that revealed a complex landscape of multiple enhancer interactions per gene [50][51][52][53] .
In fact, highly coordinated enhancer activity has been linked to the regulation of cell identity genes 54 , signal integration and compartmentalization of the genome 55 .As a consequence of such regulatory complexity, many enhancers might not, individually, reveal a strong phenotype when disrupted in their endogenous context 23,56 , while still possessing endogenous enhancer activity.Thus, the presence of multiple enhancers 57 per gene may either additively or synergistically achieve a higher transcriptional output of a gene or provide redundancy and mutational robustness to its expression.Systematic testing of enhancerpromoter compatibilities will help to better understand the still unclear connectivity rules 58,59 that control gene transcription in the human genome.
Elucidating the mechanisms and contexts, including the cell type-specific 3D topology, by which regulatory domains and TFs establish robustness or synergism will therefore be crucial to further our understanding of enhanceropathies.Combinatorial interference or perturbation of multiple enhancers within a regulatory domain will be necessary to understand the principles by which enhancers act together and their effects on gene regulation.

4) Identifying the target genes of enhancers
Enhancers ultimately need to be defined by their role in enhancing endogenous gene expression, which leads to the next challenge: the identification of their target genes.This is particularly challenging for enhancers that are located distally to any gene promoter.It is assumed that distal enhancers have to come into physical proximity to their target gene in order to function, as first demonstrated by chromosome conformation capture (3C) methods in the beta-globin locus 60 .Thus, for the operational mapping of target genes, chromatintopology assays are key to determine the physical proximity between enhancers and their putative genes.These technologies can map direct contacts (chromatin loops) and at the same time identify larger domains, so-called topologically associating domains (TADs), which have a high density of physical chromatin interactions 61 .The main caveats of using direct contacts for mapping enhancer-gene pairs are that the 3C-technologies typically require large cell numbers (with some exceptions 50,52 and may thus miss enhancer-gene pairs that are looped only in a subset of cells or contacts that are highly transient.TAD-based analyses suffer from low resolution since they typically comprise multiple genes and enhancers and can, on their own at best, restrict the search space for putative target genes 62 .There are a complementary set of approaches to map enhancer-gene pairs such as targeted Hi-C, where chromatin interactions of regions of interest such as promoters and/or enhancers are captured to increase resolution 55,63 , or expression quantitative trait loci (eQTL) mapping, where enhancer genetic variants are associated with mRNA expression changes across individuals 64 .Other approaches use covariation between molecular phenotypes (e.g., histone marks, chromatin accessibility, expression) of enhancers and genes across individuals or cell types 9,65-67 or combine chromatin states and long-range interactions 68 , to construct genome-wide maps of enhancer-gene connections in a given cell type.The advantage of these methods is that relying on enhancer-gene co-variation does not assume a specific mechanism of how enhancers regulate gene expression, and can therefore also capture transient enhancer-gene contacts 69 .Here, the caveats are that these methods require molecular data across a large number of individuals or cell types, and they may miss constitutively active enhancers that do not vary much across samples.Given their descriptive (for the 3C technologies) and correlative (for the co-variation methods) nature, all of these approaches provide an operational prediction of putative enhancer-gene pairs.For a functional mapping of target genes, CRISPR-mediated enhancer deletion or inactivation, followed by gene expression analysis 29,68 , is the most direct way to search for target genes.However, such CRISPR-based approaches may miss links due to low effect sizes and are often limited to cultured cells.In conclusion, current approaches still have difficulties identifying with high confidence the target genes of enhancers, and likely, the combination of different strategies might improve the efficiency of identifying disease-targeted genes 70 .Recent advances in applying machine learning to predict cell-type specific expression based on DNA sequence 71 show great promise to generate defined and experimentally testable hypotheses.These models were enabled by the vast resources of transcriptomics and genomics data that have been assembled by the community, and additional data, particularly from less accessible cellular states and developmental stages, will further improve the power of these methods.

5) Understanding the grammar of enhancer activity
Gene regulatory elements, including enhancers [7][8][9] , are regulated by TFs, or TF-recruited coactivators, which bind to the enhancer element at any given time and cellular state.It is thus not surprising that genetic variants that disrupt a TF binding motif are enriched among variants associated with molecular phenotypes, such as histone marks 8 or tissue-specific expression levels 72 , and can be disease-causative 73 .For example, a mutation in a SOX9 enhancer, associated with Pierre Robin Syndrome, disrupts the binding of the TF MSX1 74 (see other examples in Table 1).However, the majority of molecular trait-associated Single Nucleotide Polymorphisms (SNPs) do not disrupt known TF binding sites 8 , leading to the next open challenge in understanding enhancer dysfunction: to identify the rules by which enhancer sequence determines its activity.Concepts, such as Variable Chromatin Modules (VCMs), where the effects of a lead SNP affecting a local chromatin domain (e.g. through TF binding site disruption) spread into the local vicinity, can explain the missing mechanism to some extent 8,10,75 .Recent studies revealed that flanking regions of TF binding sites are highly informative for some TFs to bind 76 and they impact the enhancer potential of the encompassing regulatory element 77 , suggesting we are still missing part of the grammar for TF binding.In line with this, up to 30% of human TFs have no characterized binding motif 78 .
Consequently, interpreting regulatory variant-to-phenotype associations requires fundamental insights into the sequence determinants of TF binding and enhancer activity.
Here, sequence-based machine learning to model TF binding 76,79,80 , enhancer activity 25,77,81 and topologies 71 show promise.However, major challenges remain, including the difficulty to accurately interpret such models, the lack of sufficient training or validation data, and the need to improve accuracy and generalization across cell types/contexts.In parallel, experimental approaches that measure the functional impact of genetic variants on regulatory activity and TF binding in a large-scale, such as MPRA-based approaches [82][83][84][85][86] and SNP-SELEX 87 , can provide comprehensive experimental fine-mapping of likely causal variants.
Overall, these insights will be crucial for the interpretation of the potential effect and severity of enhancer dysfunction, and thus the potentially implicated genetic variants, within complex regulatory domains.

6) Understanding how TF cooperation defines enhancer activity and specificity
Enhancers integrate non-mutually exclusive layers of molecular information: their function can be impacted by genetic variants/mutations, by epigenetic chromatin remodeling that is typically set up by lineage-specific TFs, or by signaling cascades regulated by stimulusresponsive TFs 88 (Fig. 1).Here, we focus on the challenge of understanding the role of TFs and epigenetics on enhancer dysfunction.Lineage-and developmental-stage specific enhancers, typically regulated by lineage-specific TFs, may define the gene expression potential of a cell, and whether or not it will be able to mount a specific response to a given stimulus 89 .In particular, during development or differentiation, enhancers and whole chromatin domains can be primed in progenitor cells towards certain lineages before gene expression changes are obvious, e.g., during adipogenesis 55 .In contrast, enhancers that are under the control of stimulus-responsive TFs essentially act as signaling response elements and connect cellextrinsic signals to gene expression programs.Conceptually, lineage-specific TFs and the chromatin accessibility landscape they set up determine the scope of stimulus-regulated TFs.This way, stimulus-responsive TFs can access enhancers that are pre-marked and kept accessible by lineage-specific TFs, thus integrating the two layers of regulation 90 .As a consequence, some response-TFs, such as NF-kB, bind completely different enhancers depending on the cell type in which they are activated 91 .This is consistent with observations that a TF can regulate completely different sets of genes depending on the cell type 92 , which is partially explained by the cooperative interaction of TFs 93 .Yet, apart from a couple of wellstudied examples, very little is known about the contribution of TF cooperativity, enhancer priming (that can also be TF mediated) and permissive chromatin, which in turn may define the TF regulon (i.e., the set of target genes regulated by a given TF).To fully understand enhancer dysfunction, it is important to study the cell type-or condition-specific TF regulons, and how they are defined by the combinatorial or cooperative binding grammar of enhancer sequences in normal and pathological conditions.Diverse TF-centric studies are even more important given the current literature bias with many studies focusing on a small set of TFs while the majority of TFs are vastly understudied 94 .

7) Deciphering the impact and interactions of regulatory mutations in disease
The challenges above culminate in the ultimate challenge of identifying and understanding pathogenic enhancer dysfunction and eventually using this knowledge in clinically relevant studies (Fig. 2).The specific challenges that need to be solved for understanding a certain disease depend on the type of enhancer dysfunction and the nature of the genetic alteration (rare vs. common).For rare diseases, few examples of causal enhancer mutations have been established as compared to mutations in the coding genome.It is currently unclear whether this limited number of reported enhancer mutations in rare disorders is because they do not exist or because we have not been able to find them due to the lack of data and statistical power.Either way, the additional challenge for identifying causal mutations in enhancers vs. coding regions is that each genome carries around 2,000 structural and 8,500 private noncoding variants 95 , which are often not even captured since exome-sequencing is still the standard for diagnosing rare diseases.On the other hand, GWA studies have revealed hundreds of non-coding variants of significance for common disease risk, suggesting that the aggregated effect of variants in multiple enhancers modulate common disease risk.Finemapping studies aimed at identifying the causal variant(s) among those linked in a haplotype block typically integrate significantly associated variants with experimentally determined enhancer characteristics, as discussed above.While successful for the identification of some causal variants (e.g. 96,97), this is often difficult because the relevant cell type and trans-acting nuclear environment are not known (challenge 2), the role of the encompassing regulatory domain is not well understood (challenge 3) and the target gene of the affected enhancer is not identified (challenge 4).Fine-mapping of causal signals and effect size predictions can be improved by expanding the battery of GWA studies with cohorts of diverse ancestries 98,99 , and computational tools ranking genes based on their dosage-dependent pathogenicity.This allows hypothesis-driven studies where candidate target genes and enhancers are tested simultaneously to measure their combined effects on inferred functions 100 .Furthermore, for common diseases, both genetic and environmental factors contribute to the disease etiology.
Therefore, the effects of certain non-coding genetic variants might only or preferentially be manifested under certain environmental conditions.Together with a significant shift for using whole-genome instead of whole-exome sequences as a diagnostic utility, and consequently, an increasing amount of whole-genome data accumulating thanks to biobanks and cohort studies 101 , these tools will likely provide much better constraints on assessing disease causality and could pave the way towards systematic prediction of pathogenicity of regulatory variants and mutations for both rare and common variants 102 .

Future directions
As disease-associated regulatory mutations at enhancers are increasingly identified, there is an urgent need to fully characterize enhancer mutations to enable their use in functional and clinical genomic studies (Box 1).Besides the complexity of studying enhancer function in normal contexts, the characterization of noncoding variants affecting enhancer activity in disease adds additional challenges ranging from the identification of a credible set of regulatory variants to the identification of tissues and developmental contexts in which variants have an effect.Despite the wealth of data on enhancer activity across multiple celland tissue-types, it is challenging to fully utilize the vast potential of such datasets, highlighting the importance of good data-sharing practices.In addition, the majority of available data informing on enhancer activities are derived from populations of cells, disregarding the stochasticity and plasticity of regulatory events across individual cells.
However, due to the complexity of the regulatory landscape, we propose that the field should move beyond the generation of enhancer catalogs and invest more in experimental and computational efforts to identify their target genes, in particular for the prioritization of disease-relevant genes susceptible to dysfunction upon misregulation.This can only be uncovered using Systems Biology, computational modeling approaches, and targeted experimental systems.Focused efforts and datasets will enable hypothesis-driven investigations of a set of variants or genes for a given disease phenotype and further inform the modeling of enhancer function from catalog data.Ultimately, the acquired knowledge should allow the implementation of novel strategies to genetically or epigenetically modify enhancer function to treat the associated diseases.The ongoing community-driven comparison and assessment of experimental approaches for the discovery of enhancer activity should be strengthened.This will help improve our definition of the fundamental features of functionally operational enhancers as well as determine the most appropriate assay given a biological or disease context.We particularly see the benefits of further developments of CRISPR screens that improve sensitivity and allow measurements in non-cultured cells.

Area 2
We foresee major benefits in further efforts towards developing assays that will allow accurate assessment of enhancer activity in single cells.scATAC-seq is key for expanding the enhancer map repertoire, particularly in rare cell types.In addition, further development of single-cell technologies that employ large-scale perturbations of enhancers or TFs will be key to assessing enhancer function for such cell types.The output of such studies will also help building models of enhancer regulation informed by the dynamics and stochasticity of regulatory events as well as discover mechanisms by which their perturbation contributes to pathology.

Area 3
To better understand the rules by which enhancers work together in regulatory domains to achieve robustness, specificity, or synergism, further efforts are needed to derive assays and strategies that allow combinatorial interference or perturbation of multiple enhancers.
It is further imperative to develop in vivo (i.e. in situ) assays that allow the study of the activity of an enhancer in isolation or synergy with other enhancers.The outcomes of such studies would enable us to identify the biological mechanisms by which regulatory domains are formed and the rules by which TFs and the interplay between multiple regulatory elements yield robustness or additive effects.These insights will aid the interpretation of the potential effect and severity of regulatory genetic variants and enhancer dysfunction within complex regulatory domains.

Area 4
Experimental disease systems, such as humanized animal models, organoids, and engineered tissues, are becoming increasingly available for genetic engineering and in situ or ex vivo functional experiments.It will be key to fully employ these advanced disease models for assessing the functional and pathological consequences of non-coding regulatory variants (genetic and structural).Such experimental systems will allow interrogation of enhancer activity under a relevant internal or external stimulus for their dynamic and contextual assessment.

Area 5
The research community should increase the already promising work towards developing interpretable and generalizable computational models that can accurately predict TF binding, the activity of enhancers and their target genes, using molecular measurements in any given cell type and condition.From these, the main efforts should ideally be focused on deriving the underlying regulatory DNA code, allowing for direct interpretation of the effects of genetic variants across cell types.Relatedly, we foresee great benefits in putting effort into developing approaches to computationally predict dosage-sensitive and responsive genes, as they are more likely to be adversely affected by cis-acting mutations.

Area 6
To fully understand the molecular basis of enhancer dysfunction, we foresee the need to further develop and apply large-scale TF perturbation assays coupled with GRN analysis to study cell type-or condition-specific TF regulons, and how they are defined by the combinatorial or cooperative binding grammar of enhancer sequences in normal and pathological conditions.

Area 7
Last but not least, we foresee great potential for implementing tools (e.g.CRISPR-based) to genetically or epigenetically modify the functions or chromatin contexts of enhancers to treat enhanceropathies.By targeting enhancers, one can avoid the potential pleiotropic effects associated with drugs/tools directed toward proteins or gene promoters.

Highlighted references
and inversions disrupt the boundaries of a TAD containing the EPH4 enhancer and rewire the connectivity with different genes GOE 13 of a TAD boundary at the SOX9 locus causes neo-TAD formation and KCNJ2 misexpression GOE 110 Isolated atrial defect TBX5 90 kb downstream Rare variant abrogates heart-specific enhancer activity LOE 111 Isolated pancreatic agenesis PTF1A 25 kb downstream Rare variants abolish enhancer activity and disrupt the binding of FOXA2 and PDX1 LOE Obesity IRX3, IRX5 FTO intronic Multiple variants on a common haplotype increase the activity of in SN10A modulates SNC5A expression in the heart LOE 114 Hirschsprung disease RET Several enhancers Several SNPs located in RET enhancers act synergistically to reduce gene expression LOE Parkinson SNCA Intronic SNP alters bthe inding of EMX2 and NKX6-1 LOE 40 Burkitt lymphoma MYC IgH enhancer Somatic translocation (enhancer hijacking) GOE insertions introduce a MYB binding site and induce the formation of a Neo-enhancer GOE 119,120 Ph-like ALL GATA3 Intronic A rare variant increases enhancer activity GOE 121 CLL AXIN2 Upstream Common variation in the AXIN2 enhancer modulates CLL susceptibility via differential MEF2 binding GOE the affinity of TFs and switch promoter and enhancer activities

Figure 1 :
Figure 1: Different mechanisms of enhancer function and dysfunction.A) Variations in the interplay between enhancers and target genes.Multiple enhancers can cooperate in a tissue to increase the transcription of a target gene or be active in different tissues to control a complex developmental gene expression pattern.Enhancers can further control multiple genes in a mutually exclusive or shared way.Color code indicates the enhancer activity and gene expression in different tissues or developmental contexts.(B-C) Erroneous regulatory wiring between enhancers and genes, by either enhancer disruption (B) or altered enhancergene connectivity (C), can result in dysregulation of gene expression and ultimately cause disease.Enhancer dysfunction can originate from deletions, duplications and mutations, which can result in either loss or gain of gene expression.Altered enhancer-gene connectivity can be caused by chromosomal translocations or large structural variations that can distort or merge Topologically Associating Domains (TAD).As a consequence, enhancer-gene connectivity can be lost or gained resulting in dysregulated gene expression.Changes in gene expression are indicated by the number of arrows.

Figure 2 :Box 1 .
Figure 2: Challenges to unravel enhancer-associated diseases.Elucidating the molecular basis of enhancer dysfunction in disease requires critical areas of research to be addressed, each corresponding to one of the challenges described in the main text.Resolving challenges I to VI should lead to the ultimate challenge (VII) of identifying the causal variants, the impacted molecular mechanisms as well as the affected genes of a disease.TF: transcription factor.MPRA: Massively Parallel Reporter Assay.