D8. Classification of Transcription Factors - Biology

D8. Classification of Transcription Factors - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

As inferred from above, transcription factors can be classified based on their protein structure. The classes of transcription factors include those that are:

  • constitutively active : are always active in the nucleus of the cell and probably activate transcription of genes that must always be turned on;

The rest must be activated by some means, which include those that are:

  • developmental or cell type-specific whose genes must be transcribed (probably in a regulated fashion) to form the transcription factor which then enters the nucleus;
  • signal dependent transcription factors, which are activated through a signaling event.

There are classes of signal-dependent transcription factors that are activated by:

  • steroids, which are cholesterol derivatives that can pass through the cell membrane and bind steroid-specific transcription factors which turn on specific sets of genes; most of these transcription factors are present in the nucleus and are activated there by steroid hormones. One exception is the glucocorticoid receptor (GR) which is found in the cytoplasm;
  • internal signals derived from the cell, such as internally made lipid signals.
  • cell surface receptor-ligand interactions;

There are two types of receptor-ligand interactions that lead to transcription factor initiation.

  • small ligand molecules (like epinephrine) bind transmembrane receptors which leads to formation of second messengers or signals inside the cell, which ultimately activate Ser-phosphorylation activity. Nuclear transcription factors can become phosphorylated and activated.
  • small ligands bind transmembrane receptors which then bind to and activate latent transcription factors in the cytoplasm, which then migrate to the nucleus.

Figure: Transcription Factors: Functional Classification

Transcription Factors

Transcription Factors

Transcription factors are proteins possessing domains that bind to the DNA of promoter or enhancer regions of specific genes. They also possess a domain that interacts with RNA polymerase II or other transcription factors and consequently regulates the amount of messenger RNA (mRNA) produced by the gene.

Many families of molecules act as transcription factors. Some transcription factors are general ones that are found in virtually all cells of an organism. Other transcription factors are specific for certain types of cells and stages of development. Specific transcription factors are often very important in initiating patterns of gene expression that result in major developmental changes. They typically do so by acting on promoters or enhancers to activate or repress the transcription of specific genes. Based on their structure and how they interact with DNA, transcription factors can be subdivided into several main groups, the most important of which are introduced here.

Transcription Factors

Transcription Factors: What They Do

Transcription factors play many different roles, which vary according to the organism in question. For example, in vertebrates, transcription factors are directly responsible for development, with groups of different factors coming into play in specific tissues. Transcription factors are especially important during embryonic development and thus specific factors are essential for the differentiation of pluripotent embryonic stem cells. Similarly, the activity of other factors must be maintained for stem cells to retain their ability to turn into any cell type and to self-renew. It is not surprising that many human diseases or abnormalities are caused by the misfunction of transcription factors. Similarly, somatic mutation or chromosomal rearrangements that affect certain transcription factors play a key role in the development of some human cancers. Understanding how the sequential deployment of transcription factors controls differentiation and development is a vibrant current area of research and it is important to note the value of studies with mice, zebra fish, fruit flies, and nematodes in understanding how transcription factors drive development. The situation in unicellular organisms is different where the primary role of transcription factors is to manage adaptation to environmental change, for example, sensing nutrients or coping with life in stressful niches. Detailed information on the number and nature of transcription factors in different organisms can be found on many websites.

Unblending of Transcriptional Condensates in Human Repeat Expansion Disease

Expansions of amino acid repeats occur in >20 inherited human disorders, and many occur in intrinsically disordered regions (IDRs) of transcription factors (TFs). Such diseases are associated with protein aggregation, but the contribution of aggregates to pathology has been controversial. Here, we report that alanine repeat expansions in the HOXD13 TF, which cause hereditary synpolydactyly in humans, alter its phase separation capacity and its capacity to co-condense with transcriptional co-activators. HOXD13 repeat expansions perturb the composition of HOXD13-containing condensates in vitro and in vivo and alter the transcriptional program in a cell-specific manner in a mouse model of synpolydactyly. Disease-associated repeat expansions in other TFs (HOXA13, RUNX2, and TBP) were similarly found to alter their phase separation. These results suggest that unblending of transcriptional condensates may underlie human pathologies. We present a molecular classification of TF IDRs, which provides a framework to dissect TF function in diseases associated with transcriptional dysregulation.

Keywords: activation domain condensate intrinscially disordered region phase separation repeat expansion synpolydactyly transcription factor transcriptional condensate.

Copyright © 2020 Elsevier Inc. All rights reserved.

Conflict of interest statement

Declaration of Interests The Max Planck Society has filed a patent application based on this paper.


A fundamental feature of transcriptional regulation is the ability of TFs to recognize specific DNA binding sites. In this study, we present an alternative view to the established model of consensus sequence motif binding whereby endogenous G4 structures in promoters frequently serve as docking sites for TFs in human chromatin. Our work supports that DNA secondary structure recognition is an important mode by which TFs can read the genome. By mapping the G4 landscape in two human cancer cell lines and comparing these to hundreds of TF binding maps, we reveal that many TFs are highly enriched at endogenous G4 sites. This enrichment is comparable to that of dsDNA consensus binding making it highly probable that G4s have a similar capacity to recruit TFs in a cellular context.

Validating this model, we observe that several TFs bind G4s with affinities comparable to their consensus dsDNA both in vitro and in a chromatin context and that small molecule ligands can displace TFs from endogenous G4s, but not consensus dsDNA sites. Given that ENCODE has only mapped

2800 potential TFs in K562 and HepG2 cells [1], there is every prospect that many more TFs will be recruited to endogenous G4.

Recently, endogenous expression of a small, engineered G4-binding protein was reported for detection of DNA G4s via ChIP-seq in human cells [50]. This alternative mapping approach observed G4s to be enriched at promoters, associated with highly expressed genes, and enrichment of certain proteins (FUS, TAF15, RBM14, TARDBP, HNRNPK, PCBP1) at G4 loci. In contrast to G4 ChIP-seq on fixed chromatin, the study mapped over 100,000 G4s and observed considerable G4 formation downstream of the TSS in addition to promoter G4s. Endogenous expression of a probe may be able to detect weaker, more transient G4s. However, it may also perturb the endogenous G4 landscape and shift the equilibrium to stabilize G4s that do not normally form under physiological conditions.

A remaining challenge in the understanding of mechanisms that regulate transcription is how a large number of different TFs bind to the same genomic site and cannot be explained by the presence of their respective consensus motifs [1]. For some TFs, our work gives an immediate explanation into how this might be resolved through TF recruitment to G4 secondary structures rather than dsDNA consensus motifs. Furthermore, TF recruitment by G4s may explain the recognition mode for TFs with non-canonical binding properties. For example, recruitment of SP2, a TF with strong G4 association, is thought to be independent of its zinc finger dsDNA-binding domain and requires only a glutamine-rich, positively charged N-terminal region for binding [51]. Further structural investigation into of TF-G4 complexes [21] will be needed to unravel the molecular details of how TFs bind G4 structures.

Based on computationally predicted G4 forming sequences, earlier work has proposed that G4s may interfere with TF binding causing transcriptional repression and that G4s may need to be resolved by G4 binding proteins to facilitate transcription [52,53,54]. In contrast, endogenous promoter G4s are predominantly found at highly active genes [16, 17]. Here, we now show that in fact several TFs can selectively bind G4s, with little interaction with corresponding dsDNA sequences, and that G4s are promiscuous hubs for the binding of many different TFs. We propose a fundamental mechanism of transcriptional regulation that may apply to many genes, whereby G4 structures recruit a multitude of TFs causing more frequent engagement of TFs in promoters and thereby stimulating transcriptional output (Fig. 4d). Further functional studies are required to ascertain whether there is a universally positive role of promoter G4s in transcription and to explore the details of mechanisms that maintain the endogenous G4 landscape in chromatin [55]. Alternative DNA structures should thus be seriously considered as a means to recruit TFs.


Transcription factors are essential for the regulation of gene expression and are, as a consequence, found in all living organisms. The number of transcription factors found within an organism increases with genome size, and larger genomes tend to have more transcription factors per gene. [12]

There are approximately 2800 proteins in the human genome that contain DNA-binding domains, and 1600 of these are presumed to function as transcription factors, [3] though other studies indicate it to be a smaller number. [13] Therefore, approximately 10% of genes in the genome code for transcription factors, which makes this family the single largest family of human proteins. Furthermore, genes are often flanked by several binding sites for distinct transcription factors, and efficient expression of each of these genes requires the cooperative action of several different transcription factors (see, for example, hepatocyte nuclear factors). Hence, the combinatorial use of a subset of the approximately 2000 human transcription factors easily accounts for the unique regulation of each gene in the human genome during development. [11]

Transcription factors bind to either enhancer or promoter regions of DNA adjacent to the genes that they regulate. Depending on the transcription factor, the transcription of the adjacent gene is either up- or down-regulated. Transcription factors use a variety of mechanisms for the regulation of gene expression. [14] These mechanisms include:

  • stabilize or block the binding of RNA polymerase to DNA
  • catalyze the acetylation or deacetylation of histone proteins. The transcription factor can either do this directly or recruit other proteins with this catalytic activity. Many transcription factors use one or the other of two opposing mechanisms to regulate transcription: [15]
      (HAT) activity – acetylates histone proteins, which weakens the association of DNA with histones, which make the DNA more accessible to transcription, thereby up-regulating transcription (HDAC) activity – deacetylates histone proteins, which strengthens the association of DNA with histones, which make the DNA less accessible to transcription, thereby down-regulating transcription
  • Transcription factors are one of the groups of proteins that read and interpret the genetic "blueprint" in the DNA. They bind to the DNA and help initiate a program of increased or decreased gene transcription. As such, they are vital for many important cellular processes. Below are some of the important functions and biological roles transcription factors are involved in:

    Basal transcription regulation Edit

    In eukaryotes, an important class of transcription factors called general transcription factors (GTFs) are necessary for transcription to occur. [17] [18] [19] Many of these GTFs do not actually bind DNA, but rather are part of the large transcription preinitiation complex that interacts with RNA polymerase directly. The most common GTFs are TFIIA, TFIIB, TFIID (see also TATA binding protein), TFIIE, TFIIF, and TFIIH. [20] The preinitiation complex binds to promoter regions of DNA upstream to the gene that they regulate.

    Differential enhancement of transcription Edit

    Other transcription factors differentially regulate the expression of various genes by binding to enhancer regions of DNA adjacent to regulated genes. These transcription factors are critical to making sure that genes are expressed in the right cell at the right time and in the right amount, depending on the changing requirements of the organism.

    Development Edit

    Many transcription factors in multicellular organisms are involved in development. [21] Responding to stimuli, these transcription factors turn on/off the transcription of the appropriate genes, which, in turn, allows for changes in cell morphology or activities needed for cell fate determination and cellular differentiation. The Hox transcription factor family, for example, is important for proper body pattern formation in organisms as diverse as fruit flies to humans. [22] [23] Another example is the transcription factor encoded by the sex-determining region Y (SRY) gene, which plays a major role in determining sex in humans. [24]

    Response to intercellular signals Edit

    Cells can communicate with each other by releasing molecules that produce signaling cascades within another receptive cell. If the signal requires upregulation or downregulation of genes in the recipient cell, often transcription factors will be downstream in the signaling cascade. [25] Estrogen signaling is an example of a fairly short signaling cascade that involves the estrogen receptor transcription factor: Estrogen is secreted by tissues such as the ovaries and placenta, crosses the cell membrane of the recipient cell, and is bound by the estrogen receptor in the cell's cytoplasm. The estrogen receptor then goes to the cell's nucleus and binds to its DNA-binding sites, changing the transcriptional regulation of the associated genes. [26]

    Response to environment Edit

    Not only do transcription factors act downstream of signaling cascades related to biological stimuli but they can also be downstream of signaling cascades involved in environmental stimuli. Examples include heat shock factor (HSF), which upregulates genes necessary for survival at higher temperatures, [27] hypoxia inducible factor (HIF), which upregulates genes necessary for cell survival in low-oxygen environments, [28] and sterol regulatory element binding protein (SREBP), which helps maintain proper lipid levels in the cell. [29]

    Cell cycle control Edit

    Many transcription factors, especially some that are proto-oncogenes or tumor suppressors, help regulate the cell cycle and as such determine how large a cell will get and when it can divide into two daughter cells. [30] [31] One example is the Myc oncogene, which has important roles in cell growth and apoptosis. [32]

    Pathogenesis Edit

    Transcription factors can also be used to alter gene expression in a host cell to promote pathogenesis. A well studied example of this are the transcription-activator like effectors (TAL effectors) secreted by Xanthomonas bacteria. When injected into plants, these proteins can enter the nucleus of the plant cell, bind plant promoter sequences, and activate transcription of plant genes that aid in bacterial infection. [33] TAL effectors contain a central repeat region in which there is a simple relationship between the identity of two critical residues in sequential repeats and sequential DNA bases in the TAL effector's target site. [34] [35] This property likely makes it easier for these proteins to evolve in order to better compete with the defense mechanisms of the host cell. [36]

    It is common in biology for important processes to have multiple layers of regulation and control. This is also true with transcription factors: Not only do transcription factors control the rates of transcription to regulate the amounts of gene products (RNA and protein) available to the cell but transcription factors themselves are regulated (often by other transcription factors). Below is a brief synopsis of some of the ways that the activity of transcription factors can be regulated:

    Synthesis Edit

    Transcription factors (like all proteins) are transcribed from a gene on a chromosome into RNA, and then the RNA is translated into protein. Any of these steps can be regulated to affect the production (and thus activity) of a transcription factor. An implication of this is that transcription factors can regulate themselves. For example, in a negative feedback loop, the transcription factor acts as its own repressor: If the transcription factor protein binds the DNA of its own gene, it down-regulates the production of more of itself. This is one mechanism to maintain low levels of a transcription factor in a cell. [37]

    Nuclear localization Edit

    In eukaryotes, transcription factors (like most proteins) are transcribed in the nucleus but are then translated in the cell's cytoplasm. Many proteins that are active in the nucleus contain nuclear localization signals that direct them to the nucleus. But, for many transcription factors, this is a key point in their regulation. [38] Important classes of transcription factors such as some nuclear receptors must first bind a ligand while in the cytoplasm before they can relocate to the nucleus. [38]

    Activation Edit

    Transcription factors may be activated (or deactivated) through their signal-sensing domain by a number of mechanisms including:

      binding – Not only is ligand binding able to influence where a transcription factor is located within a cell but ligand binding can also affect whether the transcription factor is in an active state and capable of binding DNA or other cofactors (see, for example, nuclear receptors). [39][40] – Many transcription factors such as STAT proteins must be phosphorylated before they can bind DNA.
    • interaction with other transcription factors (e.g., homo- or hetero-dimerization) or coregulatory proteins

    Accessibility of DNA-binding site Edit

    In eukaryotes, DNA is organized with the help of histones into compact particles called nucleosomes, where sequences of about 147 DNA base pairs make

    1.65 turns around histone protein octamers. DNA within nucleosomes is inaccessible to many transcription factors. Some transcription factors, so-called pioneer factors are still able to bind their DNA binding sites on the nucleosomal DNA. For most other transcription factors, the nucleosome should be actively unwound by molecular motors such as chromatin remodelers. [41] Alternatively, the nucleosome can be partially unwrapped by thermal fluctuations, allowing temporary access to the transcription factor binding site. In many cases, a transcription factor needs to compete for binding to its DNA binding site with other transcription factors and histones or non-histone chromatin proteins. [42] Pairs of transcription factors and other proteins can play antagonistic roles (activator versus repressor) in the regulation of the same gene.

    Availability of other cofactors/transcription factors Edit

    Most transcription factors do not work alone. Many large TF families form complex homotypic or heterotypic interactions through dimerization. [43] For gene transcription to occur, a number of transcription factors must bind to DNA regulatory sequences. This collection of transcription factors, in turn, recruit intermediary proteins such as cofactors that allow efficient recruitment of the preinitiation complex and RNA polymerase. Thus, for a single transcription factor to initiate transcription, all of these other proteins must also be present, and the transcription factor must be in a state where it can bind to them if necessary. Cofactors are proteins that modulate the effects of transcription factors. Cofactors are interchangeable between specific gene promoters the protein complex that occupies the promoter DNA and the amino acid sequence of the cofactor determine its spatial conformation. For example, certain steroid receptors can exchange cofactors with NF-κB, which is a switch between inflammation and cellular differentiation thereby steroids can affect the inflammatory response and function of certain tissues. [44]

    Interaction with methylated cytosine Edit

    Transcription factors and methylated cytosines in DNA both have major roles in regulating gene expression. (Methylation of cytosine in DNA primarily occurs where cytosine is followed by guanine in the 5’ to 3’ DNA sequence, a CpG site.) Methylation of CpG sites in a promoter region of a gene usually represses gene transcription, [45] while methylation of CpGs in the body of a gene increases expression. [46] TET enzymes play a central role in demethylation of methylated cytosines. Demethylation of CpGs in a gene promoter by TET enzyme activity increases transcription of the gene. [47]

    The DNA binding sites of 519 transcription factors were evaluated. [48] Of these, 169 transcription factors (33%) did not have CpG dinucleotides in their binding sites, and 33 transcription factors (6%) could bind to a CpG-containing motif but did not display a preference for a binding site with either a methylated or unmethylated CpG. There were 117 transcription factors (23%) that were inhibited from binding to their binding sequence if it contained a methylated CpG site, 175 transcription factors (34%) that had enhanced binding if their binding sequence had a methylated CpG site, and 25 transcription factors (5%) were either inhibited or had enhanced binding depending on where in the binding sequence the methylated CpG was located.

    TET enzymes do not specifically bind to methylcytosine except when recruited (see DNA demethylation). Multiple transcription factors important in cell differentiation and lineage specification, including NANOG, SALL4A, WT1, EBF1, PU.1, and E2A, have been shown to recruit TET enzymes to specific genomic loci (primarily enhancers) to act on methylcytosine (mC) and convert it to hydroxymethylcytosine hmC (and in most cases marking them for subsequent complete demethylation to cytosine). [49] TET-mediated conversion of mC to hmC appears to disrupt the binding of 5mC-binding proteins including MECP2 and MBD (Methyl-CpG-binding domain) proteins, facilitating nucleosome remodeling and the binding of transcription factors, thereby activating transcription of those genes. EGR1 is an important transcription factor in memory formation. It has an essential role in brain neuron epigenetic reprogramming. The transcription factor EGR1 recruits the TET1 protein that initiates a pathway of DNA demethylation. [50] EGR1, together with TET1, is employed in programming the distribution of methylation sites on brain DNA during brain development and in learning (see Epigenetics in learning and memory).

    Transcription factors are modular in structure and contain the following domains: [1]

    • DNA-binding domain (DBD), which attaches to specific sequences of DNA (enhancer or promoter. Necessary component for all vectors. Used to drive transcription of the vector's transgene promoter sequences) adjacent to regulated genes. DNA sequences that bind transcription factors are often referred to as response elements.
    • Activation domain (AD), which contains binding sites for other proteins such as transcription coregulators. These binding sites are frequently referred to as activation functions (AFs), Transactivation domain (TAD) or Trans-activating domainTAD but not mix with topologically associating domain TAD. [51]
    • An optional signal-sensing domain (SSD) (e.g., a ligand binding domain), which senses external signals and, in response, transmits these signals to the rest of the transcription complex, resulting in up- or down-regulation of gene expression. Also, the DBD and signal-sensing domains may reside on separate proteins that associate within the transcription complex to regulate gene expression.

    DNA-binding domain Edit

    The portion (domain) of the transcription factor that binds DNA is called its DNA-binding domain. Below is a partial list of some of the major families of DNA-binding domains/transcription factors:

    Family InterPro Pfam SCOP
    basic helix-loop-helix [52] InterPro: IPR001092 Pfam PF00010 SCOP 47460
    basic-leucine zipper (bZIP) [53] InterPro: IPR004827 Pfam PF00170 SCOP 57959
    C-terminal effector domain of the bipartite response regulators InterPro: IPR001789 Pfam PF00072 SCOP 46894
    AP2/ERF/GCC box InterPro: IPR001471 Pfam PF00847 SCOP 54176
    helix-turn-helix [54]
    homeodomain proteins, which are encoded by homeobox genes, are transcription factors. Homeodomain proteins play critical roles in the regulation of development. [55] [56] InterPro: IPR009057 Pfam PF00046 SCOP 46689
    lambda repressor-like InterPro: IPR010982 SCOP 47413
    srf-like (serum response factor) InterPro: IPR002100 Pfam PF00319 SCOP 55455
    paired box [57]
    winged helix InterPro: IPR013196 Pfam PF08279 SCOP 46785
    zinc fingers [58]
    * multi-domain Cys2His2 zinc fingers [59] InterPro: IPR007087 Pfam PF00096 SCOP 57667
    * Zn2/Cys6 SCOP 57701
    * Zn2/Cys8 nuclear receptor zinc finger InterPro: IPR001628 Pfam PF00105 SCOP 57716

    Response elements Edit

    The DNA sequence that a transcription factor binds to is called a transcription factor-binding site or response element. [60]

    Transcription factors interact with their binding sites using a combination of electrostatic (of which hydrogen bonds are a special case) and Van der Waals forces. Due to the nature of these chemical interactions, most transcription factors bind DNA in a sequence specific manner. However, not all bases in the transcription factor-binding site may actually interact with the transcription factor. In addition, some of these interactions may be weaker than others. Thus, transcription factors do not bind just one sequence but are capable of binding a subset of closely related sequences, each with a different strength of interaction.

    For example, although the consensus binding site for the TATA-binding protein (TBP) is TATAAAA, the TBP transcription factor can also bind similar sequences such as TATATAT or TATATAA.

    Because transcription factors can bind a set of related sequences and these sequences tend to be short, potential transcription factor binding sites can occur by chance if the DNA sequence is long enough. It is unlikely, however, that a transcription factor will bind all compatible sequences in the genome of the cell. Other constraints, such as DNA accessibility in the cell or availability of cofactors may also help dictate where a transcription factor will actually bind. Thus, given the genome sequence it is still difficult to predict where a transcription factor will actually bind in a living cell.

    Additional recognition specificity, however, may be obtained through the use of more than one DNA-binding domain (for example tandem DBDs in the same transcription factor or through dimerization of two transcription factors) that bind to two or more adjacent sequences of DNA.

    Transcription factors are of clinical significance for at least two reasons: (1) mutations can be associated with specific diseases, and (2) they can be targets of medications.

    Disorders Edit

    Due to their important roles in development, intercellular signaling, and cell cycle, some human diseases have been associated with mutations in transcription factors. [61]

    Many transcription factors are either tumor suppressors or oncogenes, and, thus, mutations or aberrant regulation of them is associated with cancer. Three groups of transcription factors are known to be important in human cancer: (1) the NF-kappaB and AP-1 families, (2) the STAT family and (3) the steroid receptors. [62]

    Below are a few of the better-studied examples:

    Condition Description Locus
    Rett syndrome Mutations in the MECP2 transcription factor are associated with Rett syndrome, a neurodevelopmental disorder. [63] [64] Xq28
    Diabetes A rare form of diabetes called MODY (Maturity onset diabetes of the young) can be caused by mutations in hepatocyte nuclear factors (HNFs) [65] or insulin promoter factor-1 (IPF1/Pdx1). [66] multiple
    Developmental verbal dyspraxia Mutations in the FOXP2 transcription factor are associated with developmental verbal dyspraxia, a disease in which individuals are unable to produce the finely coordinated movements required for speech. [67] 7q31
    Autoimmune diseases Mutations in the FOXP3 transcription factor cause a rare form of autoimmune disease called IPEX. [68] Xp11.23-q13.3
    Li-Fraumeni syndrome Caused by mutations in the tumor suppressor p53. [69] 17p13.1
    Breast cancer The STAT family is relevant to breast cancer. [70] multiple
    Multiple cancers The HOX family are involved in a variety of cancers. [71] multiple
    Osteoarthritis Mutation or reduced activity of SOX9 [72]

    Potential drug targets Edit

    Approximately 10% of currently prescribed drugs directly target the nuclear receptor class of transcription factors. [73] Examples include tamoxifen and bicalutamide for the treatment of breast and prostate cancer, respectively, and various types of anti-inflammatory and anabolic steroids. [74] In addition, transcription factors are often indirectly modulated by drugs through signaling cascades. It might be possible to directly target other less-explored transcription factors such as NF-κB with drugs. [75] [76] [77] [78] Transcription factors outside the nuclear receptor family are thought to be more difficult to target with small molecule therapeutics since it is not clear that they are "drugable" but progress has been made on Pax2 [79] [80] and the notch pathway. [81]

    Gene duplications have played a crucial role in the evolution of species. This applies particularly to transcription factors. Once they occur as duplicates, accumulated mutations encoding for one copy can take place without negatively affecting the regulation of downstream targets. However, changes of the DNA binding specificities of the single-copy LEAFY transcription factor, which occurs in most land plants, have recently been elucidated. In that respect, a single-copy transcription factor can undergo a change of specificity through a promiscuous intermediate without losing function. Similar mechanisms have been proposed in the context of all alternative phylogenetic hypotheses, and the role of transcription factors in the evolution of all species. [82] [83]

    There are different technologies available to analyze transcription factors. On the genomic level, DNA-sequencing [84] and database research are commonly used [85] The protein version of the transcription factor is detectable by using specific antibodies. The sample is detected on a western blot. By using electrophoretic mobility shift assay (EMSA), [86] the activation profile of transcription factors can be detected. A multiplex approach for activation profiling is a TF chip system where several different transcription factors can be detected in parallel.

    The most commonly used method for identifying transcription factor binding sites is chromatin immunoprecipitation (ChIP). [87] This technique relies on chemical fixation of chromatin with formaldehyde, followed by co-precipitation of DNA and the transcription factor of interest using an antibody that specifically targets that protein. The DNA sequences can then be identified by microarray or high-throughput sequencing (ChIP-seq) to determine transcription factor binding sites. If no antibody is available for the protein of interest, DamID may be a convenient alternative. [88]

    As described in more detail below, transcription factors may be classified by their (1) mechanism of action, (2) regulatory function, or (3) sequence homology (and hence structural similarity) in their DNA-binding domains.

    Mechanistic Edit

    There are two mechanistic classes of transcription factors:

      are involved in the formation of a preinitiation complex. The most common are abbreviated as TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIIH. They are ubiquitous and interact with the core promoter region surrounding the transcription start site(s) of all class II genes. [89]
    • Upstream transcription factors are proteins that bind somewhere upstream of the initiation site to stimulate or repress transcription. These are roughly synonymous with specific transcription factors, because they vary considerably depending on what recognition sequences are present in the proximity of the gene. [90]

    Functional Edit

    Transcription factors have been classified according to their regulatory function: [11]

    • I. constitutively active – present in all cells at all times – general transcription factors, Sp1, NF1, CCAAT
    • II. conditionally active – requires activation
      • II.A developmental (cell specific) – expression is tightly controlled, but, once expressed, require no additional activation – GATA, HNF, PIT-1, MyoD, Myf5, Hox, Winged Helix
      • II.B signal-dependent – requires external signal for activation
        • II.B.1 extracellular ligand (endocrine or paracrine)-dependent – nuclear receptors
        • II.B.2 intracellular ligand (autocrine)-dependent - activated by small intracellular molecules – SREBP, p53, orphan nuclear receptors
        • II.B.3 cell membrane receptor-dependent – second messenger signaling cascades resulting in the phosphorylation of the transcription factor
          • II.B.3.a resident nuclear factors – reside in the nucleus regardless of activation state – CREB, AP-1, Mef2
          • II.B.3.b latent cytoplasmic factors – inactive form reside in the cytoplasm, but, when activated, are translocated into the nucleus – STAT, R-SMAD, NF-κB, Notch, TUBBY, NFAT

          Structural Edit

          Transcription factors are often classified based on the sequence similarity and hence the tertiary structure of their DNA-binding domains: [91] [10] [92] [9]

          From basic immunobiology to the upcoming WHO-classification of tumors of the thymus. The Second Conference on Biological and Clinical Aspects of Thymic Epithelial Tumors and related recent developments

          The Second Conference on Biological and Clinical Aspects of Thymic Epithelial Tumors in Leiden, The Netherlands, 1998, set the stage for an interdisciplinary meeting of immunologists, pathologists and members of various clinical disciplines to exchange their recent findings in the field of thymus-related biology, pathology, and medicine. The contributions covered such diverse subjects as the role of transcription factors and cytokines in the development of the thymic microenvironment, thymic T, B and NK cell development, the pathogenesis of myasthenia gravis and other thymoma-associated autoimmunities, the pathology of thymic epithelial tumors and germ cell neoplasms, and new approaches to their diagnosis and treatment. This editorial will briefly sum up the data presented at the Conference and will comment on related novel findings that have been reported since then. Because it was also at the Leiden Conference, that the proposal of the WHO committee for the classification of thymic tumors was discussed for the first time, a description of the upcoming WHO Classification of Tumors of the Thymus is given with emphasis on the diagnostic criteria of thymic epithelial tumors, that should now be termed as type A, AB, B1-3 and type C thymomas, to make pathological and clinical studies comparable in the future.

          Relationship of Phylogeny to Function.

          There have been several attempts to categorize bHLH proteins into higher order groups of protein families (e.g., refs. 2, 9–11). Currently, the most widely followed classification of bHLH proteins is one based on how the proteins bind to the core CANNTG E-box (Fig. 1). This classification is naturally depicted by the NJ tree in the present analyses.

          Deng et al. (9) categorized most of the bHLH proteins into groups A and B, and Swanson et al. (11) suggested that Ah and Sim proteins comprise a distinct group C based on their half-site pairing behavior within the E-box. We propose a natural fourth group of proteins (group D) for proteins like Id that lack the typical basic DNA binding region and have a very low frequency of basic residues in the first 13 amino acid sites. Our analyses established patterns of amino acids at sites 5, 8, and 13 (Fig. 1) that discriminate these four groups of bHLH proteins with considerable accuracy.

          Group A proteins bind to CAGCTG and have a distinctive pattern of amino acids at sites 5, 8, and 13, i.e., a basic amino acid at site 8 and a 5–8-13 configuration of xRx (where R = arginine at site 8 and x is another amino acid at 5 and 13). Furthermore, group A has only small aliphatic residues (A, G) at site 19. The only exceptions are dHand and AP-4, where lysine (K) is substituted at site 8. Accepting the low statistical support at the deep nodes, group A appears monophyletic and includes the protein families Lyl, Twist, Hen, Atonal, Delilah, dHand, AC-S, MyoD, E12, and Da. Validity of group A is strongly reflected in the NJ tree. AP-4 is an odd protein with an unusual E-box binding configuration and is the only group A sequence to contain an LZ (10). We consider AP-4 as a special case and did not include it as a group A protein for these discussions.

          Group B binds to CACGTG and has the 5–8-13 E-box configuration BxR with a basic amino acid (either K or H) at site 5 and arginine at site 13. Group B includes Arnt, Cbf, Esc, Hairy, Mad, No, Myc, Pho4, R, Srebp, Tfe, Usf, and others (Table 1). Group B can be further partitioned into protein families with or without an LZ motif. For group B proteins with an LZ, the E-box configuration is HxR. Additionally, sequences with an LZ have a very high frequency of N residues (93%) at site 6, the residues at site 8 are almost all aliphatic (I, L, or V), and site 56 is K at 88%

          Group C represents a statistically well supported, separate lineage derived from group B but has no consistent amino acid configuration at sites 5, 8, or 13. Furthermore, group C could be further distinguished in these analyses by the absence of basic residues at site 2, A or K at site 9, and E at site 19.

          Group D proteins, which include Id, Emc, Heira, and Hhl462 (12), lack the basic DNA binding region, have a very low frequency of basic residues in the first 13 amino acid sites, and frequently have prolines at sites 4 and 9. Group D proteins do not bind DNA rather, they form protein–protein dimers that function as negative regulators of DNA binding behavior (12). The Id lineage is a statistically well supported single lineage (boot strap value = 99%), and the included proteins probably were derived from a common ancestor that possessed a DNA binding region that was subsequently lost during evolution.

          Group D proteins act as dominant negative regulators of MyoD proteins. The question arises of whether a classification of the bHLH proteins based on the helix-loop-helix component alone (basic region removed) would place the group D proteins in the same clade as MyoD. Such an analysis was carried out, and the result was that the Id proteins were still distinct and separate from MyoD. Furthermore, the major clades as seen in Fig. 2 persisted, indicating that evolutionary relationships among protein groups persist when components of the motif are removed. However, the way the major clades were linked together deep in the tree was altered in several instances compared with the results using the full motif. These alterations would be expected in view of the low boot strap values for the deep nodes described above.

          These four higher order groups (A-D) are depicted in the NJ tree in Fig. 2, but the boot strap values this deep in the tree are low. Hence, we have used a simple procedure here to further explore the validity of these groups. At each site shown in Fig. 1, the most frequently occurring group is paired with its relevant amino acid at that site. This amino acid then can be used for classifying these sequences into the groups. Informative sites (those with probability values greater than 80%) are found in all four sequence components (i.e., basic, helices, and loop). Within the basic region, the sites and their respective probability values are 4 (81%), 5 (92%), 8 (98%), and 13 (95%). Within helix I, the sites are 14 (87%), 19 (92%), 21 (81%), and 25 (86%) within the loop, 29 (82%) and 46 (83%). And for helix II are 52 (85%), 55 (85%), and 56 (86%). Thus, it is clear that the major groups as elucidated by the NJ tree are consistent, and amino acid sites that are informative with regard to group classification occur in all components of the motif.


          The discovery of the helix-turn-helix motif was based on similarities between several genes encoding transcription regulatory proteins from bacteriophage lambda and Escherichia coli: Cro, CAP, and λ repressor, which were found to share a common 20–25 amino acid sequence that facilitates DNA recognition. [2] [3] [4] [5]

          The helix-turn-helix motif is a DNA-binding motif. The recognition and binding to DNA by helix-turn-helix proteins is done by the two α helices, one occupying the N-terminal end of the motif, the other at the C-terminus. In most cases, such as in the Cro repressor, the second helix contributes most to DNA recognition, and hence it is often called the "recognition helix". It binds to the major groove of DNA through a series of hydrogen bonds and various Van der Waals interactions with exposed bases. The other α helix stabilizes the interaction between protein and DNA, but does not play a particularly strong role in its recognition. [2] The recognition helix and its preceding helix always have the same relative orientation. [6]

          Several attempts have been made to classify the helix-turn-helix motifs based on their structure and the spatial arrangement of their helices. [6] [7] [8] Some of the main types are described below.

          Di-helical Edit

          The di-helical helix-turn-helix motif is the simplest helix-turn-helix motif. A fragment of Engrailed homeodomain encompassing only the two helices and the turn was found to be an ultrafast independently folding protein domain. [9]

          Tri-helical Edit

          An example of this motif is found in the transcriptional activator Myb. [10]

          Tetra-helical Edit

          The tetra-helical helix-turn-helix motif has an additional C-terminal helix compared to the tri-helical motifs. These include the LuxR-type DNA-binding HTH domain found in bacterial transcription factors and the helix-turn-helix motif found in the TetR repressors. [11] Multihelical versions with additional helices also occur. [12]

          Winged helix-turn-helix Edit

          The winged helix-turn-helix (wHTH) motif is formed by a 3-helical bundle and a 3- or 4-strand beta-sheet (wing). The topology of helices and strands in the wHTH motifs may vary. In the transcription factor ETS wHTH folds into a helix-turn-helix motif on a four-stranded anti-parallel beta-sheet scaffold arranged in the order α1-β1-β2-α2-α3-β3-β4 where the third helix is the DNA recognition helix. [13] [14]

          Other modified helix-turn-helix motifs Edit

          Other derivatives of the helix-turn-helix motif include the DNA-binding domain found in MarR, a regulator of multiple antibiotic resistance, which forms a winged helix-turn-helix with an additional C-terminal alpha helix. [8] [15]

          Transcription factor

          Our editors will review what you’ve submitted and determine whether to revise the article.

          Transcription factor, molecule that controls the activity of a gene by determining whether the gene’s DNA (deoxyribonucleic acid) is transcribed into RNA (ribonucleic acid). The enzyme RNA polymerase catalyzes the chemical reactions that synthesize RNA, using the gene’s DNA as a template. Transcription factors control when, where, and how efficiently RNA polymerases function.

          Transcription factors are vital for the normal development of an organism, as well as for routine cellular functions and response to disease. Transcription factors are a very diverse family of proteins and generally function in multi-subunit protein complexes. They may bind directly to special “promoter” regions of DNA, which lie upstream of the coding region in a gene, or directly to the RNA polymerase molecule. Transcription factors can activate or repress the transcription of a gene, which is generally a key determinant in whether the gene functions at a given time.

          Basal, or general, transcription factors are necessary for RNA polymerase to function at a site of transcription in eukaryotes. They are considered the most basic set of proteins needed to activate gene transcription, and they include a number of proteins, such as TFIIA (transcription factor II A) and TFIIB (transcription factor II B), among others. Substantial progress has been made in defining the roles played by each of the proteins that compose the basal transcription factor complex.

          During development of multicellular organisms, transcription factors are responsible for dictating the fate of individual cells. For example, homeotic genes control the pattern of body formation, and these genes encode transcription factors that direct cells to form various parts of the body. A homeotic protein can activate one gene but repress another, producing effects that are complementary and necessary for the ordered development of an organism. If a mutation occurs in any of the homeotic transcription factors, an organism will not develop correctly. For example, in fruit flies (Drosophila), mutation of a particular homeotic gene results in altered transcription, leading to the growth of legs on the head instead of antenna this is known as the antennapedia mutation.

          Transcription factors are a common way in which cells respond to extracellular information, such as environmental stimuli and signals from other cells. Transcription factors can have important roles in cancer, if they influence the activity of genes involved in the cell cycle (or cell division cycle). In addition, transcription factors can be the products of oncogenes (genes that are capable of causing cancer) or tumour suppressor genes (genes that keep cancer in check).

          Watch the video: Δημιουργήστε μία νέα φόρμα με διάταξη πίνακα, με όνομα ΠαραγγελίεςΠροϊόντα η οποία θα βασίζεται.. (August 2022).