Proteins that are strongly overproduced in E. coli and S. cerevisiae?

Proteins that are strongly overproduced in E. coli and S. cerevisiae?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I'm looking for some pointers to proteins that produce at really gigantic levels in E. coli and yeast (S. cerevisiae). Can anyone point to some champion proteins?

Even in inclusion bodies and non functional, just cases where the yield is very high - how much of a given protein can a liter of E. coli or S. cerevisiae make?

I've seen bands on gels like this, but I've probably not seen the champions:

It would be great if you could cite and discuss the plasmid/promoter used and whether codon optimization or other such tricks made a difference.

Heat shock proteins - HSP70 and HSP30.

You may also have ribosomal subunits as well.

The Escherichia coli SOS mutagenesis proteins UmuD and UmuD′ interact physically with the replicative DNA polymerase

The Escherichia coli umuDC operon is induced in response to replication-blocking DNA lesions as part of the SOS response. UmuD protein then undergoes an RecA-facilitated self-cleavage reaction that removes its N-terminal 24 residues to yield UmuD′. UmuD′, UmuC, RecA, and some form of the E. coli replicative DNA polymerase, DNA polymerase III holoenzyme, function in translesion synthesis, the potentially mutagenic process of replication over otherwise blocking lesions. Furthermore, it has been proposed that, before cleavage, UmuD together with UmuC acts as a DNA damage checkpoint system that regulates the rate of DNA synthesis in response to DNA damage, thereby allowing time for accurate repair to take place. Here we provide direct evidence that both uncleaved UmuD and UmuD′ interact physically with the catalytic, proofreading, and processivity subunits of the E. coli replicative polymerase. Consistent with our model proposing that uncleaved UmuD and UmuD′ promote different events, UmuD and UmuD′ interact differently with DNA polymerase III: whereas uncleaved UmuD interacts more strongly with β than it does with α, UmuD′ interacts more strongly with α than it does with β. We propose that the protein–protein interactions we have characterized are part of a higher-order regulatory system of replication fork management that controls when the umuDC gene products can gain access to the replication fork.

The Escherichia coli SOS response is the paradigm for how a cell responds to DNA damage (1). Although most of the repair and damage tolerance pathways induced as part of the SOS response are error-free, one component of the response is umuDC-dependent translesion synthesis (TLS), which is responsible for most of the mutagenesis induced by UV radiation and many chemicals (2, 3). TLS requires the UmuD′ protein (a posttranslationally modified form of the umuD gene product), UmuC, and RecA. Collectively, these proteins are thought to function in concert with the replicative polymerase, DNA polymerase III (pol III) holoenzyme, to enable replication over lesions in damaged DNA that otherwise would be strongly blocking (1, 2).

After the umuDC operon is induced in response to DNA damage, the UmuD protein undergoes an RecA-facilitated self-cleavage reaction that removes its N-terminal 24 residues to yield UmuD′ (4, 5), an event that activates it for TLS (6). Uncleaved UmuD is not only inactive in TLS, but is an inhibitor of this process as well (7). Recently, we have proposed that uncleaved UmuD acting together with UmuC plays a positive role in helping cells survive DNA damage by acting as a prokaryotic DNA damage cell-cycle checkpoint system that regulates DNA synthesis after DNA damage, thereby allowing time for accurate repair to take place (8). Thus, RecA-facilitated cleavage of UmuD to UmuD′ appears to function as a molecular switch that regulates the release of the checkpoint control while simultaneously helping to restart stalled replisomes by TLS.

Echols and colleagues (9) were the first to show that the addition of UmuD′, UmuC, and RecA, the three proteins genetically shown to be required for SOS mutagenesis (1), to pol III holoenzyme resulted in TLS. This finding since has been reproduced by others (10, 11). UmuC is a member of a large family of related proteins referred to as the UmuC superfamily (1𠄳, 12), which can be divided into four subfamilies defined by E. coli umuC, E. coli dinB, Saccharomyces cerevisiae REV1, and S. cerevisiae and human RAD30 (13, 14). Importantly, key representatives of the UmuC superfamily, including E. coli DinB (pol IV) (15), S. cerevisiae Rev1 (16), S. cerevisiae Rad30 (pol η) (17), and human Rad30 homolog, XP-V (18), all have been shown to exhibit a DNA polymerase activity. Consistent with this, UmuD′2C recently has been shown to contain an intrinsic, error-prone DNA polymerase activity (pol V) that, in combination with pol III, permits efficient replication past a synthetic abasic site in vitro (19). How this catalytic activity of UmuD′2C complex is coordinated with the action of E. coli’s replicative polymerase, an 18-polypeptide protein machine (20), is not yet understood.

In this paper we describe our efforts to determine whether the UmuD and/or UmuD′ proteins interact with E. coli pol III. Our results represent the first biochemical evidence that these two forms of the umuD gene product can interact with specific components of E. coli’s replicative DNA polymerase and suggest that the interactions are part of a higher-order regulatory system of replication fork management that serves to regulate access of the umuDC gene products to the replication fork.


Amyloids represent protein aggregates having an unusual structure formed by intermolecular beta-sheets and stabilized by numerous hydrogen bonds [1]. Such a structure called “cross-β” [2] gives amyloids the morphology of predominantly unbranched fibrils and unique physicochemical properties including (i) resistance to treatment with ionic detergents and proteinases, (ii) binding amyloid-specific dyes like Thioflavin T (ThT), and (iii) apple-green birefringence in polarized light upon binding with Congo Red (CR) dye [1,3].

The biological significance of amyloids is based on two aspects, namely, pathological and functional. Amyloid deposition is associated with the development of more than 40 incurable human and animal diseases including various types of amyloidoses and neurodegenerative disorders [4,5]. Nevertheless, amyloids may not be only pathogenic but functional as well [6]. A growing number of studies demonstrate that amyloids play vital roles in archaea [7], bacteria, and eukarya including humans [8]. Amyloids of prokaryotes fulfill mostly structural (biofilm and sheaths formation) and storage (toxin accumulation) functions [9]. In fungi, infectious amyloids called prions control heterokaryon incompatibility, multicellularity, and drug resistance [10–12]. In animals, amyloid formation is important for different functions including the long-term memory potentiation, melanin polymerization, hormone storage, and programmed necrosis [13]. Compared to other groups of organisms, plants remain to be poorly studied in the field of amyloid biology.

Notably, the “amyloid” term was initially introduced in 1838 by Matthias Schleiden to describe plant cell carbohydrates and attributed in 1854 by Rudolph Virchow to pathological protein deposits in human tissues [14]. Early studies performed in 1920s through the 1950s have led to hypotheses on the presence of the so-called “amyloids” in plant seeds [15]. However, these structures were found to be xyloglucans, the major cell wall matrix polysaccharides [16]. Nevertheless, recently, some plant proteins or their regions were shown to form fibrils with several properties of amyloids in vitro (in denaturing conditions [17,18], after the proteolytic digestion or other treatments—reviewed in the work by Jansens and colleagues and the work by Cao and Mezzenga [19,20]), suggesting that plants might form bona fide amyloids in vivo [21].

Previously, we performed a large-scale bioinformatic analysis of potentially amyloidogenic properties of plant proteins including all annotated proteomes of land plant species [22]. This screening demonstrated that seed storage proteins comprising the evolutionary conservative β-barrel domain Cupin-1 were rich in amyloidogenic regions in the majority of analyzed species [22]. Such proteins belonging mainly to 11S and 7S globulins [23] represent key amino acid sources for the growing seedlings, important components of human diet and major allergens [24]. We hypothesize that the amyloid formation could occur at seed maturation to stabilize storage proteins, thus preventing their degradation and misfolding during the seed dormancy. In order to test this hypothesis, we have analyzed whether amyloid proteins are present in seeds of an important agricultural crop and Mendel’s genetic model, garden pea Pisum sativum L.


Peroxisome numbers in oleate-induced S. cerevisiae dnm1 and fis1 cells

We analyzed peroxisome numbers in cells of S. cerevisiae DNM1 and FIS1 deletion strains, grown on glucose or in the presence of oleate, using wild-type (WT) cells and VPS1 deletion cells as controls. All strains produced GFP-SKL to label the peroxisomes. First, we performed a quantitative analysis of peroxisome abundance in the various strains and found a large variation in peroxisome numbers in oleate-grown WT cells (Fig. 1 and Table 1). In these cells, up to 12 fluorescent spots could be detected per cell. Most cells contained 2-7 fluorescent spots, with an average of ∼4.2 per cell (Fig. 1B, Table 1). The organelle numbers were reduced in oleate-induced dnm1 and fis1 cells, which showed comparable average numbers and frequency distributions (Fig. 1D,F, Table 1).

Average number of peroxisomes in S. cerevisiae WT and mutant cells

Strain . Glucose . Oleate .
WT 1.62±0.11 4.15±0.18
vps1 1.15±0.09 0.98±0.07
dnm1 1.73±0.12 2.04±0.14
fis1 1.55±0.11 1.83±0.13
dnm1 vps1 0.85±0.05 0.85±0.05
dnm1 vps1 GFP-Ant1p 0.86±0.03 0.85±0.03
Strain . Glucose . Oleate .
WT 1.62±0.11 4.15±0.18
vps1 1.15±0.09 0.98±0.07
dnm1 1.73±0.12 2.04±0.14
fis1 1.55±0.11 1.83±0.13
dnm1 vps1 0.85±0.05 0.85±0.05
dnm1 vps1 GFP-Ant1p 0.86±0.03 0.85±0.03

Average numbers of fluorescent spots per cell observed in various glucose or oleate grown S. cerevisiae strains are presented as means ± s.e.m. Statistical analysis (Z-test) revealed that the differences in average number of peroxisomes in vps1, dnm1, fis1 and dnm1 vps1 cells relative to WT controls were significant (P<0.005) except for dnm1 and fis1 cells grown on glucose. The differences between vps1 and dnm1 vps1 cells grown either on glucose or in the presence of oleate were also significant (P<0.005). Peroxisomes were labeled with GFP-SKL, except for dnm1 vps1 GFP-ANT1 where the peroxisomal membrane marker GFP-Ant1p was used

A greater reduction in organelle numbers was observed in vps1 cells grown in the presence of oleate. On average, vps1 cells contained a single fluorescent spot, although a significant fraction of the cells still harbored two or more spots (Fig. 1H, Fig. 2, Table 1). The lowest peroxisome abundance was invariably observed in cells of the dnm1 vps1 double-deletion strain, of which the cells contained a single peroxisome with only rare exceptions (Fig. 1J, Fig. 2). Similar average peroxisome numbers were observed in dnm1 vps1 cells when the peroxisomal membrane marker GFP-Ant1p was used instead of GFP-SKL, indicating that all peroxisomes were labelled by GFP-SKL (Fig. 2C, Table 1). The difference in peroxisome abundance between glucose or oleate-induced vps1 and dnm1 vps1 cells is reflected in a small, but significant difference in the average number of spots per cell (Table 1).

The reduction in organelle numbers in dnm1 and fis1 cells compared with WT controls was not observed when cells were grown on glucose (Fig. 1A,C,E, Table 1). Under these conditions, a reduction in organelle number was only observed in vps1 cells and was further pronounced in dnm1 vps1 cells (Fig. 1G,I, Table 1). The average peroxisome number of fis1 vps1 cells (0.90±0.03) was identical to that of dnm1 vps1 cells. Based on these data, we conclude that the peroxisome phenotype of dnm1 and fis1 cells was not evident in glucose-grown cells, in line with the earlier observations by Hoepfner et al. (Hoepfner et al., 2001), but was evident in oleate-induced cells and further reinforced in dnm1 vps1 double-deletion cells.

Vps1p, Dnm1p and Fis1p play a role in regulating peroxisome numbers in S. cerevisiae. Quantitative data on peroxisome abundance were obtained by fluorescence microscopy of GFP-SKL producing S. cerevisiae strains cultured on glucose (A,C,E,G,I) or in the presence of oleate (B,D,F,H,J). The number of fluorescent spots per cell was counted from randomly taken fluorescence microscope images. For each sample, fluorescent spots were counted in 300 non-budding cells taken from two independent cultures (150 cells per culture). The frequency distributions of cells with different numbers of fluorescent spots are shown. Error bars represent the s.e.m.

Vps1p, Dnm1p and Fis1p play a role in regulating peroxisome numbers in S. cerevisiae. Quantitative data on peroxisome abundance were obtained by fluorescence microscopy of GFP-SKL producing S. cerevisiae strains cultured on glucose (A,C,E,G,I) or in the presence of oleate (B,D,F,H,J). The number of fluorescent spots per cell was counted from randomly taken fluorescence microscope images. For each sample, fluorescent spots were counted in 300 non-budding cells taken from two independent cultures (150 cells per culture). The frequency distributions of cells with different numbers of fluorescent spots are shown. Error bars represent the s.e.m.

Representative fluorescence images of oleate-induced cells of WT and the various mutant strains are shown in Fig. 2A,C. In the dnm1 and fis1 cells, the number of organelles is clearly reduced, but no strong alterations in peroxisome morphology were observed relative to WT cells. By contrast, in vps1 and dnm1 vps1 cells, the enlarged peroxisomes often showed constrictions (Fig. 2A). Mitochondrial staining (Fig. 2B) revealed the expected alterations in mitochondrial morphology in dnm1, fis1 and dnm1 vps1 cells (one long tubular structure instead of several, branched mitochondria) (Bleazard et al., 1999). In vps1 cells the mitochondrial morphology was similar to that observed in WT cells.

Peroxisome and mitochondrial morphology in oleate-induced cells. (A) Representative fluorescence images of GFP-SKL-producing, oleate-induced cells of WT and the various mutant strains. In dnm1 and vps1 cells the peroxisome number is reduced. In vps1 and dnm1 vps1 cells large peroxisomes are present, which are often constricted. (B) Examples of the mitochondrial morphology in oleate-induced cells of the same strains. Mitochondria were stained using MitoTracker Orange. In WT and vps1 cells a branched mitochondrial network is evident. In dnm1, fis1 and dnm1 vps1 cells generally one long tubular structure is present that contains fewer branches than in WT cells and remains near the cell cortex. (C) A GFP-Ant1p-producing dnm1 vps1 cell induced on oleate medium. GFP-Ant1p is localized to peroxisomal membranes. As in the dnm1 vps1 cells producing GFP-SKL, generally a single, large peroxisome is observed per cell. Bars, 5 μm.

Peroxisome and mitochondrial morphology in oleate-induced cells. (A) Representative fluorescence images of GFP-SKL-producing, oleate-induced cells of WT and the various mutant strains. In dnm1 and vps1 cells the peroxisome number is reduced. In vps1 and dnm1 vps1 cells large peroxisomes are present, which are often constricted. (B) Examples of the mitochondrial morphology in oleate-induced cells of the same strains. Mitochondria were stained using MitoTracker Orange. In WT and vps1 cells a branched mitochondrial network is evident. In dnm1, fis1 and dnm1 vps1 cells generally one long tubular structure is present that contains fewer branches than in WT cells and remains near the cell cortex. (C) A GFP-Ant1p-producing dnm1 vps1 cell induced on oleate medium. GFP-Ant1p is localized to peroxisomal membranes. As in the dnm1 vps1 cells producing GFP-SKL, generally a single, large peroxisome is observed per cell. Bars, 5 μm.

Peroxisome positioning

To study whether deletion of VPS1, DNM1 or FIS1 affected organelle position in budding cells, we quantitatively analyzed the distribution of peroxisomes over mother cells and buds using fluorescence microscopy. In WT controls, grown on glucose or oleate, the expected peroxisome distribution pattern was observed: organelles accumulated in the neck region between the mother cell and the bud (Fig. 3A,B region 3) and were also abundant in the buds (Fig. 3A,B region 4). Comparable peroxisome distribution patterns were observed in each of the mutant strains (vps1, dnm1, fis1, dnm1 vps1), indicating that deletion of either of the genes encoding a DRP, although influencing total numbers, did not affect the patterns of peroxisome positioning in budding S. cerevisiae cells (Fig. 3C-J).

Organelle dynamics in dnm1 vps1 cells

Time-lapse videos were recorded by confocal laser-scanning microscopy (CLSM) to relate the process of peroxisome fission and inheritance in dividing WT, vps1 and dnm1 vps1 cells producing GFP-SKL. The data summarized in Fig. 4 are extracted from the videos of oleate-induced cells (Movies 1-5 in supplementary material). The time-lapse series presented in Fig. 4A shows that in WT cells, peroxisomes migrate into the buds at very early stages of their development (see also Movie 1 in supplementary material). Comparable patterns were observed in oleate-induced dnm1 and fis1 cells (data not shown). The remaining organelles are retained in the mother cells. The series of Fig. 4B shows that this process differs in oleate-induced vps1 cells. In these cells, elongated peroxisomes were often observed located in the neck between the mother cell and the bud. This morphology and position is similar to earlier observations by Hoepfner et al. (Hoepfner et al., 2001) in glucose-grown S. cerevisiae vps1 cells. However, more than one organelle was often also present in the mother cell before the onset of bud formation or at the initial stages of bud development. In such cells, one of these organelles migrates into the developing bud, in a similar manner to that observed in the WT control (see also Movie 2 in supplementary material).

The process of peroxisome segregation was different in dnm1 vps1 cells (Fig. 4C). Time-lapse videos revealed that the single elongated peroxisome protruded into the developing bud. This structure was maintained in this position until the very late stages of the cell division process (see also Movie 3 in supplementary material and Fig. 5). A detailed image of an elongated peroxisomal structure at this stage of yeast budding is shown in Fig. 5 (see also Movie 4 in supplementary material). This image, obtained by 3D CLSM, illustrates that at a very late stage of yeast budding, a single, elongated peroxisome protrudes from the mother cell into the bud.

Dnm1p and Fis1p are localized to mitochondria and peroxisomes

Our observation that Dnm1p plays a role in peroxisome abundance in yeast implies that the protein can be localized to these organelles. To study this, we analyzed WT S. cerevisiae, producing Dnm1-GFP and incubated with MitoTracker Orange to visualize mitochondria. As shown in Fig. 6A, most Dnm1-GFP fluorescence is observed as distinct spots at elongated mitochondrial structures, in line with earlier reports on Dnm1p localization (Bleazard et al., 1999). However, GFP fluorescent spots were regularly observed that did not co-localize with MitoTracker (Fig. 6A). To examine whether these spots were associated with peroxisomes, a strain was analyzed that co-produced the red fluorescent protein DsRed fused to a PTS1 (DsRed-SKL). In these cells, few of the Dnm1p-related green fluorescent spots co-localized with red fluorescence, in either glucose-grown cells (Fig. 6B, Movie 5 in supplementary material) or oleate-induced cells (data not shown). Association of Dnm1p-GFP with peroxisomes was not increased in oleate-induced cells relative to glucose-grown cells. These observations indicate that yeast Dnm1p is predominantly localized to mitochondria, but may also be present at peroxisomes.

Deletion of VPS1, DNM1 or FIS1 does not affect peroxisome positioning. Quantitative data on peroxisome positioning in budding cells were obtained by fluorescence microscopy of GFP-SKL-producing S. cerevisiae strains cultured on glucose (A,C,E,G,I) or in the presence of oleate (B,D,F,H,J). The number of fluorescent spots in each cell region (1-4) was counted from randomly taken fluorescence microscope images. Region 1 is the part of the mother cell opposite to the bud, region 2 is the central region in the mother cell, region 3 represents the region in the mother cell near the bud neck and region 4 is the developing bud (see A). For each sample, fluorescent spots were counted in 300 cells taken from two independent cultures (150 cells per culture). The frequency distributions of cell regions with different numbers of fluorescent spots are shown. Error bars represent the s.e.m.

Deletion of VPS1, DNM1 or FIS1 does not affect peroxisome positioning. Quantitative data on peroxisome positioning in budding cells were obtained by fluorescence microscopy of GFP-SKL-producing S. cerevisiae strains cultured on glucose (A,C,E,G,I) or in the presence of oleate (B,D,F,H,J). The number of fluorescent spots in each cell region (1-4) was counted from randomly taken fluorescence microscope images. Region 1 is the part of the mother cell opposite to the bud, region 2 is the central region in the mother cell, region 3 represents the region in the mother cell near the bud neck and region 4 is the developing bud (see A). For each sample, fluorescent spots were counted in 300 cells taken from two independent cultures (150 cells per culture). The frequency distributions of cell regions with different numbers of fluorescent spots are shown. Error bars represent the s.e.m.

Localization of Dnm1p-GFP in a fis1 deletion strain (Fig. 6C) revealed a strong reduction in the number of fluorescent spots, which is in line with earlier observations by Mozdy et al. (Mozdy et al., 2000). In these cells, no peroxisome-localized Dnm1-GFP was detected.

Finally, we analyzed the localization of Fis1p. As shown in Fig. 6D, a fusion protein consisting of GFP fused to full-length Fis1p (GFP-Fis1p) is mainly localized at large structures, which represent mitochondria. However, a portion of the protein is present in smaller spots, which co-localize with the peroxisomal marker protein DsRed-SKL (Fig. 6D). These findings indicate that in S. cerevisiae, both Dnm1p and Fis1p have a dual location on mitochondria and peroxisomes.


The GPI anchor sequence affects Epa1p localization

We examined the effect of changes to the GPI anchor addition signal on the function and localization of Epa1p, a GPI-anchored CWP. As shown in Fig. 1A, we made three constructs, all of which were derivatives of haemagglutinin (HA)-tagged Epa1p. The N-terminal 440 amino acids of Epa1p, which includes the ligand binding domain, and an HA epitope tag were fused to the C-termini of three GPI-anchored proteins, Epa1p itself, Cwp2p (a GPI-CWP) and Yps1p (a GPI-PMP). For each of these three proteins, the final fusion protein includes the GPI addition signal and the 50 amino acids N-terminal to the ω site. All these fusion proteins were expressed in S. cerevisiae under the control of a galactose-inducible promoter. We first determined whether swapping the GPI signal affected the ability of Epa1p to mediate adherence. Although full-length Epa1p, Epa1p(1–440)/ωEPA1 and Epa1p(1–440)/ωCWP2 all mediated similar levels of adherence, the adherence mediated by Epa1p(1–440)/ωYPS1 was significantly lower, ≈ 20–25% of that of the corresponding ωEPA1 or ωCWP2 constructs (Fig. 1B). The reason for this was clear when we examined cell surface expression of the various constructs, and found that Epa1p(1–440)/ωYPS1 was expressed at much lower levels on the cell surface of S. cerevisiae than any of the other constructs (Fig. 1C). To ensure that this result was not an artifact specific to the ωYPS1 sequence, we made four other constructs with GPI anchor sequences from other GPI proteins predicted to be GPI-PMs ( Caro et al., 1997 ). For all four of these constructs, Epa1p(1–440)/ωSPS2, Epa1p(1–440)/ωGAS5, Epa1p(1–440)/ωPST1 and Epa1p(1–440)/ωYNL190W, we saw a similar reduction in adherence (ranging from 5% to 33% of Epa1p(1–440)/ωEPA1) and approximately fivefold reductions in the level of surface fluorescence relative to the fluorescence of Epa1p(1–440)/ωEPA1 (data not shown). These data showed that the sequence of the GPI signal can strongly affect surface localization and function of the Epa1p adhesin. To determine biochemically how much of each protein was made and where each localized, we fractionated the cell into crude membrane and cell wall fractions. We then extracted the cell wall fractions with β1,3 glucanase and compared the amount of protein present in the cell wall or SDS-soluble (membrane) fractions. We could show that the Epa1p(1–440)/ωYPS1 fusion protein was seen primarily as a 300 kDa band in the membrane fraction, whereas the Epa1p(1–440)/ωCWP2 fusion protein was seen at the cell wall as a high-molecular-weight smear and in the membrane fraction running between 100 kDa and 150 kDa (Fig. 1D). This experiment suggested that the nature of the GPI anchor signal can strongly influence the distribution of a GPI protein between the cell wall and the membrane.

GPI anchor class determines cellular localization for a GPI-anchored protein. A. Fusion constructs of the N-terminus of Epa1p with different GPI anchor signal sequences. The white bar represents the HA epitope tag. B. In vitro adherence assay of various fusion proteins to cultured Lec2 cells. The constructs were all expressed from the GALS promoter, which was induced by the addition of 2% galactose for 2 h. Adherence shown is normalized to the adherence conferred by full-length HA-tagged Epa1p (denoted Epa1p). All experiments were carried out in triplicate. C. FACS analysis of expressed fusion proteins. Surface expression was monitored by immunofluorescence after labelling with anti-HA antibody and FITC-conjugated secondary goat anti-rabbit antibody. Fluorescence quantified on a FACScan instrument. Average fluorescence of the population indicated. The presence of a low fluorescence peak results from cells that have lost the plasmid (as demonstrated by analysis of this population by FACS). D. Western blot of fusion proteins. The fusion proteins were expressed as above, proteins prepared from cell wall and membrane as described in Experimental procedures and analysed by Western analysis. M denotes the membrane fraction and C the cell wall fraction.

Yps1p GPI signal mutagenesis

Although the data presented above suggested that GPI signals from different proteins can directly impact the distribution between plasma membrane and cell wall, it was not conclusive as the very different sequences of the GPI signals (all derived from different proteins) could in theory affect protein function or localization independent of the GPI signal itself. We therefore undertook a mutagenesis of the amino acids N-terminal to the Yps1p ω site. This was done in order to see whether point mutations could redirect this GPI protein from a predominantly membrane localization to localization in the cell wall. As shown above, Epa1p(1–440)/ωYPS1, a protein fusion of the N-terminus of Epa1p to the C-terminal 71 amino acids (including the ω site) of Yps1p, is located primarily in the membrane fraction (Fig. 1D). We started with this construct and, using oligonucleotide-directed mutagenesis, we constructed a library of mutants targeting the six amino acids preceding the ω site (ω-1 to ω-6) such that each mutated clone in the library had two amino acid changes on average (Fig. 2A). Cells expressing GPI-CWPs have a higher fluorescence when labelled with fluorescein isothiocyanate (FITC)-conjugated antibodies than do cells expressing GPI-PMPs (Fig. 1C), and we used this fact as the basis for selecting mutants of Epa1p(1–440)/ωYPS1 with increased cell wall localization. Starting with a library in yeast of 5 × 10 5 mutant proteins, we used fluorescence-activate cell sorting (FACS) to enrich moderately (10-fold) for fluorescent strains. We rescued the plasmids encoding the mutated proteins and sequenced them to determine the amino acid changes. The sequenced plasmids were then retransformed into a clean background and analysed for cell wall localization using antibody labelling. Table 1 shows the sequence of the mutants that we analysed and the fluorescence for the corresponding strains relative to the parent Epa1p(1–440)/ωYPS1 construct. From this list, it is apparent that many amino acid changes are consistent with delivery of the protein to the cell wall, consistent with the model of Klis that the primary GPI signal is a PM retention signal found in GPI-PMs ( Caro et al., 1997 ). In that model, it would be predicted that many amino acid changes, any of which destroy the retention signal, would increase expression at the cell wall. In Fig. 2B, we plotted the number of mutations against the average fluorescence for the mutants analysed and found a correlation between the number of amino acid changes in the ω-1 to ω-6 region and the cell wall localization. This suggested to us that most of the six amino acids in this region contribute at least partially to membrane retention. Lastly, in Fig. 2C, we analysed the mutants that were at least threefold more fluorescent than the Epa1p(1–440)/ωYPS1 parent. Strikingly, at position ω-2, 93% (26/28) of the highly fluorescent mutants had an amino acid change (Fig. 2C). These mutants have a mutation away from the wild-type lysine to several other amino acids (Table 1). Within this group of highly fluorescent mutants, ω-1 and ω-4 were also mutated approximately two-thirds of the time, perhaps also pointing to an important role for amino acids at those positions. In the unselected population, a given position was altered in only 30–35% of clones (data not shown). We conclude that ω-2 (and to a lesser degree ω-1 and ω-4) is a particularly important residue for plasma membrane retention of Yps1p because, in our study, it had to be mutated to achieve moderately high levels in the cell wall.

ω-site mutagenesis of EPA1(1440)/ωYPS1. A. The sequence of the ω site of Yps1p. The seven amino acids shown correspond to the ω site through ω-6. The 18 nucleotides preceding the ω site are mutated in the library. B. Graph of the average number of mutations per six amino acid region for a window of four mutants plotted against the fluorescence averaged across the same window of four mutants. Fluorescence shown is the average of three experiments and is presented as the fold increase over the parent Epa1p(1–440)/ωYPS1. C. Graph of the percentage of mutated clones at each position in the highly fluorescent population of mutants. The mutants with fluorescence over threefold higher than Epa1p(1–440)/ωYPS1 were selected, and the number of mutants with amino acid changes at each position are plotted. The percentage shown is the number of mutations divided by the total number of clones in the highly fluorescent population.

Mutant no. -6 -5 -4 -3 -2 -1 Fluorescence
YPS1 S T S S K R 1.0
41 T S 1.8
40 K T G I 1.8
12 Y S C 1.8
34 H R G 1.9
44 M P N G 2.3
36 C A I 2.3
42 R Q 2.4
39 T I M S 2.5
16 N N 2.5
30 T 2.5
20 A L G 2.6
29 C T S 2.6
17 I N T 2.8
28 Y T I S 2.8
37 P Q 3.0
33 F V 3.0
23 I T I 3.0
35 C I I 3.2
32 L G 3.2
51 L L H L M 3.6
31 A T P 3.7
25 I L 3.9
26 I Q P 3.9
15 F L I 4.0
4 A L T 4.0
19 F T Q G 4.2
18 V T I S 4.2
22 I Q L 4.3
14 C L Q G 4.3
21 Y L N 4.3
3 L A Q I 4.4
24 F L C N 4.6
50 C L R T T 4.6
11 M F I N 4.6
8 C I T Q G 4.7
13 S I W Q 4.8
6 L L I 5.2
5 A P P F T I 5.4
43 Y A L C E I 5.9
48 P A L T A 6.0
1 L T I 6.0
49 V I N I 6.0
CWP2 I S Q Q T E 6.0
  • Single clones were screened by FACScan for surface fluorescence of Epa1p(1–440)/ωYPS1 mutants. The amino acids shown are those that are mutated from the wild-type Yps1p amino acid sequence. Mutants are ordered according to the surface fluorescence averaged from triplicate FACS experiments. Fluorescence is shown as fold increase above the fluorescence of the parental Epa1p(1–440)/ωYPS1 construct. For reference, the fluorescence of Epa1p(1–440)/ωCWP2 and the sequence of the six amino acids upstream of the Cwp2p ω site are shown.

To verify that the mutation changes were affecting the distribution of the Epa1p(1–440)/ωYPS1 protein between plasma membrane and cell wall rather than simply increasing overall protein levels, we analysed several of the high and low fluorescent mutants by Western blot. When we compared the wild-type Epa1p(1–440)/ωYPS1 construct with three representative high fluorescence mutants (43, 5 and 6), we found that there was a significantly greater amount of protein at the cell wall in the mutants, demonstrating a redistribution in the localization from the plasma membrane to the cell wall (Fig. 3A). In contrast, three representative clones with mutations that resulted in no significant increase in expression at the cell wall (12, 40 and 41) showed primary localization in the membrane fraction, similar to that seen for the parental construct Epa1p(1–440)/ωYPS1.

Western analysis of selected ωYPS1 mutants. All constructs were expressed from a GALS promoter by induction with 2% galactose for 2 h. Cell wall and membrane proteins were extracted as described in Experimental procedures. Proteins were run on a 3–8% SDS-PAGE gel and detected with HRP-conjugated anti-HA antibody (Santa Cruz). M indicates the membrane fraction and C the cell wall fraction. A. Analysis of Epa1p(1–440)/ωYPS1 and mutant derivatives of the parent Epa1p(1–440)/ωYPS1 construct were expressed and extracted. YPS1 denotes the parent Epa1p(1–440)/ωYPS1. Mutant numbers 12, 40 and 41 are from the low fluorescence population, and mutants 43, 5 and 6 are from the high fluorescence population. B. Western analysis of HA-ωYPS1 and mutant derivatives.

To show the functional consequences of these changes in localization, each mutant was assayed for adherence because the N-terminus of Epa1p contains the complete ligand binding domain. We found that mutants that showed the lowest fluorescence (<twofold above Epa1p(1–440)/ωYPS1), which included the three representative mutants analysed biochemically above (12, 40 and 41), conferred the same low level of adherence as the parental Epa1p(1–440)/ωYPS1 construct (data not shown). The majority of mutants, however, could confer adherence at levels similar to the adherence conferred by the Epa1p(1–440)/ωCWP2 construct. These data suggested that, even though mutants with low cell wall expression are clearly compromised for adherence, mutants showing intermediate levels of Epa1p in the cell wall have close to wild-type levels of adherence, possibly pointing to a threshold effect of Epa1p in terms of its function in our adherence assay.

We wanted to generalize our results by determining whether the Epa1p N-terminal 440-amino-acid domain was having any effect on distribution of the wild-type and mutant proteins between cell wall and membrane. We therefore deleted the ligand binding domain from the Epa1p(1–440)/ωYPS1 construct as well as from a subset of the mutant derivatives. In these constructs (HA-ωYPS1), the Epa1p signal sequence and HA tag are fused directly to the Yps1p wild-type and mutant GPI regions. These constructs were expressed under the control of the GALS promoter in S. cerevisiae and analysed by Western analysis (Fig. 3B). Consistent with what we found in Fig. 3A, the wild-type HA-ωYPS1 protein and three mutant derivatives with the lowest cell wall fluorescence were all retained in the membrane fraction, whereas the constructs corresponding to the most highly fluorescent mutants had a relative reduction of protein in the membrane fraction and significant amounts in the cell wall. To generate a mutant HA-ωYPS1 derivative that directs localization exclusively to the cell wall, we changed the ω-1 to ω-6 region of Yps1p to the corresponding amino acids from Cwp2p. The mutant protein (HA-ωYPS1(CW-like)) was directed essentially totally to the cell wall rather than the plasma membrane (Fig. 4A).

GPI-PMPs can be redirected to the cell wall by mutating the amino acids preceding the ω site. HA-ωYPS1 was mutated to contain the ω-minus region of Cwp2p (HA-ωYPS1(CW like)). Proteins were expressed from a GALS promoter by induction with 2% galactose for 2 h and extracted as described in Experimental procedures. Proteins were run on a 3–8% SDS-PAGE gel and detected with HRP-conjugated anti-HA antibody (Santa Cruz). C denotes the cell wall fraction and M the membrane fraction. YPS1 indicates the HA-ωYPS1 construct, and YPS1CW Like indicates the HA-ωYPS1(CW-like) construct. A. Analysis in a wild-type background with cells grown at 30°C. B. Analysis in a sec23-1 mutant background. The cells were grown at 23°C, then shifted to 37°C or maintained at 23°C, for the 2 h galactose induction.

To this point, we have shown that the nature of the GPI anchor signal can significantly alter both the amount of protein in the cell wall and the function of the protein. This alteration is not the result of a change in protein levels, but results rather from a redistribution of the protein between cell wall and membrane as point mutants that increase CW localization primarily alter the ratio of protein between cell wall and membrane fractions. One concern we had was that the amino acid changes that we identified were affecting the efficiency of GPI anchor addition rather than distribution between the plasma membrane and the cell wall. This might have resulted in an increase in SDS-soluble protein as GPI anchor addition is required for the trafficking of GPI proteins from the ER. We wanted to verify that the proteins we were following in the membrane fractions were not an ER form of the protein but rather were Golgi-modified forms of the protein, consistent with their plasma membrane localization. To address this possibility, we used the sec23-1 temperature-sensitive mutant, which blocks ER to Golgi transport at the non-permissive temperature ( Hicke and Schekman, 1989 ). In Fig. 4B, we expressed the HA-ωYPS1 and analysed its localization in wild type and the sec23-1 strains at 23°C and 37°C. At the non-permissive temperature, the protein migration shifted from ≈ 300 kDa to about 50 kDa. For cells expressing the HA-ωYPS1(CW-like) construct, there was significantly less signal in the membrane fraction, consistent with its localization primarily to the cell wall. For this construct as well, the molecular weight of the material in the membrane decreased dramatically from 300 kDa to about 50 kDa in the sec23-1 mutant at restrictive temperature. The 50 kDa form of the HA-ωYPS1 protein and its derivatives is therefore likely to be the ER form of the protein, and the 300 kDa membrane form is a mature Golgi-modified form. The form that accumulates in the membrane fraction in Figs 1, 3 and 4 therefore cannot be an ER form, and is most likely a plasma membrane form.

Converting a GPI-CWP to GPI-PM

Our analysis above of the cis sequences involved in localizing a GPI-anchored protein to the PM was consistent with the PM retention model of Klis. We decided to test the model rigorously by engineering a switch in localization of a GPI-CWP from the cell wall to the plasma membrane. We first generated an epitope-tagged version of the cell wall protein Cwp2p. Then, we mutated the amino acids preceding the ω site in HA-Cwp2p to be more GPI-PM-like, according to the dibasic motif model. We created a series of constructs in which lysines or arginines were added in the ω-1 to ω-4 region of the HA-Cwp2p-ωCWP2 construct. We also created a construct, HA-Cwp2p-ωYPS1, which has the ω-1 to ω−6 region mutated to the corresponding region of Yps1p (ISQQTE to STSSKR). These were expressed under the control of the TEF1 promoter, and their surface localization was assayed by FACS. Consistent with the dibasic motif PM retention model, there was little effect of any of the single mutations, but up to a sixfold reduction in surface fluorescence in the double mutants (Table 2). We also analysed HA-Cwp2p, HA-Cwp2p-ω-1R,-2K and HA-Cwp2p-ωYPS1 by FACS and Western blot when expressed under the control of the GALS promoter. The effect of these mutations was dramatic (Fig. 5A and B). Only HA-Cwp2p was strongly expressed at the cell surface (Fig. 5A). Likewise, HA-Cwp2p was present in the cell wall fraction, whereas HA-Cwp2p-ω-1R,-2K was present mostly in the plasma membrane fraction with a minority of signal present in the cell wall, and HA-Cwp2p-ωYPS1 was present only in the membrane fraction (Fig. 5B). Thus, by mutating the amino acids upstream of the Cwp2p ω site, the distribution between cell wall and membrane can be altered dramatically.

Cwp2p mutant % of Cwp2p fluorescence
Cwp2p 100
Cwp2p-ω-1R-2K 12
Cwp2p-ω-3R-4K 32
Cwp2p-ω-1K-2K 17
Cwp2p-ω-3K-4K 29
Cwp2p-ω-1R 84
Cwp2p-ω-2R 74
Cwp2p-ω-3R 101
Cwp2p-ω-4R 97
Cwp2p-ω-1K 94
Cwp2p-ω-2K 72
Cwp2p-ω-3K 99
Cwp2p-ω-4K 83
  • Point mutations were made in HA-CWP2, and surface fluorescence was assayed by FACS. The surface fluorescence for the mutants is normalized to the fluorescence of HA-Cwp2p.

Mutation of the ω region of a GPI-CWP decreases expression on the cell surface.

A. FACS analysis of mutants of the Cwp2p ω region. 4742 denotes the untransformed parental strain. CWP2 (denotes HA-Cwp2p-ωCWP2), CWP2-ωKR (denotes HA-Cwp2p-ω-1R-2K) and CWP2ωYPS1 (denotes HA-Cwp2p-ωYPS1) were expressed from the GALS promoter by induction in 2% galactose for 2 h. Surface protein was labelled with anti-HA antibody and then with a FITC-conjugated secondary for analysis of surface expression with a FACScan instrument. X is average fluorescence of the population, and 4742 is the unlabelled parent cell. The six amino acids normally upstream of the Cwp2p ω site are ISQQTE. In the HA-Cwp2p-ω-1R-2K mutant, they are changed to ISQQKR in the HA-Cwp2p-ωYPS1 mutant, they are changed to STSSKR.

B. A GPI-CWP can be retained at the membrane by mutating the amino acids preceding the ω site. HA-Cwp2p-ωCWP2, HA-Cwp2p-ω-1R-2K and HA-Cwp2p-ωYPS1 were expressed from a GALS promoter with 2% galactose for 2 h and extracted as described in Experimental procedures. C is the cell wall fraction and M is the membrane fraction.

C. HA-Cwp2p-ωCWP2 and HA-Cwp2p-ω-1R-2K were expressed in a sec23-1 temperature-sensitive mutant strain from a GALS promoter by induction with 2% galactose for 2 h, the proteins fractionated into cell wall and membrane fractions and analysed by Western analysis. The cells were grown at 23°C, then maintained at 23°C or shifted to 37°C for the entire 2 h galactose induction.

To verify that the HA-Cwp2p-ωYPS1 and HA-Cwp2p-ω-1R,-2K mutant proteins were not being artifactually retained in the ER, we used the sec23-1 temperature-sensitive mutant again to induce an ER to Golgi block. In Fig. 5C, we expressed the three HA-tagged Cwp2 proteins in wild-type and sec23-1 yeast and analysed the proteins by Western analysis. In the membrane fraction, there were multiple HA-Cwp2p bands present that migrate between 40 kDa and 50 kDa. The HA-Cwp2p-ωYPS1 and HA-Cwp2p-ω-1R,-2K mutant proteins accumulated in the membrane fraction as a 50 kDa band. In contrast, in the sec23-1 strain at restrictive temperature, a lower molecular weight 40 kDa band accumulated with no 50 kDa band present at all. Thus, the 40 kDa band represents an ER form of the proteins, whereas the 50 kDa form of HA-Cwp2p is the mature Golgi-modified form, inconsistent with an artifactual retention in the ER and consistent with a plasma membrane localization.

Stuck in the middle with you: Coevolution in microbes

The emergence of coexistence from strong competitive interactions help reinforce that evolution in the context of a community is far different from evolution in isolation.


Copy the link

Across all orders of life, from the smallest microbes to the largest mammals, one thing is for certain: no species evolves in isolation. All of them are embedded within communities, and have to evolve within the context of interactions with other species. When resources in environments – space, light, food, etc – are limited, organisms can either get better at acquiring those resources than others (competition) or learn to live together in harmony (coexistence). In general, competition is more common than cooperation, but there are also mechanisms (outlined in Chessons Modern Coexistence Theory) that can promote coexistence.

So, can two species that compete for the same resource evolve to co-exist? To answer this, we used experimental coevolution of two species, E. coli and S. cerevisiae. Both would be competing for essential resources at the start of the experiment, and thanks to the vast genetic resources available for these model organisms, we could investigate in depth the genetic causes of any evolutionary changes we saw.

One thing we noticed immediately was that these species are competitively mismatched in our environment - E. coli drove S. cerevisiae extinct in 58 of our 60 populations after just 6 weeks of co-evolution! To remedy our now too-low replicate count, we founded a new set of co-culture populations from the two that remained and continued on as before for another 8 weeks. At this point, whilst E. coli outcompeting S. cerevisiae in co-culture was still the most common outcome, we were starting to see signs of coexistence. Now, 4 of the 60 populations contained both species, and S. cerevisiae were even outcompeting E. coli in some populations! On top of this, the ratio of species within these co-cultures were consistent and ecologically stable.

Figure 1. Emergence of coexistence in E. coli and S. cerevisiae communities. Number of populations with both species over 420 generations (left) and 420-980 generations (centre). The equilibrium frequency of S. cerevisiae and E. coli co-culture for ancestral and co-culture evolved strains (right).

Now that we had found that initially strongly competitive species can evolve to coexist together, the next big question was what genetic changes had occurred to allow these species to coexist? Naturally, we expected to see multiple genetic adaptations in S. cerevisiae that would help explain its persistence in co-culture, as it was initially the weaker species of the two. However, we saw the exact opposite! E. coli was the species that showed the most evidence of co-culture specific genetic changes, namely in the genes btuB and fhuA. Even though these genes did not increase the fitness of our evolved strain compared to the ancestor, they were able to alter the equilibrium frequency in co-culture with S. cerevisiae, suggesting these mutations are only beneficial in a community context.

One interesting aspect of coevolution experiments, as opposed to studies on single species, is that extinction is on the table. In single species experiments, beneficial mutations and novel ecotypes always have the potential to arise. However, in coevolutionary systems, there is a strict time limit on how quickly the least fit species must adapt to coexistence before it is driven from the population completely. And this might help explain the surprising genetic changes we see. In most cases, S. cerevisiae are driven extinct from co-culture, unable to adapt quickly enough. However, in the rare cases that they do, a strong selection pressure is subsequently applied to E. coli to adapt to the new co-culture environment, leading to the genetic changes in fhuA and btuB . Figure 2. Schematic representation of the sequence of evolutionary steps leading to the evolution of coexistence.

We are really excited to continue exploring co-culture dynamics with this system, and hope to have more to come in the future!

Materials and methods

Strains and plasmids

The two-hybrid system used here is based on the version originally described by Brent and colleagues [63]. C. jejuni ORFs were cloned into the yeast two-hybrid vector pJZ4-NRT for expression of AD fusions driven by the yeast GAL1 promoter [22], and pHZ5-NRT for expression of LexA DNA BD fusions driven by the yeast MAL62 promoter [23]. Both vectors contain recombination tags for direct cloning of tagged inserts (see below). Yeast strain RFY231 (MATα trp1Δ::hisG his3 ura3-1 leu2::3LexAop-LEU2) contained the AD plasmids, while Y309 (MATa trp1Δ::hisG his3Δ200 leu2-3 lys2Δ201 ura3-52 mal- pSH18-34(URA3, lacZ)) contained the BD plasmids. The reporter genes include LEU2, facilitating growth on medium lacking leucine, and lacZ, expression of which turns yeast colonies blue when the substrate X-Gal is present.

Generation of yeast two-hybrid arrays for C. jejuni

PCR amplification of over 87% of the predicted ORFs from C. jejuni NCTC11168 genomic DNA was previously described [64]. The amplification products included the 21 bp recombination tags 5RT1 and 3RT1 at their 5' and 3' ends, respectively, which match identical sites flanking the insertion site in the yeast two-hybrid vectors. PCR products were cloned into the vectors via homologous recombination in yeast as described previously [22]. To validate the identity of the insert in each vector, the 5' ends of the inserted PCR products were sequenced. We generated 1,398 BD strains and 1,442 AD strains containing the two-hybrid vectors with inserts, of which 90% have been sequence verified. Most of the ORFs missing from the arrays failed PCR amplification prior to cloning.

High-throughput yeast two-hybrid analysis

We mated BD and AD strains using a two-phase pooling (pooled matrix) strategy as described previously [21, 22]. Briefly, 15 pools of approximately 96 AD strains each were generated, along with one additional pool of 32 strains. Each pool was mated with individual BD strains arrayed on 96-well plates, and the resulting diploids were assayed for reporter activities. Positive BD strains were then mated with each member of the positive AD pool arrayed on 96-well plates to identify the interacting pairs. Reporter activities were scored using a custom program for image analysis [65] and at least one manual scoring. LacZ scores ranged from 0 (white) to 5 (dark blue) and Leu scores ranged from 0 (no growth) to 3 (heavy growth) combined scores ranged from 0 to 8. Many BDs have some level of background activity due to activation independent of the AD fusion or non-specific interactions. To correct for these we calculated the average interaction score for each BD based on at least 96 interaction assays and subtracted this background from the reporter scores for each of its interactions. Of these corrected scores, only those ≥ 1 were considered initial positives and were retested (see below). A small subset of BD strains (94 total) was also assayed using a library approach as described [21, 22]. Briefly, BD strains were individually mated with a single pool containing almost all of the AD strains (except Cj1718c (leuB) and Cj1546, which activate reporters without a BD). Up to 30 diploids with reporter activity were picked for each BD. Their AD inserts were PCR amplified and restriction digested to identify strains carrying the same clones. Single representatives from each restriction fragment class (RFC) were then sequenced to identify the inserts. Of the 134 interactions detected, 52 (39%) were also identified in the two-phase matrix screen. Combined, 16,104 unique interactions were retested in one-on-one binary mating assays between individual AD and BD strains on 96-well plates. A total of 11,687 interactions proved repeatable (background-corrected combined activity score ≥ 1), including 73% of those from the two-phase matrix screen, 75% of those from the library screen, and 100% of those detected in both screens. The majority of interactions that failed to repeat had been low-scoring (less than 2) in the initial screen. The 11,687 interactions that repeated were combined with 325 non-repeated interactions that had high confidence scores (see below) to create a dataset containing 12,012 interactions, which we named CampyYTH v3.1. This version of the dataset was subsequently used for bioinformatics analysis as indicated. The interaction data can be visualized and downloaded at [17]. The CampyYTH v3.1 data are also listed in Additional data file 13.

Assignment of confidence scores

Confidence scores were determined for each interaction based on methods described by Bader and colleagues [24, 25]. We fit a generalized linear model [66] using experimental and topological attributes of yeast two-hybrid interactions, including the number of interactions for each protein in a pair and the Leu and lacZ reporter activities Fitting the model required both positive and negative training sets. Because a reference set of known interactions is not available for C. jejuni, we derived a set of positive training data (85 interactions total) by assuming that the conserved interactions (reciprocal best match interologs) in common with either the E. coli low-throughput interaction set [28], the H. pylori yeast two-hybrid set [11], or the E. coli protein complex set [1] are likely to be true positives. We derived a set of likely true negatives (111 total) for the negative training data by considering interactions between proteins whose orthologs in E. coli or H. pylori were separated in the respective interaction maps by greater than the average distance of all pairs (≥ 4). Positive and negative training cases were weighted inversely to the number of interactions in each set. When training sets are weighted this way, a confidence score greater than 0.5 means that available data and features support that a specific interaction has a better than random chance to be a true interaction this allows 0.5 to be used as the threshold between high and low confidence interactions. Validation using protein features not used in the scoring system support the choice of 0.5 as a threshold for higher confidence interactions (discussed further in Additional data file 14 see also Figure 2c). Of the attributes tested, the numbers of interactions per protein were found to be negative predictors of biologically relevant interactions, while reporter activities were positive predictors. To evaluate the scoring model, we performed a stratified five-fold cross-validation. Cross-validation reported a precision of 91.4% and a recall of 78.9%, which gave us confidence that it is a reasonably well-fitted model. We then used the full sets of positives and negatives in training and obtained our final logistic model. The final model was used to compute confidence scores for 16,104 initial positive interactions prior to retesting. Of these, 3,209 scored higher than 0.5, which we define as the high confidence set. Of the interactions with high confidence scores (> 0.5), 90% corresponded to interactions that repeated when retested, while only 68% of the low confidence interactions repeated. Further discussion and details of the confidence scoring system are available in Additional data file 14.

Evaluating the confidence score model

Main role annotations 'mainrole' were downloaded from [67]. Excluding self-interactions, out of the 3,209 high confidence interactions, 2,599 have 'mainrole' annotations, and 454 share at least one 'mainrole' annotation. We generated 5,000 groups of 2,599 randomly selected interactions that have 'mainrole' annotations and have a confidence score lower than 0.5. The number of pairs in each set that share 'mainrole' annotations was counted. The distribution was plotted in a histogram and compared with the high confidence set (Figure 2b). To examine whether high confidence interactions tend to share more detailed GO [27] annotations, we grouped interactions into confidence bins so that each bin contains only interactions with scores falling into a specific range. For each interaction, we determined the deepest level of GO biological process annotations shared by the pair of genes, and calculated the average depth of shared biological process for each group. Since GO for C. jejuni NCTC11168 was not available, we used annotations for best match orthologs of C. jejuni RM1221 genes [68]. Figure 2c shows that there is a general pattern of increased depth of shared GO terms for interactions with confidence score higher than 0.5. This fact also suggests that our choice of 0.5 as a high confidence threshold is meaningful.

Assessment of functional enrichments

The frequency of each GO description from the iProClass database [69], amongst all of the proteins comprising the proteome was determined and compared to their frequency within the CampyYTH v3.1 dataset or the high confidence subset (Additional data file 3). A similar analysis was performed using the functional classifications assigned by the Sanger Institute [26] (Additional data file 2). We also looked for pairs of GO annotations that were enriched in the interaction data (Additional data file 10). To do this we counted the number of interactions having a specific pair of GO terms. We mapped the annotations to level 5 that is, for a protein with GO annotation A that is at a deeper level than 5, we mapped A to level 5 using 'parent' and 'part of' relationships in the ontologies, and we discarded A if it was above level 5. Self-interactions were excluded from the analysis. We did the same for all GO terms annotated to a protein. To compute the significance of finding specific GO pairs, we generated 2,000 random networks by randomly switching pairs of links while maintaining the degree distribution of the original map, and counted the number of times we found each GO pair in each randomized network. For each GO pair, a p value was computed based on the distribution of the 2,000 counts (assuming normal distribution) and the count in the original yeast two-hybrid map. The p value represents the probability of seeing such a pair in a random network. We listed only pairs with a p value less than 5%.

Comparative network analysis

Additional details are in Additional file 14. Protein-protein interactions from C. jejuni were compared with those from E. coli [1], H. pylori [11] and S. cerevisiae from DIP [28]. Corresponding protein sequences were obtained from the following sources: C. jejuni NCTC11168 [26] E. coli [70] H. pylori [71] and S. cerevisiae [72]. We used NetworkBlast to identify significant conserved protein-protein interaction subnetworks [34]. A stand-alone Java version of the program is available at [73]. Briefly, the algorithm takes as input a pair of protein-protein interaction networks, one for each of two species, along with a set of homology relationships between the proteins of the two networks. We constructed the homology relationships from an all-versus-all BLAST of the complete set of protein sequences for each of the two species, taking the top 10 hits with E-value = 10 -10 . Next, a network alignment graph was created where each node represents a homologous pair of proteins from species 1 and 2 (for example, a1 and a2) and each edge represents a conserved interaction (a1/a2 connects to b1/b2 if the a-b interaction is found in both species interactions may be either direct (distance 1) or indirect (distance 2), in which a-b is connected through a common neighbor, that is, a-c-b). A greedy search is initiated from each node to identify conserved protein subnetworks, defined as dense subgraphs within the network alignment graph (of maximum size 15 proteins per species). When multiple subnetworks contain protein homologs that overlap by ≥ 50%, only the complex with the highest density was included in the final result. GO annotations [27] of proteins in each conserved complex were analyzed to identify significant functional enrichments (Additional data file 6). We calculated a hypergeometric p value of enrichment for each GO annotation in the three divisions of the GO hierarchy and constrained the annotations by requiring that at least half of the proteins in a complex ascribe to the enrichment. The most specific annotations with hypergeometric p value < 0.05 in each of the three divisions were then assigned to each complex. A complete list of conserved complexes between C. jejuni and E. coli or S. cerevisiae is available for download at [73]. The significant conserved subnetworks provided predictions of 379 new C. jejuni protein-protein interactions not found in the two-hybrid screens (Additional data file 7). A protein pair (a, b) was predicted to interact directly if: first, both a and b were present in the same significant conserved complex second, this pair was observed to interact indirectly in C. jejuni and third, this pair corresponded to a direct interaction in the comparison species' network.

Clustering of conserved subnetworks

Since proteins can belong to more than one complex, we clustered the significant conserved subnetworks by protein membership, in effect 'superclustering' the interactions (Figure 5b). An n × m matrix was constructed, where n is the number of significant subnetworks and m is the number of unique proteins involved in any of the significant subnetworks. Using the open source tool ClustArray [74], we clustered the proteins hierarchically using the unweighted pair group method with arithmetic mean (UPGMA) and clustered the subnetworks with a combination k-means algorithm followed by UPGMA hierarchical clustering. The number of clusters k = 3 was chosen as the parameter that approximately minimized within-cluster variability and maximized between-cluster variability (data not shown). Identities of complexes and proteins are shown in the high resolution image of the hierarchical clustering in Additional data file 8. Lists of the proteins comprising complexes are available for download at [73].

Essential gene analysis and network assembly

We generated lists of putative C. jejuni NCTC11168 essential proteins by identifying reciprocal best match orthologs of likely essential proteins from B. subtilis [75] and E. coli [76]. We removed genes from our putative essential list if viable null mutants have been reported (Dr. B. Wren, personal communication). To examine the relationship between essentiality and centrality in the interaction map, we computed the numbers of essential and non-essential proteins in groups having the same number of interactions (degree) in the higher confidence dataset (interactions with confidence scores > 0.5). The result is shown in Figure 7, where r values in the graphs represent Pearson correlation coefficients between the fractions and the degrees. Figure 7 shows that there is a correlation between degree of proteins and the likelihood of being essential. A similar result was obtained with the entire dataset CampyYTH v3.1 (not shown). Lastly, we computed the fraction of essential and non-essential neighbors of each essential protein and compared this to the fraction for random groups of proteins (of the same size as the set of essential proteins). The results shown in Additional data file 11 indicate that essential genes tend to have more neighbors that are also essential p values indicate the probability of seeing the real fraction (the red dot) by chance.


Translation termination occurs when the elongating ribosome encounters a stop signal on mRNA (for a review see Kisselev and Buckingham, 2000 ). Two class 1 protein release factors (RFs), with different but overlapping specificity of recognition, are required for termination in bacteria: RF1 recognizes UAG and UAA, and RF2 recognizes UAA and UGA. In eukarya and archaea, a single protein, eRF1 and aRF1, respectively, recognizes all three stop codons. How the stop codons are recognized in any of these systems remains poorly understood, although a tripeptide motif has been suggested to define the identity of class 1 peptide RFs from bacteria ( Ito et al., 2000 ).

Eukaryotic and archaeal RFs are clearly homologous, but their primary and secondary structures differ radically from their bacterial counterparts. In fact, it is now recognized that the RFs from all kingdoms share only one sequence motif. This is the universal Gly–Gly–Gln (GGQ) tripeptide, flanked by sequences that are particularly rich in basic amino acids in eRF1 and aRF1 ( Frolova et al., 1999 ). Substitutions affecting the glycine residues in GGQ lead to almost complete loss of release activity in both human ( Seit Nebi et al., 2000 ) and Escherichia coli (L.Mora and A.Zavialov, unpublished results) RFs. In contrast, the glutamine residue can be changed to certain other amino acids with retention of partial release activity in vitro ( Seit Nebi et al., 2000 L.Mora, unpublished results). However, none of these mutants substituted at the glutamine position is able to complement RF1 or RF2 themosensitive mutants in E.coli in vivo (L.Mora, unpublished results), or substitute for eRF1 in Saccharomyces cerevisiae ( Song et al., 2000 ).

The glutamine residue in the GGQ motif is modified post-translationally to N 5 -methylglutamine in both RF1 and RF2 in E.coli ( Dinçbas-Renqvist et al., 2000 ). This modification has an important stimulatory effect on the release activity of RF2, and explains earlier observations of a striking negative correlation between the specific activity of RF2 and the degree of overproduction of the factor ( Tate et al., 1993 ). Overproduction of E.coli RF2 leads to a non-modified protein, presumably due to insufficient activity of the RF methyltransferase (MTase), and such overproduction of poorly active RF2 is highly inhibitory to growth in E.coli K12 strains ( Uno et al., 1996 Dinçbas-Renqvist et al., 2000 ). These experiments have also revealed a striking functional interplay between the methylation of Gln252 in RF2 and the nature of the amino acid at position 246, four residues from the GGQ motif towards the N-terminus. The activity of RF2 in K12 strains is low compared with other E.coli strains due to the presence of a threonine residue at position 246, in place of an alanine or serine residue found in all other bacterial RFs. In vitro, the losses in activity due to the lack of glutamine methylation and to the presence of Thr246 instead of Ala246 are cumulative. Thus, overproduction of RF2 Ala246 has little inhibitory effect on cell growth ( Uno et al., 1996 Dinçbas-Renqvist et al., 2000 ).

So far, both the extent of RF methylation among different organisms and the identity of the RF MTase are unknown. A single instance of glutamine transmethylation has been reported in the literature, involving ribosomal protein L3 in E.coli ( Lhoest and Colson, 1977 Colson et al., 1979 ).

Here, we identify the MTases for both ribosomal protein L3 and the class 1 RF in E.coli. The L3 MTase is encoded by yfcB, and the RF MTase is encoded by hemK. The gene hemK, situated immediately downstream of and co-expressed with prfA (i.e. the gene for RF1), was suggested initially to encode a protoporphyrinogen oxidase ( Nakayashiki et al., 1995 ). YfcB and HemK are the first N 5 -glutamine MTases to be identified.


"Saccharomyces" derives from Latinized Greek and means "sugar-mold" or "sugar-fungus", saccharon (σάκχαρον) being the combining form "sugar" and myces (μύκης) being "fungus". [4] [5] cerevisiae comes from Latin and means "of beer". [6] Other names for the organism are:

  • Brewer's yeast, though other species are also used in brewing [7]
  • Ale yeast
  • Top-fermenting yeast
  • Baker's yeast[7]
  • Ragi yeast, in connection to making tapai
  • Budding yeast

This species is also the main source of nutritional yeast and yeast extract.

In the 19th century, bread bakers obtained their yeast from beer brewers, and this led to sweet-fermented breads such as the Imperial "Kaisersemmel" roll, [8] which in general lacked the sourness created by the acidification typical of Lactobacillus. However, beer brewers slowly switched from top-fermenting (S. cerevisiae) to bottom-fermenting (S. pastorianus) yeast. The Vienna Process was developed in 1846. [9] While the innovation is often popularly credited for using steam in baking ovens, leading to a different crust characteristic, it is notable for including procedures for high milling of grains (see Vienna grits [10] ), cracking them incrementally instead of mashing them with one pass as well as better processes for growing and harvesting top-fermenting yeasts, known as press-yeast. [ citation needed ]

Refinements in microbiology following the work of Louis Pasteur led to more advanced methods of culturing pure strains. In 1879, Great Britain introduced specialized growing vats for the production of S. cerevisiae, and in the United States around the turn of the century centrifuges were used for concentrating the yeast, [11] making modern commercial yeast possible, and turning yeast production into a major industrial endeavor. The slurry yeast made by small bakers and grocery shops became cream yeast, a suspension of live yeast cells in growth medium, and then compressed yeast, the fresh cake yeast that became the standard leaven for bread bakers in much of the Westernized world during the early 20th century. [ citation needed ]

During World War II, Fleischmann's developed a granulated active dry yeast for the United States armed forces, which did not require refrigeration and had a longer shelf-life and better temperature tolerance than fresh yeast it is still the standard yeast for US military recipes. The company created yeast that would rise twice as fast, cutting down on baking time. Lesaffre would later create instant yeast in the 1970s, which has gained considerable use and market share at the expense of both fresh and dry yeast in their various applications. [ citation needed ]

Ecology Edit

In nature, yeast cells are found primarily on ripe fruits such as grapes (before maturation, grapes are almost free of yeasts). [12] Since S. cerevisiae is not airborne, it requires a vector to move. [ citation needed ]

Queens of social wasps overwintering as adults (Vespa crabro and Polistes spp.) can harbor yeast cells from autumn to spring and transmit them to their progeny. [13] The intestine of Polistes dominula, a social wasp, hosts S. cerevisiae strains as well as S. cerevisiae × S. paradoxus hybrids. Stefanini et al. (2016) showed that the intestine of Polistes dominula favors the mating of S. cerevisiae strains, both among themselves and with S. paradoxus cells by providing environmental conditions prompting cell sporulation and spores germination. [14]

The optimum temperature for growth of S. cerevisiae is 30–35 °C (86–95 °F). [13]

Life cycle Edit

Two forms of yeast cells can survive and grow: haploid and diploid. The haploid cells undergo a simple lifecycle of mitosis and growth, and under conditions of high stress will, in general, die. This is the asexual form of the fungus. The diploid cells (the preferential 'form' of yeast) similarly undergo a simple lifecycle of mitosis and growth. The rate at which the mitotic cell cycle progresses often differs substantially between haploid and diploid cells. [15] Under conditions of stress, diploid cells can undergo sporulation, entering meiosis and producing four haploid spores, which can subsequently mate. This is the sexual form of the fungus. Under optimal conditions, yeast cells can double their population every 100 minutes. [16] [17] However, growth rates vary enormously both between strains and between environments. [18] Mean replicative lifespan is about 26 cell divisions. [19] [20]

In the wild, recessive deleterious mutations accumulate during long periods of asexual reproduction of diploids, and are purged during selfing: this purging has been termed "genome renewal". [21] [22]

Nutritional requirements Edit

All strains of S. cerevisiae can grow aerobically on glucose, maltose, and trehalose and fail to grow on lactose and cellobiose. However, growth on other sugars is variable. Galactose and fructose are shown to be two of the best fermenting sugars. The ability of yeasts to use different sugars can differ depending on whether they are grown aerobically or anaerobically. Some strains cannot grow anaerobically on sucrose and trehalose.

All strains can use ammonia and urea as the sole nitrogen source, but cannot use nitrate, since they lack the ability to reduce them to ammonium ions. They can also use most amino acids, small peptides, and nitrogen bases as nitrogen sources. Histidine, glycine, cystine, and lysine are, however, not readily used. S. cerevisiae does not excrete proteases, so extracellular protein cannot be metabolized.

Yeasts also have a requirement for phosphorus, which is assimilated as a dihydrogen phosphate ion, and sulfur, which can be assimilated as a sulfate ion or as organic sulfur compounds such as the amino acids methionine and cysteine. Some metals, like magnesium, iron, calcium, and zinc, are also required for good growth of the yeast.

Concerning organic requirements, most strains of S. cerevisiae require biotin. Indeed, a S. cerevisiae-based growth assay laid the foundation for the isolation, crystallisation, and later structural determination of biotin. Most strains also require pantothenate for full growth. In general, S. cerevisiae is prototrophic for vitamins.

Mating Edit

Yeast has two mating types, a and α (alpha), which show primitive aspects of sex differentiation. [23] As in many other eukaryotes, mating leads to genetic recombination, i.e. production of novel combinations of chromosomes. Two haploid yeast cells of opposite mating type can mate to form diploid cells that can either sporulate to form another generation of haploid cells or continue to exist as diploid cells. Mating has been exploited by biologists as a tool to combine genes, plasmids, or proteins at will. [ citation needed ]

The mating pathway employs a G protein-coupled receptor, G protein, RGS protein, and three-tiered MAPK signaling cascade that is homologous to those found in humans. This feature has been exploited by biologists to investigate basic mechanisms of signal transduction and desensitization. [ citation needed ]

Cell cycle Edit

Growth in yeast is synchronised with the growth of the bud, which reaches the size of the mature cell by the time it separates from the parent cell. In well nourished, rapidly growing yeast cultures, all the cells have buds, since bud formation occupies the whole cell cycle. Both mother and daughter cells can initiate bud formation before cell separation has occurred. In yeast cultures growing more slowly, cells lacking buds can be seen, and bud formation only occupies a part of the cell cycle. [ citation needed ]

Cytokinesis Edit

Cytokinesis enables budding yeast Saccharomyces cerevisiae to divide into two daughter cells. S. cerevisiae forms a bud which can grow throughout its cell cycle and later leaves its mother cell when mitosis has completed. [24]

S. cerevisiae is relevant to cell cycle studies because it divides asymmetrically by using a polarized cell to make two daughters with different fates and sizes. Similarly, stem cells use asymmetric division for self-renewal and differentiation. [25]

Timing Edit

For many cells, M phase does not happen until S phase is complete. However, for entry into mitosis in S. cerevisiae this is not true. Cytokinesis begins with the budding process in late G1 and is not completed until about halfway through the next cycle. The assembly of the spindle can happen before S phase has finished duplicating the chromosomes. [24] Additionally, there is a lack of clearly defined G2 in between M and S. Thus, there is a lack of extensive regulation present in higher eukaryotes. [24]

When the daughter emerges, the daughter is two-thirds the size of the mother. [26] Throughout the process, the mother displays little to no change in size. [27] The RAM pathway is activated in the daughter cell immediately after cytokinesis is complete. This pathway makes sure that the daughter has separated properly. [26]

Actomyosin ring and primary septum formation Edit

Two interdependent events drive cytokinesis in S. cerevisiae. The first event is contractile actomyosin ring (AMR) constriction and the second event is formation of the primary septum (PS), a chitinous cell wall structure that can only be formed during cytokinesis. The PS resembles in animals the process of extracellular matrix remodeling. [26] When the AMR constricts, the PS begins to grow. Disrupting AMR misorients the PS, suggesting that both have a dependent role. Additionally, disrupting the PS also leads to disruptions in the AMR, suggesting both the actomyosin ring and primary septum have an interdependent relationship. [28] [27]

The AMR, which is attached to the cell membrane facing the cytosol, consists of actin and myosin II molecules that coordinate the cells to split. [24] The ring is thought to play an important role in ingression of the plasma membrane as a contractile force. [ citation needed ]

Proper coordination and correct positional assembly of the contractile ring depends on septins, which is the precursor to the septum ring. These GTPases assemble complexes with other proteins. The septins form a ring at the site where the bud will be created during late G1. They help promote the formation of the actin-myosin ring, although this mechanism is unknown. It is suggested they help provide structural support for other necessary cytokinesis processes. [24] After a bud emerges, the septin ring forms an hourglass. The septin hourglass and the myosin ring together are the beginning of the future division site. [ citation needed ]

The septin and AMR complex progress to form the primary septum consisting of glucans and other chitinous molecules sent by vesicles from the Golgi body. [29] After AMR constriction is complete, two secondary septums are formed by glucans. How the AMR ring dissembles remains poorly unknown. [25]

Microtubules do not play as significant a role in cytokinesis compared to the AMR and septum. Disruption of microtubules did not significantly impair polarized growth. [30] Thus, the AMR and septum formation are the major drivers of cytokinesis. [ citation needed ]

Differences from fission yeast Edit
  • Budding yeast form a bud from the mother cell. This bud grows during the cell cycle and detaches fission yeast divide by forming a cell wall [24]
  • Cytokinesis begins at G1 for budding yeast, while cytokinesis begins at G2 for fission yeast. Fission yeast “select” the midpoint, whereas budding yeast “select” a bud site [31]
  • During early anaphase the actomyosin ring and septum continues to develop in budding yeast, in fission yeast during metaphase-anaphase the actomyosin ring begins to develop [31]

Model organism Edit

When researchers look for an organism to use in their studies, they look for several traits. Among these are size, generation time, accessibility, manipulation, genetics, conservation of mechanisms, and potential economic benefit. The yeast species S. pombe and S. cerevisiae are both well studied these two species diverged approximately 600 to 300 million years ago , and are significant tools in the study of DNA damage and repair mechanisms. [32]

S. cerevisiae has developed as a model organism because it scores favorably on a number of these criteria.

  • As a single-cell organism, S. cerevisiae is small with a short generation time (doubling time 1.25–2 hours [33] at 30 °C or 86 °F) and can be easily cultured. These are all positive characteristics in that they allow for the swift production and maintenance of multiple specimen lines at low cost.
  • S. cerevisiae divides with meiosis, allowing it to be a candidate for sexual genetics research.
  • S. cerevisiae can be transformed allowing for either the addition of new genes or deletion through homologous recombination. Furthermore, the ability to grow S. cerevisiae as a haploid simplifies the creation of gene knockout strains.
  • As a eukaryote, S. cerevisiae shares the complex internal cell structure of plants and animals without the high percentage of non-coding DNA that can confound research in higher eukaryotes.
  • S. cerevisiae research is a strong economic driver, at least initially, as a result of its established use in industry.

In the study of aging Edit

For more than five decades S. cerevisiae has been studied as a model organism to better understand aging and has contributed to the identification of more mammalian genes affecting aging than any other model organism. [34] Some of the topics studied using yeast are calorie restriction, as well as in genes and cellular pathways involved in senescence. The two most common methods of measuring aging in yeast are Replicative Life Span (RLS), which measures the number of times a cell divides, and Chronological Life Span (CLS), which measures how long a cell can survive in a non-dividing stasis state. [34] Limiting the amount of glucose or amino acids in the growth medium has been shown to increase RLS and CLS in yeast as well as other organisms. [35] At first, this was thought to increase RLS by up-regulating the sir2 enzyme, however it was later discovered that this effect is independent of sir2. Over-expression of the genes sir2 and fob1 has been shown to increase RLS by preventing the accumulation of extrachromosomal rDNA circles, which are thought to be one of the causes of senescence in yeast. [35] The effects of dietary restriction may be the result of a decreased signaling in the TOR cellular pathway. [34] This pathway modulates the cell's response to nutrients, and mutations that decrease TOR activity were found to increase CLS and RLS. [34] [35] This has also been shown to be the case in other animals. [34] [35] A yeast mutant lacking the genes sch9 and ras2 has recently been shown to have a tenfold increase in chronological lifespan under conditions of calorie restriction and is the largest increase achieved in any organism. [36] [37]

Mother cells give rise to progeny buds by mitotic divisions, but undergo replicative aging over successive generations and ultimately die. However, when a mother cell undergoes meiosis and gametogenesis, lifespan is reset. [38] The replicative potential of gametes (spores) formed by aged cells is the same as gametes formed by young cells, indicating that age-associated damage is removed by meiosis from aged mother cells. This observation suggests that during meiosis removal of age-associated damages leads to rejuvenation. However, the nature of these damages remains to be established.

During starvation of non-replicating S. cerevisiae cells, reactive oxygen species increase leading to the accumulation of DNA damages such as apurinic/apyrimidinic sites and double-strand breaks. [39] Also in non-replicating cells the ability to repair endogenous double-strand breaks declines during chronological aging. [40]

Meiosis, recombination and DNA repair Edit

S. cerevisiae reproduces by mitosis as diploid cells when nutrients are abundant. However, when starved, these cells undergo meiosis to form haploid spores. [41]

Evidence from studies of S. cerevisiae bear on the adaptive function of meiosis and recombination. Mutations defective in genes essential for meiotic and mitotic recombination in S. cerevisiae cause increased sensitivity to radiation or DNA damaging chemicals. [42] [43] For instance, gene rad52 is required for both meiotic recombination [44] and mitotic recombination. [45] Rad52 mutants have increased sensitivity to killing by X-rays, Methyl methanesulfonate and the DNA cross-linking agent 8-methoxypsoralen-plus-UVA, and show reduced meiotic recombination. [43] [44] [46] These findings suggest that recombination repair during meiosis and mitosis is needed for repair of the different damages caused by these agents.

Ruderfer et al. [42] (2006) analyzed the ancestry of natural S. cerevisiae strains and concluded that outcrossing occurs only about once every 50,000 cell divisions. Thus, it appears that in nature, mating is likely most often between closely related yeast cells. Mating occurs when haploid cells of opposite mating type MATa and MATα come into contact. Ruderfer et al. [42] pointed out that such contacts are frequent between closely related yeast cells for two reasons. The first is that cells of opposite mating type are present together in the same ascus, the sac that contains the cells directly produced by a single meiosis, and these cells can mate with each other. The second reason is that haploid cells of one mating type, upon cell division, often produce cells of the opposite mating type with which they can mate. The relative rarity in nature of meiotic events that result from outcrossing is inconsistent with the idea that production of genetic variation is the main selective force maintaining meiosis in this organism. However, this finding is consistent with the alternative idea that the main selective force maintaining meiosis is enhanced recombinational repair of DNA damage, [47] since this benefit is realized during each meiosis, whether or not out-crossing occurs.

Genome sequencing Edit

S. cerevisiae was the first eukaryotic genome to be completely sequenced. [48] The genome sequence was released to the public domain on April 24, 1996. Since then, regular updates have been maintained at the Saccharomyces Genome Database. This database is a highly annotated and cross-referenced database for yeast researchers. Another important S. cerevisiae database is maintained by the Munich Information Center for Protein Sequences (MIPS). The S. cerevisiae genome is composed of about 12,156,677 base pairs and 6,275 genes, compactly organized on 16 chromosomes. [48] Only about 5,800 of these genes are believed to be functional. It is estimated at least 31% of yeast genes have homologs in the human genome. [49] Yeast genes are classified using gene symbols (such as sch9) or systematic names. In the latter case the 16 chromosomes of yeast are represented by the letters A to P, then the gene is further classified by a sequence number on the left or right arm of the chromosome, and a letter showing which of the two DNA strands contains its coding sequence. [50]

Systematic gene names for Baker's yeast
Example gene name YGL118W
Y the Y to show this is a yeast gene
G chromosome on which the gene is located
L left or right arm of the chromosome
118 sequence number of the gene/ORF on this arm, starting at the centromere
W whether the coding sequence is on the Watson or Crick strand

  • YBR134C (aka SUP45 encoding eRF1, a translation termination factor) is located on the right arm of chromosome 2 and is the 134th open reading frame (ORF) on that arm, starting from the centromere. The coding sequence is on the Crick strand of the DNA.
  • YDL102W (aka POL3 encoding a subunit of DNA polymerase delta) is located on the left arm of chromosome 4 it is the 102nd ORF from the centromere and codes from the Watson strand of the DNA.

Gene function and interactions Edit

The availability of the S. cerevisiae genome sequence and a set of deletion mutants covering 90% of the yeast genome [51] has further enhanced the power of S. cerevisiae as a model for understanding the regulation of eukaryotic cells. A project underway to analyze the genetic interactions of all double-deletion mutants through synthetic genetic array analysis will take this research one step further. The goal is to form a functional map of the cell's processes.

As of 2010 [update] a model of genetic interactions is most comprehensive yet to be constructed, containing "the interaction profiles for

75% of all genes in the Budding yeast". [52] This model was made from 5.4 million two-gene comparisons in which a double gene knockout for each combination of the genes studied was performed. The effect of the double knockout on the fitness of the cell was compared to the expected fitness. Expected fitness is determined from the sum of the results on fitness of single-gene knockouts for each compared gene. When there is a change in fitness from what is expected, the genes are presumed to interact with each other. This was tested by comparing the results to what was previously known. For example, the genes Par32, Ecm30, and Ubp15 had similar interaction profiles to genes involved in the Gap1-sorting module cellular process. Consistent with the results, these genes, when knocked out, disrupted that process, confirming that they are part of it. [52]

From this, 170,000 gene interactions were found and genes with similar interaction patterns were grouped together. Genes with similar genetic interaction profiles tend to be part of the same pathway or biological process. [53] This information was used to construct a global network of gene interactions organized by function. This network can be used to predict the function of uncharacterized genes based on the functions of genes they are grouped with. [52]

Other tools in yeast research Edit

Approaches that can be applied in many different fields of biological and medicinal science have been developed by yeast scientists. These include yeast two-hybrid for studying protein interactions and tetrad analysis. Other resources, include a gene deletion library including

4,700 viable haploid single gene deletion strains. A GFP fusion strain library used to study protein localisation and a TAP tag library used to purify protein from yeast cell extracts. [ citation needed ]

Stanford University's yeast deletion project created knockout mutations of every gene in the S. cerevisiae genome to determine their function. [54]

Synthetic yeast genome project Edit

The international Synthetic Yeast Genome Project (Sc2.0 or Saccharomyces cerevisiae version 2.0) aims to build an entirely designer, customizable, synthetic S. cerevisiae genome from scratch that is more stable than the wild type. In the synthetic genome all transposons, repetitive elements and many introns are removed, all UAG stop codons are replaced with UAA, and transfer RNA genes are moved to a novel neochromosome. As of March 2017 [update] , 6 of the 16 chromosomes have been synthesized and tested. No significant fitness defects have been found. [55]

Astrobiology Edit

Among other microorganisms, a sample of living S. cerevisiae was included in the Living Interplanetary Flight Experiment, which would have completed a three-year interplanetary round-trip in a small capsule aboard the Russian Fobos-Grunt spacecraft, launched in late 2011. [56] [57] The goal was to test whether selected organisms could survive a few years in deep space by flying them through interplanetary space. The experiment would have tested one aspect of transpermia, the hypothesis that life could survive space travel, if protected inside rocks blasted by impact off one planet to land on another. [56] [57] [58] Fobos-Grunt's mission ended unsuccessfully, however, when it failed to escape low Earth orbit. The spacecraft along with its instruments fell into the Pacific Ocean in an uncontrolled re-entry on January 15, 2012. The next planned exposure mission in deep space using S. cerevisiae is BioSentinel. (see: List of microorganisms tested in outer space)

Brewing Edit

Saccharomyces cerevisiae is used in brewing beer, when it is sometimes called a top-fermenting or top-cropping yeast. It is so called because during the fermentation process its hydrophobic surface causes the flocs to adhere to CO2 and rise to the top of the fermentation vessel. Top-fermenting yeasts are fermented at higher temperatures than the lager yeast Saccharomyces pastorianus, and the resulting beers have a different flavor than the same beverage fermented with a lager yeast. "Fruity esters" may be formed if the yeast undergoes temperatures near 21 °C (70 °F), or if the fermentation temperature of the beverage fluctuates during the process. Lager yeast normally ferments at a temperature of approximately 5 °C (41 °F), where Saccharomyces cerevisiae becomes dormant. A variant yeast known as Saccharomyces cerevisiae var. diastaticus is a beer spoiler which can cause secondary fermentations in packaged products. [59]

In May 2013, the Oregon legislature made S. cerevisiae the official state microbe in recognition of the impact craft beer brewing has had on the state economy and the state's identity. [60]

Baking Edit

S. cerevisiae is used in baking the carbon dioxide generated by the fermentation is used as a leavening agent in bread and other baked goods. Historically, this use was closely linked to the brewing industry's use of yeast, as bakers took or bought the barm or yeast-filled foam from brewing ale from the brewers (producing the barm cake) today, brewing and baking yeast strains are somewhat different.

Nutritional yeast Edit

Saccharomyces cerevisiae is the main source of nutritional yeast, which is sold commercially as a food product. It is popular with vegans and vegetarians as an ingredient in cheese substitutes, or as a general food additive as a source of vitamins and minerals, especially amino acids and B-complex vitamins.

Uses in aquaria Edit

Owing to the high cost of commercial CO2 cylinder systems, CO2 injection by yeast is one of the most popular DIY approaches followed by aquaculturists for providing CO2 to underwater aquatic plants. The yeast culture is, in general, maintained in plastic bottles, and typical systems provide one bubble every 3–7 seconds. Various approaches have been devised to allow proper absorption of the gas into the water. [61]

Saccharomyces cerevisiae is used as a probiotic in humans and animals. Especially, a strain Saccharomyces cerevisiae var. boulardii is industrially manufactured and clinically used as a medication.

Several clinical and experimental studies have shown that Saccharomyces cerevisiae var. boulardii is, to lesser or greater extent, useful for prevention or treatment of several gastrointestinal diseases. [62] Moderate quality evidence shown Saccharomyces cerevisiae var. boulardii to reduce risk of antibiotic-associated diarrhea both in adults [63] [62] [64] and in children [63] [62] and to reduce risk of adverse effects of Helicobacter pylori eradication therapy. [65] [62] [64] Also some limited evidence support efficacy of Saccharomyces cerevisiae var. boulardii in prevention (but not treatment) of traveler's diarrhea [62] [64] and, at least as an adjunct medication, in treatment of acute diarrhea in adults and children and of persistent diarrhea in children. [62] It may also reduce symptoms of allergic rhinitis. [66]

Administration of S. cerevisiae var. boulardii is considered generally safe. [64] In clinical trials it was well tolerated by patients, and adverse effects rate was similar to that in control groups (i. e. groups with placebo or no treatment). [63] No case of S. cerevisiae var. boulardii fungemia has been reported during clinical trials. [64]

In clinical practice, however, cases of fungemia, caused by Saccharomyces cerevisiae var. boulardii are reported. [64] [62] Patients with compromised immunity or those with central vascular catheters are at special risk. Some researchers have recommended not to use Saccharomyces cerevisiae var. boulardii for treatment of such patients. [64] Others suggest only that caution must be exercised with its use in risk group patients. [62]

Saccharomyces cerevisiae is proven to be an opportunistic human pathogen, though of relatively low virulence. [67] Despite widespread use of this microorganism at home and in industry, contact with it very rarely leads to infection. [68] Saccharomyces cerevisiae was found in the skin, oral cavity, oropharinx, duodenal mucosa, digestive tract, and vagina of healthy humans [69] (one review found it to be reported for 6% of samples from human intestine [70] ). Some specialists consider S. cerevisiae to be a part of the normal microbiota of the gastrointestinal tract, the respiratory tract, and the vagina of humans, [71] while others believe that the species cannot be called a true commensal because it originates in food. [70] [72] Presence of S. cerevisiae in the human digestive system may be rather transient [72] for example, experiments show that in the case of oral administration to healthy individuals it is eliminated from the intestine within 5 days after the end of administration. [70] [68]

Under certain circumstances, such as degraded immunity, Saccharomyces cerevisiae can cause infection in humans. [68] [67] Studies show that it causes 0.45-1.06% of the cases of yeast-induced vaginitis. In some cases, women suffering from S. cerevisiae-induced vaginal infection were intimate partners of bakers, and the strain was found to be the same that their partners used for baking. As of 1999, no cases of S. cerevisiae-induced vaginitis in women, who worked in bakeries themselves, were reported in scientific literature. Some cases were linked by researchers to the use of the yeast in home baking. [67] Cases of infection of oral cavity and pharynx caused by S. cerevisiae are also known. [67]

Invasive and systemic infections Edit

Occasionally Saccharomyces cerevisiae causes invasive infections (i. e. gets into the bloodstream or other normally sterile body fluid or into a deep site tissue, such as lungs, liver or spleen) that can go systemic (involve multiple organs). Such conditions are life-threatening. [67] [72] More than 30% cases of S. cerevisiae invasive infections lead to death even if treated. [72] S. cerevisiae invasive infections, however, are much rarer than invasive infections caused by Candida albicans [67] [73] even in patients weakened by cancer. [73] S. cerevisiae causes 1% to 3.6% nosocomial cases of fungemia. [72] A comprehensive review of S. cerevisiae invasive infection cases found all patients to have at least one predisposing condition. [72]

Saccharomyces cerevisiae may enter the bloodstream or get to other deep sites of the body by translocation from oral or enteral mucosa or through contamination of intravascular catheters (e. g. central venous catheters). [71] Intravascular catheters, antibiotic therapy and compromised immunity are major predisposing factors for S. cerevisiae invasive infection. [72]

A number of cases of fungemia were caused by intentional ingestion of living S. cerevisiae cultures for dietary or therapeutic reasons, including use of Saccharomyces boulardii (a strain of S. cerevisiae which is used as a probiotic for treatment of certain forms of diarrhea). [67] [72] Saccharomices boulardii causes about 40% cases of invasive Saccharomyces infections [72] and is more likely (in comparison to other S. cerevisiae strains) to cause invasive infection in humans without general problems with immunity, [72] though such adverse effect is very rare relative to Saccharomices boulardii therapeutic administration. [74]

S. boulardii may contaminate intravascular catheters through hands of medical personnel involved in administering probiotic preparations of S. boulardii to patients. [72]

Systemic infection usually occurs in patients who have their immunity compromised due to severe illness (HIV/AIDS, leukemia, other forms of cancer) or certain medical procedures (bone marrow transplantation, abdominal surgery). [67]

A case was reported when a nodule was surgically excised from a lung of a man employed in baking business, and examination of the tissue revealed presence of Saccharomyces cerevisiae. Inhalation of dry baking yeast powder is supposed to be the source of infection in this case. [75] [72]

Virulence of different strains Edit

Not all strains of Saccharomyces cerevisiae are equally virulent towards humans. Most environmental strains are not capable of growing at temperatures above 35 °C (i. e. at temperatures of living body of humans and other mammalian). Virulent strains, however, are capable of growing at least above 37 °C and often up to 39 °C (rarely up to 42 °C). [69] Some industrial strains are also capable of growing above 37 °C. [67] European Food Safety Authority (as of 2017) requires that all S. cerevisiae strains capable of growth above 37 °C that are added to the food or feed chain in viable form must, as to be qualified presumably safe, show no resistance to antimycotic drugs used for treatment of yeast infections. [76]

The ability to grow at elevated temperatures is an important factor for strain's virulence but not the sole one. [69]


Preparation of E. coli cytosolic lysate

E. coli MC4100 cells were grown at 37°C in rich or minimum medium to exponential phase (OD600nm

0.4), as described [55]. Lysis was induced by dilution of the spheroplasts into an equal volume of 25°C hypo-osmotic lysis buffer (50 mM Tris-HCl (pH 8), 0.01% (w/v) Tween 20, 10 mM MgCl2, 25 U/ml benzonase, 2 mM Pefabloc (Roche), 10 mM glucose and 20 U/ml hexokinase (Roche)). The supernatant was cleared at 30,000 × g for 10 min.

Protein and peptide fractionation

See Supplementary Materials and Methods [see Additional file 1].

NanoLC-MS/MS Analysis

See Supplementary Materials and Methods [see Additional file 1].

Protein identification and abundance estimation

MS peak lists were created by scripts in Analyst QS (MDS-Sciex) or by Bioworks 3.1 (Thermoelectron) on the basis of the recorded fragmentation spectra and were submitted to the Mascot database searching engine (Matrix Sciences, London, UK) against the E. coli SwissProt database to identify proteins. The following search parameters were used in all Mascot searches: maximum of one missed trypsin cleavage, cysteine carbamidomethylation, methionine oxidation, peptide tolerance ± 0.2 Da for QSTAR data and ± 2.0 Da for LTQ data, MS/MS tolerance ± 0.2 Da for QSTAR data and ± 0.8 Da for LTQ data. All peptides with scores less than the identity threshold (p = 0.05) or a rank > 1 were automatically discarded. We also used the parent ion mass accuracy (mass deviation < 50 ppm for QSTAR data), the predicted retention times [56] (difference < 10 min), and protein molecular weight estimated from the gel slice as additional requirements for protein identification. Finally, using peptides within the above criteria, we only accepted proteins with two or more peptide hits. For decoy database searching, all peak lists were merged into two files to create QSTAR and LTQ peak lists. These merged peak lists were searched against a decoy database created by the Mascot script '' supplied by Matrix Sciences. The obtained false positive proteins from two searches were merged and the final false positive rate was estimated to be 4.26% for the final protein identification list (containing a total of 1103 proteins).

Protein abundance expressed as emPAI scale was calculated using the number of observable peptides and the number of the observed parent ions. To calculate the number of observable peptides per protein, proteins were digested in silico and the obtained peptide masses were compared with the scan range of the mass spectrometer. In addition, the expected retention times under our nanoLC conditions were calculated according to the procedure of Meek [56] and Sakamoto et al. [57] with our own coefficients based on results of approximately 1500 peptides. Peptides that were too hydrophilic or hydrophobic were eliminated. In-house software was used to calculate emPAI values, the program is accessible at the Keio University web site. Redundancy of unique parent ions in the entire dataset was removed and the number of the unique parent ions per protein was counted. emPAI values were calculated as follows:

where Nobsd and Nobsbl are the number of observed parent ions per protein and the number of observable peptides per protein, respectively.

Measurement of protein copy numbers per cell by isotope dilution

E. coli MC4100 cells were grown at 37°C in SILAC minimum medium containing Leu-D3 instead of Leu. A stock sample of unlabeled E. coli BW25113 cell pellet, including 59 enzymes with known amounts ranging from 9 to 70,000 copies per cell [33], was kindly provided by Drs. N. Sugiyama and K. Nakahigashi (Keio Univ). Based on total protein contents, these two samples were mixed at 1:1, 1:10 and 10:1, and were digested by trypsin. After desalting with C18-StageTip, each sample was analyzed with LC-MS/MS using QSTAR as described and was quantified by Mass Navigator version 1.2 (Mitsui Knowledge Industry, Tokyo, Japan). According to the dynamic range of the instrument, peptides with SILAC ratios of 0.1–10 were accepted for calculation of protein concentrations. A total of 40 proteins with at least two quantified peptides per protein were directly quantified from three samples.

Genome data

Amino acid sequences of all proteins identified in this study were obtained from Swiss-Prot [58]. Throughout this work the primary Swiss-Prot accession code in conjunction with the Swiss-Prot entry name are used as unique protein identifiers. Codon Adaptation Index values (CAI) according to the method of [52] were used as reported by [22]. Classification of E. coli genes into three groups – (E) genes essential for cell growth (essential), (N) those dispensable for cell growth (non-essential), and (U) those unknown to be essential or non-essential – was based on the comprehensive experimental analysis of [47]. In the latter work, 630 genes were identified as being essential and 3126 as being dispensable using a genetic fingerprinting technique. Data on predicted expression measure of E. coli proteins [40] were downloaded from the Stanford University web server. Proteins possessing significant sequence similarity (BLAST [59] E-value threshold 0.001) to one or several domains of known three-dimensional structure as classified in the SCOP database [41] were attributed to the corresponding SCOP fold. Assignment of genes to functional roles as defined by the MIPS functional catalog version 1.3 [60] was conducted manually at Biomax Informatics AG. Where necessary, correspondence between published protein datasets and the SwissProt database was established based on sequence identity (at least 98%), with some ambiguous cases resolved manually. Minor discrepancies such as a missing methionine at the sequence start or a single amino acid replacement were tolerated.

Coverage of the cytosolic protein content

To compare the coverage of our experimental cytosol sample with the theoretical protein content of cytosol we combined several recent sources of data as well as bioinformatics prediction techniques. For 13% (568 out of 4289) of E. coli proteins experimentally determined cellular localization information has been reported by Lopez-Campistrous et al. [34]. We further utilized the PSORT database [35] version 2.0 that provides localization annotation for 62% of the complete E. coli proteome (2678 proteins). The remaining E. coli proteins are classified in the PSORT database as "unknown" or "unknown with multiple possible localizations". We complemented this information with the number of transmembrane segments predicted using TMHMM [61] version 2.0.

Proteins with a high number of predicted transmembrane segments can be safely assumed to be not located within the cytosol. However the TMHMM predictions may lead to an over prediction of cytosolic proteins as this method reliably allows to exclude only those proteins that have multiple integral membrane segments. Furthermore, the possibility of falsely predicted membrane segments needs to be considered. We therefore combined the three data sources described above – the number of transmembrane segments, PSORT localization, and experimental localization – to find the most accurate definition of the E. coli cytosol proteome. First we consider all proteins that have at most one membrane predicted region and are annotated as "cytosolic" or "unknown" in the PSORT database. This criterion would predict 61.46% (2636 of 3289) of the E. coli proteome to be cytosolic (Table 2). The advantage of this estimate is twofold. On the one hand a false positive prediction of one membrane region is still tolerated and thus does not lead to loss of information. On the other hand the intersection with the independent PSORT data ensures that an over prediction of cytosolic proteins is avoided as much as possible. Finally we extend our previous definition and add all proteins that were experimentally determined as cytsolic proteins. This results in 2680 proteins that we adopt as our final estimate of the E. coli cytosol proteome. It is notable that the experimental localization data hardly increase the number of the defined cytosolic proteins (plus 1% or 44 of 2680 difference only). This shows the almost complete overlap of the first definition with the experimentally confirmed protein set and confirms the validly of our approach.

Low vs. high abundance proteins

For convenience we considered proteins with copy number values greater than 2050 (emPAI > 29.0) highly abundant, while the rest of the proteins were attributed to the low abundance category. This optimal threshold was automatically found by clustering of the log-copy number values using the Expectation Maximization algorithm [62] as implemented in the WEKA machine learning workbench [63], version 3.5.6 using default parameters with the number of clusters set to two. As the copy number values are distributed according to the extreme value distribution, they were logarithmized to be useable with the Gaussian distribution approximation in the clustering process.

Statistical methods

All statistical tests and most figures were prepared with the R software package version 2.0 and PROMPT [64]. To compare the distributions of two unpaired samples with non-Gaussian or unknown distributions, the rank-sum Mann-Whitney (MW) test and the two sample Kolmogorov-Smirnov (KS) test were applied using the significance threshold α = 0.05. The null hypothesis of the Mann-Whitney test is that the abundance means are equal. The null hypothesis of the Kolmogorov-Smirnov test is that the values of the two samples are drawn from the same continuous distributions. Both tests have the advantage that they make no assumptions about the distribution of data. To ensure that our tests are not biased by small sample sizes while comparing essential genes with their counterparts, the test results were verified with additional random sampling whereby each of the applied tests was repeated 10 5 times with a randomly drawn sample of the associated basic population. Then the p-value of the actual test was compared with the p-value distribution of random samples (data not shown). An observed p-value which lies in the 5% quartile shows a reliable test outcome independently of the sampling bias. Descriptive boxplot distribution statistics such as median, quartiles and outliers were generated with R. According to the canonical statistical definition, values greater than the 3rd quartile plus the inter quartile range (IQR) were considered outliers. The IQR is defined as the 3rd quartile value minus the first quartile value. Relationships between variables were analyzed utilizing the least squares regression, loess estimation and the Pearson or Spearman rank correlation methods implemented in R with default parameters.

Operon structure

A set of known E. coli operons was obtained from RegulonDB [65]. For all operons with abundance information available for at least 3 proteins the variance of the natural logarithm of the emPAI values was calculated. The variance indicates how similar the abundance of the proteins within each operon is.

Function and structure of proteins

Functional roles of gene products were described in terms of the manually curated hierarchical functional catalog (FUNCAT) [60]. In this catalog each of the 16 main classes (e.g., metabolism, energy) may contain up to six subclasses. An essential feature of FUNCAT is its multidimensionality, meaning that any protein can be assigned to multiple categories. Carefully verified manual assignment of E. coli gene products to functional categories was obtained from Biomax Informatics AG, Martinsried, Germany. Likewise, the SCOP database [41] provides a hierarchical classification of protein structural domains. SCOP fold assignments to gene E. coli products were based on BLAST E-value of 0.001. In this work both FUNCAT and SCOP designators were truncated to include only the two upper levels of hierarchy. Proteins assigned to the same SCOP fold were grouped and the average emPAI value for each group was calculated. To avoid individual outliers with very high or very low expression levels, only groups with 10 or more proteins were considered. The EC Enzyme Nomenclature information was taken from the Swiss-Prot protein descriptions.

Disorder predictions were taken from our PEDANT database where they are calculated with the software GlobPlot [66]. GlobProt utilizes the statistics of proteins known to have unstructured regions [67, 68]. The number of alternating hydrophobic/hydrophilic stretches was computed as described [69]. The residues A, C, F, G, I, L, M, P, V, W and Y were considered to be hydrophobic and H, Q, N, S, T, K, R, D, E were considered hydrophilic in this study. The hydrophobicity of a protein was defined as ∑ i = 1 n H i n [email protected]@[email protected]@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaadaaeWaqaaiabdIeainaaBaaabaGaemyAaKgabeaaaeaac[email protected][email protected] , with Hi denoting the hydrophobicity value of the amino acid at position i of a protein of n amino acids. Hydrophobicity values were calculated using the Kyte-Doolittle scale [70].

Watch the video: In Silico Structural and Functional Annotation of Hypothetical Proteins from Escherichia coli IAI39 (August 2022).