Information

What network motifs or other mechanisms can make the expression of a gene invariable to the environment?

What network motifs or other mechanisms can make the expression of a gene invariable to the environment?



We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Next to double positive feedback loops and chromatin modification, which other mechanisms can make a gene susceptible to a certain environment in one cell-type but not in another?


This phenomenon of being insensitive to certain fluctuations is called robustness. The fluctuations can be of two kinds for an input-output device such as a gene that is activated by a signal:

  1. Fluctuations in the signal
  2. Fluctuations in the intrinsic parameters

Signal fluctuations can be temporal but parameter fluctuations are not (parameters are supposed to be time invariant). Parameter fluctuation can exist in a genotypic space or a population of cells because of variation.

Some motifs are robust to signal perturbations whereas some are robust to parameter fluctuations.

Negative feedbacks and incoherent feed-forward can confer robustness by correcting the output fluctuations because of fluctuations in the input.

When an all or none sort of (binary: one state need not be smaller in magnitude than the other) behaviour is important, then positive feedbacks confer robustness to such systems because of their property of bistability.

I cannot recall any particular motif that is robust to parameters. There are certain cases though (I am still working on that!! :P).

In any case environment can be thought of as an input.
For your question "what mechanisms can make a gene susceptible to a certain environment in one cell-type but not in another":

There can be a heterogeneous population of cells which can arise because of bistability or even otherwise, due to stochastic effects. If the gene that is in one of the two states (expression levels) in these two sub-population of cells, and this gene is responsible for the expression control (the controller gene in feedback), then one sub-population will be sensitive to the environment while the other will not be.

For more information on biological robustness read this review.


Gene regulatory network inference resources: A practical overview ☆

Gene Regulatory Networks (GRNs) control all aspects of cellular behavior.

Several approaches exist to infer GRNs. These can be broadly categorized based on the input data.

GRN inference can stem from: coexpression, sequence motifs, ChIP-Seq, orthology, literature and Protein-Protein Interaction.

We provide an extensive and commented list of >90 current GRN inference tools.

Best Practices and Examples of GRN inference using multiple methods are described.


Background

Global interaction data are synthetically structured as networks, their nodes representing the genes of an organism and their links some, usually indirect, form of interaction among them. This type of schematization is clearly wiping out important aspects of the detailed biological dynamics, such as localization in space and/or time, protein modifications and the formation of multimeric complexes, that have been lumped together in a link. Given these limitations, an important open question is whether the backbone of the interaction network provides any useful hints as to the organization of the web of cellular interactions. A first observation in this direction is that the topology of biological interaction networks strongly differs from that of random graphs [1]. In particular, when transcriptional regulatory networks are compared to randomized versions thereof, some special subgraphs, dubbed motifs, have been shown to be statistically over-represented [2, 3]. An example of a motif composed of three units is the feed-forward loop, its name being inherited from neural networks, where this pattern is also abundant.

Transcription factors often act in multimeric complexes and the formation of these plays a crucial role in the regulatory dynamics. In order to capture at least part of those effects, transcriptional networks may be integrated with the protein-protein interaction data that have recently become available [4–7]. An example is provided by the mixed network constructed in [8]. The network is mixed in the sense that it includes both directed and undirected edges, pertaining to transcriptional and protein-protein interactions, respectively. The motifs for the mixed networks were investigated in [9].

The dynamics of motifs has been thoroughly investigated in vitro and in silico, that is, in the absence of the rest of the interaction network and of additional regulatory mechanisms [10–12]. For instance, the feed-forward loop has remarkable filtering properties, with the downstream-regulated gene activated only if the activation of the most-upstream regulator is sufficiently persistent in time. The motif essentially acts as a low-pass filter, with a time-scale comparable to the delay taken to produce the intermediate protein. Furthermore, the same structure is also found to help in rapidly deactivating genes once the upstream regulator is shut off. Overabundance of motifs and their interpretation as basic information-processing units popularized the hypothesis of an evolutionary selection of motifs [2, 13].

In electrical engineering circuits, an abundant structure is likely to correspond to a module that performs a specific functional task and acts in a manner largely independent of the rest of the network. The point is moot for biological networks. A recent remark is that some of the motifs found in transcriptional networks are also encountered in artificial random networks [14, 15], where no selection is acting. However, the lists of motifs do not entirely coincide for the two cases [16]. A visually striking fact is that essentially none of the motifs exists in isolation and that there is quite a great deal of edge-sharing with other patterns (see [17] for the network of Escherichia coli). The function of the motifs might then be strongly affected by their context. The use of genetic algorithms to explore the possible structures that perform a given functional task has in fact shown a wide variety of possible solutions [18].

It is therefore of interest to address the issue of the functional role of the motifs in vivo, that is within the whole network, and examine the ensuing evolutionary constraints. In the following, we shall show that the instances of the network motifs are not subject to any particular evolutionary pressure to be preserved and analyze the biological information available on the pathways where some instances of motifs are found.


Background

Boolean networks

The attempt to model the most general aspects of gene regulatory networks dates back to the end of 1960s when Kauffman in [14] proposed a first idealized representation of a typical gene network. He modeled the regulatory interaction among genes as a directed graph in which each gene receives inputs from a fixed number of selected genes. The state of each regulatory entity, i.e., a gene, is represented as a Boolean value, either 1, representing the activation of the entity (e.g., a gene is expressed), or 0 representing its inactivation (e.g., a gene is not expressed). Connections between genes are directed, and an edge from node x to node y implies that x influences (activates or silences) the expression of y. Formally, given a set of N entities, such as genes, proteins etc., the state of the GRN is then naturally represented as a Boolean vector X ^ = [ x 1 , ⋯ , x N ] , that generates a space of 2 N possible states. The behavior of the state of each node x i is described using a Boolean function f i, which defines the value of the next state of x i using, as inputs, the states of its input nodes, i.e., those which directly affect its expression. Since the simulation of a BN is done in discrete time steps, the dynamics of a Boolean network modelling a regulatory system are described by:

where X ^ ( t + 1 ) is the next GRN state given the F ^ vector of all functions f i that map the transition of a single node from the current state to the next one.

The transition between two states of a BN can be modeled in two ways: asynchronously, where each entity updates its state independently from the others, or synchronously, where all entities update their states together. The synchronous approach is the most widely used in literature [29, 30]. In the synchronous model, a sequence of states connected by transitions forms a state-space trajectory. All trajectories always end into a steady state or a steady cycle. These steady (or equilibrium) states are commonly referred to as point or dynamic attractors, respectively. Point attractors consist of only one state: once the system reaches that state, it is "frozen" and no longer able to move elsewhere. On the contrary, dynamic (or periodic) attractors reveal a cyclic behavior of the system: once a trajectory falls into one of the states belonging the dynamic attractor, the system can only move between states belonging to the same attractor. For each attractor, the set of initial states that leads to it is called basin of attraction [31]. The analysis of the attractors characteristics (such as their size, or the size of their basin of attraction and their trajectories) are very important clues used to infer general GRN characteristics [31, 32].

Post-transcriptional modeling

The starting point to model gene regulatory activities with Boolean Networks is the Gene Protein/Product Boolean Network model (GPBN) proposed by [32]. Differently from the previous approaches where regulatory networks were modeled using only genes, in this work the authors detail the regulatory genes' interactions by explicitly separating genes from their protein products (as separate nodes in the network). We now know that also miRNAs participate, post-transciptionally, in the regulation of almost every cellular process like, for instance, cell metabolism, signal transduction, cell differentiation, cell fate, and so on [33, 34]. In the present work, with the introduction of miRNAs, we show how it is possible to include post-transcriptional regulation in the GPBN model. In general, miRNAs target mRNA molecules by interfering, using still poor understood mechanisms, with their translation, stability, or both [35]. Starting from the GPBN model, we extended the interaction between genes by explicitly introducing, as separate network nodes, also their non-coding RNA products [36, 37]. In our extended GPBN model nodes are labeled in three possible ways: (1) genes (circular nodes), (2) mRNA Protein pairs (rectangular nodes), and (3) miRNA (rhomboidal nodes). There are consequently four possible types of edges between nodes (see Figure 1):

Modeling regulatory mechanisms: 1) transcription/translation - 2) gene activation - 3) miRNA transcription - 4) post-transcriptional regulation.

transcription/translation: an edge from a gene to a protein product it represents the process that from the gene activation leads to the protein expression

gene activation: an edge from a protein to a gene it represents the activation of a gene by one or more protein products (Transcription Factors)

miRNA transcription: an edge from a host gene to a miRNA node it means that the expression of the gene implies the transcription of miRNA molecules encoded in the DNA transcript

post-transcriptional regulation: a (silencing) edge from a miRNA to a protein it means that the protein is a miRNA target and therefore the protein translation is inhibited by the presence of the miRNA.

In order to properly model the post-transcriptional regulation mechanisms, it is necessary to carefully design the set of boolean functions that define the (next) state of each transcriptional product targeted by a miRNA node. Post-transcriptional regulation acts at mRNA level, hence, considering the final protein product, it has higher priority compared to gene expression activity. In terms of boolean networks, it can be modeled by placing the miRNA expression state in Boolean AND with the mRNA expression state.

As already mentioned in [32], the introduction of gene products also requires to take into account the time each product is synthesized in the BN timeline of states evolution. Since the update of inner node values is synchronous [31], the synthesis products require s time steps to be ready (expressed). In the same way, once a gene is no longer expressed, its related products are silenced after d time steps. In our work synthesis and decay times (s and d ) are defined as unitary for all the entities so that, if a given gene is turned ON/OFF at time t, all its products will be accordingly turned ON/OFF at time t + 1. The lifecycle of miRNAs is the same as all other gene products.

Although the introduction of miRNAs activity into the BN makes it possible to include their post-transcriptional effects into the dynamics of the system, it is not enough to properly model the whole post-transcriptional activity. At this point, not all the states are biologically valid. Even though, if well designed, the dynamics of the GRN makes it impossible to evolve into a biologically illegal state, there is no guarantee that an illegal state is not used as the initial state when simulating the network.

To avoid illegal states, the description of the BN is expanded to include a set of conditions identifying all illegal states of the network. These conditions are represented by an additional set of Boolean equations that must be evaluated every time an initial state of the network is considered. Using boolean basics, a state is considered legal if all conditions return zero, illegal otherwise. As an example, let us consider a gene Gx and its related protein mRNAx_Px. The protein can be synthesized only if the related gene has been expressed. So, any state in which mRNAx_Px is equal to 1 (expressed), while Gx is equal to 0 (not expressed) is considered as an illegal state.


Results

The motifs in Figure 2 can broadly be separated into two categories – A, B and D are generally referred to as "incoherent motifs", while C is a coherent motif. This nomenclature is due to the arrangement in C in which both inputs act as promoters, in comparison to A and D which are both fully incoherent (both second tier proteins have both promoter and repressor inputs), whilst B may be considered to be partially incoherent, as one of the second tier proteins has incoherent inputs, whilst the other has coherent inputs (the model used in this case is given in equation (1), in which we consider co-operative binding in the case that the order of binding is unimportant). In addition we also add a derivative of the C model which we denote as C', in which both inputs still act as promoters, however the assumption about the way the promoters act is different – we no longer require both promoters to bind for P Wto be expressed (in practise it can be seen that these operate as an AND and an OR gate respectively). Further discussion of the many ways in which the transcription factors operates can be modelled can be found in the methods section.

Constant inputs

We first consider the effect of providing the motif with steady inputs, a biological scenario which corresponds to continued exposure to a either an environmental condition triggering independent factors, or to a constant signal which is split into two signals by the network structure. To examine the various responses, we consider the effect of both signals being turned on continuously at a high level (which we denote in the in Table 3 as a green square), the result of one of the factors occurring at a high level whilst the other is either low (denoted by a yellow square) or off (denoted by red).

In this way one can see the variation in the responses of the two output proteins, which we denote here as P Zand P Wfor all motifs. In this table, we have simply provided the transcription factors at a reasonably high concentration and quantified the response of each protein after the system has stabilized. All of the simulations are performed with the same kinetic parameters, as detailed in the methods section, and the qualification of a high or low response is based on comparison with the response of other motifs and the response of the other protein. Again, these results are colour-coded for easy comparison, with green, yellow and red again corresponding to high, low or no expression.

We can see that there is significant variation in the characterization of the response of the variants of the motif.

Response to simultaneous pulses

We next look at the responses of the motif in the case in which both transcription factors are turned on simultaneously for 3600 seconds (an hour), and then turned off again. Again, the precise amplitudes and durations of the protein responses varies greatly, and are dependant on the precise values of the kinetic parameters. Two illustrative cases can be seen in Figure 5, again corresponding to variants A and B of the motifs however. The qualitative results are summarized in Figure 4.

This table shows the variation in the outputs of the different motifs when both inputs are turned on to a high level for 3600 seconds and then abruptly turned off. This is denoted in the inputs as green then red. The outputs are then categorised and colour coded as either off, low or high, corresponding to red, yellow or green. This scenario corresponds to relatively short term exposure to a transcription factor up the transcription network. Again, the details of the model can be found in the methods section, and the parameters used are given in Table 2

In these two plots we again see the dynamical response of motifs A and B when the motif is excited by simultaneous step functions for 3600 seconds. The two plots share their y-axes, with the scale for the response on the leftmost axis, and the strength of the input signal on the far right axis. Note not only the difference between the two in peak expression, but also the marked difference in total expression, time of peak expression, and behaviour after the peak between the two motifs. The curves in (a) have been separated to enhance readability. To produce the simultaneous steps, both I Xand I Yare initially at 0 at t = 0, are then set to 100 for 3600 seconds, then again returned to 0. The kinetic parameters used are given in Table 2.

Staggered pulses

Finally we look at the effect of offset pulses in the levels of transcription factors. In the first case, providing the first transcription factor (denoted by I X) for 3600 seconds, then turning that off, and turning the second factor (I Y) on for 3600 seconds. This is then reversed. The results can be see in Figure 6, again using the colour coding scheme described above.

In this table we can see the responses of the different motifs to successive inputs. The first two rows show the effect of first perturbing I Xand then I Ywith a step function. In the third and fourth rows this is reversed. Again the main feature to notice in this table is the variety in the responses of the motifs modelled. For the purposes of this diagram there were assumed to be two phases of behaviour corresponding to "after" the first perturbation and "after" the second perturbation. The strength of the response is again indicated by Red, Yellow and Green, corresponding to Off, Low or High response. As in the previous cases, the details of the model can be found in the methods section, and the parameters used are given in Table 2.

Two illustrative cases can be seen in Figure 7, corresponding to variants A and B, in the case in which first I Xand then I Yare activated.

Here we see again the details of the dynamical response of motifs A and B to the inputs. Again, the two plots share y-axes, with the response scale on the left, and input signal strength of the far right axis. In this case, two sucessive step functions of 3600 seconds were used to model activation by first one transcription factor and then the other immediately after. In these figures, I Xis increased from 0 to 100 for 3600 seconds, and then at t = 3600, I Yis set to 100 for 3600 seconds and I Xis turned off. Again the kinetic parameters used in the model can be found in Table 2. Note the significant differences not only in the absolute values of peak expression, but in the shape of the curves.


4. Revisiting Classical Philosophical Questions

With the emphasis on systems and interacting networks, both systems and synthetic biology explicitly engage in one of the oldest philosophical discussions on the relationship between the whole and its parts, or between holism and reductionism. This section examines how classical questions are reframed in the new light of strategies for large-scale data production and dynamic modeling.

4.1 Reductionism and the Sum of the Parts

As mentioned in the introduction, proponents of systems biology often explicitly define their approach in contrast to reductionist strategies in molecular biology. Molecular biology is depicted as a field studying molecular components and pathways in isolation, whereas systems biology integrates the pieces of the puzzle in the context of the system as a whole (van Regenmortel 2004 Keller 2005 Kitano 2002a,b). The contrast between molecular biology and systems biology is often overstated, and much of systems biology research is also focused on specific molecular difference-makers (De Backer et al. 2010 Gross 2017 O&rsquoMalley & Dupré 2005). However, systems biology may give a novel interpretation of Aristotle&rsquos dictum that the whole is more than the sum of the parts by specifying what more means in the context of contemporary biology.

4.1.1. Modular and Bottom-Up Reductionism

When systems biologists criticize reduction in molecular biology, the issue at stake is typically the limitations of studying biological parts or modules in isolation. Because the target of the criticism often differs from the more traditional philosophical focus on reduction of higher-level to lower-level explanations (Brigandt & Love 2017, the notion of modular reductionism has been suggested (Gross & Green forthcoming). An anti-reductionist stance towards modular reductionism needs not reject the idea that living systems can be modelled and explained bottom up. For instance, while global approaches within these fields reject the modularity assumption, some keep the focus on genomes and molecular networks as the primary determinants of biological functions (O&rsquoMalley et al. 2008). [7]

Many systems biologists have, however, also argued against reduction of higher-level models and explanations, and there is an ongoing debate about how far genomics, proteomics, etc. will take us in solving complex problems like understanding cancer (cf., Barabási et al. 2011 Hood et al. 2015 Wolkenhauer & Green 2013). Similarly, researchers involved in projects aiming to simulate multiscale structures like the human heart emphasize the need to include macroscale parameters as they provide functionally important constraints on the behavior of microscale processes (Bassingthwaighte et al. 2009 Kohl & Noble 2009). An interesting reframing of Aristotle&rsquos dictum in this discussion is that living systems at the same time are more and less than the sum of the parts (see Hofmeyr 2017 Noble 2012). In other words, the system as a whole constrains the degree of freedom of lower-level parts and provides a functional organization of these that are required for some system capacities (see below).

Given the increasing emphasis on comprehensive multiscale models, systems biology research may have a unique potential for philosophical insights considering the explanatory role of macroscale properties and top-down effects. Systems biologists for instance point to how enzyme activity is constrained by the chemical environment and the cellular context (Hofmeyr 2017), or how the biophysical properties of muscle fibers and cell membranes provide functional constraints on ionic oscillations central to the generation of heart rhythms (Noble 2012). Interpreting top-down effects as constraining relations may exemplify what philosophers of science have called &lsquomedium downward causation&rsquo (Emmeche et al. 2000), which interprets downward causation as boundary conditions. Noble (2012) explicitly endorses such a view when arguing that downward causation is necessary by pointing out how equations describing the kinetics of ion channels in cardiac modeling cannot be solved without defining the boundary conditions (e.g., the cell voltage). Multi-scale modeling may thus help to give a more concrete mathematical reinterpretation of the controversial notion of downwards causation (see also Ellis et al. 2011 Gross & Green forthcoming). Moreover, discussions of downward causation in systems biology have practical as well as theoretical implications for cancer research as some proponents disagree on whether cancer is a genetic or tissue-based disease (Bertolaso 2011 Soto et al. 2008 see also Section 5.3).

4.1.2 Emergence and Predictability

Discussions of downward causation are often connected to debates about whether biological systems exhibit emergent properties. Emergence in the context of systems biology typically means that systems properties are explanatorily irreducible to the properties of parts (also called synchronic emergence). However, it is debatable which sense of emergence (weak or strong) systems biology supports (cf., Alberghina & Westerhoff 2005 Boogerd et al. 2005, 2007 Emmeche et al. 2000 Kolodkin et al. 2011). Moreover, systems biologists have different views on whether biological systems also exhibit diachronic emergence, i.e., system properties that are inherently unpredictable (Ellis et al. 2011).

Some systems biologists stress that biological complexity forces life scientists to draw on abstract and idealized models, and that there are practical and in principle limitations to our ability to fully predict and control living systems (Noble 2012 Bassingthwaighte et al. 2009). Others are more optimistic that the limitations can be overcome by upscaling computational models. Some even argue that systems biology breaks with the methodological principle of Ockham&rsquos razor (Kolodkin & Westerhoff 2011 see also Hofmeyr 2017 Gross 2017). These debates center on fundamental questions about how far biological research can be taken by integrating more parameters and data points (Kolodkin et al. 2012), and how far we can &lsquoextend ourselves&rsquo through computational approaches (Humphreys 2004 Vermeulen 2011).

Large-scale modeling projects such as the Vitual Cell, the Physiome Project, and the Virtual Physiological Human offer exciting cases for philosophical analysis of the prospects and challenges associated with parameterizing and validating complex models (Carusi et al. 2012 Carusi 2014 Hunter et al. 2013). Ultimately, such projects may push the boundaries for prediction and control in the life science, or may reveal deeper challenges of biological complexity.

4.2 Explanatory Pluralism

Systems biology also brings new insights to discussions of scientific explanation. The integration of different disciplinary inputs and emphasis on mathematical modeling make systems biology a particularly interesting case for discussions of whether the mechanistic account can fully capture the diversity of explanatory practices in contemporary biology.

Mechanistic accounts initially emphasized the differences between the explanatory ideal of covering laws in physics and explanations in molecular biology (Craver & Tabery 2015). Mechanistic explanations cite how biological functions arise from the interactions and organization of component parts or entities (Bechtel & Richardson 1993 Machamer et al. 2000 Glennan 2002). Since research in systems biology is often integrated with research in molecular and cell biology, some have argued that many cases in systems biology can readily be seen as an extension of mechanistic research (Boogerd et al. 2013 Richardson & Stephan 2007). However, the increasingly important role of mathematical and computational modeling for understanding non-linear dynamics, cyclic organization, and complex feedback relations have led some proponents to argue for a modified account of dynamic mechanistic explanations (Bechtel & Abrahamsen 2011, 2012 Brigandt 2013). [8]

Inspired by network analysis in systems biology, proponents of the dynamical mechanistic account have further argued for increased attention to strategies of abstraction used to identify organizational features and generalizable aspects of mechanisms (Bechtel 2015b Levy & Bechtel 2013). The emphasis on spatial organization and mutual relations between components may possibly be better described as constitutive rather than causal aspects (Fagan 2015). Whereas causal relations are often prioritized in mechanistic accounts (or the two aspects conflated), distinguishing between these can help identify the distinct contributions of causal mechanistic modeling and mathematical analysis of network dynamics. To account for case examples from systems biology, Fagan (2015) suggests a joint account of collaborative explanations in which diverse perspectives are combined to account for the mutual relations of spatial organization and binding among components as well as the causal relations in the system.

Whereas the aforementioned accounts emphasize the importance of mathematical modeling for an updated mechanistic account, others view the emphasis on quantitative and dynamic aspects as a departure from mechanistic explanations. The interpretation of network models is often conducted in an abstract mathematical or engineering-inspired framework, and it has been stressed that it is hard or perhaps impossible to reconstruct a causal story from these models (Issad & Malaterre 2015 Gross 2015). Insofar as mathematical models are primarily used as inputs to or heuristics for mechanistic explanations, e.g., as mechanistic schemas, the reliance on these modeling strategies do not provide a challenge for mechanistic accounts (Darden 2002 Matthiessen 2015). Yet, although abstract modeling is compatible with mechanistic research, an important contested issue is whether abstract models are developed towards research aims that are distinct from mechanistic explanations.

Some have pointed to examples where the modeling process does not proceed from abstract to more detailed models but in the opposite direction, suggesting that mechanistic details may be vehicles for more generic explanations rather than the other way around (Braillard 2010 Green & Jones 2016). It has been argued that systems biologists sometimes aim for non-causal explanations. One candidate of non-causal explanations, called topological explanations, emphasizes how networks architectures generically determine dynamic behaviors, independently of the causal details of the network (Huneman 2010). Mathematical analysis of network structures underlying biological robustness has been argued to exemplify this explanatory goal (Jones 2014).

Another candidate is what Wouters (2007) in the context of comparative physiology calls design explanations. Design explanations do not describe how a biological function is causally produced but clarify why a given design (and not an alternative design) is present. They do so by pointing to constraints on the possible designs that make some designs good, some suboptimal, and others impossible (see Shinar & Feinberg 2011 for a candidate example from systems biology). The relevance of design explanations for systems biology lies in the interest to specify relations between functional capacities (e.g., robustness) and system organization (e.g., integral feedback control) via design principles that are independent of specific contexts of implementation (Braillard 2010 Boogerd 2017).

Others have discussed whether explanations in systems biology are mergers of mathematical explanations and mechanistic explanations (Baker 2005 Brigandt 2013 Mekios 2015) or introduce a new explanatory category called Causally Interpreted Model Explanations (Issad & Malaterre 2015). These proposals are not considered as alternatives to mechanistic accounts but are proposed as ingredients in a pluralistic approach to biological explanation (Brigandt et al. forthcoming Mekios 2015). There is, however, currently no consensus on whether the differences are sufficiently significant to support an explanatory pluralism involving non-mechanistic explanations.

Although much, or perhaps most, of the philosophical work on systems biology has focused on explanation, one cannot assume that explanation is the sole aim of systems biology research (MacLeod & Nersessian 2015 Kastenhofer 2013a,b). Philosophers have debated whether research on the relation between biological robustness and integral feedback control should be considered as a stepping-stone for mechanistic explanations, or whether the perspective from control engineering constitute a non-mechanistic explanatory framework (Braillard 2010 Brigandt et al. forthcoming Matthiessen 2015 Green & Jones 2016). But another important reason why some systems biologists seek generality and simplicity of their models may be that they&mdashlike synthetic biologists&mdashwish to explore whether the same function could be realized in other or simpler ways (Briat et al. 2016). Thus, mathematical and computational modeling in synthetic and systems biology also call for philosophers to examine other epistemic aims than explanation. Examples are prediction, control, and design, and well as theoretical and practical interests in understanding the minimal requirements for biological functions and life itself.


NETWORK DECOMPOSITION INTO FUNCTIONAL MODULES

The decomposition of large networks into distinct components, or modules, has come to be regarded as a major approach to deal with the complexity of large cellular networks [ 27–29]. This topic has witnessed great progress lately, and only representative examples of different approaches are presented here. In cellular networks, a module refers to a group of physically or functionally connected biomolecules (nodes in graphs) that work together to achieve the desired cellular function [ 8**]. To investigate the modularity of interaction networks, tools and measures have been developed that can not only identify whether a given network is modular or not, but also detect the modules and their relationships in the network. By subsequently contrasting the found interaction patterns with other large-scale functional genomics data, it is possible to generate concrete hypotheses for the underlying mechanisms governing e.g. the signaling and regulatory pathways in a systematic and integrative fashion. For instance, interaction data together with mRNA expression data can be used to identify active subgraphs, that is, connected regions of the network that show significant changes in expression over particular subsets of experimental conditions [ 30].

Motifs

Motifs are subgraphs of complex networks that occur significantly more frequently in the given network than expected by chance alone [ 29]. Consequently, the basic steps of motif analyses are (i) estimating the frequencies of each subgraph in the observed network, (ii) grouping them into subgraph classes consisting of isomorphic subgraphs (topologically equivalent motifs) and (iii) determining which subgraph classes are displayed at much higher frequencies than in their random counterparts (under a specified random graph model). While analytical calculations from random models can assist in the last step, exhaustive enumeration of all subgraphs with a given number of nodes in the observed network is impossible in practice. Kashtan et al. [ 31] therefore developed a probabilistic algorithm that allows estimation of subgraph densities, and thereby detection of network motifs, at a time complexity that is asymptotically independent of the network size. The algorithm is based on a subgraph importance sampling strategy, instead of standard Monte Carlo sampling. They noticed that, network motifs could be detected already with a small number of samples in a wide variety of biological networks, such as the transcriptional regulatory network of E. coli [ 31]. Recently, efficient alternatives together with graphical user interfaces have also been implemented to facilitate fast network motif detection and visualization in large network graphs [ 32, 33].

Many of the methodologies recently introduced in network analysis are inspired by established approaches from sequence analysis. The concepts utilized in both fields include approximate similarity, motifs and alignments. As network motifs represent a higher-order biological structure than protein sequences, graph-based methods can be used to improve the homology detection of standard sequence-based algorithms, such as PSI-BLAST, by exploiting relationships between proteins and their sequence motif-based features in a bipartite graph representing protein-motif network [ 34]. The definition of network motifs can be enriched by concepts from probability theory. The motivation is that if the network evolution involves elements of randomness and the currently available interaction data is imperfect, then functionally related subgraphs do not need to be exactly identical. Accordingly, Berg and Lässig [ 35] devised a local graph alignment algorithm, which is conceptually similar to sequence alignment methodologies. The algorithm is based on a scoring function measuring the statistical significance for families of mutually similar, but not necessarily identical, subgraphs. They applied the algorithm to the gene regulatory network of E. coli [ 35].

Motifs have increasingly been found in a number of complex biological and non-biological networks, and the observed over-representation have been interpreted as manifestations of functional constraints and design principles that have shaped network architecture at the local level. Significance of motifs is typically assessed statistically by comparing the distribution of subgraphs in an observed network with that found in a particular computer-generated sample of randomized networks that destroy the structure of the network while preserving the number of nodes, edges and their degree distribution. It can be argued what kind of random model provides the most appropriate randomization, and especially whether it is realistic to assume that the edges in the randomized network are connected between the nodes globally at random and without any preference [ 36]. However, the principal application of network motif discovery should not originate from a rigorous statistical testing of a suitable null hypothesis, but from the possibility to reduce the complexity of large networks to smaller number of more homogeneous components. Analogously with gene expression cluster analysis, where statistical testing is also difficult because of the lack of an established null model, network decomposition may be used as a tool to identify biologically significant modules, irrespective of their statistical significance.

Clusters

An alternative approach to the identification of functional modules in complex networks is discovering similarly or densely connected subgraphs of nodes (clusters), which are potentially involved in common cellular functions or protein complexes [ 37]. As in expression clustering, the application of graph clustering is based on the assumption that a group of functionally related nodes are likely to highly interact with each other while being more separate from the rest of the network. The challenges of clustering network graphs are similar to those in the cluster analysis of gene expression data [ 6]. In particular, the results of most methods are highly sensitive to their parameters and to data quality, and the predicted clusters can vary from one method to another, especially when the boundaries and connections between the modules are not clear-cut. This seems to be the case at least in the PPI network of S. cerevisiae [ 38]. Moreover, it should be noted that modules are generally not isolated components of the networks, but they share nodes, links and even functions with other modules as well [ 8**]. Such hierarchical organization of modules into smaller, perhaps overlapping and functionally more coherent modules should be considered when designing network clustering algorithms. The functional homogeneity of the nodes in a cluster with known annotations can be assessed against the cumulative hypergeometric distribution that represents the null model of random function label assignments [ 20*].

Highly connected clusters

Most algorithms for determining highly connected clusters in PPI networks yield disjoint modules [ 39]. For instance, King et al. [ 40] partitioned the nodes of a given graph into distinct clusters, depending on their neighbouring interactions, with a cost-based local search algorithm that resembles the tabu-search heuristic (i.e. it updates a list of already explored clusters that are forbidden in later iteration steps). Clusters with either low functional homogeneity, cluster size or edge density were filtered out. After optimizing the filtering cut-off values according to the cluster properties of known S. cerevisiae protein complexes from MIPS database, their methods could accurately detect the known and predict new protein complexes [ 40]. Other local properties such as centrality measures can be used for clustering purposes as well. A recent algorithm by Dunn et al. [ 41], for example, divides the network into clusters by removing the edges with the highest betweenness centralities, then recalculating the betweenness and repeating until a fixed number of edges have been removed. They applied the clustering method to a set of human and S. cerevisiae PPIs, and found out that the protein clusters with significant enrichment for GO functional annotations included groups of proteins known to cooperate in cell metabolism [ 41].

Overlapping clusters

Corresponding to the fact that proteins frequently have multiple functions, some clustering approaches, such as the local search strategy by Farutin et al. [ 42], also allow overlapping clusters. Like in motif analysis, the score for an individual cluster in the PPI network graph is assessed against a null model of random graph that preserves the expected node degrees. They also derived analytical expressions that allow for efficient statistical testing [ 18]. It was observed that many of the clusters on human PPI network are enriched for groups of proteins without clear orthologues in lower organisms, suggesting functionally coherent modules [ 42]. Pereira-Leal et al. [ 43] used the line graph of the network graph (where nodes represent an interaction between two proteins and edges represent shared interactors between interactions) to produce an overlapping graph partitioning of the original PPI network of S. cerevisiae. Recently, Adamcsek et al. [ 44] provided a program for locating and visualizing overlapping, densely interconnected groups of nodes in a given undirected graph. The program interprets as motifs all the k-clique percolation clusters in the network (all nodes that can be reached via chains of adjacent k-cliques). Larger values of k provide smaller groups resulting in higher edge densities. Edge weights can additionally be used to filter out low-confidence connections in the graphs [ 44].

Distance-based clusters

Another approach to decompose biological networks into modules applies standard clustering algorithms on vectors of nodes’ attributes, such as their shortest path distances to other nodes [ 45]. As the output then typically consists of groups of similarly linked nodes, the approach can be seen as complementary to the above clustering strategies that aim at detecting highly connected subgraphs. To discover hierarchical relationships between modules of different sizes in PPI graphs, Arnau et al. [ 46] explored the use of hierarchical clustering of proteins in conjunction with the pairwise path distances between the nodes. They considered the problem of lacking resolution caused by the ‘small world’ property (relatively short—and frequently identical—path length between any two nodes) by defining a new similarity measure on the basis of the stability of node pair assignments among alternative clustering solutions from resampled node sets. As ties in such bootstrapped distances are rare, standard hierarchical clustering algorithms yield clusters with a higher resolution. The clusters obtained in S. cerevisiae PPI data were validated using GO annotations and compared with those refined from gene expression microarray data [ 46]. A similar approach was also applied to decompose metabolic network of E. coli into functional modules, based on the global connectivity structure of the corresponding reaction graph [ 47].

Supervised clustering

Provided that the eventual aim of module analysis is function prediction, it can be argued that supervised clustering (or classification), rather than unsupervised clustering methods, should be employed. In the context of cellular networks, classification aims at constructing a discriminant rule (classifier) that can accurately predict the functional class of an unknown node based on the annotation of neighbouring nodes and connections between them. To this end, Tsuda and Noble [ 48] considered a binary classification problem, and calculated pairwise distances on undirected graphs with a locally constrained diffusion kernel. They demonstrated a good protein function prediction with a support vector machine (SVM) classifier from S. cerevisiae PPI and metabolic networks. Supervised clustering methods in function prediction are challenged by their notorious dependence on the quality of the training examples [ 49]. As fully curated databases are rarely available, especially for less-studied organisms, the applicability of such methods is still limited. Therefore, an intermediate method between the two extremes of supervised and unsupervised clustering may be preferable. The protein function prediction algorithm by Nabieva et al. [ 50] suggests such an approach that exploits both global and local properties of the network graphs. They demonstrated better predictions than previous methods in cross-validation testing on the unweighted S. cerevisiae PPI graph. More importantly, they showed that the performance could be substantially improved further by weighting the edges of the interaction network according to information from multiple data sources and types [ 50].


6 Conclusion

In this article we have reviewed existing work on the use of gene regulatory networks for computational purposes. We first introduced how genetic regulation works in living systems, followed by a discussion of existing computational models. This shows the diversity of encodings and dynamics that are currently being used however, a rigorous comparison of models has yet to be performed. Recently, an initial study has been conducted in order to compare various encodings and dynamics [40]. Without doubt, the community would benefit from standardized benchmarks to facilitate the comparison of various models of gene regulation as well as other optimizable models, such as artificial neural networks, genetic programming, and handwritten scripts. The increasing frequency of competitions organized at conferences is one step in this direction, and serves as a good basis of comparison.

In our past experience of presenting artificial gene regulatory networks and applications to various real-world control problems, we are often asked about the difference between artificial gene regulatory networks and artificial neural networks (ANNs). While artificial gene regulatory networks and artificial neural networks can be used for similar purposes, AGRNs utilize a compact genetic representation: For instance, instead of encoding connection weights between neurons, which can mean the need to optimize millions of variables in recent deep neural networks, artificial gene regulatory networks only encode the 3D structure of proteins that codes for the dynamic interaction between them. This drastically reduces the number of variables, to a few hundred. This has widespread consequences, especially in the age of deep learning (DL) [60]. While applications of evolutionary algorithms to DL have just started to appear [101, 121], we expect that evolving DL neural networks with AGRNs will be a major application area in the future.

For a direct comparison between AGRNs and ANNs, however, the recurrent connectivity of AGRNs allows them to be best compared to recurrent neural networks. While there is significant work to be done in relating artificial gene regulatory networks to artificial neural networks, initial steps have been taken in [145]. There, Watson et al. show that evolving simple artificial gene regulatory networks is equivalent to the associative learning of weights in a Hopfield network. However, this observation has not been extended to artificial gene regulatory networks with more complex genetic representations. Also, Baran et al. recently proposed the use of AGRNs to study the evolution of social behavior and, more precisely, the underlying development of the brain's neural circuitry [8]. This opens new perspectives for studies of the connection between artificial neural networks and artificial gene regulatory networks.

Two other properties of AGRNs that are, in our opinion, particularly interesting and not yet fully used and understood are temporal dynamics and heterochrony. The first, temporal dynamics, allows a certain memory to emerge in the network: Concentrations can be updated constantly, at every time step of the simulation or problem resolution, while actions are executed once in a while. This provides the network with all the history of a given state of the environment, which is naturally kept by protein concentrations and provides a memory system of the GRN. Not yet mathematically studied or fully understood, these dynamics could be beneficial for long-term decision making.

The second, heterochrony, is a crucial property of these networks. As described previously in this article, this mechanism allows a slow modification of the network dynamics when mutation occurs. This mechanism is not yet sufficiently employed in current mutation operators of genetic algorithms. While crossover operators have been recently improved in [29], mutation is still crucial in AGRN optimization, since most approaches use very high mutation rates (∼75%), if not exclusively mutation. Whereas the NEAT algorithm has strongly impacted the evolution of neural networks [132], improving the evolutionary algorithm is a central question in order to find the best possible network for a given problem. More work is necessary in this domain in order to generate better results with artificial gene regulatory networks.

Finally, one domain in which artificial gene regulatory networks could excel but have not been well tested is online learning. Thanks to their easy-to-modify structure based on protein affinities, slight changes of proteins' tags while the agent is acting in the environment should be possible. A mechanism, comparable to backpropagation in artificial neural networks, will have to be designed in order to intelligently change these values according to the rewards obtained by the agent. The architecture of artificial gene regulatory networks should be helpful here, due to the small number of parameters one needs to modify in order to change entire networks. One could easily imagine particle-swarm-optimization-like motion, in which the AGRN's proteins would move in a 3D space (for a model based on three tags such as Cussat-Blanc et al.'s [39, 40, 125] model), attracted and repelled by other proteins according to the efficacy of the networks for a given task.

Possibilities opened by gene regulatory networks are numerous. Whereas biologists have made significant progress in understanding the inner mechanisms of gene regulation in living systems, much remains to be discovered and understood. These mechanisms produce extremely complex behaviors in living organisms, from embryogenesis to the regulation of everyday life. Computer science and more specifically artificial intelligence will benefit from these discoveries and, with gene regulatory networks, could produce more intelligent behaviors for artificial agents in the near future.


Synthetic Biology Comes into Its Own

Richard A. Muscat
Jun 1, 2016

© ISTOCK.COM/FATIDO/FEORIS

E very two hours in Matthew Bennett&rsquos Rice University lab, cyan and yellow lights flashed in synchronization. Bennett and his team had engineered 12 components to generate the coordinated oscillations. This circuit wasn&rsquot electronic, however it was biological. Two populations of E. coli, each carrying a synthetic gene circuit, cycled in synchronous pulses every 14 hours.
Bennett&rsquos work, published last year in Science, 1 is a key application of modern synthetic biology: taking biological components and linking them together to form novel functional circuits. Instead of a program coded in Java and executed by a computer&rsquos working memory, commands were written in DNA and carried out by the microbes&rsquo cellular machinery. LEDs were replaced with fluorescent proteins, and molecular signaling cascades served as the system&rsquos wires.

Stripped back to its most basic components, a synthetic or natural biological network consists of a gene that either.

The first synthetic networks were created in 2000, when researchers built an oscillator and others constructed a bistable switch in E. coli. In an oscillator circuit, three genes form a cascade, in which each gene triggers the inactivation of the next gene. In the case of the landmark oscillator constructed by Rockefeller University’s Stanislas Leibler, then at Princeton, and his graduate student Michael Elowitz, one of the three repressor genes was also linked to green fluorescent protein (GFP), resulting in visible pulses of light. 2 A bistable switch, on the other hand, consists of just two genes that inactivate each other. When one gene is on, the other is off. Due to variation in the expression of the active gene, the inactive gene occasionally gets the chance to switch on and suppress the expression of the first gene. Like Leibler and Elowitz, MIT bioengineer James Collins and his grad student Tim Gardner, then at Boston University, linked one of the genes with a sequence encoding GFP, and they were able to see the cells switch between states. 3 (See “Tinkering With Life,” The Scientist, October 2011.)

Since these early studies, engineers, computer scientists, mathematicians, and physicists have been applying their expertise to engineer synthetic gene networks. In addition to supporting the creation of novel functions, synthetic networks can also give insight into how naturally occurring ones work. As physicist Richard Feynman once wrote on his office blackboard, “What I cannot create, I do not understand.” Studying gene circuits in their natural context is complicated by the complex cellular environment in which they function reconstructing and tuning gene interactions in vitro can provide a simplified model for how equivalent networks behave in nature. (See illustration below.)

“Naturally occurring gene oscillators, especially the circadian oscillator that regulates our daily rhythms, are hard to study,” says Bennett. “We can easily make changes and fine-tune synthetic gene circuits in ways that are difficult in natural systems. Though our synthetic circuits are inherently different from their natural counterparts, we can use them to study some of the basic principles of how genes dynamically regulate each other.”

Stripped back to its most basic com­ponents, a synthetic or natural biologi­cal network consists of a gene that either switches another gene on or turns it off.

Researchers at the J. Craig Venter Institute (JCVI) in San Diego have even gone so far as to create the smallest functional genome to date, a mycoplasma bacterium consisting of just 473 genes. 4 This stripped-down cell can now provide insights into what each of the genes and their respective proteins are doing to keep the organism alive.

Building man-made circuits can also lead to something entirely new, Bennett adds. “I try to find ways to engineer a new synthetic circuit that can mimic the unexplained phenomenon, even if my solution is not the same as nature’s. Sometimes this leads to new insights into the natural circuit and sometimes not. Either way it’s exciting.”

Genetic networking

In the early 2000s, Uri Alon of the Weizmann Institute of Science in Israel and colleagues studied the connections between genes in E. coli, discovering common motifs, or patterns of gene connectivity. Importantly, the researchers found that these motifs occurred more often than could be expected if you took the same number of genes and randomly connected them, suggesting that biological networks have evolved these patterns. 5 After early studies demonstrated researchers’ ability to create novel gene circuits, many synthetic biologists began making synthetic replicas of these natural motifs.

DECIPHERING THE NETWORK: A naturally occurring gene network consists of many interacting genes that can activate or repress each other (top). But embedded within a larger network, their function can be hard to study. Synthetic biology can simplify the study of such gene interactions by engineering analogous circuits separate from the larger network (bottom).
See full infographic: WEB COURTESY OF RICHARD MUSCAT

By isolating a subset of genes and inserting them into a new cell, synthetic biologists can assemble a motif that has little interaction with the molecular machinery of that cell the genes are considered orthogonal. For example, viral promoters—sequences of DNA that drive gene expression—can be used to express GFP, a gene taken from jellyfish, inside mammalian cells. Modifications to promoters can allow them to be controlled by signaling molecules, not only allowing novel genes to be expressed, but giving synthetic biologists the ability to switch them on and off.

One of the simplest signaling motifs involves a gene that either activates or represses itself. Positive autoregulation is when a protein triggers its own expression. At first, due to the absence or very low concentration of protein, its expression is very low. After a while, however, an intermediate level of the protein builds up, speeding up the rise in expression levels. The overall effect of positive autoregulation is thus a delay before the gene reaches normal expression rates. 6 Conversely, negative regulation, when a gene inhibits its own expression, allows the fast activation of a gene upon exposure to a signaling molecule, but then slows its own production once it reaches a critical level, allowing it to rapidly reach a steady state. 7 (See illustration below.)

Another common motif includes the interaction of several genes forming feed-forward loops, in which one gene activates or represses the expression of another only under certain conditions. Synthetic implementation of one particular feed-forward loop has been shown to produce a pulse of gene activation—a large peak of gene expression followed by steady state expression. 8

Autoregulation and feed-forward loops highlight how synthetic biology can create direct replicas of naturally occurring circuits to understand their function. However, synthetic biologists also have engineered a number of novel behaviors in cells: for example, different types of computation. One of the choices a synthetic biologist might make when constructing a synthetic circuit is whether to make it digital (i.e., on/off) or analog (varying levels of output). Researchers have constructed digital circuits implementing Boolean computations such as AND/OR and NOT/NOR logic with up to 10 regulators and 55 component parts in E. coli. 9 Of course, many genes are not expressed in a digital manner neither completely on nor completely off, they are, rather, expressed dynamically over a range of levels. As a result, synthetic biologists are increasingly taking inspiration from nature and designing computational circuits in analog, implementing functions such as addition, subtraction, and division. 10

More than 15 years of constructing such biological gene networks has made waves in a wide variety of scientific fields. For example, Mary Dunlop of the University of Vermont is taking advantage of feedback circuits in the design of biofuel-producing bacteria. Her group has modified E. coli to express biofuels such as alcohols, diesels, or jet fuels that are exported from the cell by efflux pumps. Too much biofuel accumulating in a cell is toxic, and expression of too many efflux pumps places a strain on the cell. Either of these problems can prevent cell growth and the production of more biofuel. Through mathematical simulation, Dunlop has demonstrated that a negative-feedback sensor could control the balance by delaying pump expression until it is needed, when there is enough biofuel inside the cell to necessitate pumping it out. 11

DYNAMIC GENE EXPRESSION: A number of motifs that appear in naturally occurring networks have been reconstructed in synthetic circuits. Positive autoregulation (left) occurs when a gene is activated by its own product this results in delayed activation. (The black dotted line provides a comparison to gene activation with no autoregulation.) Conversely, negative autoregulation occurs when a gene represses its own expression (middle), allowing its rapid activation until it reaches a steady state, and then preventing overexpression. Finally, a combination of several genes can form a motif known as a feed-forward loop (right). Depending on the way the genes are connected, activating a single gene triggers the simultaneous activation and repression of another gene, causing a pulse in expression followed by a lower steady state.
See full infographic: WEB COURTESY OF RICHARD MUSCAT

Synthetic circuits are also showing potential as valuable tools for diagnosing and treating disease. In one study, researchers created synthetic circuits designed to detect combinations of microRNAs associated with a particular case of cervical cancer, and inserted the circuits into cancer and noncancer cell lines. Using a combination of AND and OR logic allowed the detection of specific combinations of different microRNA species only present in HeLa cells. If the right microRNA combination was detected, the synthetic circuit expressed a gene that caused the cells to die. 12

While many technical challenges stand in the way of applying synthetic biology techniques in treating patients, a more near-term application may come in the form of paper-based diagnostics. In 2014, Collins and his colleagues at Harvard and Boston Universities developed synthetic gene circuits that function outside of cells and can be embedded in paper, which changes color through the expression of fluorescent proteins if certain markers are present in the sample. In this proof-of-concept study, the researchers showed that such paper-based diagnostics could be designed for a diverse range of applications, from glucose detection to the identification of different strains of Ebola virus, with outputs that can be seen by eye or a cheap microscope. 13 This year, the team updated the test to detect 24 RNA sequences found in the Zika genome when a target RNA is present, a series of interactions turns the paper purple. 14 Paper-based diagnostics are easy to store through freeze-drying and to move to low-resource settings out of the lab, and researchers are now working to design such diagnostics for use in the field.

Working in tandem

BACTERIAL MOSAIC: Two populations of E. coli fluoresce yellow and cyan in unison as they activate or repress the other’s expression as well as their own. (See illustration below.) SCIENCE, 349:986-89, 2015, COURTESY OF MATTHEW BENNETT The examples described so far have been of genetic circuits operating in isolation inside many identical cells or outside cells altogether. In nature, however, cells don’t exist in a vacuum rather, small signaling molecules that can be easily transmitted across cell membranes allow cells to communicate with their neighbors.

In the case of the oscillator created in 2000, each individual cell in a population of bacteria would act in isolation, with one cell oscillating out of phase from its neighbors. Ten years later, University of California, San Diego’s Jeff Hasty and his colleagues used small signaling molecules that could pass out of one cell and into another to regulate the gene network within that cell. As a result, the oscillations of entire populations of bacteria were linked together and cycled in unison. 15

Bennett’s group at Rice University took this idea one step further when they made use of two interacting populations of E. coli carrying different genetic circuits to coordinate the long-term, stable oscillations of fluorescent protein expression. One population of bacteria acted as an activator strain while the other population acted as a repressor strain. The activator strain produced a signaling molecule that activated even more of its own signal production, triggering activation of the repressor strain. When activated, the repressor strain produced another signaling molecule that repressed both itself and the activator strain. Each strain also produced an enzyme that degraded the signaling molecules in the system, preventing the buildup of leftover signal.

“We took the circuitry of a single strain oscillator and reconfigured it so that two strains must work together to achieve the oscillations,” Bennett explains. “It’s similar to taking a book and giving the even pages to one person and the odd pages to another. To either individual, their portion of the book is useless. But if the two can communicate and work together, the book will make sense.”

COORDINATED OSCILLATIONS: Two populations of bacteria interact via signaling molecules to coordinate expression of fluorescent proteins. When using positive and negative autoregulation (top), the oscillations are robust as the two populations grow. The negative feedback loop of the repressor strain and the positive feedback loop of the activator strain thus reinforce oscillations when feedback is removed from the circuit (bottom), oscillations are less coordinated and prone to failure. “You can think of feedback loops as self-correction mechanisms,” says Bennett. “They are constantly assessing the current performance of the circuit and make changes if necessary.”
See full infographic: WEB COURTESY OF RICHARD MUSCAT

As each strain is activated, a fluorescent molecule is produced: cyan in the activator strain and yellow in the repressor strain. When mixed together, both populations are activated and repressed in unison, causing fluorescent oscillations over the entire cell population. When one strain is grown in isolation, no oscillations are observed.

Applying principles of basic gene motifs such as feedback loops with cell population biology can thus expand the repertoire of synthetic biologists looking to create novel genetic circuits. Likewise, implementing synthetic biological circuits in mixed cell populations that have coordinated behavior might illustrate ways in which complex synthetic tissues and organs could be engineered.

The decreasing cost of DNA synthesis and sequencing, the ability to share plasmids, the creation of databases describing genetic components, and the development of novel techniques to easily assemble and edit genomes have greatly accelerated progress in this area. As researchers engineer new genetic components, the relatively new field of synthetic biology could soon begin to bear actionable fruit, with applications that include compound synthesis, diagnostics, and even medical treatments. In addition, the design and study of synthetic systems will continue to give us a deeper understanding of the biology that exists around us.

“I take a great deal of inspiration from nature,” says Bennett. “Sometimes I see a circuit that is well-characterized and wonder if we can build it just as well as nature. Other times, I look at a phenomenon in nature that is unexplained. Then I get really excited.”

Richard A. Muscat works at the London-based Cancer Research UK, bringing together multidisciplinary teams of researchers using engineering and physical sciences to find new ways to tackle cancer.


SRH, EAB, MC, and RSM performed analysis and wrote the paper. MB and MF performed analysis. BJW, GK, DGH, and RSM performed experiments. EAB, MC, and MB developed the dynamical system model. SRH wrote the impulse R package. AB developed the interactive website. RSM conceived and oversaw the study.

SRH, DGH, BJW, GK, AB, and RSM are employees of Calico Life Sciences LLC. EAB, MC, MF, and MB are employees of Google LLC. The authors declare that they have no conflict of interest and no competing financial interests from this work.


Watch the video: Genetik del 1 (August 2022).