1. Introduction
The
lac operon in
Escherichia coli is a paradigmatic example of a genetic regulation system involving the interaction of positive and negative regulatory molecules. The system encodes a pathway for lactose catabolism that is hierarchically controlled by glucose availability, and it has been reported to exhibit a bistable behavior, since the catabolic genes are either uninduced (OFF) or induced (ON) in a single cell, depending on previous activation history and specific extracellular lactose and/or glucose concentrations [
1,
2]. The
lac operon has been used as a model system for gene regulation since its initial description in [
3], where the concept of an operon was first introduced. Since the molecular components of the regulatory system have been well characterized, it is an excellent candidate for global analysis and modeling. The operon includes three genes involved in the uptake and catabolism of lactose and/or structurally similar sugars. The first gene,
lacZ, encodes for the enzyme
-galactosidase, which converts lactose to allolactose and to subsequent catabolic intermediates;
lacY, the second structural gene of the pathway, encodes for lactose permease, a membrane transporter for lactose uptake. Finally, the product of gene
lacA is an acetyltransferase that takes part in the degradation and excretion of non-lactose sugars that may be misrouted through the lactose degradation pathway.
The fundamental states of the
lac operon are described schematically in
Figure 1. In brief, when a low level of extracellular lactose occurs, a strong binding of the LacI protein to the operator sequences strongly represses lac gene expression, thus allowing only a very low amount of mRNA to be transcribed (
Figure 1a), which can be considered as an OFF state. However, even at this low expression level, a few LacY permease molecules can reach the cytoplasmic membrane and take up any eventual lactose that appears outside the cells. The presence of low amounts of LacZ in the cytoplasm leads to transformation of this lactose into
allolactose, the true natural inducer of the
lac operon. When this happens, allolactose interacts allosterically with the LacI repressor protein, decreasing its affinity for operator sequences and releasing the negative control of the
lacZ promoter, rendering the system into an ON state (
Figure 1b). Positive feedback loops, like the one described here for the lac system, have been proposed as a typical property of bistable systems [
4].
On the other hand, an essential feature of the regulatory system is that high extracellular glucose concentrations can also inhibit the expression of lac genes, a phenomenon traditionally considered as a type of
catabolite repression that is directly mediated by the glucose molecule. Such a repression has usually been assumed to involve the transcriptional activator protein Crp (cAMP receptor protein), which also takes part in lac gene expression as a cAMP-dependent positive regulator (
Figure 1). In the repression model, glucose uptake by the cells produces a drastic reduction in the intracellular cAMP levels, negatively affecting Crp activity and abolishing lac gene expression, which would explain the metabolic preference of glucose over lactose in
E. coli [
5]. An additional explanation of the catabolic hierarchy of glucose is given by the
inducer exclusion effect, which involves glucose-mediated inhibition of lactose uptake, by means of a direct reduction of LacY permease activity [
6]. This would greatly contribute to the regulatory effect of glucose by preventing the inducer molecule to reach the LacI repressor.
Bistability is normally defined as a condition where a system can respond to the same external signal or input in two different manners, depending on the internal state and/or the previous history of the system. However, true bistability requires the system to switch in an all-or-none way between alternative fixed points, without the appearance of intermediate states or a “quantitative” cellular response. Although a quantitative continuous response to the presence of lactose, or the gratuitous inducer thiomethyl-
-D-galactoside (TMG), was initially shown in experiments measuring the average response of whole culture populations, containing several millions of cells, observations with individual bacteria showed unequivocally that each cell responds in a discrete manner, switching alternatively among the OFF and ON states [
1,
2], with virtually no evidence of intermediate responses. This shows that bistability can be a characteristic of the
lac system. Using genetically engineered
E. coli reporter cells, gratuitous inducers and sophisticated experiments, the authors in [
7] showed a discrete hysteretic response of lac gene expression in individual bacteria when pre-incubated in the presence or absence of extracellular lactose and then transferred to fresh medium with different lactose concentrations. Their results showed that, depending on the cell’s previous history, the response profile to lactose in the fresh medium varied significantly within a specific concentration range, while outside its limits, cells would respond independently of their previous incubation conditions: Any concentration lower than 3
M would not lead to
lacZ induction, whereas any concentration higher than 30
M would induce the reporter gene. However, most values within the 3–30
M window were found to induce
lacZ expression only for lactose preincubated cells. This
window of bistability was very clearly defined in the absence of extracellular glucose but when this sugar was present, the concentration window moved towards higher lactose concentrations (see Figure 2c in [
7]).
Thus, the above all-or-none and bistable behavior that characterizes the lac system makes it suitable for discrete modeling. In particular, Boolean models have been used successfully to analyze genetic regulatory systems on numerous occasions [
8,
9,
10]. They are useful for the analysis of large regulatory networks when mechanistic details underlying are scarce, insufficiently defined, or even controversial, as it has been discussed that, in such models, the interaction type (inhibition or activation) and network topology are enough for capturing dynamics of gene networks [
11,
12], without the need of estimating parameters, thereby reducing model complexity. For example, they are being used extensively to represent interactions revealed by high throughput sequencing technologies in the context of systems biology [
13,
14]. Although some of these assumptions make them less applicable for certain biological processes, they have proved very valuable for predicting the behavior of genes during bacterial metabolism [
15] or in the context of interspecies interactions [
16]. Even more, other researchers have been able to reproduce bistability for eukaryotic cell differentiation [
17], surfactant production in bacteria [
18] and microbial signaling [
19] using Boolean models.
In this context, the authors of [
20] modeled the lac regulation system using a Boolean network that was able to reproduce the bistability of the system under certain but not all of the conditions described in [
7].
Figure 2 displays a scheme of this model, where
and
are parameters that account for different concentration levels of glucose and lactose, respectively. This allows to assign specific values as inputs in order to predict the final outcome of the network in terms of fixed points (steady states) that are consistent with the operon being ON, OFF, or in a bistable condition. When the nodes are updated in a parallel regime, the model can reproduce the experimental results in [
7] when extracellular glucose is low or absent. However, the predicted outcome of the model differs from the actual behavior of the lac system when lactose and glucose concentrations are high, a feature that limits the usefulness of this initial model for representing lactose fermentation by
E. coli.
In this work, we first analyze the dynamics of the Boolean models considered but under all different updating schemes in order to know how robust they are in the sense of whether or not new attractors appear. In general, this is a difficult task because the number of different updates schemes associated to a network grow exponentially with its size [
21]. Fortunately, there are currently mathematical tools that can significantly reduce this analysis [
22]. Secondly, we propose improvements for the performance of the model considered by a rational redesign of the specific Boolean functions in order to predict the operon behavior in each of the sugar combinations tested experimentally in [
7], including combined scenarios of low, medium and high concentrations of glucose and lactose. Furthermore, we have taken into account current biological data, regarding catabolic repression and inducer exclusion mechanisms, to bring function definitions up to date.
The paper is organized as follows;
Section 2 summarizes the basic mathematical concepts and definitions used along the manuscript,
Section 3 contains the most relevant aspects taking into account of [
20], in
Section 4 our main mathematical results are presented, in
Section 5 alternative improvements for the studied models are proposed. Finally, we discuss on our findings in the conclusions section.
2. Mathematical Background
A
Boolean Network (BN) is a couple
, where
is a finite directed graph named
interaction digraph (e.g.,
Figure 2(left)),
V is a set of
nnodes (
n also called the
network size),
is the set of
edges and
is a Boolean function composed by
n local functions
such that
(
x is called a
configuration while the value of the variable
associated with node
k is known as an
state). Besides, the local functions
depends only on variables
such that
(e.g.,
Figure 2(right)).
Among the different ways that the states of a BN can be updated is the family of
(deterministic) update schedules [
22] defined as functions
such that
for some
. In particular, the well-known
parallel (or
synchronous) and
sequential (or
asynchronous) update schedules are obtained when
(i.e., all states are updated at the same time) and
(i.e., all states are updated at different times but following a predetermined order), respectively. We denote by
the set of deterministic update schedules that exist for a BN of size
n. If the states of a configuration
are updated according to a given update schedule
s, a new configuration
will be obtained. The change from
x to
can be represented by the arc
and is known as
transition from
x to
. Thus, the
configurations and their respective transitions give rise to the
dynamics of the system which can be represented in a
state transition graph Furthermore, because it is finite, the dynamics has limit behaviors called
attractors, which are of 2 types:
Fixed point (or steady state): configuration such that .
Limit cycle: sequence of configurations pairwise distinct such that , for all , and is a integer named the length of the limit cycle.
The set of configurations that converge to a specific attractor is called the attraction basin.
6. Conclusions
The
lac Operon Boolean model without catabolite repression and its respective reduced version proposed in [
20] were analyzed. These models have the particularity of being simple but capable of reproducing the operon being OFF, ON and bistable for different levels of lactose and glucose, matching very well with biological experiments such as those published in [
7]. Unlike the models of [
20], the Boolean networks proposed here predict bistability even at high glucose concentrations. This feature has been observed experimentally when inducer (a non-metabolizable lactose analogue) concentrations are also high [
7]. Furthermore, our models take into account intermediate glucose concentrations, thus increasing sensibility to changes within the “window of bistability” of the lac system (see more details in the second part below).
In a first part of this paper, we have studied the dynamical robustness of these models where we made the following contributions:
For the first 3 cases described in
Section 3.2 and that included 5 of the 6 combinations of parameters allowed in the original and reduced models, we establish two Propositions proving the non-existence of any limit cycle, whatever the update schedule used.
For the case 4, where bistability appears, we made an exhaustive analysis of all its possible dynamics generated with any update schedule. Here we detail for both models the average sizes of their attraction basins, the number of dynamics without limit cycles (i.e., only with fixed points) and the number of dynamics with limit cycles (being less than 30% in both models).
Again in the case 4, its predominant attractor (those that have the bigger attraction basin), changes dramatically; OFF attraction basin being, in average, 8 times bigger than that of ON for the original model while that in the reduced one, the ON basin is almost 2 times bigger than that of OFF.
The above findings allow us to conclude that the effect of the glucose and lactose parameters in the interaction digraph associated to each network breaks its cycles for cases 1, 2 and 3, transformed into an acyclic network that will have only fixed points as attractors and, consequently, being completely robust in these situations. However, in the case of bistability, the role of the update schedule can change this property significantly because limit cycles with large attraction basins can appear. Moreover, this work supports the hypothesis of [
20] that network topology is a key factor for qualitative dynamical properties but not for quantitative ones.
As a second part, we have proposed two alternative improvements for the Boolean models studied, with biological support and that involve small modifications in their local functions. In the first improvement, the prediction is corrected for the case in which the parameters of the models represent high glucose and lactose levels, achieving the bistability observed in the biological experiments of [
7]. In this way, with such an improvement, the original and reduced models match perfectly with the above experiments of [
7] for all the 6 combinations of parameters. The second improvement increases the possible combinations of parameters for the original model, going from 6 to 9, enriching the dynamics of the models and matching the bistability window observed in [
7] with each of the 9 possible combinations of parameters. By keeping in mind that in our models, inducer exclusion can effectively explain operon regulation, which takes into account the current challenges against the glucose-mediated repression model, our results can also be compared with some continuous models such as those of [
42], based on differential equations, where bistability windows are displayed in a wide range of glucose concentrations. It is worth mentioning that the main softwares used in this manuscript were Matlab (for the exhaustive analysis of all the different dynamics of each model and which allowed us to build
Table 3 and
Table 5) and RStudio (for the visualization of most of the state transition graphs presented in this work as well as for the stochastic experiment;
Figure 15).
Although the lac operon is a classical model and many of its key molecular players were identified a long time ago, our understanding of how these players interact with each other has evolved continuously through the years [
43]. Full comprehension of the system requires robust and thorough knowledge of key regulatory features, like catabolite repression, where the view of a direct inhibitory effect of glucose on the cAMP-CRP regulatory system has been challenged during the last decades [
25,
26,
27,
28,
29,
30]. Furthermore, lac operon bistability is a feature that is hard to reproduce by models in general, and the applicability of a Boolean network including glucose-mediated regulatory systems has only been tried previously in [
20]. Our present work shows how the translation of updated mechanistic information about the actual role of glucose into the network allows taking better advantage of a Boolean model to test and reproduce bistability in a way that is faithfully representative of experimental data. This has multiple implications as, in general, bistability has been considered a ubiquitous feature in bacteria, involving several processes [
44], and understanding the dynamics of global gene regulatory systems revealed by high throughput technologies has become increasingly complex, thus demanding simpler, robust modelling algorithms, such as Boolean networks. Future models could include additional molecular features of the operon for the lac system, like repressor oligomerization, inducer (lactose) degradation by LacZ activity, and consideration of additional binding sites within the lac promoter region to provide a more detailed description of biological outcomes. Another natural future extension consists of giving a more quantitative approach to the models here studied, similar to what has been done in works such as [
45,
46,
47].