“One of the most remarkable developments in Com- puter Science over the past 50 years has been the rea- lization
that allowing computers to toss coins
can lead to algorithms that are more efficient, conceptu- ally simpler and more elegant than their best known deterministic counterparts.”
Alistair Sinclair, University of California, Berkeley
1. Introduction
The
probabilistic method in combinatorics is a very useful and powerful family of proof techniques; for a standard reference book, see Alon and Spencer [
1]. As an important tool in this area, the result known as
Lovász Local Lemma (LLL), which was first published in 1973 by Erdős and Lovász [
2], plays a fundamental role and is often referred to as a “gem” in this field.
A key (and quite fascinating) feature of the probabilistic method is that using probabilistic arguments, it allows proving completely deterministic claims. This is also the case for the Lovász Local Lemma. Specifically, it allows us to prove that a certain structure or property exists with positive probability in a random setting, which implies that the structure must indeed exist or the property must hold deterministically. In a sense, randomness and probability play here a catalyst role: they make possible the progress toward the goal, but eventually disappear in the end result. In a number of cases, however, no other way is known to achieve the same result, which is quite surprising.
In the next section, we describe a specific motivating problem and solve it with the Lovász Local Lemma, illustrating its strength.
2. A Motivating Problem and the Original LLL
Hypergraph 2-coloring. Let H be a hypergraph, in which every hyperedge has at least vertices, and each hyperedge intersects at most d others. We ask that for what values of k and d can it be guaranteed that such a hypergraph is 2-colorable—that is, we can color each vertex with one of two colors, such as red and blue, such that no hyperedge becomes monochromatic.
To describe a solution attempt, let be the hyperedges in H. Consider a random coloring—that is, for each vertex, flip a coin independently and, depending on the outcome, color the vertex red or blue. Let denote the event that is monochromatic. What we aim to achieve is that no hyperedge is monochromatic, which is expressed by the event . If we can show , a 2-coloring of H must exist.
To find the probability of
C, let us first estimate the probability of
. If
has
vertices, then out of its
possible colorings, only two are monochromatic (the all-red and all-blue colorings). Therefore,
where the inequality follows from
. Then, (
1) implies
. Now, if the events
were independent, we could argue that
holds. It would mean that with positive probability no edge is monochromatic, so a 2-coloring of
H must indeed exist. The problem with this argument is that the events
are
not independent whenever the hyperedges overlap. Consequently, the equality
that we used in (
2) generally does not hold.
This is the critical point where the LLL provides invaluable help. Before going into formal details, let us informally display the core message of the lemma:
Key message of LLL.For any system of events, in order for the conclusion to hold, it is enough if the events are “almost” independent, in the sense that each one of them depends only on a limited number of others, provided that is small enough.
Specifically, the historically first version of the lemma, proved in the paper [
2] published by Erdoős and Lovász in 1973, is the following claim:
Lemma 1 (LLL—symmetric version, original form). Let be events such that each event is mutually independent of all the others, except at most of them. If holds, then .
Let us apply it to the above-described problem of hypergraph 2-coloring. Each event
in the example depends only on those events
for which
, since we pick the color of each vertex independently. As each hyperedge intersects at most
d others,
may depend on at most
d other events. We know from (
1) that
. Therefore, if
holds, then
follows, so the inequality (
3) is satisfied. As a result, Lemma 1 implies
, which means that the desired 2-coloring must exist. Rearranging (
4), we obtain the condition
, or equivalently,
. Thus, by means of the LLL, we have proved the following:
Theorem 1. Let H be a hypergraph in which each hyperedge has at least vertices, and any hyperedge intersects at most d others. Then, the condition implies that H is2-colorable.
For example, if each hyperedge of H has at least 11 vertices, and intersects with at most 256 other hyperedges, then H is always 2-colorable, regardless of its structure. The reason is that with and the inequality becomes , which indeed holds.
Remark 1. The factor 4 in the condition can be reduced to (the base of the natural logarithm), using stronger versions of the LLL (see Section 4), yielding the somewhat less demanding condition , instead of . This, however, does not change the principle of the proof. It is fascinating to note that applying the probabilistic claim of the LLL, we have obtained the completely deterministic result of Theorem 1. Furthermore, to the author’s best knowledge, no proof is known of Theorem 1 without the use of the Lovász Local Lemma.
3. Naming Conventions
Before looking at stronger versions of the LLL, as well as related results, let us summarize some naming conventions that have become commonplace over the years.
Why “local?” The “local” adjective in the name refers to the situation that each event is typically dependent only on a small number of others. This can be visualized by a dependency graph, in which each event is represented by a vertex, and the event is mutually independent of all events to which its vertex is not connected. In this graph representation, each event is dependent only on its neighborhood—that is, subject to local dependencies only. This explains the “local” in the name.
Remark 2. The dependency graph is not the same as the graph that we could obtain by connecting two nodes whenever they are dependent. For example, if all the events are pairwise independent, but not mutually independent, then the latter graph would have no edge, wrongly suggesting that all the events are mutually independent.
Naming the lemma after Lovász. The first version of the LLL was published in the paper [
2], jointly authored by Paul Erdoős and László Lovász. Then, why is the lemma not called “Erdoős–Lovász Local Lemma?” The reason is that Erdoős insisted in every lecture he gave about the subject that this result was created by Lovász alone, even though they applied it together in their joint paper. This is mentioned in an interview with Lovász, see [
3] (in Hungarian).
Symmetric vs. asymmetric versions. The original version of the LLL (Lemma 1) is
symmetric in the sense that each event is treated equally: they all have to satisfy the same probability bound (
3). In the
asymmetric variant (see
Section 4.2), the events may satisfy different probability bounds.
4. Stronger Versions of LLL
4.1. Strengthening the Symmetric Version
A natural question in connection with the original symmetric LLL (Lemma 1) is this: how large can
be, such that the conclusion
is still guaranteed to hold? Note that a larger value of
makes
smaller, so the nonemptiness of
becomes less likely. To formalize it, let us introduce a parameter
and replace the condition (
3) with
Then, the largest value of p for which the conclusion still remains valid provides us with the strongest from of the symmetric LLL. Note that the value of d is kept fixed.
Shearer [
4] investigated the above question and computed a specific function
, which is the supremum of
p values in (
5) for which the conclusion
remains valid, for a fixed
d:
Lemma 2 (Shearer’s lemma). For any fixed integer , let be the supremum of p values, for which the the conclusion remains valid if p is used in place of in(
3).
Then, Since
is the
supremum (not necessarily the maximum) of
p values in (
5) for which the conclusion
holds, therefore, we can formulate the strongest version of the symmetric LLL this way:
Lemma 3 (Strongest symmetric LLL). Let be events, such that each event is mutually independent of all the others, except at most of them. Let be given by(
6)
. Then, if holds, then .
For example, if
, then
, but
, so in this case the latter allows an about 77.7% larger value for
, than (
3). Since, however, the formula (
6) for
is relatively complicated, one may look for a simpler bound. Spencer [
5] proved that
suffices in place of
in (
3), where
is the base of the natural logarithm. Harvey and Vondrák [
6] further improved it to
.
4.2. Asymmetric LLL
This is the more general version of the LLL, published by Spencer [
5], in which it is allowed that different events can have different probability bounds. We again consider arbitrary events
, and express their dependencies by a
dependency graph . In this graph,
, and each event
is assumed mutually independent of the set of events
. In other words, each event is mutually independent of all its non-neighbors in the dependency graph.
Lemma 4 (Asymmetric LLL). Let be a system of events with dependency graph . Suppose there are real numbers , such that In particular, holds.
5. Further Application Examples
We have already presented an application (hypergraph 2-coloring) in
Section 2. In this section, we present further interesting applications.
5.1. Disjoint Paths
Let G be a graph with two distinguished vertices . Further, let be given sets of paths, such that each set contains m different paths. We would like to select a path for each i, such that the selected paths are edge-disjoint. Under what conditions can this be done?
Assume the family of path systems is
diffuse in the following sense: any
is edge-disjoint from all but at most
k paths in
for every
. Let us now pick a path
uniformly at random from each
. Let
denote the event that
and
are not edge-disjoint. Then, we have
since, for any
, the other path
can be chosen
m different ways, and among these at most
k can share an edge with
, by assumption. Further, any event
is mutually independent of all events
for which
holds. The reason is that
involves paths selected from
and
, and these are selected independently from the paths picked from
and
whenever there is no common index, i.e.,
. This implies that
, where
d is the maximum degree in the dependency graph of the
events. Therefore, if we satisfy
then the upper bound
holds, due to
It means, the
events satisfy the conditions of the LLL (Lemma 1), yielding
Consequently, there must exist a path system , , such that all the paths are edge-disjoint. Thus, by means of the LLL, we have proved the following result:
Theorem 2. Let G be a graph with two distinguished vertices , and let be sets of paths in G, such that each set contains m different paths. Assume that any is edge-disjoint from all but at most k paths in , for every . Then, whenever holds, it is possible to select a path from each such that all the selected paths are edge-disjoint.
5.2. k-SAT
Let us recall some well-known concepts about Boolean formulas. Such a formula is a CNF (Conjunctive Normal Form) formula if it is the conjunction (logical AND) of clauses, where each clause is the disjunction (logical OR) of literals. A literal is either a Boolean variable or its negation. A formula is a k-CNF formula if every clause contains k literals. We assume that the same variable cannot occur multiple times in a clause. The problem called k-SAT is this: given a k-CNF formula, is it satisfiable? This is a well-known NP-complete problem, but in some cases, the LLL allows us to quickly show that certain k-CNF formulas are satisfiable.
Let us say that two clauses in a k-CNF formula overlap, if there is a variable occurring in both (regardless of whether the variable is negated or not in the clauses). We can show via the LLL that if any clause overlaps with at most other clauses in a k-CNF formula , then is satisfiable.
Assign random truth values to the variables independently. Let
be the clauses and let
denote the event that
is not satisfied by the random truth assignment. There are
possible truth assignments to the
k literals in
, and only one of them makes
false – the one in which all literals of
are false. Therefore,
. Furthermore, observe that each event
is mutually independent of the set of those
’s that correspond to clauses which do not overlap with
. Consequently, the maximum degree
d in the dependency graph of the events
satisfies
. Therefore, we can write
implying
Then, we can conclude from the LLL that holds, which means that at least one of the random truth assignments satisfies all clauses—that is, is satisfiable. It is interesting to note that this does not depend on n, the number of clauses, and it is also independent of the number of variables. Thus, by means of the LLL, we have proved the following result:
Theorem 3. If in a k-CNF formula Φ any clause overlaps with at most other clauses, then Φ is satisfiable.
For example, if each clause contains literals, and each clause overlaps with at most other clauses, then the formula is satisfiable, no matter how many clauses and variables it has.
5.3. Independent Sets in Graphs
Consider a graph with vertex set V and maximum degree . Assume V is partitioned as , where each satisfies , for some positive integer ℓ. We ask: what value of ℓ guarantees that one can always select a vertex from each , such that the selected vertices form an independent set, i.e., a set in which no two vertices are connected.
First, observe that we can assume , since otherwise we can just remove vertices from to achieve the equality. The reason is that if the claim holds with fewer vertices in , then it certainly must hold in the original graph. Set ; this will be the size of each , after possibly removing some vertices.
Now, pick a random vertex from each . For each edge e, let denote the event that e connects two of the randomly selected vertices. This can only happen if e connects vertices in two different sets . Since , holds for any e due to the random vertex selection.
Observe now that if e has its endpoints in the sets , then for another edge f, the event can only depend on if f has an endpoint in . How many such edges f can exist? The size of is and each vertex is adjacent to at most edges. Therefore, the maximum degree in the dependency graph of the events is at most .
Let us now apply the LLL, in its simplest form, presented in Lemma 1. We need to satisfy the condition
. Due to
, it is satisfied if
holds. From this, and from
, we obtain
After simplification, this yields
. Consequently, the Lovász Local Lemma implies that if
, then
holds, where
are the edges of the graph. This means, there is a positive probability that no edge connects two of the randomly selected vertices, so the special selection we are looking for must indeed exist. Thus, by means of the LLL, we have proved the following result:
Theorem 4. Let G be a graph with vertex set V and maximum degree Δ. Assume V is partitioned as , where each satisfies . Then, it is always possible to select a vertex from each such that the selected vertices form an independent set.
Remark 3. The constant 8 can be reduced to 2e
via the stronger versions of the symmetric LLL (see
Section 4.1), but this does not change the principle of the proof.
5.4. Graph Coloring with Additional Constraints
The well-known task of graph coloring means that we need to assign a color to each vertex of the graph such that neighboring vertices have different colors. If a coloring satisfies this requirement, it is called a proper coloring.
The standard question regarding graph coloring is: how many colors are needed for a proper coloring? A simple basic fact is that in a graph with maximum degree , it is always enough to use at most colors: index the colors by and color the vertices one by one, always using the lowest indexed color that has not occurred among the already colored neighbors of the considered vertex. It is easy to see that this leads to the use of at most colors.
What if, however, we want a coloring that also satisfies some extra requirements? Such an extra requirement is that the frequency of color occurrences in the neighborhood of any vertex is limited, as defined below.
Definition 1. Let be an integer. A proper coloring is called β-frugal if no color appears more than β times in the neighborhood of any vertex.
Hind, Molloy, and Reed [
7] analyzed the number of colors needed for
-frugal colorings. They found a relationship between the maximum degree and the number of colors that suffice, but this is much more complex than the simple
bound mentioned above for conventional coloring. Using the asymmetric Lovász Local Lemma, they proved the theorem below (we present a simplified version given by Sinclair [
8]):
Theorem 5. Let G be a graph. If the maximum degree Δ of G satisfies , then G has a β-frugal proper coloring using at most colors.
5.5. Packet Scheduling in Networks
Consider a communication network, which is modeled by a directed graph. We have a given system of directed paths in the network, and we want to send packets along these paths. The paths are edge-simple, which means that no path repeats any edge. Let us consider a synchronous communication model, in which for each time unit, one packet can traverse an edge. Assume a set of packets is given for each path. We would like to achieve that each packet is delivered on its respective path in the shortest possible time. Since the paths and packet sets are given, our only degree of freedom is to decide, whenever several packets compete for an edge, which one goes first, i.e., in which order they will traverse the edge (recall that only one can do it in one time unit). In other words, we are looking for the optimal packet scheduling policy. Furthermore, we would like to achieve it with a constant buffer size at every node.
Let us define some key parameters for this problem:
congestion c is the maximum number of paths that share a directed edge;
dilation d is the maximum path length that occurs in the path system. Let
T be the smallest time in which each packet can be delivered, no matter how many packets wait for each path. The time for each individual packet is measured from the instant when it starts traversing the first edge on its path. A
lower bound on
T in the worst case is
The justification for this lower bound is the following:
A path can be d edge long, and traversing such a path clearly takes d time units, implying .
An edge can be included in c paths. If a packet arrives at the edge on each of these c paths at the same time, then one of them will suffer time units delay, plus its own traversal time, resulting in in the worst case.
Then, and together yield the worst case lower bound
A remarkable thing is that this essentially trivial lower bound can be achieved, up to a constant factor, no matter how many packets wait for each route. Furthermore, a constant size buffer suffices at every node. Specifically, by a sophisticated use of the Lovász Local Lemma, Leighton, Maggs, and Richa [
9] proved the following theorem:
Theorem 6. In the above model, there is always a packet schedule that achieves delivery time for each packet, with a constant queue length at every node.
Observe that ; so is indeed optimal up to a constant factor.
6. The Algorithmic Lovász Local Lemma
The original LLL (Lemma 1) and its variants presented in
Section 4 are all
existence theorems: they prove that there exists a realization that satisfies each of
, but do not tell anything about how to find such a realization algorithmically.
In many cases, it is very natural to look for an algorithmic solution. For example, in the case presented in Example 3 (see
Section 5.2), we have shown that in a
k-CNF formula
, if every clause overlaps with at most
other clauses, then
is satisfiable. In most applications, however, it is not enough just to know that a formula is satisfiable, we also want to find an actual satisfying truth assignment. The original LLL, as applied to this problem, does not provide it.
A naive approach would be to try rejection sampling to solve the above problem: repeatedly pick random truth assignments until one of them eventually satisfies all clauses. Since the LLL guarantees , we must find a satisfying truth assignment after a finite number of trials, with probability 1. Unfortunately, this approach can be very inefficient, because the probability , although guaranteed to be positive by the LLL, is typically exponentially small. This would therefore lead to an exponential time algorithm, essentially not better than exhaustive search.
The first proof of the possibility of an efficient algorithmic LLL was published by Beck [
10] in 1991. His algorithm, however, was somewhat technical, and required a stronger condition on neighborhoods than the original LLL. After a sequence of improvements, the most elegant solution was published by Moser and Tardos [
11]. It is worth mentioning that the authors received the prestigious
Gödel Prize in 2020 for this work.
The Moser–Tardos algorithm can be best presented via a formulation called the variable version of the LLL. Let be mutually independent random variables. Denote the “bad” events that we want to avoid by and let be the probability of , . Let denote the subset of variables on which depends. In particular, if , then and are independent. The dependency graph is defined by and .
With the above framework, the Moser–Tardos algorithm is surprisingly simple and elegant, see Algorithm 1. We present the most concise version, found in Szegedy [
12].
Algorithm 1:Resample (Moser and Tardos). |
- •
Assign random values independently to the variables . - •
While there is an i, , such that is not satisfied by the current assignment, do: – Choose the smallest such i and resample all variables in . That is, choose new random values independently for each . - •
Return the current variable assignment.
|
The correctness and complexity of the algorithm is captured by the following theorem, which implies that this is a probabilistic polynomial time algorithm.
Theorem 7. If the conditions of the asymmetric LLL hold, i.e., there are real numbers , such thatthen the algorithm resamples each at most an expected number of times before finding an assignment that satisfies all the . Therefore, the expected overall number of resamplings is at most . Furthermore, if the maximum degree of the dependency graph is D, and holds, then is resampled at most an expected number of times, and the total expected number of resamplings is bounded by .
While the algorithm itself is surprisingly straightforward and elegant, the proof of the above Theorem is far from simple, see Moser and Tardos [
11]. It is interesting to note that the
Resample algorithm is similar to the WalkSAT heuristic, which is used to solve general Boolean satisfiability problems, see, e.g., Hoos and Stützle [
13].
7. Outlook
Let us briefly list some other recent research directions that are being pursued in connection with the Lovász Local Lemma and its algorithmic version, without going into details.
Derandomization of the Moser–Tardos algorithm is possible, leading to a deterministic polynomial time algorithm. This was already addressed by Moser and Tardos in [
11], and further developed by Chandrasekaran, Goyal, and Haeupler [
14].
Some recent papers deal with
sampling and counting problems related to the LLL, see, e.g., Jain, Pham, and Vuong [
15].
Approximate counting is also considered in the context, such as approximately counting the satisfying truth assignments of a CNF formula, see Moitra [
16].
A
quantum version of the LLL has also been introduced, referred to as
Quantum Lovász Local Lemma, see Ambainis, Kempe, and Sattath [
17] and He, Li, Sun, and Zhang [
18]. The key difference between the Quantum Lovász Local Lemma and the classical LLL is that in the quantum version, the events are substituted with subspaces, and the event probabilities are substituted with subspace dimensions. This makes it more suitable for quantum computing applications.
The LLL is also being used to analyze
parallel and distributed algorithms, see, e.g., Chang, He, Li, Pettie, and Uitto [
19].
There are attempts to extend the LLL to an infinite setting, see Bernshteyn [
20].
8. Conclusions
We have surveyed a classic and fundamental result, known as the Lovász Local Lemma. We reviewed some variants, related results, and applications, as well as the algorithmic version. We were delighted with the intriguing feature that a purely probabilistic claim can be used to prove completely deterministic statements.
As a closing conclusion, we can claim that in a high-level, abstract sense, the Lovász Local Lemma also provides efficiency in a way that is somewhat reminiscent of the fact that general graph problems often become more tractable in bounded degree graphs. Of course, the LLL case is more complicated, since it involves not only neighborhood sizes, but also probabilities, but it still deals with bounded degree style structures. As an example, let us refer to the
k-SAT problem (see
Section 5.2) that becomes solvable in polynomial time, whenever the clause overlaps are limited.