1. Introduction
The existence of non-Shannon inequalities has received much attention since the first inequality of this type was discovered by Zhang and Yeung [
1]. The basic observation is that any four random variables
Z and
W satisfy the following inequality:
Here,
denotes the random variable that takes a value of the form
if
and
As usual,
and
denote mutual information and conditional mutual information given by:
where
H denotes the Shannon entropy. The inequality (
1) is non-Shannon in the sense that it cannot be deduced from the positivity, monotonicity and submodularity of the entropy function on the variables
and their joins, i.e., satisfaction of the following inequalities:
Positivity and monotonicity were recognized by Shannon [
2], while submodularity was first observed by McGill [
3]. It is easy to show that any inequality involving only three variables rather than four can be deduced from Shannon’s inequalities [
4]. The powerset of four variables is a Boolean algebra with 16 elements, and any smaller Boolean algebra corresponds to a smaller number of variables, so in a trivial sense, the Boolean algebra with 16 elements is the smallest Boolean algebra with non-Shannon inequalities.
In the literature on non-Shannon inequalities, all inequalities are expressed in terms of sets of variables and their joins. Another way to formulate this is that the inequalities are stated for the free ∪-semi-lattice generated by a finite number of variables. In this paper, we will also consider intersections of sets of variables. We note that for sets of variables, we have the inequality:
Inequality (
7) has even inspired some authors to use
as notation for mutual information.
Although non-Shannon inequalities have been known for two decades, they have found remarkably few applications compared with the Shannon inequalities. One of the reasons is that there exists much larger lattices than the Boolean algebra with 16 elements for which the Shannon inequalities are sufficient. The simplest examples are the Markov chains:
where any variable
is determined by its predecessor, i.e., the conditional entropies
are zero for
. For such a chain, one has:
The inequalities (
9) are all instances of the entropy function being monotone, and it is quite clear that these inequalities are sufficient in the sense that for any sequence of values that satisfies these inequalities, there exists random variables related by a deterministic Markov chain with these values as entropies.
In this paper, we look at entropy inequalities for random variables that are related by functional dependencies. Functional dependencies give a partial ordering of sets of variables into a lattice. Such functional dependence lattices have many applications in information theory, but in this paper, we will focus on determining whether a lattice of functionally-related variables can have non-Shannon inequalities. In order to achieve interesting results, we have to restrict our attention to special classes of lattices.
Entropy inequalities have been studied using matroid theory, but finite matroids are given by geometric lattices, i.e., atomistic semi-modular lattices (see the textbook of Stern [
5] for definitions). For the study of non-Shannon inequalities, it is more natural to look at general lattices rather than geometric lattices because many important applications involve lattices that are not atomistic or not semi-modular. For instance, a deterministic Markov chain gives a lattice that is not atomistic. It is known that a function is entropic if and only if it is (approximately) equal to the logarithm of the index of a subgroup in a group [
6]. Therefore, it is natural to study entropic functions on lattices and their relations to subgroup lattices.
In this paper, we bridge lattice theory, database theory and the theory of conditional independence, but sometimes, the terminology in these fields does not match. In such cases, we give preference to lattice theory over database theory and preference to database theory over the theory of conditional independence. For instance, there is a property for closure operators that is called extensivity in the theory of lattices. We translate extensivity into a property for functional dependence, and it turns out that extensivity can be used instead of the property for functional dependences, which is called augmentation. Extensivity is apparently a weaker condition than augmentation, but together with the properties called monotonicity and transitivity, they are equivalent on finite lattices. Finally, we translate extensivity from functional dependencies to separoid relations that model the concept of conditional independence. In the literature on conditional independence, extensivity has been termed “normality” without any explanation why this term is used. We called it extensivity because it is equivalent to the notion of extensivity in lattice theory, which we consider as a more fundamental theory.
The paper is organized as follows. In
Section 2, we describe the link between lattice theory and the theory of functional dependences in detail. We demonstrate how properties of closure operators associated with sub-semilattices correspond to the properties of functional dependence that are normally called Armstrong’s axioms. In
Section 3, we describe positive monotone submodular functions (polymatroid functions) and how they lead to separoid relations on lattices. These separoid relations generalize the notion of conditional independence known from Bayesian networks and similar graphical models. We demonstrate how properties of separoid relations correspond to properties of functional dependences.
In
Section 4, we describe entropy functions on lattices and how they correspond to subgroup lattices of a group. We conjecture that the Shannon inequalities are sufficient for describing entropic polymatroid functions of a lattice if and only if the lattice does not contain a special lattice as a sub-semilattice. In
Section 5, we develop some technical results related to “gluing” lattices together. The gluing technique is very useful for planar lattices, and in
Section 6, we demonstrate that entropic functions on planar modular lattices can be described by Shannon’s inequalities.
We finish with a short discussion, where we outline some future research directions. There is one appendix with some additional comments related to Armstrong’s axioms. These are mainly intended for readers that are familiar with the theory of functional dependencies in databases. A second appendix contains a long list of lattices that are used to document that polymatroid functions on lattices with seven or fewer elements can be described by Shannon’s inequalities.
Some of the results presented in this paper have been published in preliminary form and without proof [
7,
8], but since then, most of the results have now been strengthened or reformulated. In this paper, all proof details will be given.
2. Lattices of Functional Dependence
In this section, we shall briefly describe functional dependencies and their relation to lattice theory. The relation between functional dependence and lattices has been studied [
7,
9,
10,
11,
12,
13]. The relation between lattices and functional dependencies is closely related to minimal sets of Shannon-type inequalities [
14,
15]. Relations between functional dependencies and Bayesian networks have also been described [
8,
16]. Many problems in information theory and cryptography can be formulated in terms of functional dependencies.
Example 1. Consider a group consisting of n agents. One might be interested in giving each agent in the group part of a password in such a way that no single agent can recover the whole password, but any two agents are able to recover the password. Here, the password should be a function of the variables known by any two agents, but must not be a function of a variable held by any single agent. The functional dependence structure is the lattice illustrated in the Hasse diagram in Figure 1. The node at the top illustrates the password. Each of the intermediate nodes represents the knowledge of an agent. The bottom node represents no knowledge. A ∧-semilattice is a set equipped with a binary operator ∧ that satisfies the following properties:
For a ∧-semilattice the relation defines a preordering that we will denote . If is a semilattice, then we say that is sub-semilattice if is closed under the ∧ operation. Let denote a semilattice. Let . Then, . Therefore, we can identify any finite semilattice with a ∩-semilattice in a powerset. Since we will usually identify semilattice elements with sets of variables, we will often use ⊆ and ∩ to denote the ordering and the meet operation.
In this paper, we will assume that all semilattices and all lattices are finite. If a ∩-semilattice
has a maximal element, then a binary operator ∨ can be defined as:
and then,
is a lattice.
Let
denote a lattice with
as a sub-semilattice with the same maximal element as
. Then, a unary operator
can be defined by:
The operator
is a closure operator [
17], i.e., it satisfies:
For any closure operator
, the element
X is said to be closed if
. If
X and
Y are closed, then
is closed ([
18], [Lemma 28]), so the closed elements of a lattice under a closure operator form a ∩-semilattice.
Proposition 1. Let denote a finite lattice. Assume that a subset of is closed under the meet operation and has the same maximal element as . Then, is a lattice under the ordering ⊆ with the meet operation in given by ∩ and join operation in given by .
Example 2. If G is a group, then a subgroup is defined as a subset that is closed under the group operations. The closure of a subset of G is the subgroup generated by the subset. The lattice of subgroups forms a ∩-semilattice in the lattice of all subsets of the group. Let G denote a finite group. For any subgroup , we associate the variable that maps an element into the left coset Then, the subgroup lattice of G is mapped into a lattice of variables where the subset ordering of subgroups is equivalent to functional dependences between the corresponding variables.
Proposition 2. If is a closure operator on a lattice, then the relation and the relation are equivalent. The relation given by satisfies the following properties. Remark 1. The monotonicity of → is called reflexivity in the literature on databases. We reserve the notion of reflexivity to the relation , in accordance with the terminology for ordered sets. In database theory, the property is called self determination.
In the literature on databases extensivity, (18) is replaced by an apparently stronger property called augmentation, but in a finite lattice augmentation can be proven from extensivity, monotonicity and transitivity. See Appendix A for details. If the properties (
18)–(
20) are satisfied, we say that the relation → satisfies Armstrong’s axioms [
19]
Proof. Assume that
. Using extensivity (
15), we get
. The transitivity of ⊇ then gives
.
Assume
. Then, the monotonicity (
16) gives
, and the idempotent gives
.
To prove the extensivity (
18) of →, assume that
. Using the extensivity (
15), we also get
. Combining these two inequalities gives
, as desired.
The monotonicity (
19) of → follows directly from the monotonicity (
15) of
.
The transitivity (
20) of → follows from the transitivity of ⊇. □
If is a lattice with a relation → that satisfies Armstrong’s axioms, then we say that a lattice element X is → closed if implies that
Theorem 1. Let be finite lattice with a relation → that satisfies Armstrong’s axioms. Then, the set of → closed elements form a ∩-semilattice with the same maximal element as . The relation holds if and only if , where denotes the closure operator with respect to the semilattice.
Proof. Assume that
and
are closed and that
The monotonicity (
19) implies
, and then, the transitivity (
20) implies that
Since
is closed, we have
Since this holds for both
and
, we have
, implying that
is closed. The monotonicity (
19) also implies that the maximal element of
is closed so that the set of closed elements
forms a ∩-semilattice with a closure operator
.
Let
denote the closure with respect to
. We will prove that
. Let
. Assume that
is not → closed. Then, there exists
such that
and
. Using the extensivity (
18), we get
. Define
. Then,
and
. Iterate this construction so that:
Since the lattice is finite, the construction must terminate, and when it terminates, is closed. Using transitivity, we get and . Since is the smallest closed element greater than X, we have .
If
, then
by monotonicity (
19), and then,
by transitivity (
20). If
, then
. Using that
is → closed, we get
. □
We will look at functional dependencies in databases. Assume that a set of records is labeled by elements in a set
A. In statistics records are the individual elements of a sample. For each record
, the database contains the values of various attributes given by a number of functions from
A to the set of possible attributes. Sets of such functions will be denoted by capital letters, and these will be our variables. We say that
X determines
Y and write
if there exists some function
f such that
for any record
. Then, the relation → satisfies Armstrong’s axioms. Armstrong proved that these axioms form a complete set of inference rules [
19]. That means that if a set
A of functional dependencies is given and a certain functional dependence
holds in any database where all the functional dependencies in
A hold, then
holds in that database. Therefore, for any functional dependence
that cannot be deduced using Armstrong’s axioms, there exists a database where the functional dependence is violated [
20,
21]. As a consequence, there exists a database where a functional dependence holds if and only if it can be deduced from Armstrong’s axioms. Using the result that Armstrong’s axioms are equivalent to the closed sets forming a lattice, Armstrong’s result is easy to prove.
Theorem 2. For any finite lattice , there exists a database with a set of related variables such that the elements of the lattice corresponds to closed sets under functional dependence.
Proof. As the set of records, we take the elements of the lattice
. With each
, we associate a function
given by
. If
, then:
so that
. Therefore,
.
Assume that . Let and . Then, , while and . Using that , we get , so that . □
We have seen that for a subgroup lattice of a group, there exists a lattice of functional dependence. The opposite is also true. To each database with attributes related by functional dependence, there is a group. The construction is as follows. Let A denote a set of records. Let be the symmetric group consisting of permutations of the records. If X is a function on A, then we define the stabilizer group as the set of permutations that leave X invariant, i.e., permutations such that for all . Then, if and only if . In this way, the functional dependence lattice of a database can be mapped into a lattice of subgroups of a group.
Combining Theorem 2 with the stabilizers subgroups of the symmetric group of a database, we get the following result that was first proven in 1946 by Whitman [
22].
Corollary 1. Any finite lattice can be represented as a functional dependence lattice generated by subgroups of a group.
3. Polymatroid Functions and Separoids
Definition 1. On a lattice, the submodularity of a function h is defined via the inequality . If the submodular inequality holds with equality, we say that the function is modular. A polymatroid function on a lattice is a function that is non-negative, increasing and sub-modular.
Example 3. Let be finite atomistic lattice with a ranking function . Then, L is a geometric lattice if and only if the function r is polymatroid ([5], Corollary 1.9.10). For a polymatroid function
h on a lattice, one may introduce a function
that corresponds to conditional mutual information by:
One can rewrite
as:
Since
h is monotone and submodular, we have:
It is straightforward to verify that:
We will say that a function
that satisfies positivity (
26), symmetry (
27) and the chain rule (
28) is a separoid function.
Proposition 3. If is a separoid function, then the following property is satisfied. Proof. Assume that
. We can use the chain rule (
28) to get:
Hence, monotonicity (
29) is satisfied. □
The relation
is equivalent to
, and this relation will be denoted
. The first to observe that
defines a lattices was Shannon, who published a very short paper on this topic in 1953 [
23]. Shannon did not mention the relation to the theory of functional dependences because that theory was only developed two decades later. Surprisingly, Shannon’s paper was only cited once until 2002!
The relation
satisfies Armstrong’s axioms, and the most instructive way to see this is via separoid relations. If
h is a polymatroid function, then the relation
will be denoted
. Following Dawid et al. [
24,
25], we say that a relation
on a lattice
is a separoid relation, if it has the following properties:
Remark 2. The term monotonicity was used for a different concept by Paolini [26]. In [24,25], a weaker condition than monotonicity was used, but their condition together with the chain rule implies monotonicity. With this definition we see that
is a separoid relation. The properties (
31)–(
33) should hold for all
In this paper, we are particularly interested in the case where the subsets are not disjoint. In the literature on Bayesian networks and similar graphical models, the focus has been on disjoint sets where only the last two properties (
32) and (
33) are used to define a semi-graphoid relation [
27]. See also [
28], Remark 2.5, where it is noted that semi-graphoid relations can be defined on join semi-lattices.
A long list of properties for the notion of independence was given by Paolini [
26], but Studený has proven that one cannot deduce all properties of statistical conditional independence from a finite list of axioms [
28,
29].
Proposition 4. A separoid relation on a lattice satisfies the following properties. Remark 3. Property (34), which we call extensivity, was called normality by Paolini [26]. Proof. To prove the extensivity (
34), assume that
, which is equivalent to
. The monotonicity (
31) gives
. The conclusion
is obtained by the chain rule (
33).
To prove the transitivity (
35), assume that
and
. The chain rule (
33) applied twice gives
and
. □
In a set of random variables, we note that if Y is independent of Y given X, then Y is a function of X almost surely. If , we write .
Theorem 3. If is a lattice with a separoid relation , then the relation satisfies Armstrong’s axioms. The relation restricted to the lattice of closed lattice elements is separoid.
Proof. The extensivity (
18) of
follows directly from the extensivity (
34) of
.
The monotonicity (
19) follows directly from the monotonicity (
31).
To prove the transitivity of
, assume that
and
. The monotonicity (
31) implies that
, which by the chain rule (
33), implies
. By the chain rule (
33), we have
The monotonicity (
31) also gives
, which together with
implies that
by transitivity (
35). The transitivity (
35) then implies
To prove that the relation restricted to the lattice of closed lattice elements is separoid, one just has to prove that if and only if if and only if . This follows from Armstrong’s results.
The significance of this theorem is that if we start with a separoid relation on a lattice, then this separoid relation is also a separoid when restricted to elements that are closed under the relation .
Theorem 4. Any finite lattice can be represented as a closure system of a separoid relation defined on a powerset.
Proof. For any finite lattice , one identifies the elements with subgroups of a group G. If the group G is assigned a uniform distribution, then the variable corresponding to a subgroup will also have a uniform distribution. With this distribution, a variable is independent of itself given another variable if and only if the other variable determines the first variable. Therefore, statistical independence with respect to the uniform distribution on G gives a separoid relation for which the closure is the original lattice. □
Assume that
X and
Y are
closed. Then:
Therefore, h restricted to the closed elements is polymatroid. We may summarize these observations in the following proposition.
Proposition 5. If h is a polymatroid function defined on the lattice , then the relation satisfies Armstrong’s axioms. The function h restricted to the lattice of closed elements is polymatroid.
We recall that a pair of point
is said to be a modular pair, and we write
if
implies that:
If all pairs are modular, we say that the lattice is modular, and we have:
when
.
Proposition 6. If is a separoid relation on a lattice and:then in the lattice of closed elements. In particular, if h is a polymatroid function on a lattice and:then in the lattice of closed elements. Proof. If
, then we have the following sequence of implications.
If
is separoid, then according to the extensivity (
34), the relation
implies:
so that
Following Dawid [
24], we define the relation
by:
Theorem 5. If a polymatroid function h on a lattice is modular, then the lattice of closed elements is modular. If the lattice is modular, then if and only if in the lattice of closed elements.
Proof. If the function
h is modular, then all pairs of elements are modular in the lattice of
h-closed elements, so the lattice of closed elements is modular. In a modular lattice:
so that
holds when
□
The following result appears in [
24] with a longer proof.
Corollary 2. For a lattice, the relation is separoid if and only if the lattice is modular.
Proof. Assume that the lattice is modular. Then, the ranking function r is modular, and if and only if . Therefore, is equivalent to the separoid relation .
Assume that the relation is separoid. Since , we have that . Since all pairs are modular, the lattice is modular.
4. Entropy in Functional Dependence Lattices
Let denote a lattice with maximal element m. Let denote the set of polymatroid functions on The set is polyhedral, and often, we may normalize the polymatroid functions by replacing by . In this way, we obtain a polytope that we will denote .
Definition 2. A function is said to be entropic if there exists a function f from into a set of random variables such that for any element X in the lattice.
Let denote the set of normalized entropic functions on , and let denote the closure of .
Definition 3. A lattice is said to be a Shannon lattice if any polymatroid function can be realized approximately by random variables, i.e.,
One may then check whether a lattice is a Shannon lattice by checking that the extreme polymatroid functions are entropic or can be approximated by entropic functions.
Example 4. Let G denote a finite group. For any subgroup , we associate the variable that maps an element into the left coset The number of possible values of is . Assume that the subgroups are given a functional dependence structure where a variable X is given by a function . If A has n elements, then the groups of permutations G have elements. The subgroup that leaves X invariant has:element. Therefore: If U is the uniform distribution on the finite group G, then the distribution of is uniform, and the entropy is It has been proven that the set of entropic functions generated form a convex cone. Therefore, the normalized polymatroid functions generated by groups has as closure [4]. From Definition 3, we immediately get the following result.
Proposition 7. If is a Shannon lattice and M is a subset that is a ∩-semi-lattice, then M is a Shannon lattice. In particular, all sub-lattices of a Shannon lattice are Shannon lattices.
Proof. Assume that
is a Shannon lattice and that
M is a sub-lattice. Let
denote a polymatroid function. For
, let
denote the
that minimize
under the constraint that
Define the function
Now,
is an extension of
h, and with this definition,
is non-negative and increasing. For
, we have:
because
and
Hence,
is submodular. By the assumption,
is entropic, so the restriction of
to
M is also entropic. □
With these results it hand, we can start hunting for non-Shannon lattices. We take a lattice that may or may not be a Shannon lattice. We find the extreme normalized polymatroid functions. These extreme polymatroid functions can be found either by hand or by using some suitable software that can find extreme points of a convex polytope specified by a finite set of inequalities. For instance, the R program with package rcdd can find all extreme points of a polytope. For each extreme point, we determine the lattice of closed elements using Proposition 5. These lattices of closed sets will often have a much simpler structure than the original lattice, and the goal is to check if these lattices are Shannon lattices or not. It turns out that there are quite a few of these reduced lattices, and they could be considered as the building blocks for larger lattices.
We recall that an element i is ⊎-irreducible if implies that or . An ∩-irreducible element is defined similarly. An element is double irreducible if it is both ⊎-irreducible and ∩-irreducible. The lattice denoted is a modular lattice with a smallest element, a largest element and n double irreducible elements arranged in-between.
Theorem 6. For any n, the lattice is a Shannon lattice.
Proof. The proof is essentially the same as the solution to the cryptographic problem stated at the beginning of
Section 2. The idea is that one should look for groups with a subgroup lattice
and then check that the subgroups of such group have the right cardinality.
Let the values in the double irreducible elements be denoted . If , the extreme polymatroid functions are and , and these points are obviously entropic. If , the extreme points are , and which are all entropic.
Assume
. Then, the values should satisfy the inequalities:
If
is an extreme point, then each variable should satisfy one of the inequalities with equality. Assume
Then, sub-modularity implies that
for
. The extreme point
is obviously entropic. If
, this gives no further constraint on the other values, so it corresponds to an extreme point on a lattice with one less variable. Finally, assume that
for all
. Then,
for all
□
Corollary 3. Any polymatroid function that only takes the values and 1 is entropic.
Proof. Assume that the polymatroid function h only takes the values and 1. Then, h defines a separoid relation, and the closed elements form a lattice isomorphic to for some integer n. The function h is entropic on , so h is also entropic on the original lattice. □
Lemma 1. If h is submodular and increasing on ∩-irreducible elements, then h is increasing.
Proof. Assume that h is submodular and increasing on ∩-irreducible elements. We have to prove that if , then In order to obtain a contradiction, assume that Z is a maximal element such that there exist an element X such that , but We may assume that X cover Z. Since h is increasing at ∩-irreducible elements, Z cannot be ∩-irreducible. Therefore, there exists a maximal element b such that , but Since X cover Z, we have According to the assumptions, and because Z is a maximal element that violates that h is increasing. Therefore, □
Theorem 7. Any lattice with seven or fewer elements is a Shannon lattice.
Proof. Up to isomorphism, there only exist finitely many lattices with seven elements or less. These are listed in the
Appendix B. Each of these lattices has finitely many extreme polymatroid functions. These extreme polymatroid functions can be found by hand or by using the R program with package rcdd. All the extreme polymatroid functions on these lattices can be represented by a trivial lattice, or by the two-element chain
2, or by
, or by
, or by
. All these lattices are representable, and thereby, they are Shannon lattices. □
The number of lattices grows quite fast with the number of elements, and the number of elements is not the best way of comparing lattices.
The Boolean lattice with four atoms is the smallest non-Shannon Boolean algebra. Nevertheless, there are smaller non-Shannon lattices.
Figure 2 illustrates the Matúš lattice, which is a lattice with just 11 elements that violates Inequality (
1). This corresponds to the fact that the lattice in
Figure 2 is not equivalent to a lattice of subgroups of a finite group. The lattices that are equivalent to lattices of subgroups of finite groups have been characterized [
30], but the characterization is too complicated to describe here. Using the ideas from [
31], one can prove that the Matúš lattice in
Figure 2 has infinitely many non-Shannon inequalities. Therefore, any lattice that contains the Matúš lattice as a ∩-semilattice also has infinitely many non-Shannon inequalities.
Conjecture 1. A lattice is a Shannon lattice if and only if the lattice does not contain the Matúš lattice as a ∩-semilattice.
The result of Matúš has recently found a parallel in matroid theory. An infinite set of inequalities is needed in order to characterize presentable matroids [
32,
33,
34].
5. The Skeleton of a Lattice
In this section, we will develop a cutting-and-gluing technique that can be used to handle many lattices, but it is especially useful for planar lattices. We present the notion of tolerance. Further details about this concept can be found in the literature [
5,
35].
Definition 4. A symmetric and reflexive relation Θ on a lattice is called a tolerance relation if and imply:and If is a tolerance relation, then for any X, the set is an interval in the lattice. These intervals are called the blocks of , and the blocks will be denoted For a tolerance relation, the blocks may be considered as elements of the factor , and this factor has a natural structure as a lattice. Congruence relations are special cases of tolerance relations, but in general, the blocks of a tolerance relation may overlap. We note that if the intersection of two blocks is non-empty, then the intersection is a sublattice. If , then will denote the block in determined by We defined a glued tolerance relation as a tolerance relation where X cover Y in , implying that
A tolerance relation can be identified with a subset of
, so tolerance relations are ordered by subset ordering. The trivial tolerance relation is the one where
holds for all
, and this tolerance relation is the greatest tolerance relation. A glued tolerance relation contains any covering pair, and glued tolerance relations are characterized by this property. Therefore, the intersection of two glued tolerance relations is a glued tolerance relation. Therefore, the set of glued tolerance relations forms a lattice. The smallest glued tolerance relation is denoted
and is called the skeleton of the lattice. An example of a planar modular lattice is given in
Figure 3 and the skeeton is given in
Figure 4.
Lemma 2. Let be a lattice with an increasing function h. If the function h satisfies:for all where is covered by X and Y, then the function h is submodular on Proof. First, we prove that if the function
h satisfies:
for all
where
is covered by
X, then the function
h is submodular on
Let
A and
A denote two lattice elements. Define sequences
and
by first defining
and
Assume that
is an element that covers
and such that
Let
be a cover of
, and let
Then:
Adding all these inequalities leads to:
and the inequality is obtained because
h is increasing to that
and because
by construction of the sequences.
To see that, we just need to check submodularity when B covers proven in the same way. □
Proposition 8. Let be a lattice with a tolerance relations Θ, and let denote some function. Then, h is polymatroid if and only if the restriction of h to any block is polymatroid.
If h is entropic, then the restriction to each block is entropic. Characterizing the blocks of a lattice has been done for certain classes of lattices, but here, we shall only mention a single result.
Theorem 8 ([
36])
. The blocks of a modular lattice are the maximal atomistic intervals. In particular, the skeleton of a modular lattice consists of blocks that are geometric lattices.
6. Results for Planar Lattices
In this section, we will restrict our attention to planar lattices. There are several reasons for this restriction. First of all, any poset with a planar Hasse diagram is a lattice if and only if it has a least element and a greatest element [
37]. As a consequence, any ∩-semilattice of a planar lattice is also a planar lattice. Certain cut-and-glue techniques are also very efficient for planar lattices. Finally, both planar distributive lattices and planar modular lattices have nice representations that will play a central role in our proofs.
Theorem 9. Let h denote a polymatroid function on a planar lattice . Then, h has an entropic representation if and only if the restriction to each block of has an entropic representation.
Proof. The proof is via induction over the number of elements in the lattice. For a trivial lattice, there is nothing to prove. Assume that the theorem has been proven for all lattices with fewer elements than the number of elements of Assume that h is a polymatroid. Since the lattice is planar, it has a left boundary chain and a right boundary chain where is the maximal element of Let be the minimal element of the right boundary chain such that We note that Let denote the largest element in the left boundary chain such that Then, there is a chain from to , and we have a glued tolerance relation with two blocks and and with the two element chain lattice 2 as the factor lattice. These two blocks are glued together along a chain that and share. There are two cases: either or
Assume that Then, the glued tolerance relation is non-trivial. Since h restricted to and are probabilistically representable, we may without loss of generality assume that there exist two groups and such that to , there is a subgroup such that We associate the variable that maps an element into the left coset The goal is to find a joint distribution to a set of variables associated with each We note that all variables in are functions of , so if we map into , all other variables in are determined. In particular, the chain is determined by The sequences are mapped into the sequence recursively, starting with mapping into This is possible since and are uniform distributions on sets of the same size. Now, there are equally many values of and that map into the same values of and , so the the values of and can be mapped into each other. We continue like that until all the random variables along the chain have been identified.
If , then we make a similar construction with the role of the left chain and the right chain reversed. If this leads to a non-trivial glued tolerance relation, we glue representations together as we did above.
If both the left chain and the right chain lead to trivial glued tolerance relations, then is the maximal element of , and the whole lattice consists of a single block in In this case, the content of the theorem is trivial. □
Theorem 10. All planar modular lattices are Shannon lattices.
Proof. Without loss of generality, we may assume that the lattice consists of just one block for the tolerance relation A modular block is atomistic, but if a modular planar lattice is atomistic, it is equivalent to the trivial lattice or to the lattice 2, or to the lattice , or to one of the lattices □
Our construction actually tells us more. If the lattice is distributive, it is glued together with blocks that are either equivalent to
2 or to the lattice
Therefore, the lattice is a sublattice of a product of two chains, as illustrated in
Figure 5. This result was first proven by Dilworth [
38]. Other characterizations of planar distributive lattices can be found in the literature [
39]. Since the extreme polymatroid functions on the lattices
2 and the lattice
only take the values zero and one, the same is true for any planar distributive lattice.
A modular planar lattice will also contain blocks of the type
. Therefore, a modular planar lattice can be obtained from a distributive planar lattice by adding double irreducible elements [
40], as illustrated in
Figure 6.
Since has extreme polymatroid functions that take the values 0, and 1, the extreme functions are modular. Gluing such modular functions together leads to extreme polymatroid functions that are modular. Therefore, all extreme polymatroid functions on a planar modular lattice can be represented by a planar modular lattice with a modular function. Therefore, the independence structure is given by when
The extreme polymatroid functions on a planar modular lattice can be represented as follows. Let
denote independent random variables uniformly distributed over
for some large value of
Let
denote the random variable:
and let
denote the random variable:
for
The way to index the variables can be seen in
Figure 7. Then, the entropy is proportional to the ranking function. A polymatroid function
h that has a representation given by an Abelian group satisfies the Ingleton inequalities [
41], i.e., inequalities of the form:
Therefore, the Shannon inequalities imply the Ingleton inequalities as long as the polymatroid function is defined on a planar modular lattice. Paajanen [
42] has proven that under some conditions, the entropy function of a nilpotent
p-group can be represented by an Abelian group. The core of the proof was that the subgroup lattice of a nilpotent
p-group is also the subgroup lattice of an Abelian group. Many of these lattices are planar, and in these cases, the results by Paajanen follow from our results on planar graphs.
7. Discussion
In this paper, we have proven that the three basic Shannon inequalities are sufficient for certain lattices. It would be a major step forward if one could make a complete characterization of lattices without non-Shannon inequalities, but this may be too ambitious. In order to obtain results, one may have to restrict to certain classes of lattices like general modular lattices or geometric lattices. For handling such lattices, one would have to develop new techniques that may also be of wider interest.
Lattices seem to fall into two types. For one type, one does not have non-Shannon inequalities. For the other type, there are infinitely many non-Shannon inequalities. We do not know of any lattice with non-Shannon inequalities where the entropic functions are characterized by finitely many inequalities. Apparently, the complexity increases from three basic inequalities to infinitely many inequalities, and this transition seems to happen due to the Matúš lattice. Similarly matroids in general have no finite characterization, and conditional independence does not have a finite characterization. It appear to be the case that the leap from low complexity to infinite complexity happens for the same reason and seems to be related to the structure of the Matús lattice. In this paper, we have provided some basic results and a common terminology that should be useful for further exploration of this research area.
Bayesian networks and similar graphical models have not been discussed in the present paper. Nevertheless, Bayesian networks are closely related to functional dependencies, so important properties of Bayesian networks can be translated into lattice language. This will be the topic of a separate publication [
43], but some preliminary results have already been published [
7].
We have seen how a separoid relation generates a notion of functional dependence. For modular lattices, we have also seen that the lattice structure generates a separoid relation. It is an open question to what extent general lattices are born with a canonical notion of conditional independence that can be formalized in terms of separoids. For functional dependencies corresponding to Bayesian networks, this question has been studied in detail [
16], but more general results related to these questions would be of great importance to our understanding of concepts related to cause and effect.