1. Introduction
This paper compares two notions of entropy that are relevant in the context of state estimation under communication constraints. Since the work of Savkin [
1], it has been well known that the topological entropy of a dynamical system characterizes the smallest rate of information above which an estimator, receiving its state information at the corresponding rate, is able to generate a state estimate of arbitrary precision. Topological entropy is a quantity that has been studied in the mathematical field of dynamical systems since the 1960s and has turned out to be a useful tool for solving many theoretical and practical problems, cf. the survey [
2] and the monograph [
3]. A big drawback of this notion in the context of state estimation is that topological entropy is highly discontinuous with respect to the dynamical system under consideration in any reasonable topology, cf. [
4]. As a consequence, estimation policies based on topological entropy are likely to suffer from a lack of robustness. Additionally, topological entropy is very hard to compute or estimate. There are only few numerical approaches that potentially work for multi-dimensional systems, cf. [
5,
6,
7,
8], and each of them has its drawbacks and restrictions.
A possible remedy for these problems is provided in the works [
9,
10] of Matveev and Pogromsky. One of the main ideas in these papers is to replace the topological entropy as a figure-of-merit for the necessary rate of data transmission with a possibly larger quantity, named restoration entropy, which describes the smallest data rate above which a more robust form of state estimation can be achieved (called regular observability in [
9,
10]).
Looking at one of the simplest types of nonlinear dynamical systems, namely Anosov diffeomorphisms, the main result of the paper at hand demonstrates that for most dynamical systems, we have to expect that the restoration entropy strictly exceeds the topological entropy. That is, to achieve a state estimation objective that is more robust with respect to perturbations, one has to pay the price of using a channel that allows for a larger rate of data transmission. More specifically, our result shows that the equality of topological and restoration entropy implies a great amount of uniformity in the dynamical system under consideration, which can be expressed in terms of the unstable Lyapunov exponents at each point, whose sum essentially has to be a constant. Such a property can easily be destroyed by a small perturbation, showing that arbitrarily close to the given system, we find systems whose restoration entropy strictly exceeds their topological entropy. Since Anosov diffeomorphisms are considered as a paradigmatic class of chaotic dynamical systems, this property can be expected for a much larger class of systems.
To prove our result, we need a number of high-level concepts and results from the theory of topological, measurable, and smooth dynamical systems. This includes the concepts of topological and metric pressure, Lyapunov exponents, SRB measures, and uniform hyperbolicity.
For further reading on the topic of state estimation under communication constraints, we refer the reader to [
1,
9,
10,
11,
12,
13,
14] and the references given therein.
The structure of this paper is as follows: In
Section 2, we collect all necessary definitions and results from the theory of dynamical systems.
Section 3 introduces the concept of restoration entropy and explains its operational meaning in the context of estimation under communication constraints. In
Section 4, we prove our main result and provide some interpretation and an example. Finally,
Section 5 contains some concluding remarks.
2. Tools from Dynamical Systems
Notation: By , we denote the set of all integers, by the set of positive integers, and . All logarithms are taken to the base two. If M is a Riemannian manifold, we write for the induced norm on any tangent space , . The notation is reserved for operator norms. We write and for the closure and the interior of a set A in a metric space, respectively. Finally, the notation (A subset of B) does not exclude the case .
In this paper, we use several sophisticated results from the theory of dynamical systems, in particular from smooth ergodic theory. In the following, we try to explain these results without going too much into technical details.
Let
be a continuous map on a compact metric space
. Via its iterates:
the map
T generates a discrete-time dynamical system on
X with associated orbits
,
. We call the pair
a topological dynamical system (TDS).
2.1. Entropy and Pressure
Let
be a TDS. The topological entropy
measures the total exponential complexity of the orbit structure of
in terms of the maximal numbers of finite-time orbits that are distinguishable w.r.t. to a finite resolution. One amongst different possible formal definitions is as follows. For
and
, a set
is called
-separated if for any
with
, we have:
That is, we can distinguish any two points in
E at a resolution of
by looking at their length-
n finite-time orbits. By the compactness of
X, there is a uniform upper bound on the cardinality of any (
n,
ε,
T)-separated set. Writing
r(
n,
ε,
T) for the maximal possible cardinality,
This definition is due to Bowen [
15] and (independently) Dinaburg [
16]. However, it should be noted that the first definition of topological entropy, given by Adler, Konheim, and McAndrew [
17], was in terms of open covers of
X and was modeled in strict analogy to the metric (= measure−theoretic) entropy defined earlier by Kolmogorov and Sinai [
18,
19].
To define metric entropy, one additionally needs a Borel probability measure
on
X that is preserved by
T in the sense that
for every Borel set
A. By the theorem of Krylov–Bogolyubov, every continuous map on a compact space admits at least one such measure, cf. [
20], Theorem 4.1.1. We write
for the set of all
T-invariant Borel probability measures. For any finite measurable partition
of
X, we define the entropy of
T on
by:
Here, ⋁ denotes the join operation. That is,
is the partition of
X whose elements are all intersections of the form
with
. Moreover,
denotes the Shannon entropy of a partition, i.e.,
for any finite partition
. The metric entropy of
T w.r.t.
is then defined by:
the supremum taken over all finite measurable partitions
of
X (replacing measurable partitions with open covers and Shannon entropy with the logarithm of the cardinality of a minimal finite subcover, the same construction yields the topological entropy as defined in [
17]).
To understand the meaning of , note that is the average amount of uncertainty as one attempts to predict the partition element to which a randomly-chosen point belongs. Hence, measures the average uncertainty per iteration in guessing the partition element of a typical length-n orbit.
The variational principle for entropy states that:
where the supremum is not necessarily a maximum. This variational principle can be regarded as a quantitative version of the theorem of Krylov–Bogolyubov.
Another concept (of which entropy is a special case) used in dynamical systems and inspired by ideas in thermodynamics is pressure. In this context, any continuous function
, also called a potential or an observable, gives rises to the metric pressure of
T w.r.t.
for a given
, defined as:
To define an associated notion of topological pressure, put
and:
Then, the topological pressure of
T w.r.t.
is given by:
The associated variational principle, first proven in [
21], reads:
which includes (
1) as a special case (simply put
).
2.2. Subadditive Cocycles
Let
be a map. A subadditive cocycle over
is a sequence
of functions
satisfying:
If equality holds in this relation, we call an additive cocycle over .
If X has the structure of a probability space with a -algebra and a probability measure on , T is measurable, and is T-invariant, we speak of a measurable subadditive cocycle provided that all are measurable. In the context of a TDS , we speak of a continuous subadditive cocycle if all are continuous.
The most fundamental result about subadditive cocycles is Kingman’s subadditive ergodic theorem, cf. [
3], Theorem 2.1.4:
Theorem 1. Let be a measure-preserving map on a probability space and a measurable subadditive cocycle over such that each is integrable. Then, the limit:exists for μ-almost every . If, additionally, μ is ergodic, then the limit is constant with: Observe that the limit on the right-hand side of (
3) always exists by Fekete’s subadditivity lemma (see [
3], Fact 2.1.1), because the sequence
is subadditive, i.e.,
. Kingman’s theorem can, in particular, be applied if
is a TDS,
, and
is a continuous subadditive cocycle.
Now, we consider again a TDS
and a continuous subadditive cocycle
over
. We define the extremal growth rate of
by:
The following result is well known and can be found in [
22], Theorem A.3, for instance:
Lemma 1. Let be a continuous subadditive cocycle over a TDS . Then: Here, all infima can be replaced with limits. Moreover, every supremum is attained.
2.3. Lyapunov Exponents, SRB Measures, and Pesin’s Formula
To describe the long-term dynamical behavior of smooth systems, the notion of Lyapunov exponents is crucial. Given a
-diffeomorphism
on a compact Riemannian manifold
M, the Lyapunov exponent at
in direction
is the number:
provided that the limit exists. Lyapunov exponents measure how fast nearby solutions diverge from each other. The most general result on their existence and their properties is the multiplicative ergodic theorem (MET), also known as Oseledets theorem, cf. [
23,
24]. We need the following version of the theorem (which is not the most general):
Theorem 2. Let be a -diffeomorphism of a compact Riemannian manifold M and . Then, there exists a Borel set with and such that the following holds: for every , there exist numbers , and the tangent space at x splits into linear subspaces as:such that the following properties hold: - (i)
For every , we have: - (ii)
The functions , , and are measurable and constant along orbits. Moreover, - (iii)
For every , the limit:exists, and the different eigenvalues of are (here, denotes the adjoint of ).
Typically, a given map has a huge number of associated invariant measures. To obtain a good description of the global dynamical behavior, one has to select specific invariant measures that determine the behavior of the system on a large set of initial states. In this context, the notion of an SRB measure (Sinai–Ruelle–Bowen measure) comes into play. An SRB measure is a measure with at least one positive Lyapunov exponent almost everywhere, having absolutely continuous conditional measures on unstable manifolds. We are not going to give a technical definition of the latter property. Instead, we state the following celebrated theorem due to Ledrappier and Young [
25], which characterizes this property in terms of metric entropy. Here, we use the short-cut:
for the sum of all positive Lyapunov exponents at a point
, counted with multiplicities.
Theorem 3. Let be a -diffeomorphism of a compact manifold M and . Then, the formula:holds if and only if μ has absolutely continuous conditional measures on unstable manifolds. Additionally, note that for any
-diffeomorphism
T and any
, the inequality:
holds, which is known as Ruelle’s inequality or Ruelle–Margulis inequality [
26] (Formula (
4) was first proven by Pesin for smooth invariant measures).
2.4. Anosov Diffeomorphisms
One of the simplest classes of smooth dynamical systems with complicated dynamical behavior is the class of Anosov diffeomorphisms. In this paper, we use these systems for two reasons. First, they have positive topological entropy, and second, they are very well understood and there are many tools available to describe their properties.
Let
M be a compact Riemannian manifold. A
-diffeomorphism
is called an Anosov diffeomorphism if there exists a splitting:
into linear subspaces such that the following conditions are satisfied:
- (A1)
and for all .
- (A2)
There are constants
and
, so that, for all
and
,
From (A1) and (A2), it automatically follows that
and
vary continuously with
x, cf. [
20], Proposition 6.4.4. The existence of a splitting as above is also known as uniform hyperbolicity.
The simplest examples of Anosov diffeomorphisms are hyperbolic linear torus automorphisms, i.e., maps on the
n-dimensional torus
of the form:
where
is an integer matrix satisfying
and
for all eigenvalues
of
A. Observe that the assumption
guarantees that
is invertible with inverse
(because
also has integer entries) and at the same time implies that
is area-preserving. That is, the normalized Lebesgue measure on
is an element of
. The assumption on the eigenvalues of
A together with the fact that the derivative
at any point
can be identified with
A itself implies the Anosov Properties (A1) and (A2).
It is well known that Anosov diffeomorphisms are structurally stable, i.e., any sufficiently small
-perturbation
of an Anosov diffeomorphism
is also an Anosov diffeomorphism, which is topologically conjugate to
T, see [
20], Proposition 6.4.6 and Corollary 18.2.2. That is, there exists a homeomorphism
, so that:
If we assume that
T is an arbitrary Anosov diffeomorphism of the torus, the existence of a unique entropy-maximizing measure
follows. That is,
is the unique element of
satisfying:
This follows from a combination of results that can be found in Katok and Hasselblatt [
20], namely Theorem 20.3.7, Proposition 18.6.5, Theorem 18.3.9, and Corollary 6.4.10. The entropy-maximizing measure
is also known as the Bowen-measure.
In this context, also the notion of topological mixing is important. An Anosov diffeomorphism (or simply a continuous map)
is called topologically mixing if for any two nonempty open sets
, there exists an integer
N such that
for all
. In particular, all Anosov diffeomorphisms on
are topologically mixing ([
20], Proposition 18.6.5).
3. State Estimation and Restoration Entropy
The notion of restoration entropy was introduced in [
10] for systems given by ODEs on
. However, it is immediately clear from the definition that restoration entropy can be defined for any continuous map on a compact metric space as follows. Let
be a continuous map on a metric space
and
a compact set with
. For every
,
and
, let
denote the smallest number of
-balls needed to cover the image
. If the map is not clear from the context, we also write
. Then:
The existence of the limit in
n follows from the subadditivity of the sequence
(using Fekete’s lemma). If we assume that
T is a
-diffeomorphism of a compact Riemannian manifold, the numbers
can be estimated in terms of the unstable singular values of
. This is related to the simple fact that the image of a ball under a linear map (in our case, the local linear approximation
to
) is an ellipsoid with semi-axes of lengths proportional to the singular values. This leads to the following result, proven in [
10], Theorem 11, for continuous-time systems. The proof carries over to discrete-time systems on Riemannian manifolds without any problem.
Theorem 4. Let be a -diffeomorphism of a d-dimensional Riemannian manifold M and a forward-invariant compact set of T with . Then:where denote the singular values of . For the analysis of , based on the above formula, the following observations are crucial:
We have
where
denotes the linear map induced by
between the full exterior algebras of the tangent spaces
and
, respectively; see [
27], Chapter I, Proposition 7.4.2.
The sequence
,
, is a continuous subadditive cocycle over
, since:
Alternatively, this follows from Horn’s inequality for singular values; see [
27], Chapter I, Proposition 2.3.1.
In the following, we explain the operational meaning of the quantity .
Consider the dynamical system given by:
Suppose that a sensor, fully observing the state , sends its data to an encoder. At the sampling times , the encoder sends a signal through a noise-free discrete channel to a decoder (without transmission delay). The decoder acts as an observer of the system, trying to reconstruct the state from the received data. We write for the estimate generated by the observer at time t. Moreover, we assume that we start with an initial estimate of a specified accuracy.
With
denoting the coding alphabet, the encoder and the observer are described by mappings:
and:
The argument
corresponds to the initial error at time zero, i.e.,
. In particular, we assume that both the encoder and the observer are given the data
and
.
We assume that the channel can transmit at least
and at most
bits in any time interval of length
r. The capacity of the channel is then defined by:
assuming that these limits exist and coincide.
We consider the following two observation objectives:
- (O1)
The observer observes the system with exactness
if there exists
, so that
with
implies:
- (O2)
The observer regularly observes the system if there exist
, so that for all
and
with
,
We say that the system is:
observable on K over a channel of capacity C if for every , an observer exists that observes the system with exactness over this channel;
regularly observable on K over a channel of capacity C if there exists an observer that regularly observes the system over this channel.
Then, we have the following data-rate theorem, cf. [
9], Theorem 8, and [
10], Theorem 9.
Theorem 5. The smallest channel capacity , so that System (6) is: observable on K over every channel of capacity is given by: regularly observable on K over every channel of capacity is given by:
Since regular observability implies observability, it is clear that:
As already pointed out in the Introduction, the quantity is highly discontinuous w.r.t. the dynamical system. Moreover, the corresponding data-rate theorem has the disadvantage that the final error may be much larger than the initial error , which cannot happen in the case of regular observability. From Theorem 4 in combination with Lemma 1, one sees that in the smooth case, is an infimum over functions that are continuous w.r.t. T in the -topology. This implies at least upper semicontinuity. Hence, we can expect that coding and estimation strategies based on restoration entropy enjoy better properties than those based on topological entropy.
4. Results
Before we present our main result, we prove two lemmas, which are of independent interest.
Lemma 2. Let be a -diffeomorphism on a compact Riemannian manifold M. Then, for any , we have: Proof. Let
. First observe that we have the identity:
where
are the singular values of
, see [
27], Chapter I, Proposition 7.4.2. Hence,
The maximum over
k is clearly attained when
k is the maximal number such that
for all
. Hence,
The numbers
are the eigenvalues of
. Theorem 2 states that
for
-almost every
and the logarithms of the eigenvalues of
are the Lyapunov exponents at
x. Since eigenvalues depend continuously on the matrix, it follows that:
and consequently
Applying the theorem of dominated convergence then yields the result. □
Lemma 3. Let be a -diffeomorphism on a compact Riemannian manifold M such that . Then, if T has an entropy-maximizing measure , it follows that: Proof. Assume to the contrary that
(using Ruelle’s inequality (
5)). Then, Lemma 2 implies:
According to Theorem 4 and the subsequent observation, an application of Lemma 1 yields:
Combining these observations gives , in contradiction to our assumption. □
Now, we are in a position to state our main result.
Theorem 6. Let be a topologically mixing -Anosov diffeomorphism on a compact Riemannian manifold M such that . Then, the unique entropy-maximizing measure is an SRB measure. Moreover, the function:is constant. Proof. First note that the existence and uniqueness of an entropy-maximizing measure
follows from [
20], Theorem 20.3.7, Theorem 18.3.9, and Corollary 6.4.10. Here, the assumption that
T is topologically mixing is crucial. By the preceding lemma combined with Theorem 3, we already know that
has absolutely continuous conditional measures on unstable manifolds. Since an Anosov diffeomorphism has positive Lyapunov exponents everywhere (where they exist), attained in all directions of the unstable subspace
, it follows that
is an SRB measure.
Now, let
be chosen arbitrarily. Due to the invariance of
, we have:
for every
, implying:
where we use Kingman’s subadditive ergodic theorem, applied to the continuous additive cocycle
(
), and the theorem of dominated convergence. Observe that the function
is continuous (using the fact that
is continuous). Hence, we can consider the affine function:
The variational principle (
2) for pressure tells us that:
Hence, , as the supremum over affine functions, is a convex function.
Using that
is the entropy-maximizing measure and Theorem 3, respectively, we obtain:
The second identity here follows from the fact that
and
by Ruelle’s inequality (
5). Hence,
.
By convexity of
and (
7), this implies:
From (
7), it now follows that all of the maps
have the same slope, i.e.,
is independent of
. □
The above theorem shows that the equality
is a very restrictive condition. Indeed, this can be seen as follows. Any topologically mixing Anosov diffeomorphism has an abundance of periodic points. Indeed, the set of periodic points is dense in
M; see [
20], Corollary 6.4.19. If we consider a periodic point
of period
, we can consider the invariant measure
given by:
with
being the Dirac measure at a point. The above theorem implies that, under
, the number:
is independent of the periodic point
p chosen. On the other hand, we know that every sufficiently small
-perturbation of
T yields another
-Anosov diffeomorphism, topologically conjugate to
T, hence also topologically mixing. If this perturbation is only performed in a small vicinity of a fixed periodic orbit, it can easily change the number
, while not changing it for most of the other periodic orbits. As a consequence, the perturbed diffeomorphism
cannot satisfy
.
The following corollary gives another characterization of Anosov diffeomorphisms with in a two-dimensional case.
Corollary 1. Consider a area-preserving Anosov diffeomorphism of the two-torus. Then, the equality is equivalent to the existence of a hyperbolic linear automorphism and a -diffeomorphism such that .
Proof. It follows immediately from Theorem 6 in combination with [
20], Corollary 20.4.4, that the identity
implies the existence of a
-conjugacy, as asserted. The other direction is easy to see, using the definition of restoration entropy. If
, then also
for all
. We use that a
-map on a compact manifold has a global Lipschitz constant. Let
and
be Lipschitz constants of
h and
, respectively. Then:
Observe that
. Let
denote the minimal number of
-balls needed to cover an
-ball in
for any
. Then, the minimal number of
-balls needed to cover
is bounded from above by
. This implies:
Taking the lim sup for
and subsequently the limit for
, we obtain that
. The other inequality can be proven analogously, so:
Since
T and
are topologically conjugate (the
-diffeomorphism
h is a homeomorphism, in particular), they also have the same topological entropy:
To complete the proof, it now suffices to show that
. We can compute
using Theorem 4. To this end, observe that
A is a hyperbolic matrix. If
are its eigenvalues, we obtain:
implying
. It is well known that this is also the value of the topological entropy
; see [
20],
Section 4. This also follows from the combination of the variational principle with Theorem 3. □
The following example demonstrates how restrictive the condition is by looking at small perturbations of Arnold’s Cat Map.
Example 1. Arnold’s Cat Map is the hyperbolic linear two-torus automorphism induced by the integer matrix:with determinant . Observe that the derivative can be identified with A for each . Since A is a hyperbolic matrix with eigenvalues:satisfying , it follows that is a area-preserving Anosov diffeomorphism. Hence, Corollary 1 yields: Now, we consider a perturbation of the form:which is well defined as a torus map, since the sine function is -periodic. By the structural stability of Anosov diffeomorphisms, for a sufficiently small ε, this map is topologically conjugate to , hence has the same topological entropy . However, its restoration entropy is strictly greater. This can be seen by looking at the fixed point with the associated derivative: The eigenvalues of this matrix can be computed as: Since , Theorem 4 yields for sufficiently small.
5. Conclusions
In this paper, we compared two notions of entropy for dynamical systems that have an operational meaning in the context of state estimation over digital channels: topological entropy and restoration entropy. Looking at Anosov diffeomorphisms (a paradigmatic class of chaotic dynamical systems), our main result demonstrates that the equality of these two quantities implies a great amount of uniformity in the given system. For area-preserving Anosov diffeomorphisms on the two-torus, this uniformity can be expressed in terms of the existence of a
-conjugacy to a linear system. Hence, we can conclude that for most dynamical systems, the strict inequality
holds. The operational meaning of this inequality is that for regular observability, as defined in
Section 3, a strictly larger channel capacity is necessary than for observability.