Thermodynamic Formalism in Neuronal Dynamics and Spike Train Statistics

Cofré, Rodrigo; Maldonado, Cesar; Cessac, Bruno

doi:10.3390/e22111330

Open AccessReview

Thermodynamic Formalism in Neuronal Dynamics and Spike Train Statistics

by

Rodrigo Cofré

^1,*

,

Cesar Maldonado

²

and

Bruno Cessac

³

¹

CIMFAV-Ingemat, Facultad de Ingeniería, Universidad de Valparaíso, Valparaíso 2340000, Chile

²

IPICYT/División de Matemáticas Aplicadas, San Luis Potosí 78216, Mexico

³

Inria Biovision team and Neuromod Institute, Université Côte d’Azur, 06901 CEDEX Inria, France

^*

Author to whom correspondence should be addressed.

Entropy 2020, 22(11), 1330; https://doi.org/10.3390/e22111330

Submission received: 9 October 2020 / Revised: 13 November 2020 / Accepted: 15 November 2020 / Published: 23 November 2020

(This article belongs to the Special Issue Generalized Statistical Thermodynamics)

Download

Browse Figures

Versions Notes

Abstract

:

The Thermodynamic Formalism provides a rigorous mathematical framework for studying quantitative and qualitative aspects of dynamical systems. At its core, there is a variational principle that corresponds, in its simplest form, to the Maximum Entropy principle. It is used as a statistical inference procedure to represent, by specific probability measures (Gibbs measures), the collective behaviour of complex systems. This framework has found applications in different domains of science. In particular, it has been fruitful and influential in neurosciences. In this article, we review how the Thermodynamic Formalism can be exploited in the field of theoretical neuroscience, as a conceptual and operational tool, in order to link the dynamics of interacting neurons and the statistics of action potentials from either experimental data or mathematical models. We comment on perspectives and open problems in theoretical neuroscience that could be addressed within this formalism.

Keywords:

Thermodynamic Formalism; neuronal networks dynamics; maximum entropy principle; free energy and pressure; linear response; large deviations; ergodic theory

1. Introduction

Initiated by Boltzmann [1,2], the goal of statistical physics was to establish a link between the microscopic mechanical description of interacting particles in a gas or a fluid, and the macroscopic description that is provided by thermodynamics [3,4].

Although this program is, even nowadays, far from being completed [1,5], the work of Boltzmann and his successors opened new avenues of research, not only in physics but also in mathematics. Especially the term “ergodic”, which was coined by Boltzmann [1], inaugurated an important branch of mathematics that provides a rigorous link between the description of dynamical systems in terms of their trajectories and the description in terms of statistics of orbits and, more generally, between dynamical systems theory and probability theory. At the core of the ergodic theory, there is a set of "natural" dynamically invariant probability measures in the phase space, somewhat generalising the Liouville distribution for conservative systems with strong analogies with Gibbs distributions in statistical physics [6,7]. This strong connection, in particular, gave birth to the so-called Thermodynamic Formalism.

The introduction of Thermodynamic Formalism that occurred in the 1970s was primarily due to Yakov Sinai, David Ruelle, and Rufus Bowen [8,9,10]. The development of Thermodynamic Formalism initially served to derive rigorous criteria characterising the existence and uniqueness of the Gibbs states in the infinite volume limit. Although Gibbs states and equilibrium states (see Section 2.1.1) are naturally defined in finite volume systems, the extension to infinite volume (“thermodynamic limit”) is far from straightforward. Indeed, it does not follow from the Carathéodory or Kolmogorov extension theorems [11,12], that the equilibrium states of the infinite volume define a measure, as there is no way to express the marginals associated to an infinite-volume Gibbs measure without making explicit reference to the measure itself [12]. When considering conditional probabilities, rather than marginals, Dobrushin, Lanford, and Ruelle led to a different consistency condition that affords for the building of infinite volume Gibbs measures [7].

In the context of dynamical systems, Sinai, Ruelle, and Bowen were able to connect the theory of hyperbolic (Anosov) dynamical systems to results in statistical mechanics. Indeed, Sinai found an unexpected link between the equilibrium statistical mechanics of spin systems and the ergodic theory of Anosov systems by a codification using Markov partitions (see Section 2.1.1 for details). This idea was later extended for a much more general class of hyperbolic systems [10,13,14]. While the Thermodynamic Formalism started as a branch of rigorous statistical mechanics, nowadays it is viewed from different communities as a branch of dynamical systems or stochastic processes.

There have been a few attempts to use Thermodynamic Formalism in different ways other than as a natural mathematical foundation of statistical mechanics, for example, studying population dynamics [15,16], self-organised criticality [17], the relative abundance of amino acids across diverse proteomes [18], analyse the difference between introns and exons in genetic sequences [19,20], coding sequence density estimation in genomes [21], and statistics of spike trains in neural systems [22,23,24,25,26,27,28], which is the main topic of this review. Neuronal networks are biological systems whose components, such as neurons, synapses, ionic channels, ..., are ruled by the laws of physics and are written in terms of differential equations; hence, are dynamical systems. On the other hand, because of their large dimensionality, it is natural to attempt to characterise neuronal networks using methods inspired by statistical physics, for example, mean-field methods [29,30,31,32], density methods [33] or Fokker-Planck equations [34]. Most neurons produce short electrical pulses, called action potentials or spikes, and it is widely believed that the collective spike trains emitted by neuronal networks encode information about the underlying dynamics and response to stimuli [35,36,37]. Thus, researchers have devoted a lot of effort to understanding the correlations structure in the statistics of spike trains [38,39,40,41,42,43,44,45].

Because spikes can be represented as binary variables, it is natural to adapt methods and concepts from statistical physics, and more specifically the statistical physics of spin systems, to analyse spike trains statistics. There have been many successful attempts in this direction. All of the approaches we know about are based on variational principles. The most direct connection from statistical physics to spike train statistics is done via the maximum entropy principle which has attracted a lot of attention during the past years [38,39,43,46,47] (see [47] for a physicists-oriented review). Unfortunately, most of these articles are limited to the original form of an Ising spin-glass potential (pairwise interactions with random couplings) or variants of it with higher order interactions [40,41,42], where successive times are independent, thereby neglecting the time correlations and causality one may expect from a network of neurons with interactions (exceptions can be found in [48,49,50]). We focus on this approach and its extension to causal networks in the present review as it is natural to link with the Thermodynamic Formalism. Another approach, which actually appeared earlier in mathematical neuroscience, is the dynamic mean-field theory that takes into account dynamics and time correlations. In this approach, originated in quantum field theory and Martin–Siggia–Rose formalism (Statistical dynamics of classical systems [51]), the variational principle is expressed via the minimisation of an effective action containing the equations for the dynamics. It was introduced in the field of theoretical neuroscience by Sompolinsky who initially applied it to the study of spin-glasses dynamics [52,53,54], before analysing neuronal networks dynamics [29]. Here, the effective action can be summarised, in the limit when the number of neurons tends to infinity, by dynamic mean-field equations depending on neuron’s firing rates and pairwise time-correlations. Thus, here, the information about spike statistics is contained in the two first moments (Gaussian limit). This approach has inspired a lot of important works, where temporal correlations are taken into account, see, for example [55,56,57], or the recent work by M. Helias and collaborators (i.e., the review on dynamic mean-field theory in [58] and the link to from dynamic mean-field theory to large deviations from Ben-Arous and Guionnet [59] and [60]). Another trend attempts to relate neuronal dynamics to spike statistics, expressed via a maximum-likelihood approach [61] (we apologise if we have forgotten other approaches that we ignore). To our taste, the most achieved work in this direction is Amari’s information geometry [62] making a beautiful link between probability measures (e.g., exponential, like Gibbs measures) and differential geometry.

In this review, we show how Thermodynamic Formalism can be used as an alternative way to study the link between the neuronal dynamics and spike statistics, not only from experimental data, but also from mathematical models of neuronal networks, properly handling causal and memory dependent interactions between neurons and their spikes. It is also based on a variational principle (and, actually, the four methods that are discussed in this introduction certainly have a common “hat”, the large deviations theory [63]), although we depart from this principle at some point where we discuss extensions of this approach to non-stationary situations through a linear response formula. Additionally, Thermodynamic Formalism provides a rigorous mathematical framework to study phase transitions that may illuminate the current discussions regarding signatures of criticality observed in many examples of neuroscience [64,65,66,67,68,69,70,71,72] and, especially, in Gibbs distributions inferred while using the maximum entropy principle from experimental data [73,74,75]. The aim of this review is twofold. On the one hand, to bridge the gap between mathematicians working in the field of Thermodynamic Formalism and scientists interested in characterising spike statistics, especially those that apply the maximum entropy principle, which is placed here in a broader context. On the other hand, to show new perspectives in the field of mathematical neuroscience related to Thermodynamic Formalism, including phase transitions.

The rest of the article is organised, as follows. In Section 2, we introduce the main tools and ideas of Thermodynamic Formalism that we use later. In Section 3 and Section 4, we present the uses of this formalism in neuroscience, for spike train statistics. In Section 5, we end with a discussion and present perspectives for future works.

2. Mathematical Setting of Thermodynamic Formalism

Our goal in this section is to present the basic tools and ideas of Thermodynamic Formalism to the unfamiliar reader (detailed accounts of this subject can be found in [9,10,76,77,78]).

2.1. General Properties

In order to study a dynamical system from the perspective of the Thermodynamic Formalism, we need, first of all, to describe the set of elements that need to be either finite or countable. Thus, continuous particle characteristics such as position, speed, or momentum do not enter in the setting we analyse here, unless one “coarse grain“ the phase space with a finite or countable partition. Discrete particle characteristics, like spin, or symbols attached to specific features of a dynamical system, constitute the set of symbols denoted by A also referred to as the alphabet. Let

A^{M}

be the set of blocks of M symbols of the alphabet, that is, sequences of the form

x_{0} x_{1} \dots x_{M - 1}

, where

x_{i} \in A

,

i = 0 \dots M - 1

and

A^{N}

, is the set of right-infinite sequences of the form

x = x_{0} x_{1} \dots

, with

x_{i} \in A

for all

i \in N

. One may also consider the bi-infinite sequences

A^{Z}

, but we will mostly stick to

A^{N}

in the sequel.

One can equip the space

A^{N}

with a distance. We associate to

θ \in (0, 1)

the distance

d_{θ}

such that

d_{θ} (x, y) = θ^{m}

, where m is the largest non-negative integer, such that

x_{i} = y_{i}

for every

0 \leq i < m

. In particular, if

x = y

then

m = \infty

and

d_{θ} (x, x) = 0

, and, by convention, if

x_{0} \neq y_{0}

then

m = 0

. When considering this distance, the metric space

(A^{N}, d_{θ})

is compact [10].

In order to consider time order, we introduce the time evolution in the form of a left-shift,

σ : A^{N} \to A^{N}

, defined by

{(σ x)}_{i} = x_{i + 1}

, for all

i \in N

and any

x \in A^{N}

, where the index i refers to time. That is, the i-th symbol of

σ x

(the image of x under

σ

), corresponds to the

(i + 1)

-th symbol of the original x, for all i.

Let us define continuous functions

f : A^{N} \to R

. We introduce the modulus of continuity of f:

{var}_{k} (f) : = sup {| f (x) - f (y) | : x_{i} = y_{i}, for i = 0, \dots, k - 1 .},

characterising the maximal variation of f on the set of infinite sequences agreeing on their first k symbols. The function f is continuous if

{var}_{k} (f) \to 0

when

k \to \infty

, and the continuity is exponential if

{var}_{k} (f) \to 0

exponentially fast. A function

f : A^{N} \to R

has range R if

f (x) \equiv f (x_{0}, \dots, x_{R - 1})

, i.e., f only depends on the first R coordinates

x_{0}, \dots, x_{R - 1}

. Functions with finite range are obviously continuous. The notion of continuity, and, especially, exponential continuity is essential when studying the Thermodynamic Formalism of functions with infinite range.

A useful concept for this formalism is that of the Markov partitions. Let

Ω

be a general (continuous) state space and a general dynamics (a flow or a discrete time mapping)

G

on

Ω

. Consider a finite covering of

Ω

, made by sets called rectangles,

R = {R_{1}, \dots, R_{l}}

with the property that, for every pair

i \neq j

,

int (R_{i}) \cap int (R_{j}) = \emptyset

, and such that the closure of the image set

G (R_{i})

equals a union of closures of sets in

R

. There is a specific construction for different dynamical systems (for details we refer the reader to [10]). This construction allows one to make a natural coding for the state space, where the trajectories of the dynamical systems correspond to the sequences of symbols of the labels identifying the sets

R_{i}

. Rectangles allow for us to partition a continuous phase space into a discrete finite partition. For hyperbolic dynamical systems, each point in the phase space admits local stable or unstable manifolds with a non zero diameter. The edges of rectangles in this case are made of these local stable and unstable manifolds and the corresponding partition is called a “Markov partition”, because the image of a rectangle under the dynamics can be represented by a Markov transition matrix [79].

2.1.1. Gibbs Measures

We now define the measurable sets on

A^{N}

. Let us denote the sequence

x_{0} \dots, x_{n - 1}

by

x_{0, n - 1}

. The set

[x_{0, n - 1}] : = {y \in A^{N} : y_{0} = x_{0}, \dots, y_{n - 1} = x_{n - 1}}

is called a cylinder. Cylinders define a Borel sigma algebra

B

on

A^{N}

(see, for instance, Chapter 1, in [80]). We are interested in probability distributions on

(A^{N}, B)

also referred as (macro) states in physics. In the sequel we will skip the prefix “macro” and deal with “states” as probability measures on

(A^{N}, B)

. There is a special class of states

ν

, such that, for any measurable set

B \subset A^{N}

,

ν (σ^{- 1} (B)) = ν (B)

, where

σ^{- 1} (B)

stands for the set of all the pre-images of

σ

of elements of B. These distributions are the set of shift-invariant probability measures denoted by

M_{σ} (A^{N})

. In general, it is possible to consider systems where there exists some forbidden transitions between symbols. In that case, we need to consider a subset of A invariant under the dynamics, i.e.,

X \subset A^{N}

, such that

σ (X) = X

.

The analogy with the spin systems in statistical mechanics at the root of the terminology “Thermodynamic Formalism”, goes, as follows. Let

ϕ : A^{N} \to R

be a continuous function, also called “energy” or “potential”. Two important examples are finite range potentials, where

ϕ (x) \equiv ϕ (x_{0, R - 1})

, or exponentially continuous potentials with infinite range. Subsequently, the “energy” of a configuration of n sites based on the sequence

x \in X

, is given by the “Birkhoff sums”:

S_{n} ϕ (x) : = \sum_{i = 0}^{n - 1} ϕ \circ {(σ x)}_{i} .

We define the measures that assign probability to the cylinder sets based on the potential function, with the so called “Gibbs property”. There exist constants

C > 1

and

F (ϕ)

, such that, for all

x \in X

and for all

n \geq 1

:

C^{- 1} \leq \frac{μ_{ϕ} ([x_{0, n - 1}])}{exp (\sum_{i = 0}^{n - 1} ϕ \circ {(σ x)}_{i} - n F (ϕ))} \leq C .

(1)

The measures satisfying the condition (1) are called “Gibbs measures”. The quantity:

F (ϕ) = lim_{n \to \infty} \frac{1}{n} log \sum_{x_{0} \dots x_{n - 1}} e^{S_{n} ϕ (y)},

(2)

for all

y \in [x_{0, n - 1}]

[10], is called the “pressure” (or free energy) of the potential

ϕ

. Observe that it does not depend on the measure

μ_{ϕ}

itself, only on the potential. Thus, two Gibbs measures that are associated with the same potential have the same pressure.

Given a continuous potential

ϕ

, such that

{var}_{k} (ϕ) \leq b θ^{k}

for some constants

0 < θ < 1

and,

b > 0

, there is a unique shift-invariant probability measure satisfying the Gibbs property (1), [10]. Furthermore, under the same assumption for the potential, the associated Gibbs measure is mixing i.e.,

{lim}_{n \to \infty} μ (A \cap σ^{- n} B) = μ (A) μ (B)

, for any measurable set

A, B

and thus ergodic (A measure

μ

is said to be ergodic for the dynamics

G

if for any measurable

G

-invariant set A, (

G^{- 1} (A) = A

), its measure is either

μ (A) = 0

or

μ (A) = 1

. See, for instance, Chapter 3 of [79] for a detailed introduction.) (see [10,80] for definitions and details). Moreover, two continuous potentials

ϕ

and

ψ

are called cohomologous with respect to the shift

σ

(and denoted by

ϕ \sim ψ

), if there is a continuous function u and a constant K, such that:

ϕ = ψ + u - u \circ σ - K .

Cohomologous potentials have the same Gibbs measure,

μ_{ϕ} = μ_{ψ}

.

2.1.2. Entropy and the Variational Principle

The Shannon entropy of a probability distribution quantifies the average level of “uncertainty” in its possible outcomes.

h (p) : = - \sum_{x \in A} p (x) log p (x) .

(3)

More generally, let

ν

be a shift-invariant probability measure on X, we introduce the block entropy:

H_{n} (ν) : = - \sum_{x_{0, n - 1} \in A^{n}} ν ([x_{0, n - 1}]) log ν ([x_{0, n - 1}]) .

For finite alphabets and because

ν

is shift-invariant, the following limit exists,

h (ν) : = lim_{n \to \infty} \frac{1}{n} H_{n} (ν),

and is called the Kolmogorov–Sinai entropy or simply entropy of

ν

[76,81]. The key is that the sequence

H_{n}

is sub-additive and the possible values lie in the compact set

[0, log | A |]

(see [76]).

Another important quantity is the Kullback–Leibler (KL) divergence, which quantifies the difference between probability distributions. Consider two probability distributions p and q on the same probability space X, the KL divergence is:

D (p | | q) : = \sum_{x \in X} p (x) log \frac{p (x)}{q (x)} .

This divergence is finite whenever p is absolutely continuous with respect to q, and it is only zero if

p = q

.

Following the analogy with systems in statistical mechanics, an equilibrium state is defined as the measure that satisfies the so-called variational principle, namely:

sup \{h (ν) + \int ϕ d ν : ν \in M_{σ} (X)\} = h (μ_{ϕ}) + \int ϕ d μ_{ϕ} = F (ϕ),

(4)

where

\int ϕ d ν

represents the expected value of

ϕ

with respect to the measure

ν

(it would be noted

E_{ν} (ϕ)

in a probabilistic context). Let us comment on this result. The first equality establishes that, among all shift-invariant probability measures, there is a unique one,

μ_{ϕ}

, which maximises

h (ν) + \int ϕ d ν

. If

\int ϕ d ν

is fixed to some value, this corresponds to maximising the entropy under constraints. This is the Maximum Entropy Principle, but (4) is more general. The second equality states that the maximum,

h (μ_{ϕ}) + \int ϕ d μ_{ϕ}

is exactly the pressure defined in Equation (2). The equation

F (ϕ) = h (μ_{ϕ}) + \int ϕ d μ_{ϕ}

establishes a link with thermodynamics, as

F (ϕ)

is equivalent to the free energy (In contrast to thermodynamics where free energy or pressure refer to different thermodynamic ensembles, we will not make such distinction here as it is irrelevant. Also, note that we do not keep the minus sign and take the Boltzmann constant equal to one. Thus, (4) coincides with the principle of free energy minimisation in statistical mechanics, but it corresponds to a maximising measure in our formalism because of a sign convention. Gibbs measures associated to potentials satisfying that

{var}_{k} (ϕ) \leq b θ^{k}

, as stated above, are equilibrium states. For more general potentials, the notion of Gibbs states and equilibrium states may not coincide.

2.2. Observables and Fluctuations of Their Time Averages

An observable f is a function

f : X \to R

, such that

| f | < \infty

(

| \cdot |

denote the absolute value), and which is time-translation invariant, i.e., for any time i,

f (x_{0, R - 1}) = f (x_{i, i + R - 1})

whenever

x_{0, R - 1} = x_{i, i + R - 1}

.

An important observable, considered in the following sections, is the potential:

U_{λ} (x) = \sum_{k} λ_{k} f_{k} (x) k \in {1, . . ., K},

(5)

that is a linear combination of K observables

f_{k}

, where the parameters

λ_{k}

’s

\in R

. Here, we want to make a few important remarks. First, this decomposition is similar to the definition of potentials in thermodynamics, which are written in terms of a linear combination of intensive variables (e.g., temperature, chemical potential, and so on) and extensive functions of the configurations of the system (physical energy, number of particles). In order to emphasise this analogy, we use the symbol U for the potential, instead of

ϕ

. The function

U_{λ}

parametrically depends on the

λ_{k}

’s; hence, we use the lower index

λ

, to denote

U_{λ}

as well as the corresponding Gibbs measure, denoted here

μ_{λ}

.

Now, we denote

A_{n} (f)

the empirical average of the observable f in a sample of size n, that is,

A_{n} f (x) = \frac{1}{n} \sum_{i = 0}^{n - 1} f (σ x) .

This quantity is important for empirical purposes. If

μ

is ergodic, the convergence of the empirical average to the actual expected value is

μ

-almost-sure, that is

\overset{}{A_{n} (f) \overset{a . s .}{\to} \int f d μ}

as

n \to \infty

. For an example of application of this theorem in the context of spike train statistics, see (Section 2.6).

2.3. Correlations

Let us consider a pair of observables

f, g \in L^{2} (μ)

, (square integrable functions with respect to

μ

). We define their correlation at time n by

C_{f, g} (n) : = \int f \cdot g \circ σ^{n} d μ - \int f d μ \int g d μ .

One also might be interested in the auto-correlation (or auto-covariance) of f at time n,

C_{f} (n) : = \int f \cdot f \circ σ^{n} d μ - {(\int f d μ)}^{2} .

Observe that these quantities might decay fast. The mixing property implies that the correlations go to zero with

n \to + \infty

.

2.3.1. Properties of the Pressure

From the pressure (2), important statistical information regarding the system can be obtained, in particular, correlations. A distinguished case corresponds to the potentials of the form (5). When the corresponding pressure is differentiable to any order, taking the successive derivatives of the pressure with respect to their conjugate parameters gives the average values of the observables, their correlations, and their high-order cumulants with respect to the Gibbs measure. That is, in general:

\frac{\partial^{n} F (U_{λ})}{\partial λ_{k}^{n}} = κ_{n} for all k \in {1, . . ., K},

(6)

where

κ_{n}

is the cumulant of order n with respect to

μ_{λ}

. In particular,

κ_{1}

is the mean of

f_{k}

,

κ_{2}

is its variance,

κ_{3}

the skewness, and

κ_{4}

the kurtosis. Partial derivatives with respect to pairs of parameters can also be considered [82]:

\begin{matrix} \frac{\partial^{2} F (U_{λ})}{\partial λ_{k} \partial λ_{j}} & = C_{f_{k}, f_{j}} (0) + \sum_{n = 1}^{\infty} C_{f_{k}, f_{j}} (n) + \sum_{n = 1}^{\infty} C_{f_{j}, f_{k}} (n) \\ = \frac{\partial μ [f_{k}]}{\partial λ_{j}} = \frac{\partial μ [f_{j}]}{\partial λ_{i}} . \end{matrix}

(7)

Observe, we differentiate

C_{f_{k}, f_{j}}

from

C_{f_{j}, f_{k}}

as the dynamics may be irreversible in time. For an example of application of these formulas in the context of spike train statistics, see (Section 2.6).

Remark 1.

The last two equations are fundamental. They establish a link between the variations in the average of the observable

f_{k}

, when slightly varying the parameter

λ_{j}

and the sum of time correlations between the pair of observables

f_{j}, f_{k}

. This result is known, in statistical physics as the fluctuation-dissipation theorem [83,84]. It relates, for example, in a ferromagnetic model, to the variation of the magnetisation of the spin k to the variations of the local magnetic field

h_{k}

, via the magnetic susceptibility, which is the second derivative of the free energy. This is also the context of the linear response theory, which quantifies how a small perturbation of a parameter affect the average values of observables in terms of the unperturbed measure. Equation (7) can also be used to extend results of Thermodynamic Formalism to non-stationary situations, as we discuss in Section 4.5.

Now, in the classical formulation in statistical physics and the Maximum Entropy models, only the correlation

C_{f_{k}, f_{j}} (0)

appears in (7), because successive times are independent (correlations

C_{f_{k}, f_{j}} (n), C_{f_{j}, f_{k}} (n), n > 0

vanish). When handling memory (thus, potentials with range

R > 1

), there is an infinite sum (series) of correlations appearing in the linear response. This infinite sum converges whenever correlations are decaying sufficiently fast with time n (exponentially). In contrast, when they do not converge fast enough (e.g., power-law with a small exponent), the series diverges, leading to a divergence of the second derivative of the pressure, corresponding to a second-order phase transition. Here, follow the Ehrenfest classification of phase transitions [85]. There is a phase transition of order k if the pressure (free energy) is

C^{k - 1}

but not

C^{k}

. The known examples are first-order, second-order, or infinite order (Kosterlitz-Thouless) phase transitions. Note that phase transitions in memory-less models can also happen if the instantaneous correlation function

C_{f_{k}, f_{j}} (0)

diverges, e.g., when the number of degrees of freedom in the system (number of spins, neurons) tends to infinity. Therefore, in the present setting, second-order phase transitions can arise either when the number of degrees of freedom tends to infinity, or (not exclusive), when time correlations decay slowly.

Thus, Equation (7) connects the second derivative of the pressure, variations in static average of observables, and dynamical correlations. In the next sections, we discuss theorems that relate the dynamical evolution to a criteria ensuring that time-correlations are exponentially decaying, preventing the possibility of second order phase transitions for systems with a finite number of degrees of freedom.

2.3.2. Ruelle-Perron-Frobenius Operator

Let

C (X)

be the set of continuous functions on X. Consider the potential

ϕ

and a continuous function

f \in C (X)

, so one can define a bounded linear operator associated to

ϕ

(transfer operator), called the Ruelle–Perron–Frobenius (RPF), as follows (There is a close analogy between this operator and the propagator in quantum field theory or the Koopman operator in classical dynamics [5]. All of these operators characterise how measures or observables evolve ruled by the dynamics.):

L_{ϕ} f (x) = \sum_{y \in σ^{- 1} x} e^{ϕ (y)} f (y) .

(8)

The spectral properties of (8) yields information to characterise the pressure and study the ergodic properties of the system, in particular, the rate of decay of their correlation functions [80]. For instance, if 1 is a simple eigenvalue and the modulus of each of the other eigenvalues is smaller than one, this is equivalent to be mixing [80]. When the potential considered is of finite range, then the transfer operator corresponds to a matrix and the whole formalism is equivalent to Markov chains defined on finite alphabets. A potential

ϕ

is called normalised if

L_{ϕ} (1) = 1

. The log of a normalised potential of range

R + 1

, corresponds to the transition probabilities of a Markov chain with memory depth R. Moreover, in this case

F (ϕ) = 0

. For Lipschitz observables in the finite dimensional case, the Perron–Frobenius theorem assures a unique eigenvector associated to the maximal eigenvalue, from which the unique invariant measure (Markov) is obtained. This measure has mixing properties, exponential decay of correlations, central limit theorem, and a large deviations principle (see Section 2.3.4). When the operator acts on an infinite dimensional space (such as the space of continuous functions), then the spectrum of a bounded linear operator

L

is given by the set

spec (L) = {λ \in C : such that (λ I - L) has no bounded inverse}

, this set may contain points that are not necessarily eigenvalues (see, for instance, [86]). In this case, the strategy is to find a proper subspace where the spectrum of

L

has a finite number of such complex numbers whose norm is the spectral radius, say

ρ

, and the rest of the spectrum has norm strictly less than

ρ

(spectral gap). In this scenario, it is known that there is exponential decay of correlations for a sufficiently regular class of observables (such as Lipschitz), and the central limit theorem holds. In the absence of the spectral gap, then one has sub-exponential decay of correlations, which breaks down the central limit theorem, and the phase transition phenomenon appears (for further details and precise definitions, see [80] and the references therein).

Note that, given a potential

ψ

, one can explicitly find a normalised potential

ϕ

cohomologous to

ψ

, as follows,

ϕ : = ψ + log R - log R \circ σ - log ρ,

(9)

where R is the right eigenvector (real and positive) associated to the unique maximum eigenvalue

ρ

that is associated to

L_{ψ}

.

Remark 2.

Note that the normalisation of the potential ψ does not require a partition function. In fact, as discussed below, the classical normalisation by a partition function is a particular case of (9), holding for memory-less potentials that does not generalise to range

R > 1

potentials.

2.3.3. Time Averages and Central Limit Theorem

We have seen that if the measure

μ

is ergodic the time averages

A_{n} (f)

converge

μ

-almost surely to the expected value

\int f d μ

. Now, we can ask about fluctuations around the expected value. The observable f satisfies the central limit theorem (CLT), with respect to

(σ, μ)

if:

\frac{A_{n} (f) - n \int f d μ}{\sqrt{n}} \overset{law}{⟶} N_{0, σ_{f}^{2}}

(10)

where

N_{0, σ_{f}^{2}}

is the Gaussian distribution with zero mean and covariance

σ_{f}^{2}

, which is given by the following expression involving temporal correlations:

σ_{f, g}^{2} = C_{f, g} (n) (0) + \sum_{n = 1}^{\infty} C_{f, g} (n) + \sum_{n = 1}^{\infty} C_{g, f} (n) .

that is a particular case of (7). We illustrate an application of this theorem in the context of spike train statistics later in Section 2.6.

Strong properties of convergence and exponential decay of correlations are ensured for Hölder continuous potentials in finite dimension. These properties are associated with the spectral gap property and they do not (necessarily) hold for less regular potentials or in non-compact spaces [78,80].

2.3.4. Large Deviations

The central limit theorem describes small fluctuations in the limit when n goes to infinity. Rare events that are exponentially small are the object of study of the large deviations theory.

An empirical average

A_{n} (f)

satisfies a large deviation principle (LDP) with rate function

I_{f}

, if the following limit exists:

I_{f} (s) : = - lim_{n \to \infty} \frac{1}{n} log P (\{A_{n} (f) > s\}),

(11)

for

s \in R

. The above condition for large n implies that

P ({A_{n} (f) > s}) \approx e^{- n I_{f} (s)}

. In particular, if

s > \int f d μ

, then

P ({A_{n} (f) > s})

should tend to zero as n increases. The rate function tells us precisely how fast this probability goes to zero. Computing the rate function from Equation (11), may be a laborious task. The Gärtner-Ellis theorem provides a way to compute

I_{f}

more easily [63]. To this end, let us introduce the scaled cumulant generating function (SCGF) associated with the observable f, by

Γ_{f} (k) = : lim_{n \to \infty} \frac{1}{n} log \int e^{n k A_{n} (f)} d μ k \in R,

whenever the limit exists. The name comes from the fact that the n-th cumulant of f can be obtained by successive differentiation operations over

Γ_{f} (k)

with respect to k, and then evaluating the result at

k = 0

. If

Γ_{f}

is differentiable, then the Gärtner-Ellis theorem ensures that the average

A_{n} (f)

satisfies a LDP with rate function given by the Legendre transform of

Γ_{f}

, that is

I_{f} (s) = max_{k \in R} {k s - Γ_{f} (k)} .

Therefore, one can study the large deviations of empirical averages

A_{n} (f)

by first computing their SCGF, characterise its differentiability, and then find the Legendre transform. We compute this function in the context of spike train statistics later in Section 2.6.

If

Γ_{f} (k)

is differentiable then

I_{f} (s)

is convex [87], thus has a unique global minimum

s^{*}

such that

I_{f} (s^{*}) = 0

, then

I_{f}^{'} (s^{*}) = 0

. Assume that

I_{f} (s)

admits a Taylor expansion around

s^{*}

, then for s close to

s^{*}

,

I_{f} (s) = I_{f} (s^{*}) + I_{f}^{'} (s^{*}) (s - s^{*}) + \frac{I_{f}^{″} (s^{*}) {(s - s^{*})}^{2}}{2} + O {(s - s^{*})}^{3} .

Because

I_{f} (s^{*}) = 0

and

I_{f}^{'} (s^{*}) = 0

, for large values of n we obtain from (11)

\begin{matrix} P (\{A_{n} (f) > s\}) & \approx e^{- n I_{f} (s)} \\ \approx e^{- n (\frac{I_{f}^{″} (s^{*}) {(s - s^{*})}^{2}}{2})} \end{matrix}

Therefore, the small deviations of

A_{t} (f)

around

s^{*}

are Gaussian with variance

1 / n I_{f}^{″} (s^{*})

. In this way, the LDP can be regarded as an extension of the CLT.

The large deviation principle plays an important role in statistical mechanics, in particular in spin glass dynamics [59]. A large deviation principle can be used in order to relate entropy and free energy (here pressure) through a Legendre transform and to explain why variational principles arise in statistical mechanics [63,88]. As mentioned in the introduction, large deviations is the common theoretical principle linking dynamic mean field theory, maximum entropy principle, maximum likelihood, and Thermodynamic Formalism, although this link has not been studied in detail, to our best knowledge.

2.4. Potentials of Range One

A specific case where the variational principle (4) holds, is when the potential has the form (5). Then, equilibrium states are probability distributions

μ_{λ}

, that maximise the entropy (3), under the constraints of expected values of K observables

E_{μ_{λ}} (f_{k}) : = \sum_{x} f_{k} (x) μ_{λ} (x) = C_{k}

for

k = 1, \dots, K

. This problem can be solved by introducing a Lagrange multipliers

λ_{k}

in the potential (5):

F (U_{λ}) : = max_{p} {H (p) + E_{p} (U_{λ})} = H (μ_{λ}) + E_{μ_{λ}} (U_{λ}) .

(12)

There exists a unique maximum entropy distribution

μ_{λ}

(equilibrium state) satisfying the constraints. It turns out that the maximising distribution can be explicitly found for range one potentials and the distribution satisfies the Gibbs property (1), which, in this particular case, reduces to,

μ_{λ} (x) = exp (- F (U_{λ}) + U_{λ} (x)) = \frac{exp (U_{λ} (x))}{Z},

(13)

for all

x \in X

. Equation (13) is obtained by considering

F (U_{λ}) = log Z

, where Z is the “partition function”. From Equation (12), the expression for the entropy (3) and the Jensen inequality, one can obtain the formula for the pressure in this case:

F (U_{λ}) = log \sum_{x \in X} e^{U_{λ} (x)} .

The constrained problem can be uniquely solved, because the map

λ \mapsto E_{μ_{λ}} (U)

maps the real line monotonically onto the interval

(min U, max U)

[76].

For range one potentials, the measure of a block becomes a product distribution, as given by:

μ_{λ} ([x_{0, n - 1}]) = \prod_{i = 0}^{n - 1} \frac{exp (U_{λ} (x_{i}))}{Z} .

(14)

As the index n corresponds to time, having an interaction depending on one single coordinate implies that configurations at distinct times are independent.

2.5. Finite Range Potentials

Equation (9) can be used to find the unique Markov measure that is associated with a finite range potential. As an example, consider a potential U of range two representing the pairwise interactions in a graph with incidence matrix I. The entries

I (y, x) = 1

represent the allowed transitions between symbols

y \to x

and

I (y, x) = 0

the forbidden. We introduce the finite

∣ A ∣ \times ∣ A ∣

transfer matrix

L_{U}

, which corresponds to the RPF “operator” (8) restricted to a finite space.

L_{U} (y, x) = I (y, x) e^{U (y)}, y, x \in A, y \in σ^{- 1} x

(15)

As anticipated in Section 2.3.2, calling

ρ

the unique maximal positive eigenvalue of

L_{U}

guaranteed by the Perron–Frobenius theorem, and

R (x)

and

L (x)

the x-th entry in the right and left eigenvectors associated with

ρ

, respectively, we define a normalized potential

ϕ (y, x) = U (y) + log R (y) - log R (x) - log ρ

, such that the matrix

P (y, x) = I (y, x) e^{ϕ (y, x)} = \frac{I (y, x) R (y) e^{U (y)}}{ρ R (x)}

(16)

is stochastic, i.e.,

\sum_{x} P (y, x) = 1

, and represents the transition probabilities of a Markov chain

P (y \to x) = P (x ∣ y)

. The invariant measure p associated to the matrix P satisfying

p P = p

is

p (x) = \frac{R (x) L (x)}{〈 R, L 〉},

(17)

where

〈 R, L 〉 = \sum_{x} R (x) L (x)

. Note that normalisation is done without defining a partition function. The Markov measure

μ (p, P)

of a block is given by

μ [x_{0, n}] = p (x_{0}) P (x_{1}, x_{2}) \dots P (x_{n - 1}, x_{n})

for

x_{k} \in A, k = 0, . ., n .

Here, we have a nice way to show that this measure satisfies the Gibbs property while using the Markov property

μ [x_{1, n} | x_{0}] = e^{\sum_{k = 1}^{n} log P (x_{k - 1}, x_{k})}

where we see that the conditioning upon the first time is similar to left boundary conditions in statistical physics.

It follows from (9) and (17) that

μ [x_{0, n}]

obeys the variational principle and satisfies Equation (1), where

F (U) = log ρ .

The Gibbs measure

μ [x_{0, n}]

gives an exponential weight to each cylinder set, depending on the “energy” depending on smaller blocks.

2.6. Example

To illustrate the maximum entropy principle and the statistical analysis that can be performed while using tools and ideas from Thermodynamic Formalism, we include here a toy example. Consider the state space of all the binary blocks of size

2 \times 2

and one step transitions between them. We associate to each block en integer (23), and index a matrix using this representation of blocks we built the RPF matrix (15). There are allowed and forbidden transitions as explained in Section 3.3.1 (see Figure 3). Assume that we obtain from data (T samples) the empirical average value of the observable

A_{T} (x_{0}^{1} \cdot x_{1}^{2}) = 0.1

and

A_{T} (x_{1}^{1} \cdot x_{0}^{2}) = 0.4

and we want to find the maximum entropy Markov chain compatible with these constraints. Using Equations (6), (15) and (16), and, we obtain the maximum entropy Markov chain, defined by the following Markov transition matrix:

\begin{matrix} \begin{matrix} 0 \\ 1 \\ 2 \\ 3 \\ 4 \\ 5 \\ 6 \\ 7 \\ 8 \\ 9 \\ 10 \\ 11 \\ 12 \\ 13 \\ 14 \\ 15 \end{matrix} & \begin{matrix} \begin{matrix} 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 & 15 \end{matrix} \\ (\begin{matrix} 0.16 & 0.04 & 0.64 & 0.16 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0.64 & 0.16 & 0.16 & 0.04 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.04 & 0.16 & 0.16 & 0.64 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.16 & 0.64 & 0.04 & 0.16 \\ 0.16 & 0.04 & 0.64 & 0.16 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0.64 & 0.16 & 0.16 & 0.04 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.04 & 0.16 & 0.16 & 0.64 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.16 & 0.64 & 0.04 & 0.16 \\ 0.16 & 0.04 & 0.64 & 0.16 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0.64 & 0.16 & 0.16 & 0.04 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.04 & 0.16 & 0.16 & 0.64 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.16 & 0.64 & 0.04 & 0.16 \\ 0.16 & 0.04 & 0.64 & 0.16 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0.64 & 0.16 & 0.16 & 0.04 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.04 & 0.16 & 0.16 & 0.64 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.16 & 0.64 & 0.04 & 0.16 \end{matrix}) \end{matrix} \end{matrix}

From this Markov transition matrix, we can compute the fluctuations that are associated to each observable either using numerical simulations or analytically. We illustrate in Figure 1, the limit theorems and fluctuations introduced in Section 2 applied to this example.

The entropy maximisation for this toy example can be explicitly solved, and the simulations can also be performed directly from the transition matrix. However, large scale networks require sophisticated Montecarlo sampling methods to fit maximum entropy models that include non-synchronous interactions [89]. In the first column of Figure 1, we directly sample from the Markov transition matrix for different sample sizes and average the empirical frequency of both observables considered in the toy example. In the second column we plot the fitted Gaussian distributions of the empirical averages for different sample sizes. The third row correspond to the large deviations rate function. As explained in Section 2.3.4, the second derivative at the minimum of

I_{f}

characterise the Gaussian fluctuations around the expected value of f. The last column represents the auto-correlations (7).

2.7. Systems with Infinite Range Potentials, Chains with Infinite Memory and Gibbs Distributions

In this section, we somewhat depart from the strict setting of Thermodynamic Formalism, switching to the perspective of Markov chains and their extension to infinite memory. Although Thermodynamic Formalism allows for one to consider infinite memory (infinite range potentials) the advantage of the approach presented here is to allow considering non stationary dynamics, i.e., escape from the variational principle (4) constrained by the entropy definition, which requires stationarity.

A general class of stochastic processes to deal with infinite memory are called Chains with complete connections [90,91]. These chains define non-markovian processes. However, Markovian approximations are possible and useful [92]. This section follows closely from [90].

Definition 1.

A system of transition probabilities is a family

{P_{n}}_{n \in Z}

of functions with

P_{n} (\cdot ∣ \cdot) : A \times A_{- \infty, n - 1} \to [0, 1]

, such that the following conditions hold for every

n \in Z

:

(a) For every

x_{n} \in A

, the function

P_{n} (x_{n} ∣ \cdot)

is measurable with respect to the filtration

F_{\leq n - 1}

.

(b) For every

x_{- \infty, n - 1} \in A_{- \infty, n - 1}

,

\sum_{x_{n} \in A} P_{n} (x_{n} ∣ x_{- \infty, n - 1}) = 1 .

Definition 2.

A probability measure μ in

P (A_{- \infty, n}, F)

is consistent with a system of transition probabilities

{P_{n}}_{n \in Z}

if:

\int h (x_{- \infty, n}) μ (d x) = \int \sum_{x_{n} \in A} h (x_{- \infty, n - 1} x_{n}) P_{n} (x_{n} ∣ x_{- \infty, n - 1}) μ (d x) .

for all

n \in Z

and all

F_{\leq n}

-measurable functions h. The probability measure μ, when it exists, is called a chain with complete connections consistent with the system of transition probabilities

{P_{n}}_{n \in Z} .

It is possible that multiple measures are consistent with the same system of transition probabilities.

We now give conditions ensuring the existence and uniqueness of a probability measure consistent with the system of transition probabilities [90].

Theorem 1.

A system of continuous transition probabilities (

{var}_{m} [P_{n} (x_{n} ∣ \cdot)] \to 0

as

m \to + \infty

) on a compact space has at least one probability measure consistent with it.

Definition 3.

A system of transition probabilities is non-null, if, for all

n \in Z

and all

x_{- \infty}^{n} \in A_{- \infty, n}

:

P [x_{n} |x_{- \infty, n - 1}] > 0

Definition 4.

A normalized potential has bounded squared variations if, for all

n \in Z

and all

x_{- \infty, n} \in A_{- \infty, n}

:

\sum_{k \geq 0} {var}_{k}^{2} (log P [x_{n} |x_{- \infty, n - 1}]) < + \infty .

There exists a unique probability measure consistent with the system of transition probabilities if these are non-null and the associated normalised potential has bounded squared variations [90].

There is a mathematically well-founded correspondence between chains with complete connections and Gibbs distributions presented up to now [7,90,93]. Let us now discuss the formal analogy.

Define

ϕ (n, x) : Z \times A \to R

by:

ϕ (n, x) \equiv log P [x_{n} |x_{- \infty, n - 1}],

(18)

and:

Φ (m, n, x) = \sum_{r = m}^{n} ϕ (r, x) .

(19)

Then:

P [x_{m, n} |x_{- \infty, m - 1}] = e^{Φ (m, n, x)} = e^{\sum_{r = m}^{n} ϕ (r, x)},

and:

μ [x_{m, n}] = \int_{A_{- \infty, m - 1}} e^{Φ (m, n, x)} μ (d x) .

These last equations highlight the connection between chains with complete connections and Gibbs distributions in statistical physics. Indeed, the conditional probability

P [x_{m, n} |x_{- \infty, m - 1}]

has a “Gibbs” form where

ϕ

acts as an “energy” [90]. The correspondence is obtained considering “time” as a one-dimensional “lattice” and the “boundary conditions” as the past of the stochastic process. In contrast to statistical physics, there is no need to define a partition function (the potential is defined via transition probabilities, and is thus normalised).

While, for chains with complete connections defined through transition probabilities, the present is conditioned upon the past, Gibbs distributions, in general, also allow for conditioning “upon the future”. More generally, Gibbs distributions in statistical physics extend to probability distributions on

Z^{d}

where the probability to observe a certain configuration of spins in a restricted region of space is constrained by the configuration at the boundaries of this region. Therefore, they are defined in terms of specifications [7,93], which determine finite-volume conditional probabilities when the exterior of the volume is known. In one spatial dimension (

d = 1

), identifying

Z

with a time axis, this corresponds to conditioning both in the past and in the future. In contrast, families of transition probabilities with an exponential continuity rate define the so-called left-interval specifications (LIS) [90,94]. This leads to nonequivalent notions of “Gibbsianness” [95].

In contrast to the potentials studied up to now, the potential (19) is defined from transition probabilities (18), which are not necessarily time-translation invariant. This is the reason why the potential is noted

ϕ (n, x)

, as it depends explicitly on time n and the configuration x. This case is closer to the setting where potential or energy is not necessarily invariant when moving along a lattice in statistical physics, therefore not constrained by the stationarity assumption made up to now. As we discuss in the next section, this is quite helpful in the study of neuronal network dynamics.

3. Thermodynamic Formalism in Neuroscience

In this section, we make the connection between Thermodynamic Formalism and spiking neuronal dynamics. From the standpoint of mathematics, there are at least two ways to consider spiking neuronal networks. First, they can be considered as biological objects whose activity can be experimentally recorded while using Multi-Electrode Arrays (MEA), often requiring sophisticated mathematical methods and algorithms for data analysis [96,97]. Second, neuronal networks are characterised by dynamical models, more or less derived from biophysics [35,36].

Here, we begin considering the statistical analysis of spike trains recorded from neuronal networks. For this case, Thermodynamic Formalism provides a powerful and insightful method to analyse the spatio-temporal statistics from experimental spike trains. We briefly mention that this formalism has afforded us to develop algorithm for spike train analysis [49,89,98] leading to the software PRANAS [99] freely available at https://team.inria.fr/biovision/pranas-software/, although we do not develop along these lines in this paper. We focus then on a specific question. When dealing with a model of spiking neurons, how much of the intrinsic dynamics of neurons, their interaction via synapses, and the influence of stimuli, constrain the collective spatio-temporal spike statistics?

Neurons are (nonlinear) entities that evolve in a concerted way (as they interact via synapses) and responding to external stimuli. The theoretical analysis of this high dimensional systems can be made thanks to mathematical methods (dynamical systems, bifurcations theory, stochastic processes, partial differential equations) or theoretical physics (statistical physics, nonlinear physics). Here, one might be interested in what Thermodynamic Formalism can contribute when considering neuronal models dynamics. In this spirit, we consider two models, the Integrate and Fire model and the Galves Löcherbach model. Most of our presentation focuses on stationary situations that are characterised by equilibrium states. Nevertheless, we consider the extension of Thermodynamic Formalism to non-stationary situations.

At the end of the section, we address a couple of open questions.

What is the natural alphabet for spiking neuron dynamics? As we shall see, although the binary representation of spikes is a good candidate, it is too naive, as the relevant alphabet can be constructed on time blocks of spikes. A subsidiary question is about the size (time depth) of these blocks.
Under which conditions can Thermodynamic Formalism machinery be faithfully applied to a spiking neuronal network model?
What are the limits when the main theorems of Thermodynamic Formalism can and cannot be applied and what are the consequences for neuronal dynamics and spike statistics?

3.1. Statistics of Spike Trains and Gibbs Distribution

The human brain is composed of about a hundred billion neurons that mostly communicate among themselves together using sequences of spikes that are binary events (Sub-threshold oscillations also play an important role [100] and in organs like the retina, where most neurons do not spike.) Although the action potentials can vary in duration, amplitude, and shape, depending, e.g., on the type of neuron, they have a stereotyped shape, so that they can be considered as identical events. The main physiological reason for spike occurrence is that they can propagate information on different scales in the nervous system (centimeters to one meter) essentially without attenuation (active conduction as opposed to passive, Ohmic, conduction). However, from a contemporary point of view, spikes are also considered as events constituting “bits” of information. In this paradigm, it is tempting to consider spike trains as objects containing a “neural code” [37], i.e., a language that neurons use to communicate and that one could decipher. This terminology should not be considered literally, because, as opposed to computer codes, spike trains have a wide variability (e.g., the repetition of the same stimulus, even under controlled experimental conditions does not induce the same sequence of spikes as a response). In addition, nothing guarantees that there is only one code. When considered from the perspective of Thermodynamic Formalism, the notion of neural codes can have several meanings. (1) Spike trains constitute a symbolic coding of voltage dynamics (which depends on neuronal interactions and stimuli); and, (2) the way how neuronal dynamical systems (especially, spike trains) are affected by stimuli, provides a way for downstream networks to infer the stimulus (e.g., the retina encodes a visual scene in spike trains which are decoded by the visual cortex, capable of reconstructing a representation of the visual scene). Here, we essentially want to address the following questions: how to use Thermodynamic Formalism to fit experimental (or numerically generated) spike train data and which Gibbs distribution is produced by a network of neurons whose dynamics is known.

It is useful to consider spikes as instantaneous events (while the duration is about 1ms) and identify the maximum in the action potential course as the “time of the spike” [101]. This implicitly assumes that one considers dynamics on time scales larger than one millisecond. The binary representation is obtained using a window of a constant “binning size” (of order 10–20 ms) over the continuous time course of membrane potentials and count how many spikes there are per neuron within each time bin. Two or more spikes may occur within the same time bin, in that case, the convention is to consider these events equivalent to just one spike. This procedure [102] transforms experimental data into sequences of binary patterns (see Figure 2) leading to the following symbolic description.

Denoting the discrete time index by n, the spike-state of neuron k is denoted by

x_{n}^{k} \in {0, 1}

, depending on whether the k-th neuron emits a spike during the n-th time bin or not. A spike pattern is the spike-state of all the neurons in a network of N neurons at a given time bin, and it is denoted by

x_{n} : = {[x_{n}^{k}]}_{k = 1}^{N}

. A spike block denoted by

x_{n, r} : = x_{n} x_{n + 1} \dots x_{r}

is a sequence of spike patterns. The length of the spike block

x_{n, r}

is

r - n + 1

. A spike train denoted by x is the spike block representing the whole sequence of spike patterns. We consider spike trains of finite and infinite length. The set of all possible spike blocks of length R in a network of N neurons is denoted by

A_{R}^{N}

.

Thus, in comparison to the previous section, and especially Section 2.1, symbols here are spike blocks of length R. The alphabet, previously denoted A, is denoted here

A_{R}^{N}

, making explicit the dependence on the number of neurons, N, and the block depth R. It is important to make this dependence explicit, as we consider, later in this review, the effect of increasing N and R.

3.2. Conditional Probabilities for Spike Trains

The probability that a biological neuron, embedded in a network, emits a spike in a given time bin depends on the history of all variables determining the evolution of the neural network (voltage, conductances, concentrations of ions, neurotransmitters, etc.). Most of these variables are not experimentally accessible. Even if they were, there would be no hope of predicting, from this huge amount of information, the statistics of spikes. Dealing with neuronal models, the situation is simpler as there are fewer variables to control and their dynamics are known explicitly. However, even in this case, it is generally not possible to access spike statistics from dynamics. A simplification is to consider that the probability of a spike pattern only depends on the spikes emitted in the past by the network. This way, one can ignore the hidden dynamics of inaccessible variables and compute the statistics from what can be measured. Still, characterising the probability of a spike pattern given the history of the system, is generally out of reach (with, at least, two exceptions described in the next section).

The idea is to characterise the spike train statistics through a family of transition probabilities of the form:

P (x_{n + 1} ∣ x_{n - R, n}) \equiv e^{ϕ (x_{n - R, n + 1})} > 0 .

(20)

where R is the memory of the spike sequence, i.e., the time horizon on which the present depends on the past. Having these transition probabilities and an initial condition (or initial distribution), one can define a Markov chain (or a chain with infinite memory if

R \to + \infty

). It is possible, for some models, to explicitly write these probabilities. In Equation (20), we have assumed that all transition probabilities are strictly positive. This is necessary to ensure the uniqueness of the corresponding Gibbs distribution. Subsequently, one can associate to (20) a range R potential

ϕ (x_{n - R, n}) > - \infty

. On experimental grounds, the problem is to estimate these probabilities from data. Since there are

2^{N R}

possible spike blocks, for N and/or R big (e.g.,

N R > 20

) it becomes rapidly impossible to estimate these transition probabilities from experimental data while using a frequentist approach, as most of these transitions do not even occur within the finite experimental sample.

However, one can try to guess the form of these transition probabilities. One possibility is to start with an ad-hoc form, capturing the main features of neuronal dynamics. A canonical example is the Generalised Linear Model (GLM) [103], where the transition probabilities take the form:

P_{n} (x_{n + 1}^{i} ∣ x_{n - R, n}) = f (b^{i} + \sum_{j \in B_{i}} (H_{i j} * r^{j}) (n))

(21)

where f is a non-linear function. The term

b^{i}

is a constant fixing the baseline firing rate of neuron i.

H_{i j}

is the memory kernel, ∗ is the convolution product, and

r^{j} (n)

is the spike train of neuron j before time n. In this case, the memory kernel considers the spikes between

n - R

and n, but R can go arbitrarily to the past. Here we do not consider the influence of a time-dependent external stimulus.

As we show in Section 4.3 this form can be established for discrete-time Integrate and Fire models. Equation (21) gives an Ansatz for the marginal probability that neuron i spikes at time

n + 1

, given the history of the network. Equation (20), provides the joint probability of having the spike pattern at time

n + 1

, and is obtained assuming that neurons are conditionally independent. This can be justified if one assumes that, due to synaptic transmission and delays, neurons do not have the time to interact within one time bin. This means that time bins must not be too large and/or synapses must not be too fast (like gap junctions [104]).

Remark 3.

The GLM, instead of describing conditional probabilities, characterises the spike rate or conditional intensity of an auto-regressive Poisson process.

A second approach is based on the variational principle (14), maximising entropy under constraints. Both of the approaches can be addressed from the perspective of Thermodynamic Formalism.

3.3. The Hammersley–Clifford Theorem

In our representation spikes take a binary value 0 or 1. Thus, any potential of range R is a function taking a finite set of values. A general theorem from Hammersley and Clifford [105,106] states that any range-R observable, in particular, the potential

ϕ (x_{n - R, n})

, can be written in the form

ϕ (x_{n - R, n}) = \sum_{l} ϕ_{l} m_{l} (x_{n - R, n}),

(22)

where the coefficients

ϕ_{l}

correspond to the decomposition of

ϕ

in the space of finite range R-observables. This is analogous to Equation (5), with two main differences. First, the linear combination in (5) is used as an example making a link with Thermodynamics and the Maximum Entropy principle. Here, the decomposition (22) is a systematic expansion of any potential of range R defined over spike sequences. Second, in contrast to (5), the observables, denoted

f_{k}

in (5) only consider finitely many values.

Equation (22) is a linear decomposition on a basis of eigenfunctions referred to from now on as monomials [107]. They have the form:

m_{l} (x_{n - R, n}) = \prod_{k = 1}^{d} x_{i_{k}}^{n_{k}} .

where

n_{k} = 1, \dots, N

is a neuron index, and

i_{k} = n - R, \dots, n

a time index. Thus,

m_{l} (x_{n - R, n}) = 1

if and only if, in the spike sequence

x_{n - R, n}

neuron

n_{k}

spikes at time

i_{k}

for all

k = 1, \dots, d

. Otherwise,

m_{l} (x_{n - R, n}) = 0

. The number d is the degree of the monomial; degree one monomials have the form

x_{i_{1}}^{n_{1}}

, taking the value 1 if and only if neuron

n_{1}

spikes at time

i_{1}

. Degree two monomials have the form

x_{i_{1}}^{n_{1}} x_{i_{2}}^{n_{2}}

, taking the value 1 if and only if neuron

n_{1}

spikes at time

i_{1}

and neuron

n_{2}

spikes at time

i_{2}

, and so on. Thus, monomials provide a notion of spike interactions, similar to spins interactions in magnetic systems. For example, monomials of degree two correspond to pairwise interactions, like, e.g., in an Ising model. In contrast to the Ising model, the interactions that are considered here may involve time delays between spikes.

There are

2^{N R}

monomials for N neurons and a given range R. One can index them by an integer l in one-to-one correspondence with the set of pairs

(i_{k}, n_{k})

. The advantage of the monomial representation is that it focuses on spike events, which is natural for spiking neuronal dynamics. Thus, the Hammersley Clifford decomposition gives a canonical way to write any range R potentials as a linear combination of monomials of maximum degree R. This includes the GLM potential which can be embedded in the same framework [104].

As emphasised above, the Hammersley Clifford decomposition is analogous to the expression of thermodynamic potentials as a sum of products of an intensive quantity (e.g., temperature) with an extensive one (e.g., the energy). Depending on the physical constraints of the problem, one defines a thermodynamic ensemble, where the average value of extensive quantities (energy, number of particles, volume, magnetisation) is prescribed. Whereas, the first principles allow for guessing the form of the potential in thermodynamics, there is no such recipe in neuroscience. Moreover, one cannot use the complete expansion (22) on practical grounds, simply because large degree monomials have a vanishing empirical probability. More precisely, the average value of a monomial of degree d decays exponentially fast with d. This leads to two problems. (i) How to determine (from data) the constraints which are necessary to correctly characterise the spike train statistics? (ii) Are there constraints that are equivalent? (i) can be addressed in the context of information geometry [108] while (ii) can be approached using cohomology [107]. We do not further develop these aspects here, but rather refer the reader to the cited articles.

The variational principle (4) (or its finite version (12)) provides a systematic way of inferring Gibbs distributions from empirical average values of spike interactions (monomials). We make the construction explicit in the next subsections.

3.3.1. Finite Memory, Markov Chains and Gibbs Distributions

We now explicitly show how to build a Gibbs measure from a finite set of experimental averages as constraints of the maximum entropy variational problem. We assume that these constraints involve events (monomials) over a memory depth R. We build the corresponding Markov chain while using the material of Section 2.3.2. We associate to each spike block

x_{n, n + R - 1}

an integer

w_{n}

,

w_{n} = \sum_{r = 0}^{R - 1} \sum_{k = 1}^{N} 2^{k - 1 + N r} x_{n + r}^{k},

(23)

we write

w_{n} \sim x_{n, n + R - 1}

. In this way a sequence of spike patterns (spike block) can be encoded as sequences of integers, that define the alphabet. Next, we define the incidence matrix ("grammar") between symbols of the alphabet. Not all transitions between symbols are legal or allowed. A transition between the two symbols denoted by

w_{n} \to w_{n + 1}

or

w_{n}, w_{n + 1}

is legal if the corresponding blocks overlap according to this pattern, i.e., they have the block

x_{n + 1} \dots x_{n + R - 1}

in common (see Figure 3).

This defines an incidence matrix

I (w^{'}, w) = 1

if the transition between symbols

w^{'}

and w is legal and 0 otherwise. This incidence matrix defines the grammar of allowed and forbidden words or sequences of symbols. From this incidence matrix, we define the Perron–Frobenius transfer matrix

L_{ψ}

in the same way, as in (15). In order to obtain the unique Markov transition matrix of maximum entropy, we follow the procedure that is explained in Section 2.3.2.

Thus, for a given choice of monomials, we associate a potential of the form (22), where the

λ_{l}

s that do not correspond to a chosen monomial are set to 0. Subsequently, one computes the empirical average of the chosen monomials from data. From the Perron–Frobenius theorem, there is a unique Gibbs measure, the Markov measure

μ_{λ}

of the Markov chain, which solves the variational problem (14), giving a statistical model of data, minimizing the KL divergence between the empirical measure and

μ_{λ}

. Here,

λ

is the set of parameters

λ_{l}

that achieve the variational principle. These parameters can be numerically computed, either by using the explicit form of the measure (17) [49] or by while using the MonteCarlo methods [89]. The software PRANAS allows for the handling of spike train statistics by numerically computing the Gibbs distribution, solving (14) for up to 100 neurons [99].

3.3.2. Spectral Gap and Thermodynamic Limit

For a potential of finite range R and a finite number of neurons provided

λ_{l} > - \infty

, for all l in (22), the Perron–Frobenius theorem guarantees the uniqueness of the Gibbs measure

μ_{λ}

solving the variational principle (14). Moreover, the pressure being a real analytic function for finite range potentials, is infinitely differentiable with respect to parameters and there is an exponential decay of correlations with respect to time. This last aspect is due to the gap in the spectrum of the transfer matrix (15). These properties may not hold if either

R \to + \infty

or, if

N \to \infty

(corresponding to a thermodynamic limit) where the potential may lose regularity. Here, one has to consider Thermodynamic Formalism in infinite dimension, on the functional space of continuous functions. The case

R \to + \infty

corresponds to a potential with infinite range associated, in our case, to spike statistics with infinite memory. This is discussed in the next section and in Section 4, where we show that neuronal models can have such an unbounded memory. More generally, the limits

N \to \infty

or

R \to + \infty

can induce important effects, such as phase transitions, which will be commented upon further in the discussion section.

4. Spiking Neuronal Network Models: The Leaky Integrate-and-Fire and Beyond

There are many models for neuronal dynamics both at the level of individual neurons and neuronal networks [35,36,109]. Here, we consider a canonical example of such a model. The first model proposed to the scientific community was introduced by Lapicque in 1907 [110]. The main interest was to make a nice link between dynamics, coding, and spikes, paving the way to use Thermodynamic Formalism in order to analyse the spike train statistics.

4.1. Dynamics and Spikes

A fundamental equation in neuronal membrane potential dynamics is the conservation of electric charge, written in its most canonical form, as follows:

C \frac{d V}{d t} = - \sum_{X} g_{X} (V - V_{X}) + I (t),

where C is the membrane capacitance of the neuron and V is its membrane potential. The sum

\sum_{X}

holds on ionic currents of the form

i_{X} = - g_{X} (V - V_{X})

involving specific ionic channels permeable to specific ions (e.g.,

N a^{+}

,

K^{+}

,

C l^{-}

,

C a^{2 +}

). Here, we include the neuron’s intrinsic currents’ (e.g., sodium and potassium currents triggering a spike [111]) and synaptic currents [109]. The conductance,

g_{X}

, of channels of type X depends, in general, non-linearly on activation variables, themselves dependent on the voltage.

V_{X}

is the Nernst reversal potential, i.e., the value of the membrane potential at which the current

i_{X}

reverses its direction. Finally,

I (t)

is an external current that can mimic, e.g., an injection by an electrode or an external stimulus.

In its simplest form, for a single isolated neuron, this equation takes the form:

C \frac{d V}{d t} = - \frac{1}{R} V + I (t),

(24)

where R is the membrane resistance and the term

g = \frac{1}{R}

corresponds to a unique passive conductance. In this case, we consider only a leak current where the leak reversal potential is set to 0. This is the equation of an RC circuit, which is quite simple, but quite far from a real neuron, as this equation does not even produce spikes. To circumvent this problem, one introduces a threshold,

θ

, such that Equation (24) holds whenever

V (t) < θ

(sub-threshold dynamics), corresponding to the “ntegrate” phase. In contrast, for all times

t_{(r)}

such that

V (t_{(r)}) = θ

, two effects take place: (i) The membrane potential of the neuron is reset instantaneously to a rest value, here 0, without a loss of generality; (ii) a spike is recorded at times

t_{(r)}

called “spike times”. This is the “fire” phase (see Figure 2B, bottom).

While this is a simple artificial way to generate spikes, there is a huge price to pay on mathematical grounds because the threshold introduces a singularity set in the phase space where the dynamic is not differentiable. We develop this aspect below.

The generalisation of (24) to a network of N neurons is straightforward. Adding the contribution of synaptic currents,

I_{(s y n)}^{k} (t)

, building the network interactions, we obtain:

C^{k} \frac{d V^{k}}{d t} + \frac{1}{R} V^{k} = I_{(s y n)}^{k} (t) + S^{k} (t) + σ_{B} ξ^{k} (t), if V^{k} (t) < θ,

(25)

where

C^{k}

is the membrane capacitance of neuron k,

V^{k}

, its membrane potential. The resistance R is assumed to be the same for all neurons. In the synaptic current,

I_{(s y n)}^{k} (t) = \sum_{j, r} W_{k j} α (t - t_{(r)}^{j}),

(26)

the parameter

W_{k j}

represents the synaptic strength (“weight”) from the pre-synaptic neuron j to the post-synaptic neuron k (see Figure 2B). Synaptic weights can be negative (inhibition) or positive (excitation). By convention,

W_{k j} = 0

if there is no connection from j to k. This way, the sum in (26) holds for

j = 1 \dots N

.

The function

α

represents the time profile of the postsynaptic current induced by a pre-synaptic spike [112]. It has been experimentally observed that the tail of this function is exponential. On mathematical grounds this is essential. The sum in Equation (26) considers all the spike times

t_{(r)}^{j}

emitted by all the pre-synaptic neurons j before time t. When considering the asymptotic regime

t \to - \infty

(to get rid of initial conditions) this sum may contain an infinite number of terms. Thus, in order to ensure the sumability of this series one needs

α

to decay sufficiently fast (here exponentially fast).

Equation (25) holds in the sub-threshold regime. The term

S^{k} (t)

represents an external stimulus, and

ξ^{k} (t)

is white noise whose amplitude is modulated by

σ_{B}

. When the membrane potential of neuron k reaches the firing threshold at a firing time

t_{(r)}^{k}

, for some r, i.e.,

V^{k} (t_{(r)}^{k}) \geq θ

, the neuron k fires an action potential and its membrane potential is reset to a fixed reset value instantaneously (see Figure 2B).

While Equations (25) and (26) look rather simple, the right hand side of Equation (25) depends, via (26) on a possibly uncountable set of events (the spike times) corresponding to a possible infinite history of the voltage dynamics of the network. In this sense, these equations do not represent a classical dynamical system, where the knowledge of the variables at a given time allows one to compute the variables’ value at a future time by integrating the flow. Here, we require knowledge of the spike history, back to the last time where the neuron was reset to make the integration. This history can go quite far into the past, with a dependence decaying like the tail of the function

α

.

In order to circumvent these problems, we need to first get rid of the fact that spike times belong a priori to an uncountable set. There are two alternatives. The first one, briefly explored in this section, consists of discretising time as done e.g., in [113]. This leads to important results related with Thermodynamic Formalism (see [23,114] for details). The second alternative, to keep time continuous, is commented on the discussion section.

4.2. A Discrete Time Version of the Leaky-Integrate and Fire Model

The time discretisation of the model (25) reads:

\{\begin{matrix} V_{n + 1}^{k} & = γ V_{n}^{k} + \sum_{j = 1}^{N} W_{k j} x_{n}^{j} + S_{n}^{k} + σ_{B} ξ_{n}^{k}, & if V_{n}^{k} < θ, Integrate phase; \\ V_{n + 1}^{k} & = 0 and x_{n}^{k} = 1, & if V_{n}^{k} \geq θ, Firing phase . \end{matrix}

(27)

For simplicity, we have assumed that all neurons have the same capacitance

C^{k} = C

, and set

γ = 1 - \frac{d t}{τ}

, where (

τ

is the characteristic time scale of the membrane response,

d t

is an integration time step which has to be much smaller than

τ

to preserve the physical relevance, whereas it has to be strictly positive to have a time-discretization scheme.)

τ = R C

, with

0 < γ < 1

, and then taken

d t = 1

. We have assumed that synapses are instantaneous. Subsequently, the synaptic input is

\sum_{j} W_{k j} x_{n}^{j}

, that correspond to the pre-synaptic neuron j that acts on the post-synaptic neuron k whenever j spikes,

x_{n}^{j} = 1

. If, at some discrete-time n,

V_{n}^{k}

exceeds the threshold

θ

, the membrane potential is reset at time

n + 1

and a spike is recorded at n for neuron k, i.e.,

x_{n}^{k} = 1

. Below the threshold, the random dynamical system is ruled by (27).

S_{n}^{k}

is the time discretization of the external stimulus.

ξ_{n}^{k}

are independent standard Gaussian random variables.

It is easy to integrate Equation (27) conditionally upon a fixed spike sequence x. A trajectory

V = \{V_{n}^{k}, k = 1 \dots N, n \in Z\}

is compatible with this spike sequence if

χ (V_{n}^{k} > θ) = x_{n}^{k}

,

\forall k = 1 \dots N, n \in Z

, where

χ (A)

is the indicator function of the logical event A,

χ (A) = 1

if A is true,

χ (A) = 0

otherwise. We discuss the compatibility condition in more detail in Section 4.4, when dealing with symbolic coding. For the moment, assume that V and x are compatible. We note

τ_{k} (x, n) = max \{l, l < n | x_{l}^{k} = 1\}

the last time before n where neuron k has spiked, thus whose voltage was reset to 0. Then:

V_{n + 1}^{k} = \sum_{j = 1}^{N} W_{k j} η_{k j} (n, x) + \sum_{l = τ_{k} (x, n)}^{n} γ^{n - l} S_{l}^{k} + σ_{B} \sum_{l = τ_{k} (x, n)}^{n} γ^{n - l} ξ^{k} (l),

(28)

where:

η_{k j} (n, x) = \sum_{l = τ_{k} (x, n)}^{n} γ^{n - l} x_{l}^{j},

integrates the influence of pre-synaptic neuron j on the time interval

[τ_{k} (x, n) + 1, n]

. Each spike emitted by this neuron, at times l in this time interval, contributes with a weight

γ^{n - l}

and there is no contribution at times where

x_{l}^{j} = 0

. The condition

γ < 1

implies an exponential decay in the spike history dependence with characteristic time:

τ_{γ} = - [\frac{1}{log γ}] .

Likewise,

\sum_{l = τ_{k} (x, n)}^{n} γ^{n - l} S_{l}^{k}

integrates the stimulus influence on neuron k and

σ_{B} \sum_{l = τ_{k} (x, n)}^{n} γ^{n - l} ξ^{k} (l)

is the integrated noise term. This is a Gaussian random variable, with mean zero and variance

σ_{B}^{2} \frac{1 - γ^{2 (n + 1 - τ_{k} (x, n))}}{1 - γ^{2}}

. In (28), the dependence on the initial condition does not appear because we assume the initial time to be

n_{0} \to - \infty

. So, either neuron k has spiked in the time interval

[- \infty, n]

and the voltage is reset to 0, or it has not spiked, but the initial condition dependence decays like

γ^{n - n_{0}}

which vanishes when

n_{0} \to - \infty

.

4.3. Gibbs Distribution of the Discrete Lif Model

Thanks to the integrated Equation (28) and because the integrated noise is Gaussian it is now easy to compute the probability that neuron k spikes at time

n + 1

given the history x,

P (x_{n + 1}^{k} = 1 | x_{- \infty, n}) = P (V_{n + 1}^{k} \geq θ | x_{- \infty, n})

:

P (x_{n + 1}^{k} = 1 | x_{- \infty, n}) = f (\frac{θ - \sum_{j = 1}^{N} W_{k j} η_{k j} (n, x) - \sum_{l = τ_{k} (x, n)}^{n} γ^{n - l} S_{l}^{k}}{σ_{B} \sqrt{\frac{1 - γ^{2 (n + 1 - τ_{k} (x, n))}}{1 - γ^{2}}}})

(29)

where

f (z) = \int_{z}^{+ \infty} \frac{e^{- \frac{u^{2}}{2}}}{\sqrt{2 π}} d z

. Here, we have used a small abuse of notation. The conditioning upon

x_{- \infty, n}

means, in fact, the conditioning upon the sequence

x_{n - 1, R (x)}

, where

R (x) = min \{k = 1 \dots N | τ_{k} (x, n)\}

. We condition upon the spike history prior to n back to the last time where all neurons had been reset at least once. While, for Equation (29). we just need to consider the history back to

τ_{k} (x, n)

, the conditioning upon

x_{n - 1, R (x)}

is necessary when considering the conditional join probability of spiking patterns. The joint probability is conditionally independent given the past:

P (x_{n + 1} | x_{- \infty, n}) = \prod_{k = 1}^{N} (x_{n + 1}^{k} P (x_{n + 1}^{k} = 1 | x_{- \infty, n}) + (1 - x_{n + 1}^{k}) (1 - P (x_{n + 1}^{k} = 1 | x_{- \infty, n}))) .

(30)

Let us comment this result. Equation (30) is the transition probability, of the form (20), where the normalised Gibbs potential

ϕ

can be explicitly written, in terms of the synaptic interactions and the parameter

σ_{B}

controlling the noise amplitude. However, note that, in contrast to (20), where the memory of the spike sequence was fixed independently of x, here the memory depends of x, providing a variable length Markov chain [115,116]. Actually,

R (x)

is an unbounded function of x as one can find, for all

r \in \{- \infty, n\}

, a sequence x, such that

R (x) = r

(take the sequences where all

x_{n}^{k} = 0

,

k = 1 \dots N

,

n > r

and

x_{r}^{k} = 1

for at least one k). Thus, we have to deal with the extension of Markov chains, to chains with unbounded memory introduced in Section 2.7. The existence and uniqueness of a Gibbs distribution compatible with this chain is guaranteed by the exponential decay of the memory controlled by

γ < 1

[23]. In this case, the potential fulfills the conditions described in Section 2.7. Finally, in (30), the transition probabilities explicitly depend on time because of the stimulus dependent term. They are, therefore, not translation invariant. While the extension of Gibbs distributions to non time-translation invariant chains can be rigorously done (upon the exponential decay of memory [90]), we restrict ourselves now to the case without stimulus (

S_{n}^{k} = 0, \forall k = 1 \dots N, n \in Z

) to apply Thermodynamic Formalism, until Section 4.5, where we discuss linear response.

Note that (29) bares a strong resemblance to the GLM Ansatz (21).

4.4. Markov Partition and Symbolic Coding

In this section, we consider the deterministic discrete-time neuronal network model obtained considering (25) with

σ_{B} = 0

, studied in detail in [114]. The threshold

θ

of the voltage in a network of N neurons induces a natural partition of

R^{N}

,

P = \prod_{k = 1}^{N} P_{x^{k}}

, where

x^{k} \in \{0, 1\}

,

P_{0} = [- B, θ], P_{1} = [θ, B]

where the bound B depends on synaptic weights [114,117]. If

V_{n}^{k} \in P_{0}

, it evolves according to the sub-threshold dynamics (25) and it does not emit a spike at time n. In contrast, if

V_{n}^{k} \in P_{1}

, it emits a spike at time n and its trajectory is set back to

P_{0}

at time

n + 1

. Thus,

P

is a natural partition in the sense that it informs about the spikes of each neuron. Therefore, to each trajectory

V \equiv \{V_{n}^{k}, k = 1 \dots N, n \in Z\}

, there is associated an infinite spike sequence x such that

x_{n}^{k} = 0 ⟺ V_{n}^{k} \in P_{0}

and

x_{n}^{k} = 1 ⟺ V_{n}^{k} \in P_{0}

.

This partition can be used to generate a Markov partition [114], but, in general, the Markov partition is not

P

but a finite refinement

Q

of

P

. What ensures that a Markov partition exists is that (27) is contracting. More precisely, it contracts, in one step, at speed

γ

for directions (neurons) such that

V^{k} < θ

, and it contracts, with an infinite speed, for directions such that

V^{k} \geq θ

(reset). However, this generates discontinuities in the mapping (27), and a singularity set

S = \{V \in R^{N} | \exists k \in \{1 \dots N\}, V^{k} = θ\}

, where the map associated with (27), hereafter denoted by

G

, is discontinuous. Thus,

G

is piecewise continuous and piecewise contracting.

Now, recall that

Q

is a Markov partition for the dynamics with contracting map

G

if its elements satisfy

G (Q_{n}) \cap Q_{n^{'}} \neq \emptyset \Rightarrow G (Q_{n}) \subset Q_{n^{'}}

. In other words, the image of

Q_{n}

is included in

Q_{n^{'}}

whenever the transition

n \to n^{'}

is legal. Here, in general, the elements of

P

do not satisfy this condition. This is because the image of a domain of

P

usually intersects in several domains (in this case, the image intersects the singularity set). From the neural network’s perspective this means that, in general, it is not possible to know the spiking pattern at time

n + 1

knowing the spiking pattern at time n. There are several possibilities, depending on the membrane potential values and not only on the firing state of the neurons. Of course, if, say

P_{n}

is such that

G (P_{n})

intersects several domains

P_{n_{1}}, \dots, P_{n_{l}}

one can take the preimages of these domains

G^{- 1} (P_{n_{r}})

to construct a refinement of

P

, such that the Markov partition requirement is satisfied in one iteration of the map. However, nothing guarantees that, at the second iteration, some elements of this new partition will not intersect the singularity set under

G^{2}

.

Can we find a finite refinement of

P

, such that the trajectory of the partition elements never intersects several partition elements? It is shown in [114] that this property is satisfied for generic values of the synaptic weights

W_{i j}

. Essentially, it is based on the fact that the distance between the

Ω

-limit set of (27) and the singularity set

S

, is generically positive. In other words, each point in the partition elements

Q_{n}

has a local stable manifold with a finite diameter.

As a consequence, the deterministic discrete-time neuronal network model (27) admits a Markov partition, a refinement of the natural partition, providing a symbolic coding of the membrane potential trajectories in terms of spike sequences. In particular, once the initial condition dependence has been removed, the evolution (28) (without noise) is only constrained by the stimulus. Thus, (28) provides a coding scheme of the stimulus in terms of spike sequences. The Markov partition is made of spike blocks, with finite memory depth R, which can be used to apply Thermodynamic Formalism in the presence of noise. However, R—the memory depth of the corresponding Markov chain—depends on the parameters and, in particular, the synaptic weights.

In addition, note that the presence of a singularity set induces a weak form of initial conditions. Although the dynamic is contracting, a small perturbation of a trajectory can induce an evolution drastically different from the unperturbed trajectory, if the perturbation crosses the singularity set. In this case, e.g., there is a neuron, k, which does not spike, at time n in the unperturbed trajectory, and spikes at time n in the perturbed one, inducing a completely different evolution (cascade effect). This phenomenon has been exposed in the context of spiking neurons, where the coexistence of stable and unstable dynamics is investigated [118]. The singularity set also induces the existence of ghost orbits,

\exists k \in \{1 \dots N\}, \forall n > 0, V_{n}^{k} < θ

and

{lim sup}_{n \to + \infty} V_{n}^{k} = θ

. However, ghost orbits are non-generic in a topological and a measure-theoretic sense. As a corollary, the

Ω

-set is generically composed of finitely many periodic orbits with a finite period (whose length depends on parameters of the model, in particular, synaptic weights).

4.5. Extensions

4.5.1. Explicit Form of the Potential. GLM vs. MaxEnt

Model (27) makes a link between the dynamics of a neuronal network and the transition probabilities (20), where the dependence on the model parameters (in particular, synaptic weights, and stimuli) is explicit. We have an explicit potential for this model, which, here, takes a GLM-like form (29), but is more general, as in contrast to the GLM, the effective interactions depend on time via powers of the leak term

γ

. This potential can also be written in terms of monomials using the Hammersley–Clifford decomposition (Section 3.3), through a series expansion of the function f. This procedure generates a series of monomials with coefficients that can be explicitly computed (using the fact that, from the monomials definition (22)

{(x_{i_{k}}^{n_{k}})}^{m} = x_{i_{k}}^{n_{k}}

, for any integer

m > 0

). These coefficients are proportional to powers of

γ < 1

, so their strength decays exponentially fast, allowing for truncating the potential to a finite number of terms, which produce canonical Markov approximations of different orders [92]. One obtains, to the lowest order, a Bernoulli potential, then pairwise terms, and so on.

4.5.2. Linear Response

Another interesting consequence of the analysis of this model is that the potential may depend on a time dependent stimulus. When considering that the stimulus is of small amplitude and additive, one can take a Taylor expansion of the potential as powers of the stimulus allowing one to go beyond the stationarity assumption central to equilibrium statistical mechanics and Thermodynamic Formalism. In this case, it is possible to show that the variations in the spike statistics induced by the stimulus, can be described in terms of a linear response theory [119,120,121,122]. The main result is that the variation, in the average of an observable f, resulting from the application of a stimulus reads:

δ μ [f] (n) \equiv μ^{S} [f] (n) - μ^{(s p)} [f] = (K_{f} * S) (n),

where

μ^{(s p)}

is the Gibbs distribution in spontaneous activity (without stimulus), and

μ^{S}

is the Gibbs distribution in the evoked activity regime (with stimulus),

μ^{S} [f] (n)

means the average of f with respect to

μ^{S}

, which depends on time (if the stimulus does), and

μ^{(s p)} [f]

means the average of f with respect to

μ^{(s p)}

, which does not depends on time. This variation in average is given by a convolution between a kernel

K_{f}

, depending on f (which can be expressed in terms of time correlation functions between monomials) and of the stimulus. The coefficients in the expansion of

K_{f}

depend on the parameters constraining dynamics (e.g., the synaptic weights in (27)). The correlations are computed with respect to the invariant Gibbs measure

μ^{(s p)}

. In addition, the influence of monomials in the expansion decreases with their order, so that one can obtain a reasonable approximation of the convolution kernel only considering averages of order two monomials (time dependent pairwise correlations). Therefore, this is sa result in the form of a fluctuation-dissipation “theorem” in statistical physics, with the difference that one considers time dependent correlations. This formula has proven to give astonishingly good results when computing the response to a time dependent stimulus in the model (27) [122].

4.6. The Galves-Löcherbach Model

Here, we present a second example, where the Gibbs potential can be computed. This model is known as the Galves–Löcherbach model introduced by Antonio Galves and Eva Löcherbach in [91] (see also [123,124]). This model is a generalization of [22], but considering an infinite (countable) network of neurons interacting in time with memory of variable length.

The model is built when considering a stochastic chain

{(X_{t})}_{t \in Z}

taking values in

{0, 1}^{I}

, where I is a countable set of neurons. The probability of a spike depends on the accumulated activity of the system since the last spike, thus, each spike depends on a variable length history, defining also a non-Markovian stochastic process. Extensions of this model have been made considering the hydrodynamic limit of the interacting neuronal system [125], classifying the collective behavior according to parameter values [126], and the generalization to the continuous time [127,128].

For each neuron

i \in I

and each time

t \in Z

, let

τ_{i} (x, t)

denote the last time before t at which neuron i fired a spike in the spike train x:

τ_{i} (x, t) = sup \{s < t : x_{s}^{i} = 1\},

and suppose that the synaptic weights

W_{i j}

have the uniform summability property:

sup_{i \in I} \sum_{j} |W_{i j}| < \infty .

The joint probabilities are conditionally independent given the past:

P (x_{t} ∣ x_{- \infty, t - 1}) = \prod_{i \in I} P (x_{t}^{i} ∣ x_{τ_{i} (x, t), t - 1}),

(31)

where the probability of neuron i having a spike at time t is given by:

P (x_{t}^{i} ∣ x_{τ_{i} (x, t), t - 1}) = h_{i} (\sum_{j} W_{i j} \sum_{s = τ_{i} (x, t)}^{t - 1} g_{j} (t - s) x_{s}^{j}) .

(32)

where

g_{j} (t - s)

plays the role of the exponential

α

-kernel in (26). The transition probabilities (32) have the form of a GLM model.

Under technical conditions of the functions

h_{i}

and

g_{j}

and

W_{i j}

, there exists a unique probability measure consistent with (31) and (32) (see Theorem 1 of [91]). To prove this claim, they use a Kalikow-type decomposition of the infinite order transition probabilities. This type of decomposition has also been considered in Ref [91,129,130]. The setup considered in this work extends to infinite size and infinite memory.

5. Discussion and Perspectives

In this review, we introduce different ideas and tools from Thermodynamical Formalism and show how they can be applied in theoretical neuroscience. As a summary, we grouped these approaches depending on two main characteristics. The first one is the number of neurons N that affect the cardinality of the alphabet considered and the range R of the potential (memory in transition probabilities) associated to the equilibrium measure characterising the system. This is represented in Table 1. The infinite number of neurons and infinite range cases are further discussed in Section 5.2.

While there have been interesting applications of Thermodynamic Formalism to neuroscience, there are interesting ideas and developments still to come. In this concluding section we raise questions, challenges and new avenues for the application of Thermodynamic Formalism to theoretical neurosciences.

5.1. Thermodynamic Formalism for More Biologically Plausible Neuron Models

In this review, we have considered rather academic models of neuronal networks, where, especially, time is considered to be discrete. There are good reasons for that. As we remarked at the beginning of Section 3, we were considering models, like the Integrate-and-Fire, where spikes arise instantaneously, in continuous time, thereby providing a possibly uncountable set of potential spike trains. The question is whether we are dealing here with a realistic property of biological neuronal networks or with an artifact created by the instantaneous reset. Real spikes have a duration (a few ms) and a refractory period (also a few ms), so, for a fixed initial condition, spike trains produced by a continuous-time neuronal network are countable. Now, it might be that the set of spike trains depend continuously on the initial condition, so that we are still left with an uncountable set of spikes.

It is out of the scope of this review to discuss from a biological perspective, whether or not neuronal networks have the cardinality of the continuum (see [101,117,131] for a discussion on this topic). Instead, we have considered a strategy consisting of discretizing time, avoiding the problem of potentially uncountable spikes. Here, we briefly mention another strategy, allowing for associating a countable set of spike trains to continuous time networks with a countable set of spike trains. We first make the remark that the instantaneous reset of voltage is physical and biological nonsense, inducing pathologies in the dynamics [132]. On this basis, we use a convenient mathematical trick that is explained in the next paragraph, which can certainly be criticized on phenomenological grounds [131], especially when the dynamical system representing the neuronal activity is deterministic.

After spiking, a biological neuron stays at rest a certain time (refractory period). Accordingly, the trick is the following. Fix

δ > 0

and define a spiking variable

x_{n}^{k} \in \{0, 1\}

, where n is an integer, where

x_{n}^{k} = 1

if neuron k emits a spike in the time interval

[n δ, (n + 1) δ]

and

x_{n}^{k} = 0

otherwise. Recall that

t_{(r)}^{k}

denotes the time at which neuron k emits its r-th spike. This reads:

x_{n}^{k} = \{\begin{matrix} 1, & i f \exists r, t_{(r)}^{k} \in [n δ, (n + 1) δ]; \\ 0, & otherwise . \end{matrix}

Spiking variables are therefore time-discrete events with a time resolution

δ

. When

V^{k}

reaches the threshold at time

t_{(r)}^{k}

it is reset to 0, and stays there until time

(n + 1) δ

. After this, follows the sub-threshold evolution (25) until the next time where

V^{k}

reaches the threshold. Note that, in this modelling,

δ

can be quite small when compared to the time scales of the dynamics. In this way, the set of spike trains x becomes at most countable.

This trick can be used to generalise the Integrate-and-Fire model into a conductance-based Integrate-and-Fire model, which was introduced by Rudolph and Destexhe in [133] and mathematically studied in [23,24,117], where the synaptic conductance depends on the spike history. One can still show that a unique Gibbs distribution with infinite range potential exist in this case, characterising the spike train statistics. The potential can be explicitly computed as a function of network parameters, even in the presence of a time-dependent stimulus.

Now, would Thermodynamic Formalism apply to more realistic neuronal models, like Hodgkin–Huxley [111], FitzHugh–Nagumo [134,135], Morris–Lecar models [136]? (see [36,109,137] for a complete presentation of canonical neuronal models). In these models, closer to biology, the spikes have a time course and, thus, are not considered as point events. However, we do not know any result establishing, e.g., the existence of Markov partitions and symbolic coding for these models and this seems to be out of reach for the moment. Still, one can bin the time and proceed as done in experiments where voltage is a time-continuous signal. Thus, one can still use the approach used in (20) to characterise the spike train statistics.

A natural question in this context is what is the link between spike train statistics and the underlying dynamical model, with “hidden” dynamical variables, such as membrane potential, but also, e.g., activation/inactivation variables? If we think in terms of spike coding, the alternative is the following. Either spike trains contain all the necessary information to characterise the dynamics, e.g., the spike response to a stimulus, and then characterising (20) is sufficient. Or, there is additional information, not conveyed by spikes (e.g., sub-threshold oscillations [100,138,139]), and the "neural code" is not entirely contained in the spikes, somewhat ruining the hope of encoding neuronal messages purely in terms of spikes. This question is much more general than the validity of the Thermodynamic Formalism approach for these models.

What Thermodynamic Formalism brings to the analysis of these models is twofold: (1) a way to rigorously handle probabilistic representations of spikes (20); and, (2) to provide conceptual and mathematical tools to analyse spiking neuronal network models, like (27), where a dynamical system formulation of biophysical variables can be mathematically related to spike coding and spike train statistics.

5.2. Phase Transitions

Several studies have shown that the population of vertebrate retinal ganglion cells responding to naturalistic stimulus is poised near a “critical state” [73,74]. From the maximum entropy joint distribution (13), a family of Gibbs distributions can be built introducing a parameter

1 / β

(analogous to the inverse temperature), which scales all of the Lagrange multipliers of the inferred Hamiltonian. When

β \to 0

(infinite temperature), the uniform distribution is obtained, and when

β \to + \infty

(zero temperature), the Dirac delta supported at the spike configuration(s) of minimal energy is obtained.

1 / β = 1

corresponds to the inferred maximum entropy distribution from data. These studies have only analysed memoryless Gibbs distributions (13).

From this representation, it is possible to compute the fluctuations (variance) of the energy

U_{λ}

as a function of the “temperature” parameter T. This quantity can be obtained as the second derivative of the pressure, Equation (6), which is, in thermodynamics, related to the heat capacity

C_{T}

. On numerical grounds, this quantity can be computed while using MonteCarlo simulations and plotted as a function of the “temperature”

T = \frac{1}{β}

, for different network sizes, (see Figure 4).

The form of

C_{T}

versus T plot, for maximum entropy models of Ising type obtained from the recording of retinal ganglion cells responding to naturalistic stimuli are shown in Figure 4 (redrawn from [74]). It can be observed that there is a clear, increasing peak at

T = 1

, which starts to manifest itself when larger and larger groups of neurons are considered. This presumable divergence of the heat capacity (or variance of U) when

N \to \infty

(thermodynamic limit) is interpreted as a second order phase transition (a so-called “critical regime” [140]).

The behavior of the specific heat that is observed in Figure 4 suggests that the heat capacity of a maximum entropy distribution, fitted over an increasingly large group of neurons in the retina, diverges. This phenomenon has been considered to be a “signature of criticality" (details of this study and a discussion about whether criticality is functional for retinal ganglion cells can be found in Ref. [74]). Some criticism regarding this approach to diagnostic criticality has appeared arguing that the maximum entropy principle is likely to yield models that are close to singular values of parameters, akin to critical points in physics where phase transitions occur. Statistically distinguishable models tend to accumulate close to critical points, where the susceptibility (inverse Fisher Information) diverges in infinite systems [141]. These ideas have also been applied to numerical simulations of a canonical feed-forward population model showing that the specific heat diverges whenever the average correlation strength is independent of the population size [75], as in the random subsampling of correlations used in [74]. Additionally, note that, for spike trains obtained from discrete Markov processes, binning generates a stochastic process with unbounded memory akin to inducing spurious phase transitions [102].

This interesting approach leads, nevertheless, to several questions in the context of Thermodynamic Formalism.

Does this signature of criticality extend to Gibbs distributions with potentials of range $R > 1$ , i.e., with memory? How does it depend on R? We are not aware of any experimental results addressing this issue. This question is related to the following:
What is this signature of criticality from the point of view of Thermodynamic Formalism? The occurrence of a second-order phase transition mathematically means that the pressure is $C^{1}$ but not $C^{2}$ when some limit is taken. Here, we have two possible limits: the range of the potential R tends to infinity or the number of neurons N tends to infinity. These two limits could also be addressed simultaneously and they do not necessarily commute. For potentials associated to finite R and N the Perron-Frobenius theorem guarantees the existence and uniqueness of the Gibbs measure and the analyticity of the pressure can also be proved, preventing phase transitions. When R or N are infinite, the properties of the RPF operator (Section 2.3.2) characterises the presence or absence of phase transitions. Indeed, there are conditions ensuring a spectral gap for this operator, ensuring the exponential decay of correlations. Now, Equation (7) characterise the second derivative of the pressure as a time series of correlations, which converge when the correlations decay exponentially. On the opposite side, the non-summability of time correlation function implies the non-existence of the second derivative, and thus, of a second-order phase transition. Therefore, a possibility to have a second-order phase transition is when the spectral gap property for the RPF operator when $R \to + \infty$ or $N \to + \infty$ is absent. In statistical mechanics, second-order phase transitions can be characterised by how the zeros of the partition function, written as a polynomial, pinch the real axis (Lee-Yang phenomenon) [142,143,144]. In our case, when $R > 1$ , the object of interest is not the partition function, but rather the largest eigenvalue of $L_{ϕ}$ , which has to stay analytic in the limit R, or N, $\to + \infty$ . The absence of the spectral gap property presents an analogy with the Lee-Yang phenomenon, although we do not know about results establishing a deeper link.
Can we relate known examples of dynamical systems exhibiting phase transitions to models in neuroscience? Another possible example to be interpreted in neuroscience is the Dyson model [145], in which there exists a phase transition in the sense of spontaneous magnetisation when the temperature goes to zero, due to an infinite range potential whose correlation does not decay exponentially fast. In our case, the range of the potential should be taken in time, keeping (possibly) the number of neurons finite. Other examples exist of rigorous characterisations of phase transitions in the thermodynamic description of Pomeau–Manneville intermittent maps, passing from an integrable density function associated with the measure to heavy-tailed densities [146]. An interesting result may hint at the connection between the topological Markov map of the interval and stochastic chains of infinite order or chains with complete connections. Ref [147] presents how to build a topological Markov map of the interval whose invariant probability measure is the stationary law of a given stochastic chain of infinite order. This is interesting in this context because as we presented in (27), there are mathematical models of spiking neurons whose spike statistics are represented by chains of infinite order. This result or its inverse i.e, how to build a stochastic chain of infinite order from a topological Markov map may hint at conditions in the parameters or conditions of the mathematical models of spiking neurons to exhibit second order phase transitions.
What could the dynamical or mechanistic origins of a second-order phase transition be in a spiking neuronal network model? Handling experimental data is of course important, but for long experiments with living neuronal tissue, one cannot control the size of the sample, the stationarity of data, and so on. Accordingly, assume that we have been able to find an example of a Gibbs distribution exhibiting a second-order phase transition when $R \to + \infty$ or $N \to + \infty$ . Can we build a spiking dynamical system, with finite R and N, which has this Gibbs distribution in the limit R, or N, $\to + \infty$ , so that we observe a phase transition in the model? Then, what are the mechanistic origins (in the neuronal dynamics) of second-order phase transitions? It could be an interesting example to study the existence of a second-order phase transition in a simple neuronal model. Returning back to the discrete LIF model, the failure in the second-order differentiability of the pressure means the loss of exponential mixing, which, in the model (27) can arise in, at least, two cases. First, if $γ = 1 - ϵ$ , $ϵ \to 0$ . This is a way to obtain a potential with increasing range as $ϵ \to 0$ with loss the summability of correlations. The corresponding orbits (reminiscent of the ghost orbits discussed in Section 4.4) are such that it may take a long time for some neurons to be reset. Thereby, the memory to be considered is very long. However, this is a case hardly interpretable from the neuroscience perspective. A second possibility is to analyse how the pressure depends on the spectrum of the synaptic weights matrix and to check whether there are cases (e.g., small world or scale-free lattices), where the spectral gap of the RPF vanishes.

From the perspective of the maximum entropy distributions built from experimental data of spiking neurons, there have been interesting applications of the Gibbs distributions that were obtained to answer questions related to the retinal code that are not related to criticality [148]. From the maximum entropy joint distribution the conditional distributions can be computed, and questions about the redundancy of the neural code can be addressed such as how predictable is the activity of each neuron based on the knowledge of the activity of other neurons in the population. Can we find a subset of neurons J that together predict with high accuracy the spiking behaviour of the neuron i? Mathematically can be written in this way

p (x^{i} = 1 ∣ {x^{j}}_{j \in J})

, where J is a subset of neurons in the population of spiking neurons. Other questions related to the neural coding and dimensionality reduction can be addressed studying the energy landscape

U_{λ} (x)

of (13). For example, the the local minima of an energy landscape correspond to metastable states and several configurations may correspond to the same “valley“ near each local minima. Transitions between valleys have be studied in the context of “retinal coding” (see details in [148]). Alternative methodologies using the maximum entropy principle to study network of sensory neurons have been used in order to classify intrinsic interactions from extrinsic correlations [46] and to reveal the excitatory and inhibitory correlations [45].

5.3. What Else Do Thermodynamic Formalism and Gibbs Distributions May Tell Us about Neuroscience?

The relationship between mathematics and physics has been historically symbiotic and Thermodynamic Formalism is an interesting example of how ideas from physics may help to solve problems and introduce ideas into the field of mathematics. The history of Thermodynamic Formalism also shows how purely mathematical results can be obtained as corollaries of physical laws, inverting the frequently assumed relationship between physics and mathematics [149].

However, in the case considered in this review—the link between Thermodynamic Formalism and neuroscience—the mathematical problem is motivated by biology, not by physics. While Eugene Wigner argues in favour of the “The unreasonable effectiveness of mathematics in the natural sciences” [150], Israel Gelfand, after spending several years working in mathematical problems related to biology, replied with “The unreasonable ineffectiveness of mathematics in biology.” [151]. While there are reasons to argue that this is still the case, it is less clear if one can blame the field of mathematics or just the fact that we have not yet used the right tools or frameworks. In the quest for these “right tools”, there is a long tradition of using ideas from statistical physics to study neural networks, and in particular, to represent the emergence of collective behaviour from microscopic interactions, with the hope that statistical aspects of the collective behaviour will be independent of the details in these systems. This gave rise to major branches of theoretical neuroscience, like dynamic mean-field methods [29,152,153,154] or the Maximum Entropy approach [38,49,148,155], mainly coming from physicists. Although there are considerably less articles using mathematical methods to rigorously analyse the collective behaviour of neuronal networks some promising approach have been recently proposed based on large deviations [156,157], Kalikow-type decomposition [91,158], stochastic processes [159,160,161,162,163], dynamical systems [137,164], etc. As we have developed in this review, Thermodynamic Formalism could also be one of these tools, providing interesting connections between mathematics and physics, dynamics and statistics, applied to neuroscience.

Especially, we have described how Thermodynamic Formalism: (1) provides a conceptual and operational (i.e., allowing to develop algorithms and software [99]) framework to analyse experimental spike train data; (2) allows to derive explicit expressions linking spike statistics to neuronal networks dynamics; (3) extends to non stationarity via linear response theory; and, (4) proposes a realm to address questions related to criticality.

Here, we would like to propose some other extensions, not yet explored so staying at the level of ideas, all based on the power of Thermodynamic Formalism to make explicit and operational links between dynamics, statistics and symbolic coding.

Geometry of the state space. A prominent aspect of Thermodynamic Formalism, that we haven’t discussed yet in this review, is its link to the characterisation of the geometry of attractors and, especially, fractal sets [165,166]. For example, the composition of contracting mappings along symbolic orbits defines the so-called Iterated Function Systems (IFS) [167] generating fractal sets with tunable geometry and structure. Now, it is interesting to remark that Integrate and Fire models are actually piecewise contracting dynamical systems having a structure similar to IFS where the contracting pieces are symbolically encoded by spike blocks [114]. It would be interesting to investigate, along these lines, the structure of attractors in Integrate and Fire models, and how orbits, encoded by spike blocks, are related to the geometry of attractors (the $Ω$ -limit set).
Transitions between attractors. The concept of attractor is actually central in describing brain dynamics [168,169]. Especially, a current trend in neuroscience is to associate to brain states attractors (or ghost attractors, see [170] and references therein). The transitions between these states corresponds to transition during tasks or spontaneous activity [171,172,173,174]. It is relatively natural to characterise such transitions by Markov chains [175], which is the first step toward the application of Thermodynamic Formalism and analysing these transitions from a statistical and statistical physics perspective.
Non-stationarity and link with generating functional formalism. As we mentioned, Thermodynamic Formalism is constructed from a variational approach based on entropy and, thus, requiring time translation invariance. We have briefly described how we can depart from this constraint while using linear response theory. It would be interesting to explore beyond this point and consider general types of response to stimuli (not requiring a small perturbation, as in linear response). For this, one would have to construct a Thermodynamic Formalism based on the optimisation of a quantity, which is not the entropy. This is somehow what generating functional approaches like the dynamic mean-field theory does (see introduction), although using other constraining hypotheses (essentially to be able to describe the infinite size limit by a Gaussian process). It would be interesting to try to close the gap between these two approaches (e.g., via large deviations theory).

One of the biggest challenges in science of the XXI century is to understand the brain functions within a conceptual framework that are capable of unifying the multi-scale dynamics that take place in the brain. This framework should also make sense in the light of the overwhelming amount of experimental data capable of predicting macroscopic phenomena, such as motor behaviour or visual experience from the activity of billions of neurons.

Physicists have been able to make a deep connection between mechanics, statistical physics, and thermodynamics. A similar quest is presumably guiding the research of (some) theoretical and experimental neuroscientists. While there is still a long way to go before achieving this goal (as some argue we are still searching for principles [176]), during the last decades, mathematicians have been playing a relevant role in the rigorous description of neural phenomena, clarifying and raising conceptual problems in neuroscience.

We hope that theoretical tools and ideas from Thermodynamic Formalism and its current application in neuronal dynamics and spike train statistics will lead to a better and unified understanding of the neural phenomena. We also hope that the present review may serve as an encouragement for the mathematical community that is interested in applications of Thermodynamic Formalism in order to study these interesting and important problems.

Author Contributions

All authors conceived the main ideas and concepts, and wrote and revised the manuscript. All authors have read and approved the final manuscript.

Funding

R.C. was supported by Fondecyt Proyecto 11181072 and CONICYT REDES Project No. 180151. C.M. acknowledges CONACYT (Mexico) for financial support through project No. A1-S-15528 and the CONICYT REDES Project No. 180151 which partially supported him. C.M. also wishes to thank the CIMFAV for their wonderful hospitality and facilities during his stay. B.C. and R.C. acknowledge the INRIA program “Associated teams” for funding part of the project via the associated team MAGMA https://team.inria.fr/biovision/associated-team-magma/.

Acknowledgments

The authors thanks Antonio Galves, Godofredo Iommi and Ignacio Ampuero for valuable suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

KL	Kullback-Leibler
MEP	Maximum Entropy Principle
RPF	Ruelle-Perron-Frobenius
LDP	Large Deviation Principle
SCGF	Scaled Cumulant Generating Function
GLM	Generalized Linear Model
LIF	Leaky Integrate-and-Fire
MEA	Multi-Electrode Arrays

Symbol List

$x_{n}^{k}$	Spike-state of neuron k at time n
$x_{n}$	Spike pattern at time n
$x_{n_{1}, n_{2}}$	Spike block from time $n_{1}$ to $n_{2}$
$A_{n, n + m}$	Configuration space of spike blocks of m spike patterns
$A_{R}^{N}$	Configuration space of N neurons and spike blocks of R spike patterns
$E_{ν} (f)$	Expected value of the observable f w.r.t. the probability measure $ν$
$A_{T} (f)$	Empirical average of the observable f considering T spike patterns
$H [μ]$	Entropy of the probability measure $μ$
$λ_{k}$	Lagrange multiplier parameter
$U_{λ}$	Potential or Energy function
$F [U_{λ}]$	Pressure associated to the potential $U_{λ}$
$μ_{ψ}$	Equilibrium measure associated to the potential $ψ$
$S_{n} ϕ$	Birkhoff sums associated to the potential $ϕ$
$Γ_{f}$	Scaled cumulant generating function of the observable f
$I_{f}$	Rate function of the observable f
$L_{ϕ}$	Ruelle-Perron-Frobenius operator associated to the potential $ϕ$
$w_{n}$	Integer associated to the spike block $x_{n, n + R - 1}$
$C_{f, g} (n)$	Correlation function between the observables f and g at time n
$m_{l}$	Monomial l
$C_{T}$	Heat capacity
$V^{k}$	Voltage of neuron k
$θ$	Threshold

References

Gallavotti, G. Statistical Mechanics: A Short Treatise; Theoretical and Mathematical Physics; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
Gallavotti, G. Nonequilibrium and Irreversibility; Springer Publishing Company: Basel, Switzerland, 2014. [Google Scholar]
Kardar, M. Statistical Physics of Particles; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
Landau, L.; Lifshitz, E.M. Statistical Physics: Volume 5; Elsevier: Amsterdam, The Netherlands, 1980. [Google Scholar]
Gaspard, P. Chaos, Scattering and Statistical Mechanics; Cambridge Non-Linear Science Series: Cambridge, UK, 1998; Volume 9. [Google Scholar]
Ruelle, D. Statistical Mechanics: Rigorous Results; Addison-Wesley: New York, NY, USA, 1969. [Google Scholar]
Georgii, H.O. Gibbs Measures and Phase Transitions; De Gruyter Studies in Mathematics: Berlin, Germany; New York, NY, USA, 1988. [Google Scholar]
Sinai, Y.G. Gibbs measures in ergodic theory. Russ. Math. Surv. 1972, 27. [Google Scholar] [CrossRef]
Ruelle, D. Thermodynamic Formalism; Addison-Wesley: Reading, PA, USA, 1978. [Google Scholar]
Bowen, R. Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms. Springer Lect. Notes Math. 2008, 470, 78–104. [Google Scholar]
Ash, R.; Doleans-Dade, C. Probability and Measure Theory, 2nd ed.; Academic Press: Cambridge, MA, USA, 1999. [Google Scholar]
Friedli, S.; Velenik, Y. Statistical Mechanics of Lattice Systems: A Concrete Mathematical Introduction; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
Young, L.S. Statistical properties of dynamical systems with some hyperbolicity. Ann. Math. 1998, 147, 585–650. [Google Scholar] [CrossRef]
Climenhaga, V.; Pesin, Y. Building thermodynamics for non-uniformly hyperbolic maps. Arnold Math. J. 2017, 3, 37–82. [Google Scholar] [CrossRef]
Dementrius, L. The Thermodynamic Formalism in Population Biology. In Numerical Methods in the Study of Critical Phenomena; Della Dora, J., Demongeot, J., Lacolle, B., Eds.; Springer: Berlin/Heidelberg, Germany, 1981; pp. 233–253. [Google Scholar]
Demetrius, L. Statistical mechanics and population biology. J. Stat. Phys. 1983, 30, 709–753. [Google Scholar] [CrossRef]
Cessac, B.; Blanchard, P.; Krüger, T.; Meunier, J.L. Self-Organized Criticality and thermodynamic formalism. J. Stat. Phys. 2004, 115, 1283–1326. [Google Scholar] [CrossRef] [Green Version]
Krick, T.; Verstraete, N.; Alonso, L.G.; Shub, D.A.; Ferreiro, D.U.; Shub, M.; Sánchez, I.E. Amino Acid Metabolism Conflicts with Protein Diversity. Mol. Biol. Evol. 2014, 31, 2905–2912. [Google Scholar] [CrossRef] [Green Version]
Jin, S.; Tan, R.; Jiang, Q.; Xu, L.; Peng, J.; Wang, Y.; Wang, Y. A Generalized Topological Entropy for Analyzing the Complexity of DNA Sequences. PLoS ONE 2014, 9, 1–4. [Google Scholar] [CrossRef]
Koslicki, D. Topological entropy of DNA sequences. Bioinformatics 2011, 27, 1061–1067. [Google Scholar] [CrossRef] [Green Version]
Koslicki, D.; Thompson, D.J. Coding sequence density estimation via topological pressure. J. Math. Biol. 2015, 70, 45–69. [Google Scholar] [CrossRef] [Green Version]
Cessac, B. Statistics of spike trains in conductance-based neural networks: Rigorous results. J. Math. Neurosci. 2011, 1, 1–42. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cessac, B. A discrete time neural network model with spiking neurons II. Dynamics with noise. J. Math. Biol. 2011, 62, 863–900. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cofré, R.; Cessac, B. Dynamics and spike trains statistics in conductance-based Integrate-and-Fire neural networks with chemical and electric synapses. Chaos Solitons Fractals 2013, 50, 13–31. [Google Scholar] [CrossRef]
Cofré, R.; Maldonado, C. Information Entropy Production of Maximum Entropy Markov Chains from Spike Trains. Entropy 2018, 20, 34. [Google Scholar] [CrossRef] [Green Version]
Galves, A.; Galves, C.; García, J.E.; Garcia, N.L.; Leonardi, F. Context tree selection and linguistic rhythm retrieval from written texts. Ann. Appl. Stat. 2012, 6, 186–209. [Google Scholar] [CrossRef]
Cofré, R.; Maldonado, C.; Rosas, F. Large Deviations Properties of Maximum Entropy Markov Chains from Spike Trains. Entropy 2018, 20, 573. [Google Scholar] [CrossRef] [Green Version]
Cofré, R.; Videla, L.; Rosas, F. An Introduction to the Non-Equilibrium Steady States of Maximum Entropy Spike Trains. Entropy 2019, 21, 884. [Google Scholar] [CrossRef] [Green Version]
Sompolinsky, H.; Crisanti, A.; Sommers, H. Chaos in Random Neural Networks. Phys. Rev. Lett. 1988, 61, 259–262. [Google Scholar] [CrossRef] [Green Version]
Buice, M.A.; Chow, C.C. Beyond mean field theory: Statistical field theory for neural networks. J. Stat. Mech. Theory Exp. 2013. [Google Scholar] [CrossRef]
Montbrió, E.; Pazó, D.; Roxin, A. Macroscopic Description for Networks of Spiking Neurons. Phys. Rev. X 2015, 5, 021028. [Google Scholar] [CrossRef] [Green Version]
Byrne, A.; Avitabile, D.; Coombes, S. Next-generation neural field model: The evolution of synchrony within patterns and waves. Phys. Rev. E 2019, 99, 012313. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chizhov, A.V.; Graham, L.J. Population model of hippocampal pyramidal neurons, linking to refractory density approach to conductance-based neurons. Phys. Rev. E 2007, 75, 114. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Brunel, N.; Hakim, V. Fast global oscillations in networks of integrate-and-fire neurons with low firing rates. Neural Comput. 1999, 11, 1621–1671. [Google Scholar] [CrossRef] [PubMed]
Abbott, L.; Dayan, P. The effect of correlated variability on the accuracy of a population code. Neural Comput. 1999, 11, 91–101. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Gerstner, W.; Kistler, W. Spiking Neuron Models; Cambridge University Press: Cambridge, UK, 2002. [Google Scholar]
Rieke, F.; Warland, D.; de Ruyter van Steveninck, R.; Bialek, W. Spikes, Exploring the Neural Code; M.I.T. Press: Cambridge, MA, USA, 1996. [Google Scholar]
Schneidman, E.; Berry, M., II; Segev, R.; Bialek, W. Weak pairwise correlations imply string correlated network states in a neural population. Nature 2006, 440, 1007–1012. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shlens, J.; Field, G.D.; Gauthier, J.L.; Grivich, M.I.; Petrusca, D.; Sher, A.; Litke, A.M.; Chichilnisky, E.J. The structure of multi-neuron firing patterns in primate retina. J. Neurosci. 2006, 26, 8254–8266. [Google Scholar] [CrossRef] [Green Version]
Tkačik, G.; Prentice, J.; Balasubramanian, V.; Schneidman, E. Optimal population coding by noisy spiking neurons. Proc. Natl. Acad. Sci. USA 2010, 107, 14419–14424. [Google Scholar] [CrossRef] [Green Version]
Ganmor, E.; Segev, R.; Schneidman, E. The architecture of functional interaction networks in the retina. J. Neurosci. 2011, 31, 3044–3054. [Google Scholar] [CrossRef]
Ganmor, E.; Segev, R.; Schneidman, E. Sparse low-order interaction network underlies a highly correlated and learnable neural population code. Proc. Natl. Acad. Sci. USA 2011, 108, 9679–9684. [Google Scholar] [CrossRef] [Green Version]
Tkačik, G.; Marre, O.; Mora, T.; Amodei, D.; Berry II, M.; Bialek, W. The simplest maximum entropy model for collective behavior in a neural network. J. Stat. Mech. 2013, P03011. [Google Scholar] [CrossRef]
Granot-Atedgi, E.; Tkačik, G.; Segev, R.; Schneidman, E. Stimulus-dependent Maximum Entropy Models of Neural Population Codes. PLoS Comput. Biol. 2013, 9, 1–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Nghiem, T.A.; Telenczuk, B.; Marre, O.; Destexhe, A.; Ferrari, U. Maximum-entropy models reveal the excitatory and inhibitory correlation structures in cortical neuronal activity. Phys. Rev. E 2018, 98, 012402. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ferrari, U.; Deny, S.; Chalk, M.; Tkačik, G.; Marre, O.; Mora, T. Separating intrinsic interactions from extrinsic correlations in a network of sensory neurons. Phys. Rev. E 2018, 98, 42410. [Google Scholar] [CrossRef] [Green Version]
Gardella, C.; Marre, O.; Mora, T. Modeling the correlated activity of neural populations: A review. Neural Comput. 2019, 31, 233–269. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Marre, O.; El Boustani, S.; Frégnac, Y.; Destexhe, A. Prediction of spatiotemporal patterns of neural activity from pairwise correlations. Phys. Rev. Lett. 2009, 102, 138101. [Google Scholar] [CrossRef]
Vasquez, J.; Palacios, A.; Marre, O.; Berry II, M.; Cessac, B. Gibbs distribution analysis of temporal correlation structure on multicell spike trains from retina ganglion cells. J. Physiol. Paris 2012, 106, 120–127. [Google Scholar] [CrossRef] [Green Version]
Gardella, C.; Marre, O.; Mora, T. Blindfold learning of an accurate neural metric. Proc. Natl. Acad. Sci. USA 2018, 115, 3267–3272. [Google Scholar] [CrossRef] [Green Version]
Martin, P.C.; Siggia, E.D.; Rose, H.A. Statistical Dynamics of Classical Systems. Phys. Rev. A 1973, 8, 423–437. [Google Scholar] [CrossRef]
De Dominicis, C. Dynamics as a substitute for replicas in systems with quenched random impurities. Phys. Rev. B 1978, 18, 4913–4919. [Google Scholar] [CrossRef]
Sompolinsky, H.; Zippelius, A. Dynamic theory of the spin-glass phase. Phys. Rev. Lett. 1981, 47, 359–362. [Google Scholar] [CrossRef]
Sompolinsky, H.; Zippelius, A. Relaxational dynamics of the Edwards-Anderson model and the mean-field theory of spin-glasses. Phys. Rev. B 1982, 25, 6860–6875. [Google Scholar] [CrossRef]
Wieland, S.; Bernardi, D.; Schwalger, T.; Lindner, B. Slow fluctuations in recurrent networks of spiking neurons. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 2015, 92, 040901. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lerchner, A.; Cristina, U.; Hertz, J.; Ahmadi, M.; Ruffiot, P.; Enemark, S. Response variability in balanced cortical networks. Neural Comput. 2006, 18, 634–659. [Google Scholar] [CrossRef] [PubMed]
Mari, C.F. Random networks of spiking neurons: Instability in the xenopus tadpole moto-neural pattern. Phy. Rev. Lett. 2000, 85, 210. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Helias, M.; Dahmen, D. Statistical Field Theory for Neural Networks; Springer Nature Switzerland AG: Gewerbestrasse, Switzerland, 2020. [Google Scholar]
Ben-Arous, G.; Guionnet, A. Large deviations for Langevin spin glass dynamics. Probab. Theory Relat. Fields 1995, 102, 455–509. [Google Scholar] [CrossRef]
van Meegen, A.; Kühn, T.; Helias, M. Large Deviation Approach to Random Recurrent Neuronal Networks: Rate Function, Parameter Inference, and Activity Prediction. arXiv 2020, arXiv:2009.08889. [Google Scholar]
Ladenbauer, J.; McKenzie, S.; English, D.F.; Hagens, O.; Ostojic, S. Inferring and validating mechanistic models of neural microcircuits based on spike-train data. Nat. Commun. 2019, 10, 4933. [Google Scholar] [CrossRef] [Green Version]
Amari, S.i.; Nagaoka, H. Methods of Information Geometry; Oxford University Press: Oxford, UK, 2000; Volume 191. [Google Scholar]
Ellis, R. Entropy, Large deviations and Statistical Mechanics; Springer: Berlin/Heidelberg, Germany, 1985. [Google Scholar]
Beggs, J.M.; Plenz, D. Neuronal Avalanches in Neocortical Circuits. J. Neurosci. 2003, 23, 11167–11177. [Google Scholar] [CrossRef] [Green Version]
Haldeman, C.; Beggs, J. Critical Branching Captures Activity in Living Neural Networks and Maximizes the Number of Metastable States. Phys. Rev. Lett. 2005, 94, 058101. [Google Scholar] [CrossRef] [Green Version]
Kinouchi, O.; Copelli, M. Optimal dynamical range of excitable networks at criticality. Nat. Phys. 2006, 2, 348–351. [Google Scholar] [CrossRef] [Green Version]
Shew, W.L.; Yang, H.; Petermann, T.; Roy, R.; Plenz, D. Neuronal Avalanches Imply Maximum Dynamic Range in Cortical Networks at Criticality. J. Neurosci. 2009, 29, 15595–15600. [Google Scholar] [CrossRef]
Shew, W.L.; Plenz, D. The Functional Benefits of Criticality in the Cortex. Neuroscience 2013, 19, 88–100. [Google Scholar] [CrossRef] [PubMed]
Gautam, S.H.; Hoang, T.T.; McClanahan, K.; Grady, S.K.; Shew, W.L. Maximizing Sensory Dynamic Range by Tuning the Cortical State to Criticality. PLoS Comput. Biol. 2015, 11, 1–15. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Girardi-Schappo, M.; Bortolotto, G.S.; Gonsalves, J.J.; Pinto, L.T.; Tragtenberg, M.H.R. Griffiths phase and long-range correlations in a biologically motivated visual cortex model. Sci. Rep. 2016, 6, 29561. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Touboul, J.; Destexhe, A. Can Power-Law Scaling and Neuronal Avalanches Arise from Stochastic Dynamics? PLoS ONE 2010, 5, 1–14. [Google Scholar] [CrossRef] [PubMed]
Cocchi, L.; Gollo, L.L.; Zalesky, A.; Breakspear, M. Criticality in the brain: A synthesis of neurobiology, models and cognition. Prog. Neurobiol. 2017, 158, 132–152. [Google Scholar] [CrossRef] [Green Version]
Mora, T.; Bialek, W. Are biological systems poised at criticality? J. Stat. Phys. 2011, 144. [Google Scholar] [CrossRef] [Green Version]
Tkačik, G.; Mora, T.; Marre, O.; Amodei, D.; Berry II, M.; Bialek, W. Thermodynamics for a network of neurons: Signatures of criticality. Proc. Natl. Acad. Sci. USA 2015, 112. [Google Scholar] [CrossRef] [Green Version]
Nonnenmacher, M.; Behrens, C.; Berens, P.; Bethge, M.; Macke, J.H. Signatures of criticality arise from random subsampling in simple population models. PLoS Comp. Biol. 2017, 13, e1005886. [Google Scholar] [CrossRef] [Green Version]
Chazottes, J.; Keller, G. Pressure and Equilibrium States in Ergodic Theory. In Mathematics of Complexity and Dynamical Systems; Springer: Berlin/Heidelberg, Germany, 2011; pp. 1422–1437. [Google Scholar]
Keller, G. Equilibrium States in Ergodic Theory; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
Sarig, O.M. Thermodynamic formalism for countable Markov shifts. Ergodic Theory Dyn. Syst. 1999, 19, 1565–1593. [Google Scholar] [CrossRef] [Green Version]
Katok, A.; Hasselblatt, B. Introduction to the Modern Theory of Dynamical Systems; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
Baladi, V. Positive Transfer Operators and Decay of Correlations; World Scientific: Singapore, 2000; Volume 16. [Google Scholar]
Shields, P.C. The ergodic theory of discrete sample paths. In Graduate Studies in Mathematics; American Mathematical Society: Providence, RI, USA, 1996; Volume 13, p. xii+249. [Google Scholar]
Mayer, V.; Urbański, M. Thermodynamical formalism and multifractal analysis for meromorphic functions of finite order. Mem. Am. Math. Soc. 2010, 203, 954. [Google Scholar] [CrossRef] [Green Version]
Kubo, R. Statistical-mechanical theory of irreversible processes. J. Phys. Soc. 1957, 12, 570–586. [Google Scholar] [CrossRef]
Kubo, R. The fluctuation-dissipation theorem. Rep. Prog. Phys. 1966, 29, 255–284. [Google Scholar] [CrossRef] [Green Version]
Jaeger, G. The Ehrenfest Classification of Phase Transitions: Introduction and Evolution. Arch. Hist. Exact Sci. 1998, 53, 51–81. [Google Scholar] [CrossRef]
Dunford, N.; Schwartz, J. Linear Operators: Spectral Operators; Wiley-Interscience: New York, NY, USA; London, UK; Sydney, Australia; Toronto, ON, Canada, 1988; Volume 7. [Google Scholar]
Dembo, A.; Zeitouni, O. Large deviations techniques and applications. In Stochastic Modelling and Applied Probability; Springer: Berlin/Heidelberg, Germany, 2010; Volume 38. [Google Scholar] [CrossRef]
Touchette, H. The large deviation approach to statistical mechanics. Phys. Rep. 2009, 478, 1–69. [Google Scholar] [CrossRef] [Green Version]
Nasser, H.; Marre, O.; Cessac, B. Spatio-temporal spike trains analysis for large scale networks using maximum entropy principle and Monte-Carlo method. J. Stat. Mech. 2013, P03006. [Google Scholar] [CrossRef] [Green Version]
Fernandez, R.; Maillard, G. Chains with complete connections: General theory, uniqueness, loss of memory and mixing properties. J. Stat. Phys. 2005, 118, 555–588. [Google Scholar] [CrossRef] [Green Version]
Galves, A.; Löcherbach, E. Infinite Systems of Interacting Chains with Memory of Variable Length-A Stochastic Model for Biological Neural Nets. J. Stat. Phys. 2013, 151, 896–921. [Google Scholar] [CrossRef] [Green Version]
Fernández, R.; Galves, A. Markov approximations of chains of infinite order. Bull. Braz. Math. Soc. (N.S.) 2002, 33, 295–306. [Google Scholar] [CrossRef]
Ruelle, D. Statistical mechanics of a one-dimensional lattice gas. Commun. Math. Phys. 1968, 9, 267–278. [Google Scholar] [CrossRef] [Green Version]
Ny, A.L. Introduction to (Generalized) Gibbs Measures. Ensaios Mat. 2008, 15, 1–126. [Google Scholar]
Fernandez, R.; Gallo, S. Regular g-measures are not always Gibbsian. Electron. Commun. Probab. 2011, 16, 732–740. [Google Scholar] [CrossRef]
Yger, P.; Spampinato, G.L.; Esposito, E.; Lefebvre, B.; Deny, S.; Gardella, C.; Stimberg, M.; Jetter, F.; Zeck, G.; Picaud, S.; et al. A spike sorting toolbox for up to thousands of electrodes validated with ground truth recordings in vitro and in vivo. eLife 2018, 7, e34518. [Google Scholar] [CrossRef]
Buccino, A.P.; Hurwitz, C.L.; Magland, J.; Garcia, S.; Siegle, J.H.; Hurwitz, R.; Hennig, M.H. SpikeInterface, a unified framework for spike sorting. eLife 2020, 9, e61834. [Google Scholar] [CrossRef] [PubMed]
Nasser, H.; Cessac, B. Parameters estimation for spatio-temporal maximum entropy distributions: Application to neural spike trains. Entropy 2014, 16, 2244–2277. [Google Scholar] [CrossRef] [Green Version]
Cessac, B.; Kornprobst, P.; Kraria, S.; Nasser, H.; Pamplona, D.; Portelli, G.; Viéville, T. PRANAS: A New Platform for Retinal Analysis and Simulation. Front. Neuroinform. 2017, 11, 49. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Stiefel, K.M.; Fellous, J.M.; Thomas, P.J.; Sejnowski, T.J. Intrinsic subthreshold oscillations extend the influence of inhibitory synaptic inputs on cortical pyramidal neurons. Eur. J. Neurol. 2010, 31, 1019–1026. [Google Scholar] [CrossRef]
Cessac, B.; Paugam-Moisy, H.; Viéville, T. Overview of facts and issues about neural coding by spikes. J. Physiol. Paris 2010, 104, 5–18. [Google Scholar] [CrossRef]
Cessac, B.; Ny, A.L.; Löcherbach, E. On the mathematical consequences of binning spike trains. Neural Comput. 2017, 29, 146–170. [Google Scholar] [CrossRef] [Green Version]
Pillow, J.W.; Shlens, J.; Paninski, L.; Sher, A.; Litke, A.; Chichilnisky, E.J.; Simoncelli, E. Spatio-temporal correlations and visual signaling in a complete neuronal population. Nature 2008, 454, 995–999. [Google Scholar] [CrossRef] [Green Version]
Cessac, B.; Cofré, R. Spike train statistics and Gibbs distributions. J. Physiol. Paris 2013, 107, 360–368. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hammersley, J.M.; Clifford, P. Markov Fields on Finite Graphs and Lattices. Computer Science. Available online: http://www.statslab.cam.ac.uk/~grg/books/hammfest/hamm-cliff.pdf (accessed on 14 November 2020).
Moussouris, J. Gibbs and Markov Random Systems with Constraints. J. Stat. Phys. 1974, 10, 11–33. [Google Scholar] [CrossRef]
Cofré, R.; Cessac, B. Exact computation of the maximum entropy potential of spiking neural networks models. Phys. Rev. E 2014, 89, 052117. [Google Scholar] [CrossRef] [Green Version]
Herzog, R.; Escobar, M.J.; Cofre, R.; Palacios, A.G.; Cessac, B. Dimensionality Reduction on Spatio-Temporal Maximum Entropy Models of Spiking Networks. bioRxiv 2018. [Google Scholar] [CrossRef] [Green Version]
Ermentrout, G.B.; Terman, D.H. Mathematical Foundations of Neuroscience, 1st ed.; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
Lapicque, L. Recherches quantitatives sur l’excitation électrique des nerfs traitée comme une polarisation. J. Physiol. Pathol. Gen. 1907, 9, 620–635. [Google Scholar]
Hodgkin, A.; Huxley, A. A quantitative description of membrane current and its application to conduction and excitation in nerve cells. J. Physiol. 1952, 117, 500–544. [Google Scholar] [CrossRef] [PubMed]
Destexhe, A.; ZF, Z.M.; Sejnowski, T. Kinetic Models of Synaptic Transmission; MIT Press: Cambridge, MA, USA, 1998; pp. 1–26. [Google Scholar]
Soula, H.; Beslon, G.; Mazet, O. Spontaneous dynamics of asymmetric random recurrent spiking neural networks. Neural Comput. 2006, 18, 60–79. [Google Scholar] [CrossRef] [Green Version]
Cessac, B. A discrete time neural network model with spiking neurons. Rigorous results on the spontaneous dynamics. J. Math. Biol. 2008, 56, 311–345. [Google Scholar] [CrossRef] [PubMed]
Bühlmann, P.; Wyner, A.J. Variable length Markov chains. Ann. Stat. 1999, 27, 480–513. [Google Scholar] [CrossRef]
Mächler, M.; Bühlmann, P. Variable length Markov chains: Methodology, computing, and software. J. Comput. Grap. Stat. 2004, 13, 435–455. [Google Scholar] [CrossRef] [Green Version]
Cessac, B.; Viéville, T. On Dynamics of Integrate-and-Fire Neural Networks with Adaptive Conductances. Front. Neurosci. 2008, 2. [Google Scholar] [CrossRef] [Green Version]
Monteforte, M.; Wolf, F. Dynamic flux tubes form reservoirs of stability in neuronal circuits. Phys. Rev. X 2012, 2, 041007. [Google Scholar] [CrossRef] [Green Version]
Lindner, B.; Schimansky-Geier, L. Transmission of noise coded versus additive signals through a neuronal ensemble. Phys. Rev. Lett. 2001, 86, 2934. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Brunel, N. Dynamics of Sparsely Connected Networks of Excitatory and Inhibitory Spiking Neurons. J. Comput. Neurosci. 2000, 8, 183–208. [Google Scholar] [CrossRef]
Schuecker, J.; Diesmann, M.; Helias, M. Modulated escape from a metastable state driven by colored noise. Phys. Rev. E-Stat. Nonlinear Soft Matter Phys. 2015, 92, 052119. [Google Scholar] [CrossRef] [Green Version]
Cessac, B.; Ampuero, I.; Cofre, R. Linear Response for Spiking Neuronal Networks with Unbounded Memory. arXiv 2020, arXiv:1704.05344. [Google Scholar]
Galves, A.; Löcherbach, E.; Pouzat, C.; Presutti, E. A system of interacting neurons with short term plasticity. J. Stat. Phys. 2019, 178, 869–892. [Google Scholar] [CrossRef] [Green Version]
Galves, A.; Löcherbach, E. Stochastic chains with memory of variable length. In Festschrift in Honour of the 75th Birthday of Jorma Rissanen. Available online: https://arxiv.org/pdf/0804.2050.pdf (accessed on 14 November 2020).
De Masi, A.; Galves, A.; Löcherbach, E.; Presutti, E. Hydrodynamic Limit for Interacting Neurons. J. Stat. Phys. 2014, 158, 866–902. [Google Scholar] [CrossRef] [Green Version]
Fournier, N.; Löcherbach, E. On a toy model of interacting neurons. Ann. L’Institut Henri Poincare (B) Probab. Stat. 2016, 52, 1844–1876. [Google Scholar] [CrossRef] [Green Version]
Yaginuma, K. A Stochastic System with Infinite Interacting Components to Model the Time Evolution of the Membrane Potentials of a Population of Neurons. J. Stat. Phys. 2016, 163, 642–658. [Google Scholar] [CrossRef] [Green Version]
Hodara, P.; Löcherbach, E. Hawkes processes with variable length memory and an infinite number of components. Adv. App. Probab. 2017, 49. [Google Scholar] [CrossRef] [Green Version]
Ferrari, P.A.; Maass, A.; Martínez, S.; Ney, P. Cesàro mean distribution of group automata starting from measures with summable decay. Ergod. Theory Dyn. Syst. 2000. [Google Scholar] [CrossRef] [Green Version]
Comets, F.; Fernández, R.; Ferrari, P.A. Processes with long memory: Regenerative construction and perfect simulation. Ann. Appl. Probab. 2002, 12, 921–943. [Google Scholar] [CrossRef]
Kirst, C.; Timme, M. How precise is the timing of action potentials? Front. Neurosci. 2009, 3, 2–3. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cessac, B. A view of Neural Networks as dynamical systems. Int. J. Bifurcations Chaos 2010, 20, 1585–1629. [Google Scholar] [CrossRef] [Green Version]
Rudolph, M.; Destexhe, A. Analytical Integrate and Fire Neuron models with conductance-based dynamics for event driven simulation strategies. Neural Comput. 2006, 18, 2146–2210. [Google Scholar] [CrossRef] [Green Version]
FitzHugh, R. Impulses and physiological states in models of nerve membrane. Biophys. J. 1961, 1, 445–466. [Google Scholar] [CrossRef] [Green Version]
Nagumo, J.; Arimoto, S.; Yoshizawa, S. An active pulse transmission line simulating nerve axon. Proc. IRE 1962, 50, 2061–2070. [Google Scholar] [CrossRef]
Morris, C.; Lecar, H. Voltage oscillations in the barnacle giant muscle fiber. Biophys. J. 1981, 35, 193–213. [Google Scholar] [CrossRef] [Green Version]
Izhikevich, E. Dynamical Systems in Neuroscience: The Geometry of Excitability and Bursting; The MIT Press: Cambridge, MA, USA, 2007. [Google Scholar]
Lampl, I.; Yarom, Y. Subthreshold oscillations of the membrane potential: A functional synchronizing and timing device. J. Neurophysiol. 1993, 70, 2181–2186. [Google Scholar] [CrossRef] [Green Version]
Engel, T.A.; Schimansky-Geier, L.; Herz, A.V.; Schreiber, S.; Erchova, I. Subthreshold membrane-potential resonances shape spike-train patterns in the entorhinal cortex. J. Neurophysiol. 2008, 100, 1576–1589. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ma, S. Modern Theory of Critical Phenomena; Routledge: New York, NY, USA, 2001. [Google Scholar] [CrossRef]
Mastromatteo, I.; Marsili, M. On the criticality of inferred models. J. Stat. Mech. 2011, P10012. [Google Scholar] [CrossRef]
Yang, C.N.; Lee, T.D. Statistical Theory of Equations of State and Phase Transitions. I. Theory of Condensation. Phys. Rev. 1952, 87, 404–409. [Google Scholar] [CrossRef]
Lee, T.D.; Yang, C.N. Statistical Theory of Equations of State and Phase Transitions. II. Lattice Gas and Ising Model. Phys. Rev. 1952, 87, 410–419. [Google Scholar] [CrossRef]
Privman, V.; Fisher, M.E. Universal Critical Amplitudes in Finite-Size Scaling. Phys. Rev. B 1984, 30, 322–327. [Google Scholar] [CrossRef]
Dyson, F.J. Existence of a phase-transition in a one-dimensional Ising ferromagnet. Comm. Math. Phys. 1969, 12, 91–107. [Google Scholar] [CrossRef]
Venegeroles, R. Thermodynamic phase transitions for Pomeau-Manneville maps. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 2012, 86, 021114. [Google Scholar] [CrossRef] [Green Version]
Collet, P.; Galves, A. Chains of Infinite Order, Chains with Memory of Variable Length, and Maps of the Interval. J. Stat. Phys. 2012, 149, 73–85. [Google Scholar] [CrossRef] [Green Version]
Tkačik, G.; Marre, O.; Amodei, D.; Schneidman, E.; Bialek, W.; Berry, M.J. Searching for collective behavior in a large network of sensory neurons. PLoS Comput. Biol. 2014, 10, e1003408. [Google Scholar] [CrossRef] [Green Version]
Ruelle, D. Is our mathematics natural? The case of equilibrium statistical mechanics. Bull. Am. Math. Soc. 1988, 19, 259–268. [Google Scholar] [CrossRef] [Green Version]
Wigner, E.P. The unreasonable effectiveness of mathematics in the natural sciences. Richard courant lecture in mathematical sciences delivered at New York University, May 11, 1959. Commun. Pure Appl. Math. 1960, 13, 1–14. [Google Scholar] [CrossRef]
Lesk, A.M. The unreasonable effectiveness of mathematics in molecular biology. Math. Intell. 2000, 22, 28–37. [Google Scholar] [CrossRef]
Faugeras, O.; Touboul, J.; Cessac, B. A constructive mean field analysis of multi population neural networks with random synaptic weights and stochastic inputs. Front. Comput. Neurosci. 2009, 3. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Schuecker, J.; Goedeke, S.; Dahmen, D.; Helias, M. Functional methods for disordered neural networks. arXiv 2016, arXiv:1605.06758. [Google Scholar]
Helias, M.; Dahmen, D. Statistical Field Theory for Neural Networks; Lecture Notes in Physics; Springer: Berlin/Heidelberg, Germany, 2020; Volume 970. [Google Scholar]
Tkacik, G.; Mora, T.; Marre, O.; Amodei, D.; Palmer, S.E.; Berry, M.J.; Bialek, W. Thermodynamics and signatures of criticality in a network of neurons. Proc. Natl. Acad. Sci. USA 2015, 112, 11508–11513. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Faugeras, O.; MacLaurin, J. A large deviation principle for networks of rate neurons with correlated synaptic weights. BMC Neurosci. 2013, 14 (Suppl. 1), P252. [Google Scholar] [CrossRef] [Green Version]
Faugeras, O.; Maclaurin, J. Asymptotic description of stochastic neural networks. I. Existence of a large deviation principle. Comptes Rendus Math. 2014, 352, 841–846. [Google Scholar] [CrossRef] [Green Version]
Ost, G.; Reynaud-Bouret, P. Sparse space-time models: Concentration inequalities and Lasso. Ann. l’IHP Probab. Stat. 2020, 56, 2377–2405. [Google Scholar]
Reynaud-Bouret, P.; Rivoirard, V.; Grammont, F.; Tuleau-Malot, C. Goodness-of-fit tests and nonparametric adaptive estimation for spike train analysis. J. Math. Neur. 2014, 4, 3. [Google Scholar] [CrossRef] [Green Version]
Delarue, F.; Inglis, J.; Rubenthaler, S.; Tanré, E. Global solvability of a networked integrate-and-fire model of McKean-Vlasov type. Ann. Appl. Probab. 2015, 25, 2096–2133. [Google Scholar] [CrossRef]
Cormier, Q.; Tanré, E.; Veltz, R. Hopf Bifurcation in a Mean-Field Model of Spiking Neurons. arXiv 2020, arXiv:2008.11116. [Google Scholar]
Lambert, R.C.; Tuleau-Malot, C.; Bessaih, T.; Rivoirard, V.; Bouret, Y.; Leresche, N.; Reynaud-Bouret, P. Reconstructing the functional connectivity of multiple spike trains using Hawkes models. J. Neur. Meth. 2018, 297, 9–21. [Google Scholar] [CrossRef] [PubMed]
Albert, M.; Bouret, Y.; Fromont, M.; Reynaud-Bouret, P. Surrogate data methods based on a shuffling of the trials for synchrony detection: The centering issue. Neural Comput. 2016, 28, 2352–2392. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bressloff, P.C.; Coombes, S. Dynamics of strongly coupled spiking neurons. Neural Comput. 2000, 12, 91–129. [Google Scholar] [CrossRef] [PubMed]
Falconer, K.J. The Geometry of Fractal Sets; Cambridge University Press: Cambridge, CA, USA, 1985. [Google Scholar]
Falconer, K. Techniques in Fractal Geometry; John Wiley & Sons, Ltd.: Chichester, UK, 1997. [Google Scholar]
Barnsley, M.; Rising, H. Fractals Everywhere; Elsevier Science: Amsterdam, The Netherlands, 1993. [Google Scholar]
McKenna, T.M.; McMullen, T.A.; Shlesinger, M.F. The brain as a dynamic physical system. Neuroscience 1994, 60, 587–605. [Google Scholar] [CrossRef]
Hutt, A.; beim Graben, P. Sequences by Metastable Attractors: Interweaving Dynamical Systems and Experimental Data. Front. Appl. Math. Stat. 2017. [Google Scholar] [CrossRef] [Green Version]
Deco, G.; Jirsa, V.K. Ongoing Cortical Activity at Rest: Criticality, Multistability, and Ghost Attractors. J. Neurosci. 2012, 32, 3366–3375. [Google Scholar] [CrossRef] [Green Version]
Chang, C.; Glover, G.H. Time–frequency dynamics of resting-state brain connectivity measured with fMRI. NeuroImage 2010, 50, 81–98. [Google Scholar] [CrossRef] [Green Version]
Hutchison, R.M.; Womelsdorf, T.; Allen, E.A.; Bandettini, P.A.; Calhoun, V.D.; Corbetta, M.; Della Penna, S.; Duyn, J.H.; Glover, G.H.; Gonzalez-Castillo, J.; et al. Dynamic functional connectivity: Promise, issues, and interpretations. Mapping the Connectome. NeuroImage 2013, 80, 360–378. [Google Scholar] [CrossRef] [Green Version]
Allen, E.A.; Damaraju, E.; Plis, S.M.; Erhardt, E.B.; Eichele, T.; Calhoun, V.D. Tracking Whole-Brain Connectivity Dynamics in the Resting State. Cereb. Cortex 2012, 24, 663–676. [Google Scholar] [CrossRef]
Cabral, J.; Kringelbach, M.L.; Deco, G. Functional connectivity dynamically evolves on multiple time-scales over a static structural connectome: Models and mechanisms. Functional Architecture of the Brain. NeuroImage 2017, 160, 84–96. [Google Scholar] [CrossRef] [PubMed]
Vohryzek, J.; Deco, G.; Cessac, B.; Kringelbach, M.L.; Cabral, J. Ghost Attractors in Spontaneous Brain Activity: Recurrent Excursions Into Functionally-Relevant BOLD Phase-Locking States. Front. Syst. Neurosci. 2020, 14, 20. [Google Scholar] [CrossRef] [PubMed]
Bialek, W. Biophysics: Searching for Principles; Princeton University Press: Princeton, NJ, USA, 2012. [Google Scholar]

Figure 1. Example of fluctuations of observables. Top row represent four measures of fluctuations of the observable

x_{0}^{1} \cdot x_{1}^{2}

. The same analysis is done in the bottom row for the observable

x_{1}^{1} \cdot x_{0}^{2}

. The first column represent the sample average for different sample sizes. We observe the convergence towards the theoretical value as predicted by the law of large numbers. The second column represent the fitted Gaussian’s to the histograms of the averages that were obtained for different sample sizes in the legend (10). The third column represent the large deviations rate function for both observables. In the abscissa it is the parameter s in (11) and in the ordinate

I_{f} (s)

where f represent the observables

x_{0}^{1} \cdot x_{1}^{2}

(top) and

x_{1}^{1} \cdot x_{0}^{2}

(bottom). The minimum of

I_{f} (s)

indicate the expected value of f (LLN) and values in the neighbourhood characterise the CLT, as explained in Section 2.3.4. The expected values of both observables are determined by the constrains imposed to the maximum entropy problem. The fourth column show the auto-correlations obtained while using Formula (7).

Figure 1. Example of fluctuations of observables. Top row represent four measures of fluctuations of the observable

x_{0}^{1} \cdot x_{1}^{2}

. The same analysis is done in the bottom row for the observable

x_{1}^{1} \cdot x_{0}^{2}

. The first column represent the sample average for different sample sizes. We observe the convergence towards the theoretical value as predicted by the law of large numbers. The second column represent the fitted Gaussian’s to the histograms of the averages that were obtained for different sample sizes in the legend (10). The third column represent the large deviations rate function for both observables. In the abscissa it is the parameter s in (11) and in the ordinate

I_{f} (s)

where f represent the observables

x_{0}^{1} \cdot x_{1}^{2}

(top) and

x_{1}^{1} \cdot x_{0}^{2}

(bottom). The minimum of

I_{f} (s)

indicate the expected value of f (LLN) and values in the neighbourhood characterise the CLT, as explained in Section 2.3.4. The expected values of both observables are determined by the constrains imposed to the maximum entropy problem. The fourth column show the auto-correlations obtained while using Formula (7).

Figure 2. From experimental spike trains to mathematical modelling. (A) Experimental set-up. MEA detect spikes from living neuronal tissue. In this illustration, the retina of a mammalian is put into the MEA and submitted to natural light stimuli. The membrane potential of retinal ganglion cells is recorded and analysed to extract the spikes using spike sorting algorithms [96,97]. (B) Mathematical models of biophysically inspired spiking networks can be used to study spike trains. Top. Neurons, considered here as points in a lattice, interact via synaptic connections on an oriented weighted graph where the matrix of weights is denoted W. Bottom. A prominent mathematical class of models is the Integrate and Fire model where the membrane potential is modelled by a stochastic differential equation (black trajectory) with threshold condition

θ

. The neuron is considered to spike whenever the membrane potential reaches the threshold. Then, it is reset to some constant value. Binning time using windows of a few ms length, one can associate the continuous-time trajectory of the membrane potential with a discrete-time sequence of 0s and 1s characterising the spike state of the neuron in each time window. (C) Spike trains. Using the binary representation at the bottom of (B) for each neuron in a network one obtains sequences of binary spike patterns (spike trains) symbolically representing the underlying neuronal dynamics.

Figure 2. From experimental spike trains to mathematical modelling. (A) Experimental set-up. MEA detect spikes from living neuronal tissue. In this illustration, the retina of a mammalian is put into the MEA and submitted to natural light stimuli. The membrane potential of retinal ganglion cells is recorded and analysed to extract the spikes using spike sorting algorithms [96,97]. (B) Mathematical models of biophysically inspired spiking networks can be used to study spike trains. Top. Neurons, considered here as points in a lattice, interact via synaptic connections on an oriented weighted graph where the matrix of weights is denoted W. Bottom. A prominent mathematical class of models is the Integrate and Fire model where the membrane potential is modelled by a stochastic differential equation (black trajectory) with threshold condition

θ

. The neuron is considered to spike whenever the membrane potential reaches the threshold. Then, it is reset to some constant value. Binning time using windows of a few ms length, one can associate the continuous-time trajectory of the membrane potential with a discrete-time sequence of 0s and 1s characterising the spike state of the neuron in each time window. (C) Spike trains. Using the binary representation at the bottom of (B) for each neuron in a network one obtains sequences of binary spike patterns (spike trains) symbolically representing the underlying neuronal dynamics.

Figure 3. Spike blocks transition. Example of legal transition

w_{n} \to w_{n + 1}

between blocks of range four (

R = 4

). The two blocks are

w_{n} \sim x_{n, n + 3}

and

w_{n + 1} \sim x_{n + 1, n + 4}

and have the block

x_{n + 1} x_{n + 2} x_{n + 3}

in light blue in common.

Figure 3. Spike blocks transition. Example of legal transition

w_{n} \to w_{n + 1}

between blocks of range four (

R = 4

). The two blocks are

w_{n} \sim x_{n, n + 3}

and

w_{n + 1} \sim x_{n + 1, n + 4}

and have the block

x_{n + 1} x_{n + 2} x_{n + 3}

in light blue in common.

Figure 4. Signatures of criticality Generic plot of heat capacity

C_{T}

versus temperature T for maximum entropy models built constraining firing rates and pairwise correlations of retinal ganglion cells responding to naturalistic stimuli [74]. A clear peak appears at

T = 1

when groups of an increasingly large number of neurons are considered (thermodynamic limit).

Figure 4. Signatures of criticality Generic plot of heat capacity

C_{T}

versus temperature T for maximum entropy models built constraining firing rates and pairwise correlations of retinal ganglion cells responding to naturalistic stimuli [74]. A clear peak appears at

T = 1

when groups of an increasingly large number of neurons are considered (thermodynamic limit).

Table 1. Types of Gibbs measures potentially found in experimental data analysis or in the analysis of mathematical models of networks of interacting spiking neurons.

Thermodynamic Formalism and Gibbs Measures
Number of neurons	Memory of the potential
Number of neurons	Memoryless	Finite	Infinite
Finite	Boltzmann-Gibbs	Gibbs in the sense of Bowen	Chains with complete connections
Infinite	Countable state Bernoulli	Countable state Markov	Chains with variable length

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cofré, R.; Maldonado, C.; Cessac, B. Thermodynamic Formalism in Neuronal Dynamics and Spike Train Statistics. Entropy 2020, 22, 1330. https://doi.org/10.3390/e22111330

AMA Style

Cofré R, Maldonado C, Cessac B. Thermodynamic Formalism in Neuronal Dynamics and Spike Train Statistics. Entropy. 2020; 22(11):1330. https://doi.org/10.3390/e22111330

Chicago/Turabian Style

Cofré, Rodrigo, Cesar Maldonado, and Bruno Cessac. 2020. "Thermodynamic Formalism in Neuronal Dynamics and Spike Train Statistics" Entropy 22, no. 11: 1330. https://doi.org/10.3390/e22111330

APA Style

Cofré, R., Maldonado, C., & Cessac, B. (2020). Thermodynamic Formalism in Neuronal Dynamics and Spike Train Statistics. Entropy, 22(11), 1330. https://doi.org/10.3390/e22111330

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Thermodynamic Formalism in Neuronal Dynamics and Spike Train Statistics

Abstract

1. Introduction

2. Mathematical Setting of Thermodynamic Formalism

2.1. General Properties

2.1.1. Gibbs Measures

2.1.2. Entropy and the Variational Principle

2.2. Observables and Fluctuations of Their Time Averages

2.3. Correlations

2.3.1. Properties of the Pressure

2.3.2. Ruelle-Perron-Frobenius Operator

2.3.3. Time Averages and Central Limit Theorem

2.3.4. Large Deviations

2.4. Potentials of Range One

2.5. Finite Range Potentials

2.6. Example

2.7. Systems with Infinite Range Potentials, Chains with Infinite Memory and Gibbs Distributions

3. Thermodynamic Formalism in Neuroscience

3.1. Statistics of Spike Trains and Gibbs Distribution

3.2. Conditional Probabilities for Spike Trains

3.3. The Hammersley–Clifford Theorem

3.3.1. Finite Memory, Markov Chains and Gibbs Distributions

3.3.2. Spectral Gap and Thermodynamic Limit

4. Spiking Neuronal Network Models: The Leaky Integrate-and-Fire and Beyond

4.1. Dynamics and Spikes

4.2. A Discrete Time Version of the Leaky-Integrate and Fire Model

4.3. Gibbs Distribution of the Discrete Lif Model

4.4. Markov Partition and Symbolic Coding

4.5. Extensions

4.5.1. Explicit Form of the Potential. GLM vs. MaxEnt

4.5.2. Linear Response

4.6. The Galves-Löcherbach Model

5. Discussion and Perspectives

5.1. Thermodynamic Formalism for More Biologically Plausible Neuron Models

5.2. Phase Transitions

5.3. What Else Do Thermodynamic Formalism and Gibbs Distributions May Tell Us about Neuroscience?

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Abbreviations

Symbol List

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI