2. Summary and Outline
Strong interactions are described by a Yang–Mills theory, which generally involves
-odd parameters through the masses of the quarks as well as through the topological term. The Lagrangian in the Euclidean spacetime is
where we use the convention
,
for the Lie algebra generators
and the structure constants
. Above,
with
being the field strength tensors.
(
) is the Hodge dual of
. The covariant derivative takes the form
when
lives in the fundamental representation of the gauge group and
when
lives in the adjoint representation.
The Euclidean gamma matrices
are obtained from the Minkowskian counterparts
(where
and
with
the Pauli matrices) via
These matrices satisfy the Clifford algebras
Following Ref. [
7], we use the same
in Euclidean spacetime as for Minkowski spacetime (In Ref. [
7], it is mistakenly stated that
.)
Since, in this paper, we mostly work in Euclidean space, from now on, we remove the hat on the Euclidean gamma matrices. Note that the
in the Euclidean spacetime used here may differ from that used in some other papers, e.g., Ref. [
11], by a minus sign. The chirality in Euclidean spacetime in these two notations is thus defined oppositely. The only effect of this appears in applying the Atiyah–Singer index theorem [
12] when, e.g., deriving the anomalous axial current by counting the zero modes of the massless Dirac operator.
We note that the terms
and
are both parity-odd and charge-conjugation even [
13]. The
-odd parameters are therefore
, the phase pertaining to the mass
m of the fermion
, and
, the coefficient of the topological term. The fermion
in the fundamental representation is referred to as quark. To focus on the principal aspects, for the most part of the discussion, we take here just one quark flavor and the gauge group to be
. This is the minimal setup that allows us to study the interplay of the
-odd parameters
and
. In particular, we are interested in how
and
appear in the effective interaction (known as ‘t Hooft operator) that captures nonperturbative effects associated with the chiral anomaly. In the single-flavor model, this operator has the same form as a quark mass term, and therefore, both would be hard to discern. Nonetheless, the main question of whether and how
and
appear in the effective ‘t Hooft operator can be answered within this setup. The generalization to the phenomenologically relevant case of several flavors is presented in Ref. [
7]. In strong interactions, the group
can be viewed as embedded within the
color group. The technical details of this construction are reviewed in Ref. [
14].
It is well known that the presence of a
-odd Lagrangian term does not readily imply
-violating physical effects. A necessary condition is the existence of a
-odd combination of the Lagrangian terms that is invariant under field redefinitions. In the present case, this condition is met, as the parameter
is invariant under redefinitions of the quark fields, in particular through anomalous chiral transformations. The chiral anomaly [
15,
16] also implies that
is an angular variable, i.e., all observables must be
-periodic in
. Therefore, the integral
that multiplies
must be an integer in order to contribute to the action and thereby to the partition function.
Here, we ask the question of whether a nonvanishing
is also a sufficient condition for
violation in strong interactions. Does
have physical effects, in particular, is there a neutron EDM depending on its value? Evidently,
has no impact on the classical equations of motion since the topological term is a total derivative. Nonetheless, under certain assumptions, based on the fact that the third homotopy group of the gauge group or one of its
subgroups is
, the energy functional is periodic under so-called large gauge transformations. The situation is therefore reminiscent of a periodic quantum mechanical potential in a crystal, and
would then correspond to the crystal momentum [
4,
5,
6,
11].
One way to see how far the analogy goes is to study canonical quantization of the gauge theory [
4]. Since non-Abelian gauge theories are typically handled through functional quantization, this possibility has not yet been investigated in all aspects pertinent to the present questions. We briefly comment on canonical quantization in
Section 4 and in more detail in a separate paper [
17].
For the time being, we focus on functional quantization since it has been the principal method to carry out calculations on
violation in strong interactions ever since the matter was brought up [
5,
6,
11,
18]. In the functional approach, we take the partition function
as the defining point of the quantum field theory, where
A is the gauge potential. We write this as a functional of external fermionic sources
,
as a provision in order to derive quark correlation functions that may or may not exhibit
invariance. Furthermore, we make it explicit that the integral over Euclidean spacetime is understood as a limiting procedure, taking the spacetime volume
to infinity. It is this limit that allows us to state Equation (
9) without specifying boundary conditions on the path integral, and, moreover, this way, we obtain the vacuum correlation functions of the theory [
19]. We shall review this point in
Section 3.
Now, as argued in Ref. [
11], the partition function (
9) in infinite spacetime volume
receives its nonvanishing contributions from saddle points of finite action and fluctuations about these. For these field configurations, the winding number
is an integer that labels the topological sector:
This is the desired outcome because it is consistent with
being an angular variable, as required by the chiral anomaly. As topological quantization, i.e., integer
, is a consequence of
, we must carry out this limit before summing over topological sectors.
Suppose now that it is valid to organize the calculation of the path integral by adding contributions from the individual topological sectors. Then, Equation (
9) implies that the partition function should be evaluated as
The subscript
on
indicates that the path integral is supposed to cover the configurations with given
, i.e., to sweep over the given topological sector. The contributions of the topological sectors are evaluated with the limit
before they are added together. A rearrangement of the limits will in general lead to different results and is therefore not justified.
In
Section 3, we expand on this argument and put it on a more formal footing. The path integrals within individual topological sectors correspond to steepest-descent contours for the exponent of the Euclidean path integral. For different
, these contours can only be connected by configurations of infinite action. It thus follows that the arrangement of limits in Equation (
11) indeed corresponds to a good contour for the path integral in Equation (
9). Further, this formally establishes that the decomposition of the path integral into topological sectors is valid in the first place.
This brings us to the salient point: in Ref. [
7], it has been shown that the limit
does not commute with the sum over
in Equation (
11), with the consequence that the quark correlations do not exhibit
violation. While we review this technical argument in
Section 5, the basic reason is that
, see also
Section 3. As explained in
Section 7, one can thus conclude that there is no EDM for the neutron, no matter what the value of
is. Since, in the sum over
, all integer values are taken, this is where the analogy with the quantum-mechanical crystal breaks down; as for the latter, the number of potential minima may be large but remains finite. Therefore, the order of the path integral and the limit of an infinite spacetime volume are not an issue in that case.
After all, strong interactions are complete without the necessity of tuning the parameter
to be small or extending the theory by additional scalar fields and nonrenormalizable operators. While this is a gratifying conclusion, a scrutiny of the argument is warranted, not least because the prevalent line of reasoning arrives at the contrary verdict: in order to deduce
-violation in strong interactions, one would have to impose that the limit
is taken last, i.e.,
and, at the same time, specify boundary conditions on the finite surfaces
, such as
where
, which corresponds to a pure gauge and implies topological quantization, i.e.,
. Equations (
12) and (
13) together are either directly or indirectly implied in the bulk of the existing literature, including the initial papers on the topic [
5,
6,
11,
18]. Although Equation (
13) may be motivated by considering fields in the classical ground state, i.e., of vanishing classical energy, on the initial and final spatial hypersurfaces, the quantum ground state also receives contributions from other field configurations. Therefore, the boundary condition (
13) does not follow from the partition function (
12) (note that Equation (
13) moreover assumes that three-dimensional space is finite), unlike the topological quantization (
10) that is implied by the partition function (
9), see
Section 3 for more detail.
To our knowledge, the published papers do not provide a conclusive reason for how the procedure given by Equations (
12) and (
13) might be deduced from the functional (
9) that defines the theory. Given that the limits do not commute, i.e., that Equation (
11) is not equivalent to Equations (
12) and (
13), as shown explicitly in
Section 5, there also cannot be such a derivation. (Note that while in Ref. [
11], it is shown that the correlators in a large but finite box depend only on the boundary conditions through the Chern–Simons flux, the issue that the limits do not commute is not addressed in that work.) Neither are we aware of an argument why Equation (
9), which is the standard textbook expression (up to the fact that we write the infinite-spacetime limit explicitly, which is a purely notational matter), might be incorrect to start with. Unless taking
, there is also no apparent reason why the boundary condition (
13) should be physical.
In the simplest terms, the reason for -conservation in strong interactions can thus be stated as follows:
For to be physical, we must have since is an angular variable. Since fixed boundary conditions on a finite surface are not physical, this topological quantization can only follow from .
Given the order of limits that is thus implied, there is no violation for since .
While the main conclusions and technicalities on the absence of
violation in strong interactions have been presented in Ref. [
7], one objective of the present paper is to add a more formal interpretation of the difference between Equations (
9) and (
11) versus Equations (
12) and (
13). In
Section 3, we recall to that end the reason for taking Euclidean time to infinity in the first place. Then, we show that Equation (
11) corresponds to a contour integration that can be derived and assembled from steepest-descent flows, while the prescription of Equations (
12) and (
13) does not correspond to a connected integration contour. Since the reasoning in the present work is based on infinite Euclidean spacetime as the analytic continuation of Minkowski spacetime, we briefly comment in
Section 4 on how calculations in finite Euclidean spacetimes with and without boundaries can be made meaningful, but we leave a detailed discussion to a separate paper. Next, as it allows for an explicit demonstration of
conservation as a consequence of Equation (
11) and for an intuitive interpretation of the matter, we review in
Section 5 the dilute instanton gas calculation of Ref. [
7]. This lays the ground to address objections concerning the volume dependence of the partition function in
Section 6. Since there is no physical interpretation of fixed boundary conditions on finite Euclidean surfaces, we show that the partition function in fact shows the expected behavior when evaluated in finite volumes with open boundary conditions. Further objections are based on effective field theory (EFT) descriptions, which is why we review the role of the
parameter in the effective ‘t Hooft vertex as well as in chiral perturbation theory in
Section 7. With this preparation, we can reply in
Section 8 to an objection using the topological term in hadronic matrix elements and the role of
in the effective description of the dynamics of the topological current. After some additional comments on the sampling of topological configurations in the different orders of limits and the recent literature, we wrap up and conclude in
Section 9. Except for
Section 7, where we work in Minkowski spacetime, we work in Euclidean spacetime throughout all other sections.
5. Dilute Instanton Gas Approximation
The most sensitive probe of possible
violation associated with strong interactions is the EDM of the neutron. At the relevant energy scale, QCD is deeply in the nonperturbative regime. This is a well-known and obvious drawback for any analytical approximation. Yet, one can observe from semiclassical calculations that instantons play a central role in the spontaneous breaking of chiral symmetry as well as in mediating the effects from the anomalous axial
symmetry that notably explains the large mass of the
meson [
29,
30]. It is therefore also strongly indicated that the role of the topological term can be understood from a semiclassical evaluation of the effective fermion interaction mediated by instantons, i.e., the ‘t Hooft operator [
29,
30]. This corresponds to the expectation that the presence or absence of
violation should prevail when crossing between the strongly and weakly coupled regimes at low and high energies, respectively. Moreover, the generic arguments in
Section 3 as well as in Refs. [
7,
19], where cluster decomposition and the index theorem are used, do not refer to the semiclassical approximation.
The semiclassical approximation therefore remains of substantial interest, being the only analytic procedure to make quantitative statements about violation in strong interactions. Also, it offers a very useful perspective on the central issues with this topic.
In the present context, the semiclassical approach is given by the dilute instanton gas approximation. Stationary and quasi-stationary points of the action are described in terms of instantons and their individual collective coordinates [
10,
11,
29,
30]. Stationary points are the classical solutions. These are the minima of the action for each topological sector characterized by winding number
. For
, they are given by Belavin–Polyakov–Schwarz–Tyupkin (BPST) (anti-)instanton solutions [
31], whose classical Yang–Mills action is
Explicitly, the BPST instanton reads in the regular gauge
where
and
are free parameters corresponding to the center location of the instanton and its size, respectively. Here,
are the ‘t Hooft symbols [
30]
Similarly, one can define
by a change in the sign of
in the above equation. For the anti-instanton, we should replace
with
in Equation (
20).
To visualize the BPST instanton (in analogy to Ref. [
32] for one-dimensional instantons), we consider some explicit expressions with
, i.e., with the center set at the origin. For example,
is symmetric in the hyperplane
and
symmetric in the hyperplane
. Without loss of generality, in
Figure 2, we show these as a function of
by taking
in arbitrary units. The field strength components read
As an example, we plot
as a function of
and
in
Figure 3. These quantities are gauge-dependent. The gauge-independent quantities are
For the anti-instanton, one would have
. We plot these in
Figure 4. From the graph, one can see that the instanton indeed has a radius characterized by the value of
(
in the plot).
For
, they should be obtained from the Atiyah–Drinfeld–Hitchin–Manin (ADHM) construction [
33]. In the dilute gas picture, they correspond to
instantons, no anti-instantons for
and
anti-instantons, no instantons for
. The collective coordinates for the individual instantons describe their size, their gauge orientation as well as their position in Euclidean spacetime. As we follow the steepest-descent contours, the action
S evolves towards larger values, and we can encounter quasi-stationary points. These can be described in terms of the number
n of instantons and
of anti-instantons, where both of these objects can coexist within such configurations. In the sector (i.e., on the thimble) characterized by
, it must hold that
. Each of these individual instantons and anti-instantons is again parametrized in terms of the aforementioned collective coordinates. In
Figure 5, we illustrate the typical structure of the thimbles in the semiclassical approximation.
Now, we aim to integrate out the gluon fields in order to see explicitly what quark correlations breaking the anomalous chiral symmetry
they leave behind. We follow Ref. [
7], but here, we work with Euclidean time for simplicity.
The relevant quark correlation function is given by
While this is a standard expression, one should note that the numerator and denominator in this equation are not well defined in the thermodynamic limit
, even when ultraviolet divergences have been renormalized. However, this does not force us to keep
finite. Rather, divergent extensive contributions in the numerator and denominator from spacetime regions far away from
x and
cancel. In standard perturbation theory, these contributions are represented by vacuum diagrams where the divergence results from their overall invariance under spacetime translations. We go into more detail regarding this point in
Section 6. In the present semiclassical evaluation, we shall see how to deal with these extensive contributions a bit further down the line of argument.
To proceed with the evaluation of Equation (
26), we approximate the Green’s function of the quarks in the background of one anti-instanton (
) as
The middle expression is the exact spectral sum representation in terms of the eigenvalues
of the Dirac operator of massive quarks in the anti-instanton background, and
are the corresponding eigenfunctions. As for the approximation on the right, by
, we denote the ‘t Hooft zero modes of the massless Dirac operator in the corresponding one anti-instanton or instanton background [
29,
30], which are purely chiral and where their handedness is indicated by
. The nonzero eigenvalues of the massless Dirac operator are given by
and the pertaining eigenmodes by
. Note that the contribution breaking chiral symmetry, i.e., the first term in the approximate expression, aligns with the
phase
pertaining to the quark mass and not with the angle
. For real masses, this approximation has been used in Refs. [
34,
35].
In the semiclassical approximation, we carry out the path integral by taking the quasistationary configurations of the action, i.e., with
n instantons and
anti-instantons, and evaluate the leading fluctuations, i.e., the functional determinants corresponding to one-loop order. For such a quasistationary background, the Green’s function for the quarks should be well approximated by [
35]
Here,
and
are the locations of instantons and anti-instantons, respectively, and
is the Green’s function of a Dirac fermion with mass
in a translation-invariant (i.e., void of instantons) background. This approximation neglects contributions from overlapping instantons, which are more suppressed as the instanton gas becomes more dilute. While the Green’s function close to the individual instantons and anti-instantons is dominated and therefore approximated by the ‘t Hooft zero modes, sufficiently far away from the points
and
, the Green’s function is given by the form in the background without instantons, i.e.,
In Equation (
28), we note the alignment of the instanton-induced breaking of chiral symmetry with the quark masses so that there is no indication of
violation at this level but also note that
has not yet entered into the calculation.
Given the Green’s functions (
28), we can proceed with evaluating the fermion correlation on the thimble (or equivalently in a fixed topological sector) characterized by the winding number
:
The symbol
implies that the path integral is evaluated in terms of fluctuations and moduli about the classical background, i.e., (quasi-)stationary point, made up from
n instantons and
anti-instantons. The integration over collective coordinates other than the locations of the instantons and anti-instantons are denoted by
, and the Jacobians from the transformation of the zero modes in the path integral in favor of the collective coordinates are denoted by
. The one-loop determinant of the gauge field about a single instanton or anti-instanton (denoted by
below) with the zero modes omitted and divided by the gauge field determinant in the background
is given by
where the prime on the determinant indicates the omission of zero eigenvalues. In an analogous manner,
represents the modulus of the ratio of the fermionic determinants in the one-(anti)instanton and
backgrounds,
As usual, the partition function diverges in the thermodynamic limit so that we keep the spacetime volume
finite for now. Nonetheless, we need to take
before eventually summing over the topological sectors
, as the latter are only a consequence of infinite spacetime volume and to remain true to the integration contour implied by Equation (
9), cf. the discussion in
Section 3.
In order to normalize, i.e., to divide out vacuum contributions, we also need the partition function in a fixed topological sector. Proceeding as for the fermion correlation, we obtain
Next, we turn to the collective coordinates and integrate out the location of a single anti-instanton as
The dots above represent contributions from the zero modes of the (anti)-instantons whose centers were not integrated over. This expression defines the overlap function
—a rank-two tensor in spinor space:
Further, we integrate over the remaining collective coordinates as
Notice that we ignore here the fact that for the classical instanton, the integral over the dilatational mode is divergent. The running coupling will however render the correlations finite in a more complete calculation.
Strictly speaking, the dilute instanton gas approximation is only applicable when the integral over the dilatational mode converges, i.e., when contributions from both large and small instantons are cut off. This is naturally the case in the ultraviolet, where small instantons are suppressed for asymptotically free theories as
. In the infrared, one may attempt to keep
g perturbatively small by a bespoke particle content that controls the running coupling. As a matter of principle, an infrared cutoff can also be enforced when the gauge symmetry is spontaneously broken so that the size of the instantons is limited by the inverse gauge boson mass [
29,
30]. While none of this applies to strong interactions, such considerations show that the dilute instanton gas is a meaningful concept. Note that an infrared cutoff for the instanton size has no implications for the integration over the locations
of the instanton centers. The preservation of Poincaré symmetry, and, in particular, Lorentz invariance, demands that the values of
should remain unconstrained. Hence, the dependence of the results on the spacetime volume is unchanged in the presence of a cutoff for the instanton size. As a consequence, the former has no consequence for the order of limits of infinite spacetime volume and infinite maximal absolute value of the topological charge. Note that even with the aforementioned size cutoff, it is clear that either expression (
11) or (
12) can be technically evaluated. Further, the presence or absence of divergences from infrared instantons does not decide which order of limits must be taken because the presence of sectors of integer
is a topological argument that does not depend on the validity of the semiclassical expansion. We eventually note here that the validity of the dilute instanton gas and its generalization toward the inclusion of interactions between instantons has been addressed in Refs. [
9,
10].
The present point of view is that the saddle point approximation in the dilute instanton gas approach, while not quantitatively applicable to strong interactions, yields information about the symmetries that are respected by the theory. This does not only apply to the present work that argues in favor of the evaluation of the partition function according to Equation (
11) but also to Refs. [
5,
18] that assume Equations (
12) and (
13). To our knowledge, Ref. [
18] is the only paper that explicitly evaluates the ’t Hooft operator for nonzero
. While the saddle point approximation is an important cross check, we note that the conclusion about the absence of
violation does not rely on it, cf. the boxed argument in
Section 2 and in the part of
Section 3 on the evaluation of the partition function and topological quantization as well as Refs. [
7,
19]. Furthermore, as will be reviewed at the end of
Section 6, the results obtained with the dilute instanton gas can be recovered from general arguments based on cluster decomposition and the index theorem, without making use of the dilute gas approximation.
Integrating now over all locations of instantons and anti-instantons, we obtain the correlation function for fixed
:
where
is the instanton density per spacetime volume and
is the modified Bessel function. (Note that in Minkowskian spacetime, we define
in Ref. [
7]. The
in both cases is the same and is real due to the fact that the Jacobian
J in Minkowski spacetime contains an additional factor of
compared to its Euclidean counterpart.)
The terms involving the overlap function
are due to the instanton effects on the quarks and break chiral symmetry. While we should expect that these scale in the same way with the spacetime volume
as the term with
, i.e., the contribution from regions between instantons, the explicit dependence on
in Equation (
38) is different. However, we see that the scaling after all is the same. Relax for the moment the constraint of fixed
and use that
may be interpreted as the likelihood for finding an instanton in a unit four-volume. Then, for large
, the sum is dominated by a particular value of
:
Moreover, the relative fluctuation vanishes in the infinite-volume limit [
7]:
This means that in the coefficients in front of the chiral projection operators within the middle expression in Equation (
38), we can replace
. This basic behavior, i.e., that the central value for the number of instantons is given by
, is also reflected by the fact that for large arguments, the modified Bessel functions become independent of their index, i.e.,
. Since all the modified Bessel functions in Equation (
38) tend to the same value, we see directly from this expression that there is no relative
phase between the terms from the quark masses and instanton-induced breaking of chiral symmetry in the infinite-volume limit. Correspondingly, the partition function for fixed
turns out as [
36]
Now, when calculating the correlation function as the sum over the topological sectors, we have to take the limit
first for the reasons explained in
Section 3. Because of the divergence in the thermodynamic limit, the numerator and denominator have to be treated together, and we obtain
In
Section 6, we show that this procedure amounts to dropping the divergent extensive contributions that correspond to the vacuum diagrams in standard perturbation theory. In this final result, the phase from the quark mass in
, cf. Equation (
29), is aligned with the phase from the instanton-induced effects in the term with the overlap function
, so that there are no
-violating effects.
One may wonder about Equation (
42) why we take the limit
in front of the fraction, whereas, by Equation (
9), it appears that it should hold for numerator and denominator separately in the first place. As we have noted though, without normalization by vacuum contributions, the partition function is not well defined in the thermodynamic limit. The present procedure is necessary to divide out the extensive contributions causing the divergence. It is unique in the sense that we carry out the integrals over each steepest-descent contour before interfering them. Doing otherwise would correspond to a partitioning and reordering of the full integration contour that consists of the steepest-descent contours connected via configurations of infinite action, see
Figure 1 for illustration. This amounts to an incorrect manipulation of a path integral that is not absolutely convergent.
Now, we consider what happens when the limits are ordered the other way around, i.e., sum over the topological sectors before taking
, according to Equations (
12) and (
13). We reiterate though that this procedure is not valid because topological quantization can only be deduced in infinite spacetime volume. As for the fermion correlation, one obtains
and for the partition function
Taking the ratio, the overall exponential factors cancel, but now there is a misalignment between the phases in
and in the instanton-induced term. This means that as
, there is an infinite amount of destructive interference that suppresses the statistically more likely contributions with approximately equal numbers of
n and
(see Equation (
39)) in favor of outliers for which
does not go to zero. Equations (
43) and (
44), if they were correct, would signal
-violating effects. Note that in either result, terms that break the
symmetry from both instanton-mediated effects and the quark mass
m are present. When we turn to the phenomenology of strong interactions and generalization to several flavors in
Section 7, we shall recall that both the breaking of chiral symmetry through the quark masses and from instantons are necessary in order to explain the spectrum of mesons, in particular why the
-meson is much heavier than the pions [
18,
29,
30]. In either order of limits, this phenomenology is explained. Therefore, the meson spectrum alone cannot be used in order to conclude the correct order of limits.
6. Thermodynamic Limit and Cluster Decomposition
We establish here that with the limiting procedure in Equation (
42), contributions that are divergent due to the infinite spacetime volume cancel between numerator and denominator. This corresponds to the usual cancellation of vacuum diagrams when evaluating connected correlation functions in standard perturbation theory (i.e., without expanding around nontrivial classical solutions).
The present argument is also interesting for what concerns Equation (
9). The partition function is defined in the limit
in the first place. This appears as an obstacle to using
as an extensive, volume-dependent quantity in line with what is familiar from thermodynamics. We shall see here that an expression with such a property can nonetheless be defined when restricting
Z to some subvolume of
. Note that as such a restriction is arbitrary, no boundary conditions on the subvolume can be placed.
While we are working here at zero temperature, we note that one can use the Polyakov line at finite temperature in order to control and study the deconfinement phase transition, including contributions from the gradient expansion of the quark determinant [
37].
We use a well-known line of reasoning [
38] and consider the expectation value of an operator
in an infinite spacetime volume
and interfere different topological sectors
as
where
is the path integral measure over all fields involved. Now, let
be an operator corresponding to a correlation function evaluated for some spacetime points. For example, in Equation (
42),
. For the action, we write
to indicate that it is obtained from integrating the Lagrangian over the spacetime volume
. As for the Lagrangian, we take it not to include the topological term
. Rather, we have the function
taking care of the dependence on the topological sector.
Now, consider partitioning the spacetime volume as
so that
. We further assume that the spacetime arguments of the operator fall within
, and we write
in favor of
to indicate this. We can thus write
Since there may be instantons sitting right at the boundaries of the two subvolumes,
will not be strictly integer. However, if the instanton gas is sufficiently dilute, integer winding numbers may still correspond to an adequate approximation.
Now, as required by the cluster decomposition principle, provided
is chosen large enough,
must not depend on contributions from
to the path integral. This is generally the case when the numerator and the denominator decompose into factors that only depend on
,
or
,
, respectively. Then, the contributions from the volume
can be reduced from the fractions. This generally happens when
Therefore, the contributions from the topological term, that we have left aside thus far, can indeed be accounted for through
. Note that the argument holds for either order of limits, i.e., the one from Equation (
9) that implies Equation (
11), which is imposed here, as well as for the commuted version from Equation (
12).
To carry out the limits, we write Equation (
46) as
Corresponding to Equation (
41), the integrations over the volume
lead to
The explicit exponential factors here are phases from the fermion determinants. Note that these have not been absorbed in
, which we have defined to be real.
Now, we are aiming for an expression for with finite , without making reference to . This proves to be possible because the contributions from can be interpreted as vacuum factors that reduce out from the normalized expectation value.
Since we must take
to have well-defined integer
, we also have to take here
. The Bessel functions with a factor
in their argument then go to a common limit so that we can factorize out the sum over
. We are left with
We therefore see that taking the limits as in Equation (
42) leads to the correct cancellation of “disconnected” terms, in particular those that originate from regions that are far separated from the spacetime arguments of the observable
.
Moreover, in Equation (
50), the
-angle from the function
does not occur anymore. We can see this as a consequence of phases incurred in
being canceled against complementary phases from
. The remaining explicit dependence on the unphysical phase
cancels when the fermionic part of the path integral
is carried out. Since the path integral here is restricted to
, which is finite, we can compute the expectation values in finite volumes after all from a partition function in the form of Equation (
12) but with the parameter
set to zero. This way, the logarithm of the partition function can be taken as an extensive quantity.
In the previous derivation, when going from Equation (
48) to (
49) we made use of the result for the partition function in the dilute instanton gas approximation, Equation (
41). However, it is worth pointing out that the latter result can be derived from the cluster decomposition principle alone, without making use of the dilute instanton gas approximation. One can start by noting that the factorization of the path integrals in the denominator in Equation (
46) can be written in terms of the following relations between the partition functions
in the full volume and their counterparts
,
for the subvolumes
,
Equation (
51) is an infinite set of identities that can be used to solve for
from a set of minimal assumptions. First, we note that
are complex. For starters, they receive a phase
due to the
term. Further complex phases in
can only come from the phases
of the fermion masses. At least at the leading order, the fermionic path integration yields determinants of the massive Dirac operator in a background of topological charge
, which can be fully general and is not assumed to be precisely captured by the dilute instanton gas approximation. The phase of the total fermionic determinant is then fixed by the Atiyah–Singer index theorem [
12] and for a single fermion is given by
. As a consequence, one can write
Parity considerations and appropriate limits of the cluster-decomposition relation (
51) can be used to motivate the simple ansatz [
7]
Notably, the previous ansatz together with the assumption of analyticity in
give rise to a unique solution for the infinite tower of identities in Equation (
51), which can be written as
As advertised, this recovers the result of Equation (
41) without making use of the dilute instanton gas approximation.
Equation (
54) can be taken even further, as it allows one to rederive the phases of fermionic correlators and confirm the conclusions of this section without using the dilute instanton gas approximation. Defining a complex mass parameter
as
the mass terms in the Lagrangian of Equation (
1) can be written as
Then, one can view the complex mass parameters as sources for integrated correlators,
As the partition functions of Equation (
54) have been derived on general grounds, the previous correlators are meant to include nonperturbative effects. Noting that the reality condition in the
parameter of Equation (
54) implies
and writing
yields the following spacetime-averaged correlators [
7]
It is readily seen that the total phase of the fermionic correlators, including nonperturbative effects, is aligned with the phases of the tree-level masses in the Lagrangian. This generalizes the result of Equation (
42) and leads again to the conclusion of no
violation. Again, the order of limits plays a crucial role in Equation (
58).
7. Effective Theories and Effective Operators
We shall now draw the connection from the results of the semiclassical approximation corresponding to integrating out gluons from
Section 5 with observables probing
conservation or violation in strong interactions. The main object of interest in that context is the ‘t Hooft vertex, which can be inferred from the correlation function (
42) as the Lagrangian term
This vertex generates the same correlation functions as in Equation (
42) for the EFT where gluons have been integrated out.
Figure 6 illustrates how such a model fits into the picture of the different EFTs discussed in the present context. In addition, there will also be in general nonlocal operators from the long-range interactions of the gluons, because with quark degrees of freedom still in the theory, there is no cutoff parameter that allows for a local expansion. The new operators appear in favor of the gluon kinetic term
as well as the topological term
, which disappear together with the gluons.
In the dilute instanton gas approximation in
Section 5, one has integrated out gluons in the semiclassical approximation. For this to be valid, the theory should be perturbative throughout, which can be achieved in principle by adding a bespoke matter content that controls the renormalization group evolution in this peculiar way. Certainly, however, this is neither the case for the theory specified in Equation (
1) with the gauge group
and one quark flavor nor for QCD with
and three flavors of light quarks. As a consequence, one should expect substantial deviations from the correlation function given in Equation (
42), in particular for large distances between
x and
. Nonetheless, there should still be a small distance, high energy contribution of this form. When drawing conclusions about
conservation or violation, one therefore must make the assumption that the
-odd coefficient of the ‘t Hooft vertex appears in the same way within the extra operators that have to be added in principle to account for the low-energy behavior. Note, however, that this shortcoming applies to the conclusions based on either order of limits when the calculation is carried out semiclassically.
To some extent, the above matter is addressed by the concluding argument from
Section 5, where the leading fermion correlations are constrained without the dilute instanton gas approximation but using instead cluster decomposition and the index theorem. There, no assumption about the fermion correlation is made but for its
-violating form. While the resulting fermion correlation then can only be stated in the coincident limit, the conclusions about
conservation based on the order of the infinite volume limit and the sum over topological sectors should therefore extend to the nonperturbative low-energy regime as well.
The underlying theory that we are concerned with after all is QCD, which is specified (now with the gauge group
,
flavors of light quarks and in
Minkowski spacetime) as (we choose
in Minkowski spacetime)
where in the mass-diagonal basis
For the model as in Equation (
60), the vertex corresponding to Equation (
59) is
where
We need to sort in what way (cf.
Figure 6) this is connected to the EFT of hadrons that is valid at low energies and should describe those possible
-violating effects that are accessible by current precision experiments. A principal obstacle to systematically deriving quantitative predictions lies of course within the circumstance that perturbation theory is not valid anymore at low energies.
Yet, the symmetries, even when realized approximately only, offer a standard method of constraining the EFT. In the fundamental theory, as well as on the EFT side, one can introduce operators of the physical fields coupled to external sources (sometimes called spurions) so that these operators are invariant under local symmetry transformations. On the side of the EFT, the coefficient of these operators has to be obtained through computational or experimental matching. Variation with respect to these sources then allows one to express matrix elements of the fundamental theory in terms of parameters of the EFT.
In the present case, we can apply this method by perceiving the quark masses that break the chiral flavor symmetries
as well as the operators breaking
as external sources that transform according to these explicitly broken symmetries. In the following discussion, we occasionally let
, meaning that only up and down quarks are considered, for simplicity. But for expressions explicitly depending on
, we keep
general. First, we parametrize a chiral transformation as
where
and
are independent unitary matrices. For an axial transformation,
so that the
transformations are given by
The Lagrangian (
60) would remain invariant if the mass matrix transformed as
In this transformation,
M corresponds to a spurion field.
The corresponding EFT Lagrangian (cf.
Figure 6) with the lowest-order terms is (see, e.g., Refs. [
8,
39,
40] where the effective theory is derived from integrating out quark fields)
where
In the equations above,
is the pion decay constant and
,
are EFT coefficients to be determined experimentally or computationally and
is directly related to the magnitude of the chiral quark condensate. The phases of the latter correspond to
, and we have assumed a diagonal mass matrix
M. The squared pion and
masses are then given by
where we have made the phenomenologically valid approximation that
. We leave the terms with the parameter
aside just yet as
is invariant under
but transforms with
in a way that we shall approach shortly. In correspondence with Equation (
65), the meson fields behave under axial transformations as
The term with the parameter
should be matched so that the correct correlation functions are produced. Corresponding to the invariance of the underlying theory (
60), the Lagrangian (
67) is invariant under the simultaneous
transformations (
66) and (
70).
Now, after all, the (up and down) quark masses do not transform under
, they rather break this symmetry explicitly. We can still perceive these as local sources though, that perturb the correlators of the theory about the case with full
symmetry. For the local source, we can then take the fixed physical values of
M so that Equation (
67) accounts for the perturbation through the quark masses to linear order. In the EFT, one can continue this to higher orders pending on the precision that is aimed for.
Now, consider
transformations
and recall the expression for
from Equation (
63). The fundamental theory (
60) would remain invariant if the quark mass transformed as
The chiral anomaly requires that the coefficient
of the topological term goes as
in order to keep the Lagrangian invariant. Note that this implies that the combination
is invariant under chiral rephasings and in general is nonzero. The presence of such an invariant does however not yet guarantee that it leads to physical effects.
We thus see that there are two local sources that transform under the symmetry
:
and
. Noting that under this symmetry
the EFT Lagrangian (
67) remains invariant if either
In principle, one may also allow linear combinations of the parameters
and
. As this does not follow from either order of limits for the sum over topological sectors and spacetime volume that we discuss here, we do not consider this combination option further.
We also note that the operator with the coefficient
breaks
. Therefore, instead of the quark mass phase in
M, one could also use
to write this as an invariant operator with the help of chiral-variant source fields. However, the symmetric theory should respond to
-breaking perturbations through a quark mass term in the same way as it does for
-breaking. In this sense, the term with
is unique to linear order in
M. The explicit breaking of
through instantons is independent of the quark masses, cf. Equation (
42) together with the fact that
, and, therefore,
M does not appear in the terms with
.
Now, recall that Equation (
42) leads to the effective vertex (
62) in the theory where gluons have been integrated out. At this level,
has disappeared so that the only option for the EFT Lagrangian (
67) is
The
-odd coefficients can then be removed by an overall field redefinition. On the other hand, if it were
, there would be a residual
-odd term.
Further, note that Equation (
69) shows that the mass of the
in general does not vanish in the limit of
, no matter which of the values
takes in Equation (
76). In turn, the fact that the
is heavy compared to the pions as such does not lead to a conclusion about which is the correct order of limits.
Finally, the parameter
in the coefficient of the ‘t Hooft operator enters the calculation of the nucleon EDM as follows: Given the EFT Lagrangian (
67) and choosing a basis in which
M is diagonal, the minimum of the field
U is given by
as in Equation (
68), where, in the limit of
, and for
in the first quadrant, one has [
41,
42]
Going beyond the assumption
leads to a mixing of the flavor eigenstates
and
within the mass eigenstates.
In order to expand in terms of the meson fields, following Equations (
68) and (
78) leads to
The operators in the second line are
-odd, and we note that
Substituting this into the term with
in Equation (
67) and expanding in the meson fields, one generally would obtain
-violating effects if
, the most immediate consequence of which would be
(recall that the meson fields are
-odd) through the interaction term
The latter expression (which follows when assuming
) is shown here for comparison with Equation (
8) of Ref. [
43]. In the latter, the quark masses are taken as real (see Equation (
5) in [
43]), which means that the parameter
of that reference should correspond to
in Equation (
74). Matching the resulting signs for the phases of the quark masses in Equation (
8) of Ref. [
43] leads to an identification with
. It then follows that up to the central issue that which value in Equation (
76) is taken by
, the results from the present EFT description and the partially conserved axial currents in Ref. [
43] are therefore in agreement, as they should be. We further compare the two approaches given different values of
in
Section 8. Finally, let us note also that the coefficient in Equation (
81) is different in the approximation of three light flavors where an extra factor of
occurs.
To see what the above
-odd interactions of the pions and
would imply for the nucleons, one can add their interactions to the EFT Lagrangian as
where the nucleon doublet transforms as
Again, promoting
M to a source that transforms under the axial symmetries rather than breaking these, this Lagrangian is invariant. Substituting the expectation value of the chiral condensate (
68), (
78) for small
, expanding in the meson field and applying field redefinitions
so as to obtain the canonically normalized flavor eigenstates of the nucleons, one finds the interaction terms [
41]
The first of these is
-even, as it couples two axial currents,
being a pseudoscalar field. The second term is
-odd, as it couples a scalar density with a pseudoscalar field. At one loop level, if it were
, this would induce an EDM through the famous diagrams shown in
Figure 7. Note that the weak interactions make an additional contribution to the neutron EDM [
44,
45], which, however, is too small and usually neglected in the discussion of
in strong interactions.