1. Introduction
Physics started as a study of motion in Greek times and was formalized by Aristotle (c. 350 BC) by a set of “laws”, which he declared as “self-evident truths”, based on his view of the universe as it was then visualized. The law for motion on the Earth was based on the nature of the object moving (how much of earth, water, air or fire there is in it), and for motion in the heavens on the “truth” that heavenly objects are made of the perfect element, aether, and hence move along perfect circles, unless they are contaminated by proximity to the Earth, in which case epicycles (perfect circles about a point moving in a perfect circle, or further repetitions thereof) develop. The closer the object to the Earth, the more epicycles it will have. Using this law the motion of the then known “planets”: the Sun, Moon, Venus, Mars, Mercury, Jupiter and Saturn, were supposedly explained using 127 epicycles. The number was reduced to 57 by Ptolemy (c. 150 AD) by extending from motion in a plane to motion in three-dimensional space and using spheres in place of circles. The Muslim scholars followed the same way of thinking, but Al-Zarkali (c. 1050 AD) was ready to use ellipses instead of perfect circles. Later, this work, and the astronomical data of Ulugh Beg (c. 1420 AD), led Nicholas Copernicus to heliocentric planetary orbits [
1] in 1543 AD, with the Earth replacing the Sun as a planet. Johannes Kepler [
2] replaced Copernicus’ circular orbits by ellipses in 1619, which finally led Isaac Newton to his laws of motion and universal gravitation [
3] in 1687. While Newton’s law is supposedly universal, his methods work well only for two bodies and become unwieldy for several bodies. The method was extended by Joseph Louis Lagrange in 1778 and 1779, [
4] and later used by William Rowan Hamilton [
5,
6] in 1834 and 1835, for systems of particles. Newton had used Calculus for his purpose and Lagrange used the Calculus of Variations that Leonhard Euler [
7] had fully developed by 1773. It was this formulation that Emmy Noether had used, which completed the classical view and led to the modern view of mechanics.
Emmy Noether (1882–1935) was a German female—and in those days, German females could not enter academics. However, she had remarkable mathematical capability and was able to get the support of David Hilbert to work in the field unofficially. Finally, she had to emigrate to America to achieve her true potential. She made numerous contributions in various branches of mathematics, but we are concerned with her contribution to the use of symmetry in mechanics.
In normal parlance, “symmetry” is used in an aesthetic sense to express balance and harmony, as in William Blake’s poem, The Tyger:
“Tyger, tyger burning bright;
In the forests of the night;
What immortal hand or eye
Could frame thy fearful symmetry?”
This is almost as common is its use in the geometric sense of leaving a figure or shape invariant under some transformation, such as reflection or rotation. It is odd, however, that the common use has such a strong hold that many do not realise that one can count the symmetries of objects. (AQ has even known an expert in differential topology to objet to the idea of counting symmetries.)
The geometrical concept of symmetry had come from the Greeks. However, while considering solutions of polynomial equations of degree 5 or more in 1771, Lagrange extended the concept to invariance of polynomials under permutations of its elements [
8]. While this led to many other developments in Algebra, I am here concerned with its use by Abel [
9] and Galois [
10] to invent Group Theory, so as to prove that there could be no solution of quintic or higher degree polynomial equations by means of radicals. In particular, Abel’s work led to the Abelian Group and Galois’ to the Galois Group. This inspired Sophus Lie (1842–1899) to try to emulate the success of Abel and Galois for differential equations in 1883 [
11]. Note the leap over all other algebraic equations to reach out to
all differential equations. This was overambitious and Lie never managed to complete the attempt. Nevertheless, it led to enormous developments in the solution of nonlinear differential equations. This will be discussed in the next section.
One might think that Geometry and Dynamics had no contact until the time that Albert Einstein and Marcel Grossmann used Geometry to generalize the Restricted Theory of Relativity [
12], but that is not the case at all. As pointed out by Julian Barbour [
13], starting in antiquity and going through Copernicus, Kepler, Galileo and Newton, Kinematics and then Dynamics, have been inextricably entwined with Geometry. At the base of the link between them is the idea of symmetry. Aristotle insisted on perfect circles because he perceived them as the most symmetric figures possible. Ellipses, to the contrary, were perceived as imperfect, and hence not to be used for celestial motion. It took a deeper, hidden, symmetry for the ellipses to be perceived as “beautiful” by Kepler. We will see how the hidden symmetry was uncovered by Noether. This symmetry could not have been understood till the advent of calculus and it was the geometry that used calculus, differential geometry, that Einstein and Grossmann introduced into considerations of dynamics. We will be concerned with the importance of the not-so-obvious symmetries that have become all important in modern physics.
The plan of the paper is as follows. In the next section, we will briefly review Lie’s
Symmetry Analysis and go on, in the subsequent
Section 3 and
Section 4, to review Euler’s variational principle for particles and fields, respectively, and its use by Lagrange and Hamilton in the principle of least action and the equations of motion of Hamilton and of Euler and Lagrange. A geometrical application is given in
Section 5, to generalize the concept of a straight line in flat space to curved spaces. The applications of the Noether’s theorem in classical mechanics, economics, classical field theory, relativistic field theory, will be discussed in the
Section 6. Some extensions in obtaining Noether symmetries and Noether invariants will also be given in this section. Some applications of quantum field theory will be given in
Section 7. Complex Lie and Noether symmetries will be considered in the
Section 8, and a discussion and conclusion presented in the last section.
2. Lie Symmetry Analysis
Before Lie, the usual method to solve a differential equation (DE) was by ad-hoc approaches or by approximating it by a linear DE and solving that. In general the approximation will work well enough in some domain and become arbitrarily bad in others. Thus, one would need to prove the existence of a solution and to determine the domain in which the approximation is good enough. Since these will be different for each DE, one is reduced to solving one DE at a time and cannot rely on any method for whole classes of DEs. Among the methods that had been used for solving nonlinear DEs was the transformation of independent and dependent variables. Analogous to Abel and Galois, Lie looked for invariance of the DE under such transformations [
11,
14,
15,
16], so that it could be determined when the DEs could be solved, or their order reduced, by transformation, and then proceed with the transformation. Lie used not only the groups of symmetries, but the algebra of the corresponding infinitesimal symmetry generators. The DEs are not necessarily single (scalar) but could be systems of (vector) DEs. Further, he did not restrict the domain of the DEs to be real, but took it to be complex.
For completeness we start with basic definitions so as to present the notation used. If there are
l independent variables represented as a vector
and
m dependent variables represented by
, a
Lie point symmetry generator is the operator
or using indices
a for the independent variables and
i for the dependent variables
where the Einstein summation convention, that repeated indices are summed over, has been used. Further, if the DE is of order
n, one must prolong the space and the generators to incorporate all the derivatives of the dependent variables with respect to the independent variables. For ordinary differential equations (ODEs)
where
simply being
, and
is the
total derivative in the prolonged space,
For partial differential equations (PDEs),
A has to be replaced by
and
by
in Equation (
3). While the former is easily converted to index notation as
and
in Equation (
3), for the latter one has to write
, which is the partial derivative with respect to
all to
all orders up to
k.
The set of all prolonged symmetry generators,
, forms a
p-dimensional Lie algebra, which determines what reduction there can be of the DE. It is a Lie algebra if the commutators of the symmetry generators satisfy
where
are constants, called
structure constants, that determine the structure of the Lie algebra. The generators must also satisfy the Jacobi identities,
The square brackets in the subscript denote a skew linear combination, signifying that the terms with the suffices in an even permutation are positive and those in an odd permutation negative, so that interchanging any two indices reverses the sign of the expression. The total expression is divided by the factorial of the number of indices involved. A system of
m ODEs of order
n,
is said to be
symmetric under the transformation generated by
, if
, when restricted to the solutions of
, which is denoted by
. The generalization to PDEs is as before, with the corresponding complications.
It is worth pondering the prolonged space. For a scalar ODE there are only two variables, the independent and dependent, and so the manifold considered is only two-dimensional. However, for a DE we have to treat the derivative as unknown, for if it were known the DE would already be reduced or solved. Hence, it must be treated as another dependent variable, leading to a three-dimensional manifold. If the DE is of second order, we would need to also treat the second derivative as an independent variable. Thus, for an nth order ODE, we need to use an - dimensional space. This is the prolonged or extended space, also called a jet space. For an m-dimensional system of nth order ODEs we need an -dimensional space. For a PDE of l independent variables the dimension is . Since the largest group acting on an dimensional manifold is , that would be the upper limit for the algebra. In fact, since the algebra of infinitesimal generators must leave the group identity out, and only add to it, the relevant group would be and the corresponding algebra , which has generators. Thus, the dimension for the general PDE mentioned is . Obviously, the dimension of the manifold rapidly becomes unwieldy with increase in order, and the number of independent and dependent variables. Even for a second order, two dimensional PDE of two variables the dimension is 66, and this ignores other complications arising for PDEs, which will be discussed later. This is what makes it necessary to use algebraic computing programmes.
3. The Variational Principle for Particles
The laws of refraction had been explained by assuming that light takes minimum time to go from a point in one medium to a point in the other medium, taking into account the change in its speed in the media. This principle was used by Lagrange [
4] for his extension of Newton’s mechanics. He wrote Newton’s mechanics for a system of particles as if it were a single particle in a higher dimensional space. For
N particles, Newton’s laws would be written in terms of their positions,
, and velocities
(
). Lagrange wrote them as
. This change makes it possible to also incorporate
m constraints between the generalized coordinates, so that the dimension of the space for the single particle is
. He then required that the
free energy, i.e., the difference between the kinetic and potential energy, be minimum, over the entire motion. This is called the
principle of least action.
Lagrange’s methods have been used in areas far removed from mechanics, as with in economics. The function to be minimized (with or without constraints) is called the
Lagrangian,
, and the
action is defined as the total of the minimized quantity over the time period from the initial time,
, to the final time,
:
In economics, the
would represent the quantity of a commodity and its time rate of change would be directly related to the price of that commodity in the market. The Lagrangian would be the cost for all things bought at the time, and the action would be the total money spent. It is useful to think of this analogy for mechanics. In that case, the money spent at one instant is the free-energy and the total money spent is the total energy spent. The object “wants” to spend the least energy to get from the start to the end, and “chooses” the path that will do this.
The original mechanical application of Lagrange’s formalism regarded the Lagrangian as the difference between the kinetic and potential energy
where the explicit time-dependence is absent. This was because in the system of the Sun and planets the gravitational potential remained constant, and there was no meaning to the kinetic energy changing with time. The economic analogy replaces the kinetic energy by the profit and the potential energy by the loss. In economics, we are familiar with money “evaporating” due to inflation. Similarly, the potential can be made time-dependent and energy can be lost to friction, or be radiated away. Hence, there should be explicit time dependence. Not only may the system lose energy, it could gain energy. The corresponding phenomenon in economics is “deflation”, where the money appreciates in value. One might think that would be desirable, but it leads to the economy slowing down. In physics, one gets more energy, but it is unusable as it is thermalized; it would lead to a “heat death”.
For the action functional,
S, to be minimal when we vary the functions that are its arguments by
, it must be unchanged, i.e.,
. Writing this explicitly,
If we require that the initial and final positions of the system of particles are fixed, then
. Writing the variations in the integrand in terms of
and integrating the
by parts using the above boundary conditions,
Since the
in the integrand is arbitrary and the integral is zero the rest of the integrand must be zero. This gives the time-dependent Euler–Lagrange (EL) equations for a system of particles
It is easily checked that by the EL equations,
. Assuming that there is no explicit time dependence in the Lagrangian,
is a conserved quantity, which is the energy, and is called the
Hamiltonian.
So far, only a system of particles has been considered. In earlier times a fluid was generally regarded as distinct from particles, though people such as Democritus (c. 460–370 BC) argued that even water must finally be particulate. With Robert Boyle (1627–1691 AD) and John Dalton (1766–1844 AD) the idea was lifted out of the realms of Philosophy to a scientific base in Chemistry. Later, Daniel Bernoulli [
17] treated the flow of fluids mathematically as a nearly infinite system of particles. The term “molecule” was coined much later by Amedeo Avogadro [
18] and defines our modern view of material fluids. He (and his brother Johann, who claimed that he had written the work prior to Daniel) sent their work to Leonhard Euler.
As the third term comes from the explicit time-dependence of the Lagrangian, if it is positive it corresponds to dissipation of energy in physics, such as the velocity dependent friction. For the universe as a whole, i.e., in the cosmological context, it corresponds to the energy getting absorbed into the expanding spacetimes, such as water into an expanding sponge. In economics it would be the money “evaporating” by inflation. To the contrary, if it is negative, it corresponds to absorption of energy from the environment in physics and to energy getting “squeezed out” of the spacetime, as from a sponge. In economics it corresponds to deflation. If the analogy holds, we can expect the cosmological deflation to “squeeze out” thermal radiation. Instead of merely being crushed to death, the universe would be broiled and crushed to death.
4. The Variational Principle for Fields
Even if all material objects, including fluids such as water and air, are made of particles, there are so many particles that one might as well take them to be infinitely many. Even that would not provide us with the full power of calculus, so it was worthwhile to use the continuum limit. Further, with Michael Faraday [
19] and James Clerk Maxwell’s [
20] theory of electricity and magnetism, they had to be treated as continuous fields. As such, it became necessary to extend the Lagrange formalism to a continuum in space and in time. Faraday and Maxwell’s electric and magnetic fields,
and
, pervade all of space at all times. Let us briefly review the developments.
Faraday’s law relates changes in the magnetic field over time to changes in the electric field over space. Maxwell believed that the converse effect should also hold and modified Ampere’s law relating the current density,
to a magnetic field varying over space, to include a change of the electric field over time. The Gauss equations use his divergence theorem to relate the divergence of the electric and magnetic fields to the strengths of their sources. Since there is no “magnetic charge”, the divergence of the magnetic field is zero. Since electric causes have magnetic effects and vice versa, Maxwell unified the theories of electricity and magnetism to electromagnetism. The theory was formulated mathematically in a set of four PDEs, written in the cumbersome formalism of the time. In more modern notation they are:
in electrostatic units, where
is the charge density,
is the dielectric constant of the material and
its magnetic susceptibility (see, for example [
21]). The vacuum has a dielectric constant and magnetic susceptibility, denoted by a subscript zero. Thus, the spacetime that carries the field takes the place of the aether–Not a mechanical aether but an electromagnetic aether-which supports the field.
Notice the symmetry in these equations, in that two deal only with fields and two have material sources involved (the charge and current density). Even in the absence of these sources at some point in space at some time, the fields can still exist. There is also a symmetry between space and time in the equations. However, in Maxwell’s theory there is a scalar potential,
for the electric force and a vector potential,
, for the magnetic field, so that
and
. While we can restore the symmetry of the Maxwell equations by considering the source-free case, how can we obtain a symmetry in the potentials? In his seminal paper on special relativity, Albert Einstein [
22] brought out the symmetry by unifying space and time into a single
spacetime, so that a point in spacetime is given by a
four-vector,
. In the same way the electromagnetic potential is a four vector,
. Though the first field theory was for fluids, that was artificial, as it dealt with discrete particles
as if they formed a continuum. In this case, the fields
are continua. In four-vectors we can write the Maxwell fields in terms of a tensor,
or, in the language of forms
. The corresponding Maxwell equations are
In fact, inserting the definition of the Maxwell field as the skew derivative (generalized curl) of the four-vector potential, the second equations are easily seen to be identities,
, as the exterior derivative,
, is associative. In other words the definition of the four-vector potential makes the magnetic Gauss law and Faraday’s law into identities, while the physical content now resides in the equations with sources: Gauss’ electric law, and Ampere’s modified law. The difference between “;” and “,” is explained in the next section.
It is easily shown (see for example [
12]) that in the absence of any source the electric and magnetic fields satisfy the wave equation and the speed of the electromagnetic wave,
, so that the speed of light in vacuum is
. Since the dielectric constant and magnetic susceptibility of the vacuum are less than for any material medium, the speed of the electromagnetic wave in vacuum is the maximum speed of these waves. Maxwell had already noted in his work [
20], that this is the speed of light. Einstein had pointed out that it is the maximum attainable speed for
any form of matter or energy [
22,
23,
24,
25,
26].
We need to repeat the variational procedure for fields in
n-dimensional spaces and four-dimensional spacetime, as slightly new features arise both times. Let us keep in mind three-dimensional spaces first, but be ready to extend the three to
n. A scalar field,
, is an explicit function of the independent variable,
t and of the three spatial variables
. Though this is not necessary, we limit the Lagrangian to be a function of the field and its first total
t-derivative,
, where
. The problem now is what is meant by the derivative of the functional
with respect to a
function , rather than a continuous variable such as
t? For the latter we just take the limit as
tends to 0. A function can tend to zero in infinitely many ways. As such, we need a measure of the function, let us say the root mean square norm, and then let
that tend to zero. To distinguish between the two concepts of derivative, we use
in place of
∂. Now repeating the previous variational procedure, yields the EL equations for scalar fields
where
and
stand for the partial derivative of
relative to
t and
, respectively. For use in mechanics we can take
, but in economics we have to allow for all the commodities. It would be worth exploring the use of fields in economics. If, instead of a scalar field, there is a vector field,
, the EL equations become
You might have expected that the examples of the electric and magnetic fields would be given here, but the fact is that a field theory of one without the other would be incomplete. We need to use the relativistic electromagnetic field,
. However, there is a complication. In relativity, time is like any other coordinate, so we do not have a separate independent variable, but four independent variables, and the field is
. Now, the variation is with respect to each function of each variable. The EL equations for a relativistic vector field become,
Using the electromagnetic Lagrangian,
in the above, the EL equations give the first of the Maxwell Equation (
16). The other is, of course, an identity.
5. A Geometrical Application of the Lagrangian
The geometrical Lagrangian is obviously for a continuum, but not so obviously for a field theory. If we limit our discussion to three-dimensional space, or surfaces, or even to Minkowski space, it is
not for a field. As we saw, electromagnetism, which is the epitome of a field theory, has a vector field,
. Geometrically, the arc length square is given by
where
is the matrix representation of the metric tensor,
g, in some
n-dimensional coordinate system in index notation and
are infinitesimal changes of the position vector in those coordinates [
21]. Thus geometry needs a (second rank) tensor field,
. Note that not only the metric coefficients, but the metric tensor itself, varies from point to point.
The shortest path between two points, P and Q, called a
geodesic, is obtained by minimizing the integral of the arc length,
, along the path from one to the other,
where
is the Lagrangian, which has a constant value as a
function of
s, but as a
functional it depends on the position and velocity vectors. Using the EL Equation (
11), without explicit dependence on the parameter
s, we obtain
The total derivative of
is
and, since
is an explicit function of
but not of
s,
. Hence
As the metric tensor defines the length of a vector, so it and its inverse must exist at every point. In index notation we write the inverse as
such that
, the Kronecker delta, which is the identity matrix in index notation. Multiplying (
23) through by half the inverse metric, relabeling dummy indices and transposing the term on the right side, we obtain the
geodesic equation
where
is the
Christoffel symbol, defined by
which gives the difference between the covariant derivative denoted by “
” and the partial derivative denoted by “
”, namely for any
,
One of Euclid’s theorems for a plane says that, the shortest path between two points is the straight line joining them. Of course, there is no straight line in a curved space (like the surface of the Earth, where straight lines in three dimensions are excluded). The
straightest available path in an
n-dimensional space, is the curve whose unit tangent vector,
, does not change direction along it, i.e.,
. Now, we could write the tangent vector in the local coordinates discussed, with the basis vectors,
, so that
. Since every vector can be written as a linear combination of basis vectors, so the partial derivative along along any direction can be so written. Hence,
, As such the Christoffel symbol arises from the differentiation of the basis vectors. It is shown in Differential Geometry (see for example [
21]) that this set of coefficients is given by (
25). Thus, the generalization of Euclid’s theorem to curved spaces is: “the shortest
available path between two points, is the straightest
available curve.” In a flat space in Cartesian coordinates the basis vectors are constant, but in general, such as in Gauss’ theory of surfaces, they vary. The simplest examples are of the polar basis vector in a plane in polar coordinates, and all basis vectors on a sphere.
6. Symmetries of Fields and Lagrangians
What would be meant by the symmetry of a field? For one thing, the field may not depend on some independent variable(s), such as time or position. Writing the field, which may be a scalar, vector or tensor, as , then , or the equivalent for one or more position variable. In that case we say that , or the equivalent for some position variable is a translation symmetry of the field. On the other hand, it could be that the field itself “inflates” (or “deflates”) with time, or along some spatial direction. In that case or say, is called a scaling symmetry of the field. Even if the field depends on the independent variables, it could be that some physically (or economically) relevant quantities do not. For example it may be that the energy and momentum re-scale in such a way that the difference between the square of the energy and a constant times the square of the other is constant, as in special relativity, where . In this case we say that the mass, m, is an invariant and the corresponding infinitesimal symmetry generator is , which gives the energy momentum four-vector. Notice that this says that energy-momentum is collectively but not separately, conserved. The conserved quantity is the Hamiltonian, . In physics, one would call the former conservation a “conservation law”, but people in symmetry analysis call the latter by that name. There can be other symmetry generators, such as a combined scaling symmetry such as , along the line , or rotations given for example by for rotation about the -axis. In relativity, the Lorentz transformation for uniform linear motion in the x-direction is . Thus, in special relativity, this would be a conservation law. There can also be symmetries of combinations of the field, or of it and the first derivative of the field, but not of the field itself.
Of special relevance are symmetries of the Lagrangian, because it gives the dynamics arising from the variational principle for the Lagrangian, while also allowing a reduction of the number of variables that are involved in the DE, or reducing its order. This double reduction makes these symmetries especially useful. They are called Noether symmetries. Noether’s theorem says that to each symmetry of the Lagrangian, there corresponds a conserved scalar quantity (called a Noether charge). Each conserved quantity is a first integral of some part of the equations of motion. At the same time, when we use that quantity as a new “variable”, it reduces the equation by trivializing that part of it. The most obvious invariant is the Lagrangian itself by definition. This is not useful for any reduction, as it is tautologically true. In fact, scaling it by a constant cannot change the equations of motion, since they are linear and homogeneous in the Lagrangian. The great thing about using Geometry for kinematics or dynamics is that in that case every geometrical symmetry will be a non-trivial Noether symmetry and provide a double reduction for the equations of the theory. The charges give us physical information and help to reduce the order of the equations. Before continuing with the applications, we need to present the definition of Noether symmetries and explain how they are determined.
6.1. Noether Symmetries
Noether’s theorem [
27] is applicable to a dynamical system of ordinary or partial differential equations obtained from a variational principle [
28,
29,
30,
31]. Let
be
ℓ independent variables and
be
m dependent variables which are arbitrary (sufficiently smooth) functions of independent variables. The total derivative operator
given in (
5) can be recast into the form
where the derivatives of
with respect to
are represented by
, and so on. Then, the Euler–Lagrange operator, for each
i, is defined by
We now consider the EL equations of motion
which is an
th-order system of
PDEs or, the ODEs if
. Equation (
29) is assumed to be of maximal rank and locally solvable, where the collection of
Nth-order derivatives is denoted by
. If there exists a function
,
, such that (
29) is equivalent to
then
is called a Lagrangian of (
29). Here the
th prolonged operator has the form [
32]
where the operator
is of the form
and
and
are defined by
A Noether symmetry generator corresponding to a Lagrangian
is the operator
in (
32) if there exists a vector
, or a function
if there is only one independent variable (i.e.,
), such that
where
. Further, the operator
, which is also called the Lie–Bäcklund symmetry, is a Noether symmetry of
corresponding to an Euler–Lagrange equations (
30) if and only if the Lie characteristic function
of
is also the characteristic of the conservation law
where
has the form
and the Noether operator associated with the operator
is defined by Ibragimov [
33] as follows:
Here,
will be called a
conserved vector of the EL Equation (
29), or a
conserved quantity (or
first integral) if
. Notice that for variational problems with Lagrangian functions depending on higher-order derivatives, the main conservation theorems are valid, but the conserved vector (
36) for a Lagrangian function depending on any order derivatives has a different form.
6.2. Classical Mechanics
Time-translational invariance of the Lagrangian in classical mechanics reduces the number of variables in the equations of motion from four to three and it implies energy conservation. The former property makes it easier to solve the ODEs involved, but the latter allows one to know the answer without solving the equations. For a simple harmonic oscillator, one can use the further spatial symmetry to reduce them to a single second-order ODE with constant coefficients, but the momentum conservation gives planar motion and the law of energy conservation says that a frictionless oscillator will continue its motion forever, and that the greater the friction the faster the motion will die out. The formal aspects of the calculations of Noether symmetries in this case are explained below.
Consider a first order Lagrangian for only one independent variable, so that the EL equations of motion are given by (
11). Then the energy functional associated with
is defined by
which is also the Hamiltonian of the system. This yields the first integral as a system of ODEs of the form
The Noether symmetry generator for this Lagrangian is given by
if there exists a function
and the Noether symmetry condition
is satisfied. Here
is the first prolongation of Noether symmetry generator
, i.e.,
where
. For every Noether symmetry generator
, the conservation law (
36) becomes
, where
I is the corresponding Noether flow, and it has the expression
which is a
conserved quantity of the system of Equation (
39).
There is a second way of finding symmetries of a Lagrangian
for a given dynamical system, called the
strict Noether symmetry approach that yields
, where
is the Lie derivative operator along
[
34]. The only difference is that here
K vanishes. Both approaches are useful in a variety of problems arising from physics and applied mathematics, and lead to first integrals. The important thing is that which approach yields a conserved quantity. The classical Noether symmetries have the advantage of yielding conserved quantities or conservation laws, directly [
30]. Cyclic variables are also used and are related to Noether symmetries, but there is ambiguity in their choice. For more details see [
35].
Another example is of a particle that would normally undergo geodesic motion, but is forced off it, such as a charged particle in an EM field. For the present purpose take a scalar potential
, then the geodesic Lagrangian describing the motion of the massive or massless (i.e., lightlike) particles can be written as
which gives rise to the forced geodesic equations of motion
where
is the conservative force field. For every Noether symmetry, there is a
a first integral for the system of Equation (
45) of the form
where the
energy functional (
38) for the geodesic Lagrangian is given by
which is the Hamiltonian of the system.
6.3. Economics
Shifting from mechanics to economics, time-translational invariance of the Lagrangian implies that prices will stay constant over time and the total amount in the economy stays constant. “Amount of what?”, you ask. This hides an aspect of economics with no classical mechanics analogue. Two distinct quantities may be taken: wealth; or money. Wealth refers to the actual goods and services in the economy, while money is an arbitrary measure for that wealth, called a “numeraire”. Being arbitrary, we can change the amount of wealth to a unit of money. The Government may do this to count in wealth that would be there in the economy, but not as yet registered in the accounting; or to appear to be richer. In the former case, there is a growing economy, and the “value of money” remains constant. In the latter case, the value of money will decline while the economy remains stagnant. The Government could even anticipate the wealth yet to be generated, before it has been, so as to accelerate economic growth. As such, it would “borrow from the future”. This is called “credit creation”. It can lead to runaway inflation leading to a credit crunch, as has been seen.
Already we have seen economic insights provided by the invariant beyond the benefit of reducing the number of variables in the EL-equations. More follow. In 1945, John von Neumann demonstrated [
36] that the rate of interest equals the rate of growth for an optimally growing economy. In other words, for optimal growth, the amount of money in the economy must grow at the same rate as the amount of wealth. It was then shown that [
37,
38] that there must be inflation in a growing economy. The excess money behaves in much the same way as the entropy in physics corresponding to the shortfall of efficiency of a heat engine. It should be noted that the Noether invariant in both applications (physics and economics) has significance
far beyond its use for reducing the number of variables and order of the governing ODEs.
Though there is no analogue of the two ways of considering the “amount” of something in classical mechanics, there is in relativistic mechanics. The measures of length and time vary from observer to observer, since they are arbitrary measures of the invariant length and duration. Consequently, the invariants play a far more significant role in relativity than they do in classical mechanics. To explain this will need to provide a quick review of geometrical symmetries.
6.4. Geometrical Symmetries
There are two generalizations of the derivative for a manifold. One is to first map an open set on the manifold to a coordinate frame, , take the derivative along a vector in the usual way and then map the derivative back to the manifold. The second is to directly take the rate of change by infinitesimal movement along a curve on the manifold. The former is called the intrinsic derivative and the latter the Lie derivative. The former procedure obviously incorporates the derivatives of the basis vectors, while the latter does not. The intrinsic derivative, left in the coordinate system, is called the covariant derivative and denoted by “;”, as we saw. The Lie derivative along a vector field, , will be denoted by . If the intrinsic derivative is used in a Taylor series to move from one point on the manifold to another, it is called parallel transport, as a curve traced out by transporting a vector field, , along a curve with tangent vector field , will appear parallel as seen in the coordinate system. However, it will not be parallel on the manifold. If the Lie derivative is used for moving on the manifold it is called Lie transport.
This is most easily seen by taking a sphere as the manifold and the base curve to be a line of latitude. Transporting a unit North pointing vector traces out the next line of latitude. At the equator it is not so obvious, but near the North pole, as we know from our school geography, on the map of the Earth it does not look parallel, and the square on the map does not look like a square on the globe. The reason is that the lengths of the upper and lower sides of the “square” on the globe are unequal, but the angles subtended by them are the same. Thus, if takes point P to point Q on one line of latitude, and takes P to R on the same line of longitude and Q to S on the next one, by definition of subtending the same angle, must take R to S, so is Lie transported along and the square on the globe closes. However, since the lengths of the upper and lower are unequal, it cannot close on the map.
Let us be a bit more technical. If is to be Lie transported from P to Q by , then . Now, if is invariant under this Lie transport, then clearly . In this case, the “square” mentioned above, or more generally the “rectangle”, closes and so . Taken into the coordinate system, . Now, since the Christoffel symbol is symmetric the terms involving it on both sides, and , cancel and so we have . This is the way that the derivative of the basis vector is removed from the Lie derivative.
6.5. General Relativity
First let us introduce general relativity (GR). It requires that all observers be equally good. “Observers” are conceived as disembodied persons—massless points endowed with a clock attached to a spatial frame of reference—who move on geodesics. “Equally good” means that physical laws are no simpler in one frame than another. This does not necessarily mean that one cannot distinguish between different frames. For example, one could feel acceleration, so a frame of zero acceleration could be determined, but the presence or absence of acceleration would not be relevant for the statement of a physical law. As such, those laws must be defined on manifolds. Special relativity (SR)
does assume that there is no way to distinguish between two un-accelerated frames. However, this must again be seen in a special context of two observers communicating with each other, without reference to a third [
21]. By choosing the frame of reference in which the cosmic microwave radiation is isotropic (up to statistical fluctuations) we
can determine the rest-frame of the universe. As such, GR entails using Lie derivatives and Lie transport.
As in the earlier example, let us take
to be the unit tangent vector to the geodesic of an observer,
O and
be the position vector of another observer
relative to
O. Then
and
, i.e., the velocity and the acceleration of
as seen by
O. In Geometry, the latter is called
geodesic deviation and will be denoted by
. In index notation
Using the Lie transport requirement we can interchange the
t and
p inside the bracket and, on expanding the bracket use the geodesic condition on one term. What remains is
Now, by definition, the Riemann curvature tensor is defined [
39] by the skewed second derivative of a vector, so that we get
This means that acceleration is related to the curvature of the spacetime manifold. Physically, acceleration is caused by matter and energy. We saw how energy conservation arose for a system of particles in classical mechanics. It would be useful to generalize that concept to other physical fields in a four-dimensional spacetime.
On account of mass-energy equivalence, we no longer have conservation of mass and energy separately, but only of the matter-energy tensor, also called the stress–energy tensor, (see [
39], Ch.4). It includes the momentum 4-vector and spatial fluid stress tensor
, where
is the force acting on an area element of the fluid,
. If it is an irrotational perfect fluid, the stress–energy tensor density is
and the usual conservation laws for relativistic fluids, derived by the requirement that the flux of fluid and energy crossing a closed surface is conserved. Using Gauss’ divergence theorem, it gives
For a field with a vector-valued potential
, for an irrotational perfect fluid, the stress–energy tensor density is
For
we get a scalar field, for
we get a scalar field, and for
we get a tensor field. For any number of fields, we simply have to add the stress–energy tensor for each one to get the total stress–energy tensor.
Returning to GR, we need that
be related to the Riemann tensor,
, by a second rank tensor function, which is divergence-free. Since the function is to be second rank and the Riemann tensor is fourth rank, we need to take the trace of the Riemann tensor, namely the Ricci tensor. Using the Bianchi identities that are satisfied by the Riemann tensor, we obtain the linear combination of the Ricci tensor and scalar that is divergence-free,
, called the
Einstein tensor, yielding the simplest non-trivial relation, called the
Einstein Field Equations, (EFEs)
where
, gives the coupling of gravity with matter,
G being Newton’s constant,
is a constant of integration, called the “cosmological constant”. In their full generality they are a system of ten non-homogeneous, second order nonlinear PDEs for ten functions (metric coefficients) of four variables. Even given the source, it would be impossible to solve them generally.
Let us now take a quick look at Noether symmetries for the geodesic Lagrangian (
44) in GR for some well-known spacetimes which have been used ofr classifiation according to their symmetry generators. This has been conducted for static plane, static spherical, and static cylindrically symmetric spacetimes by Feroze and her collaborators [
40,
41,
42,
43,
44,
45]. The complete classification of non-static plane and non-static spherically symmetric spacetimes via Noether symmetry worked by Jamil et al. [
46,
47]. The Lie and Noether symmetries of geodesic equations have been studied for the Friedmann metrics by Tsamparlis and Paliathanasis [
48]. These symmetries have also been obtained for some of the Bianchi-type spacetimes in [
49,
50,
51,
52]. The complete analysis of Noether symmetries for Gödel-type and pp-wave spacetimes studied by Camci et al. [
53,
54,
55].
6.5.1. Spacetime Symmetries
This is where the need to use symmetries comes in. Given enough symmetries we can reduce the number of independent variables. Thus, if we have the maximum possible symmetry of a flat space, i.e., Minkowski space, the functions are fully given directly, being constant in Cartesian coordinates and only trivially dependent on the independent variables in other coordinates. This brings out a problem of determining whether the dependence is trivial or not. Can a change of the independent variables remove the apparent dependence on them? For this purpose we need an invariant characterization of the symmetry involved. Since the metric tensor is the (tensor) potential, we need that the symmetry direction be one along which
g is Lie transported. If the symmetry vector is denoted by
, we need that
. In index notation this reduces to the
Killing equation,
A vector,
, satisfying this equation is called a
Killing vector (KV), or an
isometry. If
is a unit timelike isometry then we can choose coordinates such that
and use it to define the time coordinate and the metric coefficients will be time-independent in these coordinates. Similarly, if there are three unit spacelike isometries,
, we can use them to define three spatial coordinates by
, for
. In the former case, the metric coefficients will be independent of time, and in the latter of space. As such, we will have time translation invariance in the former case and space translation invariance in the latter case. The Noether invariants corresponding to them will be the
energy and
momentum. If the spacelike vectors generate the Lie algebra
, there can only be trivial dependence of the metric coefficients on the three coordinates and there will be rotational invariance. The corresponding conserved quantity will be the
angular momentum. If the timelike and spacelike vectors in pairs, generate an
there will also be invariance under Lorentz transformations and the corresponding conserved quantity will be the
spin angular momentum. If all (six) of these “rotational” symmetries exist, the Lie algebra will be
. If the four translations also exist the spacetime must be flat and one can choose Cartesian coordinates, in which the metric tensor is diagonal, being 1 in the time component and
in each spatial component.
In a curved spacetime, the symmetry cannot be higher dimensional, but can have the same number of dimensions if the curvature is a non-zero constant. Thus, the number of symmetry generators will be ten and the associated group will be the de Sitter (dS) group, for positive curvature and the anti-de Sitter, (AdS) group, , both of which contain as a subgroup. Since the metric has a timelike KV, there is still energy conservation. However, the distinction between linear and angular momentum disappears. The point is that the geodesic on which the linear momentum was being conserved, now bends around and closes on itself for dS and bends hyperbolically away for AdS, so that it becomes rotational motion as well. In all other cases, some conservation laws will be lost.
To be concrete, the dS metric is:
where distances are measured in light seconds and
R being a constant. If, instead,
we have the AdS metric. In these cases the stress–energy tensor is proportional to the metric tensor and so, by the EFEs, the Ricci tensor must also be proportional to the metric tensor. Such spaces are called
Einstein spaces. In the former case, the stress–energy tensor has a positive trace, and in the latter a negative trace. Were we to interpret that trace as the energy, which would be conducted for a fluid, the former would have a positive energy and the latter a negative energy. The difference this makes may be best visualized by conceiving of a very large space city, in which some live on the outside and others on the inside. Those on the outside would see a horizon below which there is nothing, albeit smaller than they would see on Earth. The latter would see everything collecting up at the horizon and above that a sky that would look just like the city around them. The former is what one gets with positive energy and the latter what arises with negative energy.
The gravitational field for a point particle of mass
m and charge
Q is given by
in units with Newton’s gravitational constant chosen to be unity, and yields a traceless Ricci tensor. If there is no charge,
, the spacetime is Ricci-flat. The former is called the Reissner–Nordström metric, and the latter the Schwarzschild metric. In both cases we are only left with time translational and rotational invariance, or energy and angular momentum conservation. The fact that momentum conservation is lost is easily seen by considering a test particle left near a gravitational source, in which case it will fall towards the source. That spin angular momentum conservation is lost has the consequence that precession can be generated or lost in a gravitational field, and can be tested. For gravitational waves time translation is also lost and so energy is not conserved.
Some more conservation laws are found in an
Einstein Universe, which has the symmetry group,
, for which the coefficient of the time metric coefficient in Equation (
56), is unity and the coefficient of the radial coefficient is given by Equation (
57). This is an Einstein space and we recover the usual translation invariance as well, leading to linear momentum conservation. If Equation (
58) is satisfied instead of Equation (
57), it is the
anti-Einstein Universe and the usual angular momentum conservation is replaced by a spin angular momentum conservation, as we get an
instead of an
, symmetry group. The total number of conserved quantities is seven.
There are other metrics with 6 isometries, corresponding to spaces that have a constant coefficient for the solid angle element and they have the corresponding Noether invariants, but their physical significance is not quite that obvious. There are no spherically symmetric metrics with only 5 isometries as was shown in a complete classification of spacetimes by their isometries [
56]. There are many other spherically symmetric spacetimes with 4 KVs and any number with only the 3 of angular momentum that define spherical symmetry.
Dispensing with spherical symmetry, the Kerr metric represents a spinning point mass,
m, with angular momentum per unit mass,
a,
which has two KVs, for time translation and axial rotation. However, it has three Noether symmetries, two of which correspond to energy and angular momentum and an additional one coming from a Killing
tensor (see, for example, [
57]). It has been shown [
58] that this invariant corresponds to the total angular momentum squared. It is interesting to note that these are the same quantities that are conserved in quantum mechanics and that the angular momentum vector is not conserved there either [
59].
6.5.2. Conformal Symmetries
In school geometry, one first learns of congruent triangles and then of similar triangles. The congruent triangles give invariance of the figure under translation, but the similar triangles give it under translation and
scaling, which is provided by changing lengths while leaving angles invariant. This is achieved by scaling the metric tensor,
, which is called a
conformal transformation. In this case, there will be a
conformal Killing vector, (cKV), or
conformal isometry. Thus, if there is a timelike cKV, though energy conservation is lost, a re-scaled energy is conserved. This applies, for example, to the Friedmann metrics
where
k gives the normalized constant curvature, being
for a sphere, 0 for a plane and
for a hyperbola; the corresponding
are
,
and
. For the first case, the range of
is
, for the other two it is semi-infinite. In this case, there are only 6 isometries as the timelike one is lost, and so energy is not conserved. However, there is a cKV,
, and so energy is conserved up to scaling. The scaling comes from the expansion factor for the universe,
, and the re-scaled conserved quantity is the number of particles in the expanding volume. In terms of the Lagrangian for the metric tensor, the Lagrangian is not conserved but it is scaled. This yields a
conformal Noether invariant, which is the energy.
6.5.3. Symmetries of the Electromagnetic Field
In the Lagrangian for the electromagnetic field, given immediately after Equation (
20), the first term gives the source and the second represents the pure electromagnetic field. The corresponding stress–energy tensor for the pure electromagnetic field, given by Equation (
53) is
Writing this in terms of the electric and magnetic fields, the Lagrangian is
, for the pure time component of the stress–energy tensor, i.e., the Hamiltonian or energy, we get
, the space-time part is the momentum vector, which gives the
Poynting vector,
, and the spatial part gives the Maxwell stress tensor
which signifies the symmetric stresses in a shear-free “fluid”.
Now, the 4-gradient of a scalar function can be added to the 4-vector potential,
without changing the Maxwell tensor,
, since the latter is the 4-curl of the former, and the curl-grad vanishes in all dimensions, i.e., if
then
. This non-uniqueness of the field is called “gauge-freedom” and we say that
A is invariant under gauge transformations. This freedom must have a group associated with it, and consequently a “Noether charge”. A conservation associated with it comes from the observation that the second divergence of the Maxwell tensor must be zero, as the second derivative is symmetric but the Maxwell tensor is skew. Hence, the divergence of the first of the Maxwell Equation (
16), implies that
, which amounts to
, which says that the total time derivative of the charge density is zero, i.e., the electric charge is conserved.
For the transformation , . If we now define the so-called “covariant derivative”, then the covariant curl gives , so that we recover gauge invariance by multiplying by a position-dependent phase. The multiplication by a phase can be understood geometrically by considering acting on a position vector in the complex plane, i.e., the number . This transforms to the new point , which is just a rotation through the angle . If is a constant we say that it is a global gauge transformation and if it is variable a local gauge transformation. Thus, for the electromagnetic field we have local gauge invariance. The physically relevant symmetry is the local gauge symmetry which yields the conservation of electric charge. Written as the addition of the gradient of a scalar function does not indicate what the associated group is, but in the complex form employed above, this becomes the unitary group in one dimension, . In other words, we can regard Maxwell’s theory as the consequence of a local gauge symmetry, called . Local gauge theories really came into their own in quantum theory (QT), which we shall be seeing in the next section.
7. Symmetries in Quantum Theory
In quantum theory (QT) discrete symmetries play a significant role. Since the Lagrangian is left invariant under their action, they are Noether symmetries and will give some conserved quantity. Since they are discrete, they will not reduce the number of variables or order of the equations, but will limit the range. Thus, a reflection symmetry halves the range and a global rotational symmetry limits the variable to a semi-closed interval that can be chosen to be . If there is reflection symmetry, the number of solutions is more than halved (as they must all be even functions; or if there is antisymmetry they must all be odd). This break-up applies to translational shifts, and hence to momentum. However, for angular momentum, we get and , so . This symmetry is called parity. Thus, while in a mirror left is converted to right the angular momentum arrow remains unchanged. If a quantity remains unchanged, it is said to have positive parity and if it reverses direction, it is said to have negative parity. It was believed that all fundamental particles have either one or the other (with no cases of neither) and the total parity in any interaction is conserved.
Salam proposed that parity is violated in weak nuclear interactions (due to which there is radioactive decay), and he sent his draft paper for comments to Wolfgang Pauli, a leader of QT and especially of the spin quantum number, which would be reversed if parity is violated. Pauli sent him back the message, “Tell my young friend Salam to think of something better.” He added, “Parity is conserved—I can feel it in my bones,” punning on the use of the phrase “feel in my bones”, as the phosphorous in the bones would decay if parity is not conserved. The fact is that it is violated and the phosphorous in the bones does decay, but the rate of decay is orders of magnitude less than the biological degradation of the bones and could not be noted. Pauli’s reasoning was specious and spurious. When Yang and Mills received the Nobel Prize for independently discovering the same principle, Pauli apologised to Salam, but that did not restore Salam’s claim to priority.
There is also time reversal symmetry or asymmetry in any process, in that a film of the process run in reverse would be indistinguishable from the original, e.g., the collision of two billiard balls on a billiards table, or the swing of an ideal pendulum. However, if the surface of the table is rough, or the pendulum is damped, one can distinguish the forward direction from the reverse. Though non-frictional dynamics obeys time reversal invariance, for fundamental particles it was taken for granted that it is conserved, with positive or negative values. Another property of fundamental particles is electric charge. It is found that corresponding to each fundamental particle, there is an otherwise identical particle with opposite charge, called an antiparticle, even though one may be far more common than the other (such as electrons and positrons or such as protons and antiprotons). Writing the parity reversal operator as P, time-reversal as T and charge conjugation as C, though each one may be separately violated it has been proved that their product, CPT, is conserved in all fundamental processes, (the CPT Theorem).
More relevant for recent developments, are continuous symmetries. Recall that QT uses a complex “wave function”, satisfying the Schrödinger, Klein-Gordon, or Dirac equations. It gives a complex amplitude whose magnitude square represents the probability of finding a quantum entity at some place at some time. (In fact, even for the classical electromagnetic field theory, it is convenient to use complex variables, see, e.g., [
60,
61].) When the classical electromagnetic field is “quantized” by Dirac’s procedure (see, e.g., [
62]), the 4-vector field corresponds to a spin-one particle, called the
photon. If the gravitational field could be quantized, it would correspond to a spin-two field that people call the
graviton. More generally, a field represented by a tensor of rank
n corresponds to spin-
n quantized field a. These have a real representation but, Dirac showed that there would be half-integer spin fields as well, such as the electron, represented by a 4-dimensional complex vector of the representation space of
, called a
Dirac spinor. Spin-
has an
n-index spinor. The usual vector corresponds to a two index spinor, or a single index tensor. Note that
is locally isomorphic to
, as they both give rotations, but the former is a double covering of the latter because of the complex representation.
While rotations in two dimensions commute, and so
and
are Abelian, they do not in general and so are non-Abelian. As
, the next simplest unitary group is
. Like the transpose of an orthogonal matrix is its inverse, the Hermitian conjugate of a unitary matrix is its inverse. Since a
matrix has four complex entries subject to four real constraints,
has four independent parameters and hence
has three. As 3-d rotations are also three,
is locally isomorphic to
, again with a double covering. This is the symmetry group for weak interactions at high enough energy ∼ 150 GeV, as shown by Glashow, Salam and Weinberg [
63]. In fact, the weak and electromagnetic forces are unified at energies ∼ 150 GeV but the
electroweak (EW) symmetry group,
, breaks down at lower energies to the usual
below that. The “
Y” is the conserved
hypercharge, which mixes the charge of electromagnetism with that of the weak nuclear force, on account of which the photon of electromagnetism is a mixture of the bare
and the neutral component of the
, called a
neutral current (the other two being charged currents), above the unification energy. Murray Gell-Mann [
64] had proposed an
gauge group for three
quarks, for the strong nuclear force. Originally an energy-dependent term not respecting the symmetry, was inserted in the Lagrangian to break the symmetry, which is negligible at lower energies, but dominates at higher energies, and so the symmetry breaks—gradually. Peter Higgs [
65] suggested a mechanism whereby the symmetry breaks spontaneously at a critical energy, due to a field with a vacuum expectation value at higher energies that acquires a mass and becomes a physical spin zero particle, called a “Higgs boson”, as the energy drops below its mass. This mechanism was used by Salam to develop the “electroweak” unified theory, and was then used for
-breaking. The new conserved quantum number in this case was called “colour”, and the resulting theory called
quantum chromo-dynamics (QCD), denoted by
. It has eight generators and that yields eight gauge bosons called “gluons”. The
standard model of particle physics is, then,
.
7.1. Gauge Grand Unification Symmetry
The critical energy of the standard model is the same as the EW theory, but it consists of three forces with very different strengths: (a) weak ; (b) electromagnetism ; and (c) strong . As the interaction energy increases the strengths change; the weak and electromagnetic becoming stronger at relative rates that go inversely as the relative strengths and the strong gets weaker. At first it appeared that all three should meet at somewhere around – GeV. Since the universe is cooling as it expands (like the gas in a refrigerator), this was taken to indicate that there may have been a time in the early stages of the universe when all three were unified and this “grand unified theory” (GUT) broke at the critical energy to yield the universe as we see it now. In that case we would need a bigger group that would break down to the groups of the standard model.
In 1971, Jogesh Pati and Abdus Salam proposed an
GUT (published in 1974 [
66]), with the weakly interacting particles (called leptons) as a fourth colour prior to symmetry breaking, and with four “flavours” of the leptons/quarks,
giving the the usual protons, neutrons and electrons and a more massive set
that was known at the time. Whereas Gell-Mann used fractional charges for his quarks:
for
u;
for
d;
for
e; and
for
, Pati and Salam used integer charges, but it has enormous 225 generators! The smallest simple group containing the full standard model is
, which could break into the standard model at some critical energy, taken to be ∼
GeV and there would be a ∼
GeV Higgs boson associated with that unification. This was proposed in 1974 by Howard Georgi and Sheldon Glashow [
67]. The number of generators would be 24 in this case. By this time, there was reason to believe that there were three sets of the basic four particles, and the two heavier sets were not needed for the unified theory, but came as redundant copies. The one set had three colours for the quarks and only one each for the leptons. However, the theory took left-handed and right-handed spins for the quarks and the electron, but only a left-handed version of the neutrino,
. This unaesthetic break-up was put in a cumbersome way into two representations of the group. To avoid this problem in 1975 Harald Fritzsch and Peter Minkowski [
68] proposed an
theory that put all the particles into a single multiplet, but at the expense of expanding the number of generators to 45. While the
breaks in a single step,
can break into a left-right symmetric model, which then breaks down into the standard model.
More precise experiments showed that the three strengths do not come together at a single energy, with the extrapolations forming a triangle. As such, the basic raison d’etre was lost and the whole unification enterprise seemed to be in serious jeopardy. Further, since the quarks and leptons could inter-convert at sufficiently high energies, the proton would be unstable. The predictions for proton decay of and were experimentally violated and the only reason escaped was that it did not have a definite prediction—which is hardly a recommendation. Something more was needed to save GUTs.
7.2. Supersymmetry and Unification
Quantum field theory (QFT) had problems since its inception. Taken beyond the lowest level, the calculations for any interaction yield infinite probabilities, called “divergences”. Since the wave functions are unit norm vectors in a Hilbert space, it is argued that if they seem to become infinite they should be “renormalized”. However, it turns out that all theories, especially gravity, cannot be renormalized. Renormalizing infinity seems suspect to many in any case. A finite theory, which can then be renormalized in a meaningful way is needed. A method came from a novel proposal of treating spinors and tensors as different representations of a unified symmetry, but using commutators for the products of tensors and
anti-commutators for the products of spinors (or multi-spinors), which complicates the Lie algebra to what is called a “super-algebra”. This
supersymmetry (SUSY) was proposed by Julius Wess and Bruno Zumino [
69]. Their cumbersome formalism was put into a more usable form by Abdus Salam and his student, John Strathdee [
70].
Tensors correspond to fields with integer spins in units of Planck’s constant (divided by ), ℏ, while spinors correspond to half-integer spins. The thermodynamic distributions, giving the speeds or energies of the gas particles for the former are called bosons and for the latter fermions, are different at low temperatures but behave in much the same way at higher temperatures. In no way could this be taken to be the unification talked of. However, there is no spontaneous symmetry breaking mechanism for it either. Nevertheless, assuming that the symmetry does apply in the sense of a fundamental theory at some higher energy and then breaks down, it would modify the standard model significantly by introducing an extra parameter of the symmetry-breaking energy. In the standard model the number of Higgs bosons is not constrained. For definiteness one takes the minimal number of Higgs fields, and this is called the minimal standard model (MSM). With SUSY one gets an MSSM. It was found that the three constants came together at GeV if SUSY is assumed to be broken at 1 TeV = Gev. Thus, if SUSY is to save GUTs, it must be seen to break at this energy, which was reached long ago at the Large Hadron Collider at CERN. It has not been seen so far, and people are jumping through hoops to keep SUSY alive, but she is on life-support.
Could a higher unification of
all forces save the day? The divergences of the bosons seemed to be canceled at the lowest non-trivial level of calculation of interaction cross-sections, by those of its fermionic super-partner, and vice versa. Gravity is a non-renormalizable theory. With SUSY it is called
supergravity (SUGRA). The super-partner of the hypothetical “graviton” was a spin 3/2 field called a
gravitino. In SUGRA it turned out that at the next non-trivial level the divergences
did not cancel. This problem was resolved by using an
extended SUGRA, in which a second gravitino was inserted. This mechanism worked by introducing a new gravitino at the next level up to the eighth level, because of the extra parameter inserted, but by the same token it will not work beyond that [
71]. One also needed to go to higher dimensions of spacetime and strings or membranes instead of point particles [
72].
7.3. Twistor Quantization and Unification
A totally different approach to combine GR and QT was proposed by Roger Penrose. As he saw it, the problem with combining the two is that since the field to be quantized is the metric tensor, which defines the distance between two
points, once the quantization is conducted the spacetime points will no longer form a continuum, but will be discrete. Thus, we would no longer be able to use Calculus for Geometry. He pointed out that the quantity that we know is quantized, and we know how to deal with, is angular momentum or spin angular momentum. Regarding particles as just a collection of spins, one could only
know of the existence of the other by exchanging a unit of spin,
. He wrote a paper entitled Spin Networks in 1967 or 1968, which he gave to one of us (AQ), when he was suggesting possible lines for AQ’s PhD research (when he gave one of us, AQ, an unpublished paper on
spin networks to see if he wanted to work on the idea for his PhD). It was later published in 1971 [
73]. He regarded the bundles of spin as moving at the speed of light, so that he could express them in terms of
spinors (for which he had given a geometrical visualization [
74]).
Spinors can be thought of as vectors of the representation space of the symplectic group, , over . This group leaves invariant the null structure, so that all vectors have zero magnitude. Penrose was dealing with two-component spinors, so that the symplectic structure is provided by the Levi-Civita symbol, , which is 0 when , 1 when and when . Using the set of four Pauli spin matrices , a vector, can then be written as a complex matrix, . If the vector is null then we can write . Here is a two-component spinor (for spin vector) and can be visualised as a flagpole lying along the null cone with a half-plane element stuck on it like a pennant. The symmetry group for it is . Multiplying the spinor by a complex number scales the flagpole by the magnitude of the factor and the pennant is rotated through twice the argument of the factor. This explains why is a double covering of , since the scaling by will leave the vector unchanged but will reverse the direction of the spinor.
The spinor representation of the covariant derivative operator
is
, and the Killing equation can be written for a spinor as
, which is called the
twistor equation [
75]. The twistor contains the information of the spinor,
and an
. Thus, the twistor is given by the pair of spinors,
. Single or multi-index twistors are solutions of the zero rest-mass field equations [
76], with the number of indices corresponding to the spin of the field. A twistor has four complex components, corresponding to eight real components, but the relevant information does not depend on the overall magnitude, as the position can go on sliding up the flagpole out to infinity. Thus, the twistor is an entire null ray. As such one only needs the projective space of twistors, which is three complex, or six real, dimensional. The
in the twistor will generally be complex and the
twist in the congruence of geodesics (given by the relevant spin coefficient, see [
77]) is either positive or negative. The corresponding twistors are said to belong to
or
. However, when
is real, there is one extra constraint and there are only five real components left. Such twistors are called
null twistors. Hence, there is a five-dimensional hypersurface,
N, separating the positive and negative projective twistor spaces,
and
. Elements of
N correspond to entire real null rays in real Minkowski space, and the congruence of such rays is also five dimensional. The Penrose transform between the twistor space and complexified Minkowski space signifies a duality between the two.
We can generate solutions of the zero rest-mass field equations by using certain “intertwining integrals” [
78,
79]. This enables us to do contour integration over
N, for twistor fields, which will start in one of the six-dimensional spaces, pass through
N and then go back into the original one. This procedure yields scattering amplitudes, and hence probabilities, for the various scattering processes in the high energy limit [
80,
81]. Whereas standard QFT yields infinite probabilities, and SUSY, SUGRA, superstrings and supermembranes hope to achieve cancellations of the infinities, the twistor approach yields finite answers automatically, and the renormalization involved is only division by a magnitude obtained by summing a convergent series of finite magnitude terms.
8. Complex Lie and Noether Symmetries
In his work, Lie had considered the transformation of the independent and dependent variable by differentiable transformations, called
point transformations, as the space of the variables can be thought of as two-dimensional, one for the independent and one for the dependent, with the specific values of the variables represented by a point. When the variables are transformed the points get shifted. Thus,
is a point transformation. It is particularly useful to make the differentiability explicit, by using the transformations in infinitesimal form, so that one sees a smooth path traced out by the moving point. Thus, we can write
where
is an infinitesimal. The
infinitesimal generator of the transformation is then
A scalar ODE is said to be
symmetric or
invariant under a transformation if the graph of its solution is preserved by the transformation. He provided various methods to check whether a given system of differential equations is invariant under a given point transformation and others were developed later, see, e.g., [
57]. In particular, he discussed when the equations could be converted to linear form by point transformations, called
linearization, and thereby solved comparatively easily. He especially studied the criteria for scalar second order ODEs to be so transformed [
15], see, e.g., [
82]. Despite the importance of PDEs in applications (especially in Fluid Dynamics), we restrict our attention to ODES. The reason is that while the general solution of an ODE is unique up to (at most) as many arbitrary constants as the order of the equation, PDEs, have infinitely many solutions. The definite statements available for ODEs are lost for PDEs.
For second order ODEs,
we need to include the first derivative as if it were an independent variable and use a 3-d space. For higher order scalar ODEs the space has to be
extended (or
prolonged) to include all derivatives up to the next to highest derivative, as if all are independent of each other. The graph of the solution must then remain invariant in the projected two-dimensional space. Thus, for the second order scalar ODE the prolonged infinitesimal generator is
where
is given in terms of
and the derivative of
. As the order increases the generator gets prolonged further by adding another term involving the derivative with respect to the next order derivative. For an
order ODE we need to prolong up to
, where
Lie had proved that second order scalar ODEs are linearizable only if they are at most cubically semilinear and the four coefficients satisfy a set of four first derivative constraints involving two arbitrary functions. Tresse [
83] reduced them to two second order constraints without the arbitrary functions. Note that these equations do not have to be
solved but only
checked. Methods were later developed to simply write down the solution of second order quadratically semilinear linearizable systems [
84] and were later generalized to the cubically semilinear case [
85,
86].
Lie used complex functions of complex variables for his analysis of differential equations. Of course, he had to take the functions to be not only continuous but differentiable. Now complex differentiability implies analyticity. He did not explicitly use this fact in his analysis. However, it obviously has significant consequences, as the dependent and independent variables will be constrained by the Cauchy-Riemann (CR) equations. Thus, even a scalar first order ODE, split into its real and imaginary parts, is actually a system of four first order PDEs. This point was noted and exploited by Ali, Mahomed and Qadir [
87,
88], who called it
complex symmetry analysis (CSA). If the dependent variable is
and independent variable
, one function of one variable yields four functions of two variables, and so a system of PDEs. For the resulting system to be of ODEs, one must restrict the independent variable to the real part,
x, only. Thus, the complex scalar ODE will now correspond to a system of two ODEs along with a set of CR equations. A first order ODE will then split into the two first order ODEs,
and
, by taking the real and imaginary parts of the equation. The CR equations for the corresponding system will be
. It is this system that would now be analysed. The formalism is easily extended to higher order ODEs.
The complex generator
splits into its real and imaginary parts, as
the half coming from the requirement that
. The prolongation is, of course, still more cumbersome to write, but easy to obtain. Ali, Mahomed and Qadir applied their CSA to the linearization of second order scalar ODEs [
89], and found that they could linearize the complex scalar second order ODE corresponding to a 2-d non-linearizable system. In fact, it was found that systems with less than the eight generators required for a single scalar second order ODE may correspond to a complex scalar linearizable ODE. Further, systems without enough infinitesimal symmetry generators to be solvable by symmetry methods, there was an example of a second order 2-d system that has
no generator, but corresponds to a complex scalar second order ODE, and its solution was obtained [
90,
91,
92].
Extending CSA to Lagrangians to be able to use it for Noether symmetries is non-trivial. The reason is that the Lagrangian is necessarily defined for the real domain. The extension presents us with two basic problems that may be thought of as two faces of the same coin. The Lagrangian is the kernel of a functional, and functionals map functions into
not
. The whole purpose of defining it is to find the form of the dependent variable for which the functional takes a minimum value. As such the image space for it has to be an ordered set like
, and not a partially ordered set like
. The first problem is simply dealt with by fiat, redefining the range of the functional to be
. The associated problem is dealt with by using the
magnitude of the functional to be minimal, rather the functional itself [
93].
It turned out that invariants could be obtained for the complex Lagrangians and provide new insights into the physical significance of the Noether invariant [
94]. In particular, when a complex scalar harmonic oscillator equation is split, it gives a system of coupled harmonic oscillators and the Noether invariant gives the energy in each oscillator
and identifiably separately in the field between them. This was expressed as “seeing the energy in the field through complex glasses”. It is easy to extend to the time dependent harmonic oscillator and see how the energy in the coupled oscillators and in the field are transferred about. It turns out that the complex Noether symmetries do not provide any new invariants, but they
do provide them more easily and put them into different combinations that may be more insightful [
95].
9. Concluding Remarks
Mathematically, Noether symmetries provide double reduction of the order and/or the number of variables in a differential equation. One of the methods of achieving a reduction is by determining an invariant combination of the dependent and independent variables and their derivatives. Writing the invariant as an arbitrary constant, it can be used to write, say, the highest derivative in the combination, in terms of the other variables. It is particularly needed for PDEs, as “solving them” without the boundary conditions is not very meaningful, and the invariants can incorporate the boundary conditions. While other methods could be used for the same mathematical purpose, they cannot provide the physical insights of the “Noether charge” that the invariants do. In this paper, the physical insights obtained from Lie and Noether symmetries and invariants was reviewed. Worth special mention is the use of Noether symmetries in quantum theory, which has not received enough attention from those working in symmetry analysis. Since one of the most important outstanding problems in fundamental physics is the unification of quantum theory and general relativity, the introduction of new methods may lead in solving it. One such method is the explicit use of complex analyticity of Lie groups that was also briefly mentioned, but there are many others.
There is considerable activity in spacetime symmetries [
96], including not only isometries (also called KVs), but also the scaling symmetry of the metric tensor, called a
homothety (the special case of a constant conformal factor). There is also interest in the symmetries of the Ricci (
Ric), Riemann (
Rie) and Weyl (
C) tensors, which are called
Ricci,
Riemann and
Weyl collineations, given by
,
and
. Much of it is on their physical significance [
97,
98,
99,
100,
101,
102,
103] and many problems continue to arise, which would be worth exploring. A different line was developed of obtaining all metrics with isometry groups containing a minimal group, which was called a “complete classification” of spacetimes with the minimal symmetry [
56,
104,
105]. The idea was to be able to pick up spacetimes that have the desired symmetry and use the metric to obtain the stress–energy tensor by constructing the Einstein tensor for it. Thus, one manages to “solve the Einstein equations without having to solve them”. This was initially conducted for isometries of spherically symmetric, static metrics, but was then extended to isometry groups of only three dimensions. It was further extended to homotheties and collineations. Classification by Noether symmetries [
41,
42,
43,
44,
45,
46,
47] has yielded solutions of the Einstein equations along with their conserved quantities. This line is also worth pursuing. A more difficult problem is to completely classify by a two-dimensional isometry group, such as cylindrical symmetry. If that is conducted, it may be possible to extend it to homotheties and collineations.
As mentioned in
Section 7 there is great need for workers in Lie symmetry analysis to enter into QFT, as the work using symmetries is pursued by physicists who are not generally well-versed in the methods developed by Lie and subsequent workers. (However, it is not enough for the workers to simply take an open problem in QFT, without understanding what is entailed in it. It would be all too easy to get spurious and irrelevant results if the context is not understood.) In particular, any new solutions of the Newman-Penrose equations [
74,
77] would be
extremely useful as they come equipped with their physical significance. Since they deal with invariants a Noether symmetry formulation for them may be possible, and would be a major contribution.
Coming to
Section 8, the field is wide open. Very powerful complex methods have been developed, where they can be applied. One direction to go in following this up, is to check open problems to see if the powerful complex methods can be used there or not. Unfortunately, one runs the risk of wasting a lot of time finding that the methods are
not applicable to the chosen problem. One must also bear in mind that while CSA provides solutions of systems of ODEs that cannot be solved by traditional symmetry methods, Noether symmetries will not give new invariants, but only new combinations of the old invariants. As pointed out above, they
can provide new insights into the physical significance of the invariants. A very much more important line of work to follow in this area, is to find an explanation of
why the complex methods provide answers, where they do. If one can answer that question, one should be able to formulate criteria for the applicability of complex methods and thus avoid wasting time on searching for problems where complex methods can be applied and go directly to those where it can; or perhaps even find ways to tweak the methods to make them more generally applicable.