1. Introduction
In the early 1950’s, Dirac and Bergmann independently developed the Hamiltonian formalism for systems with singular Lagrangians [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10]. These systems, often called “constrained Hamiltonian systems”, include gauge theories. Gauge freedom is more clearly and more completely displayed in the Hamiltonian setting, with the generators of gauge transformations expressed as functions on phase space. Historically, the main motivation for casting gauge theories in Hamiltonian form was to facilitate their canonical quantization. Dirac and Bergmann were primarily motivated by the prospect of developing a quantum theory of gravity based on a Hamiltonian formulation of general relativity.
Textbook treatments of Lagrangian and Hamiltonian mechanics invariably assume that the Lagrangian is nonsingular; that is, that the matrix of second derivatives of with respect to the velocities is invertible. In classical mechanics, the nonsingular case appears to be sufficient to cover problems of physical interest. However, one might argue that textbooks avoid certain physically interesting problems simply because their Lagrangians are singular.
In field theory, the issue of singular Lagrangians and gauge freedom cannot be avoided. Nearly every field theory of physical interest—electrodynamics, Yang–Mills theory, general relativity, relativistic string theory—has gauge freedom.
The Dirac–Bergmann algorithm transforms a singular Lagrangian system into a Hamiltonian system. The formalism is elegant but at the same time rather complex. It consists of a large number of logical steps, linked together by a chain of reasoning that can be difficult to keep straight. Of course, there are many examples in the literature in which the Dirac–Bergmann algorithm is applied, converting a singular Lagrangian into Hamiltonian form. However, to my knowledge, all of these examples are designed to illustrate just one or two of the logical steps in the algorithm. The student of the subject is faced with the task of linking these examples together to create a complete picture of the algorithm.
For those who learn by example, what is needed is a single example that illustrates all of the major logical steps in the Dirac–Bergmann algorithm and shows how these steps are linked together. Such a “complete” example is not easy to identify because there is no obvious way to predict, starting with a particular Lagrangian, which of the steps in the algorithm will be needed.
The system analyzed in this paper is defined by the Lagrangian
where the dot denotes a time derivative. The matrix of second derivatives with respect to the velocities is
This matrix is singular; it has rank 2.
As we will see, the system defined by the Lagrangian (
1) is relatively complete.
1 It contains both primary and secondary constraints, both first and second class constraints, and restrictions on the Lagrange multipliers. The first class constraints for this system are not all primary; this allows us to address the Dirac conjecture. The second class constraints can be eliminated by introducing Dirac brackets. Finally, this system contains both physical and gauge degrees of freedom. The gauge freedom can be eliminated with suitable gauge conditions.
One characteristic of any complete example such as (
1) is that the configuration space, the space of
q’s, must be at least four-dimensional. Here is why: The number of physical degrees of freedom is equal to the dimension of the configuration space, minus the number of first class constraints, minus half the number of second class constraints. If the example is to have at least one physical degree of freedom, at least two first class constraints (one primary and one secondary), and at least two second class constraints (the number of second class constraints must be even), then the configuration space must be at least four–dimensional.
The study of constrained Hamiltonian systems predates Dirac and Bergmann with earlier work by Rosenfeld [
12,
13]. Like Rosenfeld, Bergmann and his collaborators [
1,
2,
4,
6,
7,
14] were focused on field theories such as general relativity that are covariant with respect to general four-dimensional coordinate transformations. (For a historical review, see [
15].) Dirac took a more basic approach to the problem by considering a generic singular Lagrangian [
3,
5,
8,
9,
10]. He developed the algorithm for the case of systems with a finite number of degrees of freedom. His view was that the generalization to field theory, with an infinite number of degrees of freedom, would be “merely a formal matter”.
Although the typical starting point for the Dirac–Bergmann analysis is a singular Lagrangian, not all gauge theories can be expressed in terms of a Lagrangian that depends only on the q’s. General relativity is one such example. The Einstein–Hilbert action is a functional of Lagrange multipliers (the lapse function and shift vector) as well as the configuration variables (the spatial metric). Nevertheless, the Dirac–Bergmann algorithm provides the foundation for our interpretation of general relativity as a constrained Hamiltonian system.
In this paper, I apply the Dirac–Bergmann algorithm to the Lagrangian (
1), following closely the general treatment given by Henneaux and Teitelboim [
11]. In turn, the account of Henneaux and Teitelboim closely follows Dirac’s 1964
Lectures on Quantum Mechanics [
10]. Presentations of the Dirac–Bergmann algorithm can also be found in books by Hanson, Regge and Teitelboim [
16], Sundermeyer [
17], Rothe and Rothe [
18], and Lusanna [
19].
Throughout the paper, I attempt to explain the reasoning behind the logical steps of the Dirac–Bergmann algorithm, but avoid general proofs. The reader is referred to references [
11,
16,
17,
18,
19] for more details.
We begin in
Section 2 with a derivation of Lagrange’s equations for the singular Lagrangian (
1). The general solution is derived, and in
Section 3, we discuss the gauge freedom at the Lagrangian level. We begin construction of the Hamiltonian theory in
Section 4 with a derivation of the primary constraints and canonical Hamiltonian. In
Section 5, we introduce the primary Hamiltonian and the primary action.
Section 6 is devoted to a discussion of the initial value problem and the need to go beyond the primary Hamiltonian. In
Section 7, we apply Dirac’s consistency conditions to derive the secondary constraints and restrictions on the Lagrange multipliers. The concept of weak equality is introduced in
Section 8, along with a formal analysis of the restrictions on the Lagrange multipliers. The total Hamiltonian is computed in
Section 9, and in
Section 10 we sort the constraints into first and second class. The first class Hamiltonian and gauge generators are identified in
Section 11, where we also introduce the Dirac conjecture. In
Section 12 and
Section 13, we define the extended Hamiltonian and extended action. Dirac brackets are used in
Section 14 to eliminate the second class constraints, which yields a partially reduced Hamiltonian. The corresponding partially reduced action is derived in
Section 15, and in
Section 16 we eliminate the momenta to obtain a partially reduced Lagrangian. Gauge conditions are introduced in
Section 17 and Dirac brackets are used to eliminate the constraints and gauge conditions. This yields a fully reduced Hamiltonian. The fully reduced action is derived in
Section 18. Finally,
Section 19 contains a summary of the Dirac–Bergmann algorithm and a discussion of the Einstein–Hilbert action for general relativity.
2. Lagrangian Analysis
The action is the integral of the Lagrangian (
1):
The notation
indicates that
S is a functional of the complete set of coordinates,
. The equations of motion are obtained by extremizing the action. For this example, we are not concerned with boundary conditions and integrate by parts freely. Lagrange’s equations are
We can rewrite these as follows. First, add Equation (4b,d), then subtract Equation (4c). This gives
Next, subtract this result from Equation (4a) to obtain
The time derivative of this equation yields Equation (4c). Finally, we find the result
by solving Equation (5a) for
and using the equation of motion (4d).
Equation (5) are equivalent to Lagrange’s Equation (4). In particular, Equation (4a) is the sum of Equation (5a,b); Equation (4b) is the sum of Equation (5a,c) with the time derivative of (5b) subtracted; Equation (4c) is the negative of the time derivative of (5b); Equation (4d) is obtained by subtracting (5c) from (5a).
The equations of motion for this simple linear system are easily solved. Note that the combination
is determined by (5c) along with initial or boundary data; thus, we have
where
A and
B are constants. Now, Equation (5a) gives
If we knew
, we could solve the previous two equations for
and
, then integrate Equation (5b) to obtain
. Clearly, we do not have enough information to fully determine each of the
q’s as functions of time. One of the
q’s must remain undetermined. For example, let us choose
arbitrarily by setting
for some function
. We can then use the equations above to solve for
,
and
:
where
C is an integration constant. This is the general solution of the equations of motion.
3. Gauge Invariance
The undetermined function
that appears in the general solution (8) can be freely specified. This is the gauge freedom of the theory. We can express the gauge freedom in another way: the Lagrangian (
1) and the equations of motion (either (4) or (5)) are invariant under the replacements
where
is an arbitrary function of time.
Although each configuration of the system (that is, each set of q values) corresponds to a specific physical state of the system, the converse is not true. Because of the gauge freedom, there are many sets of q’s that describe one and the same physical state.
Let us examine the gauge freedom more closely, in anticipation of the Hamiltonian description of evolution. To begin, choose the gauge
and consider the general solution (8). This solution describes the evolution of the system from initial data
The configuration at some arbitrary final time
is
This configuration corresponds to a particular state of the physical system.
We can choose a different gauge in Equation (8). As long as the new gauge satisfies
, the solution will describe evolution from the same initial data (10). For example, with
where
, the configuration at
is
The configurations (11) and (13) represent the same physical state of the system, since they evolve from the same initial data.
We can express this result more compactly as
Here,
denotes the change in
at the generic time
T, due to the change in gauge function
.
Here is another example. With
we obtain a configuration that differs from Equation (11) by
This configuration is also evolved from the initial data (10), and represents the same physical state as the configurations (11) and (13).
Although the gauge transformation (9) contains a single arbitrary function of time, the gauge invariance naturally splits into two types. The first consists of variations subject to . The second consists of arbitrary variations in , with . This apparent “doubling” of the gauge freedom arises because the solution (9c) for (unlike the other variables) includes the integral of . There is enough freedom of choice in to allow variations in that are independent of the variations among the other variables. Both types of gauge transformations leave the physical state of the system unchanged.
The consequences of gauge invariance are most clearly expressed in the Hamiltonian formalism. The extended Hamiltonian defined in
Section 12 includes phase space generators for both types of gauge transformations.
4. Primary Constraints and the Canonical Hamiltonian
We now begin construction of the Hamiltonian description of the system. The conjugate momenta are defined as usual by
. For the Lagrangian (
1), we have
Because the Lagrangian is singular, the matrix of second derivatives
is not invertible and we cannot solve Equation (17) for the velocities as functions of the coordinates and momenta. The definitions (17) yield two
primary constraints,
that restrict the phase space variables
,
. We will denote these constraints collectively by
, where
. Note that in this simple example, the primary constraints are independent of the
q’s.
There is freedom in choosing how the constraints are written. For example, we could replace the
’s above with
In fact, any choice for the constraints is allowed, as long as they follow from the definitions
and satisfy the
regularity conditions. These conditions state that, roughly speaking, the constraints should have nonzero gradients on the constraint surface. More precisely, the Jacobian matrix formed from the derivatives of the constraints with respect to the
p’s and
q’s should have maximal rank on the constraint subspace [
10,
11]. For both choices, (18) and (19), the rank of the Jacobian matrix is 2. On the other hand, the set
is not permissible because the gradient of
vanishes on the constraint surface where
. Correspondingly, the rank of the Jacobian matrix is less than 2 on the constraint surface.
The next step in constructing the Hamiltonian formalism is to compute the
canonical Hamiltonian. The canonical Hamiltonian
is defined from the usual prescription by writing
in terms of
p’s and
q’s. Although we cannot solve for all of the
’s in terms of
p’s and
q’s, it can be shown that the combination
depends only on the phase space variables [
10,
11]. For our example problem, the canonical Hamiltonian is
Note that is ambiguous. For example, we could use the primary constraint (18b) to replace the term with .
5. Primary Hamiltonian and the Primary Action
The
primary Hamiltonian is obtained from the canonical Hamiltonian
by adding the primary constraints with Lagrange multipliers,
where a sum over the repeated index
a is implied. The
primary action is built from the primary Hamiltonian in the usual way:
. Explicitly, we have
The primary action is a functional of the complete set of phase space coordinates,
,
, as well as the Lagrange multipliers
and
.
The equations of motion are obtained by extremizing the primary action
. Extremization with respect to the momenta
gives
while extremization with respect to the coordinates
yields
Extremizing the action
with respect to the Lagrange multipliers gives the constraints,
These equations of motion (24) are equivalent to Lagrange’s Equation (4). To show this, we first solve Equation (24c,d,i,j) for the momenta to obtain
Using these results along with Equation (24a,b) for the Lagrange multipliers, we find that Equation (24e,f,g,h) agree precisely with Lagrange’s Equation (4).
6. Hamilton’s Equations and the Initial Value Problem
At this point one might ask whether the task of expressing the singular system (
1) in Hamiltonian form is complete. After all, the primary action (
23) provides the correct equations of motion for the phase space variables
and
. In fact, we can obtain the time evolution of any phase space function
F from
where
is the primary Hamiltonian and
denotes Poisson brackets. Hamilton’s equations for the coordinates and momenta,
and
, coincide with Equation (24a–h).
Our task of expressing the singular system in Hamiltonian form is not yet complete because we still need to interpret Hamilton’s Equation (24a–h) as an initial value problem. That is, Hamilton’s equations should determine the future history of the system solely from initial data. In contrast, the primary action (
23) defines a boundary value problem in which the configuration variables, the
q’s, are specified at initial and final times.
The key difference between the equations of motion and Hamilton’s equations is that the former include the primary constraints, Equation (24i,j), whereas the latter do not. Thus, the phase space trajectories that extremize the action must lie entirely in the primary constraint surface. (The primary constraint “surface” is the subspace of phase space that satisfies the primary constraints.) In contrast, the trajectories obtained from Hamiltonian evolution are defined throughout the entire phase space. Note that we cannot simply append the primary constraint equations to Hamilton’s equations, because in that case the complete system would not be in Hamiltonian form.
Of course, the physically allowed phase space trajectories must satisfy the primary constraints. With an initial value interpretation of Hamilton’s equations, we can try to enforce the primary constraints with appropriate choices of initial data and Lagrange multipliers. In particular, we can choose initial data that lie on the primary constraint surface and . However, this is not enough, because the primary constraints are not necessarily satisfied at later times as the system evolves into the future.
We can describe the situation as follows. The trajectories that extremize the action , the physical trajectories, do not necessarily fill the entire primary constraint surface. Instead they might span only a subspace of the primary constraint surface. If the initial data lie in the primary constraint surface but outside the subspace of physical trajectories, then the primary constraints will not be preserved as the data are evolved.
How should the initial data and Lagrange multipliers be restricted such that the primary constraints hold throughout the evolution? The primary constraints will hold for all time if they hold initially and their time derivatives (to all orders) also vanish initially. In the general case, this leads to a hierarchy of restrictions on the initial data in the form of secondary, tertiary, etc., constraints.
2 It can also lead to restrictions on the Lagrange multipliers.
The higher order (secondary, tertiary, etc.) constraints and restrictions on the Lagrange multipliers are not new—imposing them does not change the content or predictions of the physical theory. This is because the higher-order constraints and restrictions on the Lagrange multipliers are direct consequences of the equations of motion (24) that follow from the primary action (
23). They are simply “hidden” in those equations. The process of identifying the higher-order constraints and restrictions on Lagrange multipliers reveals these hidden conditions.
7. Consistency Conditions, Secondary Constraints and Restrictions on the Lagrange Multipliers
We can ensure that the primary constraints hold for all time by applying Dirac’s
consistency conditions [
10]. Begin by computing the time derivatives of the primary constraints with the primary Hamiltonian,
. Now set these equal to zero:
For each value of the index
a, there are three possibilities.
3 First,
might vanish on the constraint surface
, so that the consistency condition (
26) reduces to the identity
. Second,
could be a (non-constant) phase space function that is independent of the Lagrange multipliers. In this case, Equation (
26) is a
secondary constraint. Finally,
might depend on the Lagrange multipliers. Then, Equation (
26) fixes one of the Lagrange multipliers in terms of the phase space variables and the other Lagrange multipliers.
The secondary constraints that arise from this process must themselves satisfy the consistency conditions. This can lead to tertiary constraints and more restrictions on the Lagrange multipliers. In turn, the tertiary constraints can lead to quaternary constraints, and so forth. We must continue to apply the consistency conditions until the process naturally stops.
For our example, the primary constraints are
and their time derivatives are
Thus, we find the secondary constraints
These will be denoted collectively by
.
Applying the consistency conditions to the secondary constraints gives
These equations restrict the Lagrange multipliers to satisfy
The process has now terminated. In this example, there are no tertiary or higher-order constraints.
Recall from the previous section that our goal was to restrict the initial data and Lagrange multipliers such that the primary constraints vanish for all times under the Hamiltonian evolution defined by
. We achieve this by imposing the primary constraints at the initial time,
the secondary constraints at the initial time,
and restricting the Lagrange multipliers to satisfy Equation (
31) for
all time
t.
Let us review the reasoning. From Equation (30), the restriction (
31) on the Lagrange multipliers tells us that
for all time. By Equation (28),
vanishes initially, so we see that
must vanish for all time. Now we use Equation (32) to conclude that
must vanish for all time. Since
vanishes initially, by Equation (
Section 7), it follows that the primary constraints
must hold for all time
t.
8. Weak Equality and Lagrange Multiplier Analysis
It will be useful to follow the general Dirac–Bergmann algorithm closely and carry out a formal analysis of the restriction (
31) on the Lagrange multipliers [
10,
11]. We begin with the concept of
weak equality.
Let
denote the complete set of (primary, secondary, tertiary, etc.) constraints. For our example,
where the index
A runs from 1 to 4.
Two phase space functions F and G are weakly equal if they are equal when the (primary, secondary, tertiary, etc.) constraints hold. In other words, F and G are weakly equal if they coincide on the constraint surface, the subspace of phase space defined by . Weak equality is written as .
Functions F and G are strongly equal if they agree throughout phase space. Strong equality is written as .
Now we turn to the formal analysis of the restriction (
31) on the Lagrange multipliers. This restriction can be expressed as the weak equality
. From Equations (28) and (30), we have
which simplifies to
This is a system of inhomogeneous linear equations for the Lagrange multipliers. A particular solution is
and the homogeneous solutions are
where
is arbitrary. The general solution is the sum of particular and homogeneous solutions:
Thus, the restriction (
31) on the Lagrange multipliers yields
and
, where
is an arbitrary function of time.
9. Total Hamiltonian
The
total Hamiltonian is obtained from the primary Hamiltonian
by inserting the general solution for the Lagrange multipliers:
Physical phase space trajectories are defined by the total Hamiltonian as the weak equality
, with initial data that satisfy the complete set of constraints,
.
Hamilton’s equations for the total Hamiltonian
are
and
Since these are weak equalities, we can use the constraints to simplify the results. Observe that the constraints
imply
,
and
. Therefore, we can set
and
to zero, replace
with
, and replace
with
. Then Equation (41d,e,g,h) are either redundant or vacuous, and the remaining equations are
These equations, along with the constraints
, give a complete description of the physical system.
Let us check the results. The Lagrange multiplier
can be eliminated from Equation (42a,b) to give
. Now differentiate this equation and eliminate
with Equation (42d) to obtain
. The constraints allow us to set
, which gives
This is Equation (5c), which follows directly from Lagrange’s equations. The result (5a) from Lagrange’s equations is simply the secondary constraint
. Finally, the result (5b) is obtained by summing Equation (42b,c).
Recall that Equation (5a–c) are equivalent to Lagrange’s equations. Thus, we have verified that Hamilton’s Equation (41), along with the primary and secondary constraints , are equivalent to Lagrange’s equations.
10. First and Second Class Constraints
A first class function
F is a phase space function that has weakly vanishing Poisson brackets with all primary and secondary constraints:
It can be shown that the Poisson bracket of any two first class functions is itself a first class function [
10,
11].
The constraints themselves can be first class; constraints that are not first class are called second class. The constraints are separated into first and second class by examining the matrix of Poisson brackets:
The rank of this
matrix is 2, and its nullity is
. It follows that there are 2 independent eigenvectors with eigenvalues equal to zero; for example,
and
. Then there are two independent combinations of constraints that are first class, namely
and
. (A sum over the repeated index
A is implied.) The first class constraints are
One can check that the first class conditions
and
hold. The most general first class constraint is a linear combination of
and
.
There are two remaining linear combinations of constraints, which we take to be
These are the second class constraints. They have nonvanishing Poisson brackets with each other,
The most general second class constraint is a linear combination of
,
,
and
, with nonzero coefficients on one or both of
and
.
The splitting of constraints into first and second class is independent of the splitting into primary and secondary. In this example, the first class constraints are mixtures of primary and secondary constraints. Likewise, the second class constraints are mixtures of primary and secondary constraints.
11. First Class Hamiltonian, Gauge Generators and the Dirac Conjecture
The total Hamiltonian (
40) includes the product of an arbitrary Lagrange multiplier
with the first class constraint
. We refer to
as a
primary first class constraint, since it is constructed entirely from primary constraints.
If we remove the primary first class constraint from the total Hamiltonian, what remains is the
first class Hamiltonian . That is, the total Hamiltonian can be written as
where
is the first class Hamiltonian. A common notation for
, the notation used by Dirac [
10], is
.
We can check directly that the first class Hamiltonian (
50) is a first class function. However, this is not necessary, because we know that the constraints are preserved under the time evolution defined by
. That is,
. Thus, the total Hamiltonian must be first class,
. Of course, the primary first class constraint
is first class. It then follows from the definition (
49) that
must also be a first class function.
The splitting (
49) of the total Hamiltonian into the first class Hamiltonian and the primary first class constraint is not special to our example. This splitting will occur for any constrained Hamiltonian system [
10,
11]. In general,
will include the products of every primary first class constraint with an arbitrary multiplier.
Primary first class constraints generate gauge transformations. Consider the change in a phase space function
F generated by the primary first class constraint
,
This transformation does not change the physical state of the system. We can see this by considering
F to be evaluated as a function of the
q’s and
p’s at some particular time
t. At an infinitesimally later time
, this function becomes
. In terms of the first class Hamiltonian, we have
The Lagrange multiplier is arbitrary, so we can make a different choice during the time interval from
t to
, say,
. Then, the function
F at time
will be
The physical state of the system at
should not depend on our choice of Lagrange multiplier, so
and
must represent the same physical state. The result (
51) is obtained by subtracting Equation (
52) from Equation (
53) and defining
and
.
For the phase space coordinates, the gauge transformation generated by the primary first class constraint
is
The transformations of the
p’s all vanish. This result agrees with the gauge transformation from Equation (14), with the change of notation
. Here, we denote the gauge parameter by
because the transformation is infinitesimal; in Equation (14), we used
because the transformation was finite. It is clear that the infinitesimal transformation (54) can be iterated to obtain the finite transformation (14).
In general, a gauge transformation is defined as a transformation
that does not alter the physical state of the system. The function
is the gauge generator. We have seen that the primary first class constraints generate gauge transformations. However, not all gauge transformations are generated by primary first class constraints. In fact, it can be shown [
10,
11] that the Poisson bracket between any primary first class constraint and the first class Hamiltonian is itself a first class constraint that generates a gauge transformation.
4For our example problem, the Poisson bracket of the primary first class constraint
and the first class Hamiltonian
is
This is the secondary first class constraint,
. Thus, we see that in this example both the primary and secondary first class constraints are generators of gauge transformations. Explicitly, the transformation
is
with the transformations of the
p’s all vanishing. We can iterate this infinitesimal gauge transformation (56) to obtain the finite transformation (16).
The “doubling” of the gauge freedom identified in
Section 3 appears quite naturally in the Hamiltonian formalism. The two types of gauge transformation are generated by the two first class constraints,
and
.
The Dirac conjecture [
10] says that
all first class constraints (whether they are primary, secondary, etc., or a combination of primary, secondary, etc.) generate gauge transformations. This conjecture does not hold as a general theorem—there are known examples in which the transformation generated by a secondary first class constraint does not coincide with any invariance of the original Lagrangian system.
5 Nevertheless, the Dirac conjecture is usually taken as an assumption. It appears that in practice, for systems of physical interest, all first class constraints generate gauge transformations.
12. Extended Hamiltonian
The Dirac conjecture tells us that all first class constraints generate gauge transformations and should be treated on an equal footing. The
extended Hamiltonian is defined by adding all first class constraints
with Lagrange multipliers
to the first class Hamiltonian:
(A sum over the index
a is implied.)
For our example, the extended Hamiltonian is
The equations of motion
are
and
Let us compare these results to the equations of motion (41) obtained from the total Hamiltonian
. There are just two differences. The first is trivial: the Lagrange multiplier
in Equation (41) has changed names to
in Equation (59). The second difference is significant: the equation for
has an extra term
on the right-hand side. This is a new feature of the extended Hamiltonian. It makes explicit the fact that the gauge freedom allows
to be changed arbitrarily, and independently, from the other variables.
We can check the equations of motion for
following the same reasoning that was applied to the equations of motion for
. First, recall that the (first and second class) constraints imply
,
and
. Then Equation (59e,g) are vacuous, and Equation (59f,h) are redundant. It also follows that with the constraints imposed, Equation (59d) is a consequence of Equation (59a,b). The remaining equations are
These agree with Equation (42), apart from the change of notation
and the extra term
on the right-hand side of the
equation.
By eliminating
, the equations of motion generated by the extended Hamiltonian
become
If we differentiate the first equation, combine with the second, and use the constraint
, we obtain
. This is the expected result (5c). In fact, the only difference between Hamilton’s equations
and the results (5) (which are equivalent to Lagrange’s equations) is the extra term
in Equation (61b) above. That term does not appear in the corresponding Lagrangian Equation (5b).
Note that we can use the first class constraints to simplify the extended Hamiltonian
. For example, using
, we can set
everywhere in Equation (
58), except of course in the term
. The extended Hamiltonian becomes
This amounts to replacing the Lagrange multiplier
in Equation (
58) by
This replacement does not change the physical content of the theory, since the Lagrange multiplier
is arbitrary.
13. Extended Action
The equations of motion for the extended theory can be derived from the action [
11]
which includes the second class constraints with Lagrange multipliers
. Recall that the first class constraints
are included in the extended Hamiltonian
with multipliers
, so
includes all four constraints.
We can use either form of the extended Hamiltonian, Equation (
58) or (
62), in the extended action. Let us use Equation (
58). Then, the equations of motion that follow from extremizing
with respect to the momenta
are
Extremizing
with respect to the coordinates yields
and the constraints
follow from extremizing
with respect to the Lagrange multipliers
and
.
Let us check these equations of motion. The constraints imply
,
and
. Then the equation of motion (65e) gives
. The constraints also imply
. The sum of Equation (65a,b,d) then yields
. Now, if we set
and
to zero, the equations of motion (65a–h) agree precisely with Hamilton’s Equation (59) for the extended Hamiltonian
. In
Section 12, we showed that the equations generated by
agree with Lagrange’s equations apart from the extra term
in the equation for
. This term extends the original Lagrangian theory by making explicit the fact that the gauge freedom allows for independent transformations of
.
Finally, we note that the extended action is invariant under the transformation defined by
for the phase space variables and
for the Lagrange multipliers. Here, the gauge parameters
and
are functions of time. These equations express the gauge invariance at the level of the action
.
14. Dirac Brackets and the Partially Reduced Hamiltonian
We now return to the evolution defined by the extended Hamiltonian
of Equation (
58), and Hamilton’s Equation (59). To obtain a physically allowed trajectory, we must choose initial data that satisfy the four constraints
and
. Apart from restricting the initial data, the second class constraints play no role in the formalism. It would be convenient if we could restrict the variables from the outset such that the second class constraints are automatically satisfied. For example, we could use
and
from Equation (65k,l) to replace
with
and replace
with
.
We are not allowed to apply the second class constraints in this way. For example, consider the Poisson brackets . If we were to replace with , we would find a different answer: . The second class constraints cannot be imposed before Poisson brackets are computed.
Dirac devised a way to allow the second class constraints to be imposed from the outset by modifying the Poisson brackets [
10]. The result is the Dirac brackets.
To construct Dirac brackets, we first compute the matrix of Poisson brackets among the second class constraints:
Let
denote the inverse of
. Then, the Dirac brackets
of two phase space functions
F and
G are defined by
Dirac brackets, like Poisson brackets, are antisymmetric and satisfy the Jacobi identity [
10,
11].
Explicitly, the Dirac brackets among the coordinates are
and the Dirac brackets between the
q’s and
p’s are
For our example, the Dirac brackets among the momenta all vanish:
.
There are two key properties that make Dirac brackets relevant. First, the Dirac brackets agree weakly with Poisson brackets if one of the two functions is first class. Since the extended Hamiltonian is first class, we have
for any
F. It follows that we can write the equations of motion as
using Dirac brackets.
The second key property of the Dirac brackets is that they weakly vanish if one of the functions is a second class constraint:
. This allows us to apply the second class constraints before computing brackets. For example, we can use either
or
to compute Dirac brackets with
:
With Dirac brackets, the second class constraints can be treated as strong equations and imposed before computing the equations of motion.
Let us use the second class constraints (65k,l) to eliminate
and
and write the extended Hamiltonian in terms of the smaller set of variables
,
,
,
,
and
. Setting
we have
This is the
partially reduced Hamiltonian, obtained from the extended Hamiltonian by applying the second class constraints.
Of course, the partially reduced Hamiltonian is not unique. We could use the second class constraints to eliminate some other pair of variables instead of and .
The equations of motion generated by the partially reduced Hamiltonian,
, are
We can also use
and the Dirac brackets to compute
and
. The results are equivalent to those obtained by differentiating the right-hand sides of Equation (74) and using the equations of motion (76).
Let us check the equations of motion. With the second class constraints applied, the first class constraints imply
and
. Thus, Equation (76d,e) are vacuous and the remaining equations become
Compare these to the independent Equation (60) that follow from the extended Hamiltonian. Equation (60b–d) agree with Equation (77a,b,d) once we use
. The final Equation (60a) is obtained by differentiating
in time and using Equation (77a,b).
15. Partially Reduced Action
The partially reduced equations of motion (76) can be obtained from the extended action
by eliminating the superfluous variables. Note that the equations of motion obtained by varying
with respect to
,
,
and
are Equation (65b,e,k,l), respectively. We can eliminate these variables by solving these equations and substituting the results into the action.
6 The results are:
Inserting these into the extended action (
64), we find
This is the partially reduced action.
The equations of motion obtained from varying
with respect to the phase space variables are
and the equations obtained by varying with respect to the Lagrange multipliers
and
are
These are, of course, the first class constraints, reduced by using the second class constraints to eliminate
and
.
We can now solve Equation (80) for the time derivatives of , , , , and . The result coincides with the equations of motion (76) obtained from the partially reduced Hamiltonian and the Dirac brackets.
The partially reduced action
is invariant under the transformation defined by
for the phase space variables and
for the Lagrange multipliers. These equations express the gauge invariance of the theory at the level of the action principle with the second class constraints eliminated.
16. Partially Reduced Lagrangian
It is not too difficult to find a change of variables that will bring
into “canonical form”. For example, let
define a new set of variables
,
for the secondary constraint surface. (The index
ranges over 1, 2 and 3.) The partially reduced action becomes
with
The equations of motion
include
and
, where
are the usual Poisson brackets. The partially reduced action (
84) is invariant under the transformation
where
and
are the first class constraints.
We can use the equations of motion
to eliminate the momenta
from the partially reduced action (
84). These equations are
with solutions
Inserting these results into the partially reduced action, we find
The
partially reduced Lagrangian is the integrand of this action.
We can go one step further and eliminate the Lagrange multipliers using the equations of motion
and
. These equations have solutions
Inserting these results back into the action yields
This is the action for a harmonic oscillator in the variable
. Note that the coordinate transformation (83) implies
, so once again, we find that the coordinate combination
describes a simple harmonic oscillator. In addition, we observe that the action (
91) leaves the variable
completely unspecified. This expresses the gauge freedom generated by the first class constraint
.
17. Gauge Conditions and the Fully Reduced Hamiltonian
Let us return to the theory described by the extended Hamiltonian, prior to the elimination of the second class constraints.
For our example problem, phase space is eight-dimensional. The physical trajectories fill the constraint “surface”, which is the four-dimensional subspace where all first and second class constraints hold. Each point in the constraint surface can be mapped into a physically equivalent state by the gauge generators, namely, the first class constraints . Since there are two independent gauge generators, each physical state of the system corresponds to a two-dimensional subspace of the constraint surface. The constraint surface is foliated by these two-dimensional slices, referred to as gauge “orbits”.
We can select a single phase space point on each gauge orbit to represent the physical state. We do this by applying gauge conditions. In particular, we will consider a
canonical gauge7 which takes the form
with
. A good canonical gauge condition must not be gauge invariant, otherwise it would allow more than one point on the gauge orbit to represent the physical state of the system. To be precise, the matrix of Poisson brackets of gauge conditions and gauge generators,
, must be nonsingular [
11].
As an example, let us choose
as our gauge conditions. This is a good gauge:
The matrix
is nonsingular, as required.
The gauge conditions
, like the first and second class constraints, restrict the phase space variables. The full set of restrictions
reduces the available phase space from eight dimensions to two dimensions. (Here, the index
A ranges from 1 to 6.) Taken as a whole, the six conditions
are second class. We see this by computing the Poisson brackets
This matrix has a nonzero determinant,
, which is the condition for the set of constraints and gauge conditions to be second class.
We can eliminate the constraints and gauge conditions by constructing Dirac brackets. The inverse of
is
and the Dirac brackets are defined by
The Dirac brackets among the phase space variables are
with all other brackets vanishing.
The constraints can be solved in various ways and the results can be used freely, either before or after computing Dirac brackets. For example, the constraints imply
We can use these to eliminate the variables
,
,
,
,
and
. Then the extended Hamiltonian
becomes the
fully reduced Hamiltonian
which depends only on
and
.
The Dirac brackets of the variables that remain are
. Thus, the equations of motion become
These are the equations for a simple harmonic oscillator with solution
where
and
are arbitrary constants.
With the gauge fixed, the dynamics take place on the fully reduced phase space, the two-dimensional surface defined by the constraints and gauge conditions
. There are many different choices of coordinates for this surface. Instead of solving the constraints for
and
, we could solve them for
and
. In that case, the fully reduced Hamiltonian is
and Hamilton’s equations are
Again, this describes the simple harmonic oscillator.
We can choose other coordinates on the fully reduced phase space. For example, let
and
, then use the constraints to eliminate
,
,
,
,
and
. The fully reduced Hamiltonian becomes
The nonzero Dirac brackets are
, and the equations of motion are simply
and
.
In each case, the fully reduced theory exhibits the single physical degree of freedom that we expect.
18. Fully Reduced Action
The fully reduced equations of motion can be derived from the action that includes all of the constraints and gauge conditions. For lack of a better name, let us denote this action with the subscript “all”:
Now extremize
with respect to variations in
,
,
,
,
,
and
:
In addition, vary
with respect to the Lagrange multipliers to obtain the constraints
. The solution of the full set of equations, (107) and
, is given by Equation (99) along with
We now insert these results into
to obtain the
fully reduced action
which is a functional of
and
. The equations of motion
are
These are equivalent to Hamilton’s Equation (101) for the fully reduced Hamiltonian.
We can place
into “canonical form” by defining new variables
and
. Then
which is the familiar action for the harmonic oscillator.
19. Summary and Discussion
Here is the Dirac–Bergmann algorithm:
Compute the conjugate momenta and define the canonical Hamiltonian as , written in terms of p’s and q’s.
Identify the primary constraints. The primary Hamiltonian is obtained from by adding the primary constraints with Lagrange multipliers.
Apply Dirac’s consistency conditions to identify higher-order constraints and restrictions on the Lagrange multipliers.
The total Hamiltonian is found from by incorporating the restrictions on the Lagrange multipliers.
Separate the primary, secondary, and higher-order constraints into first and second class.
The first class Hamiltonian is the part of with the primary first class constraints removed.
The extended Hamiltonian is obtained from the first class Hamiltonian by adding all of the first class constraints with Lagrange multipliers.
The partially reduced Hamiltonian is found from by using Dirac brackets to eliminate the second class constraints.
Gauge freedom is removed by assigning gauge conditions. The fully reduced Hamiltonian is obtained from by using Dirac brackets to impose all constraints and gauge conditions.
The theory defined by the singular Lagrangian (
1) provides a relatively complete example of each step in the algorithm.
One reason the Dirac–Bergmann algorithm can be confusing is that typical examples are chosen for simplicity, allowing some of the logical steps to be skipped. This causes the distinction between Hamiltonians to become blurred. For example, if there are no restrictions on the Lagrange multipliers, then the primary Hamiltonian and the total Hamiltonian coincide. Likewise, if there are no secondary (or higher-order) first class constraints, then the total Hamiltonian and the extended Hamiltonian coincide. Moreover, for theories with no second class constraints and no gauge conditions imposed, Dirac brackets and the reduction process are not needed.
Another confusing aspect of the Dirac–Bergmann algorithm is that for many important theories, the Lagrangian is given to us in a form that contains Lagrange multipliers. For example, consider the Einstein–Hilbert action of general relativity. The Lagrangian density is the spacetime curvature scalar. A 3 + 1 splitting of the spacetime metric [
31,
32] allows us to write the Lagrangian density as
apart from a total derivative term that integrates to the boundary. Here,
R and
are the spatial scalar curvature and spatial metric. In addition,
is the extrinsic curvature of space, built from the lapse function
N, shift vector
and spatial derivatives of
. The Lagrangian density depends on the Lagrange multipliers
N and
as well as the configuration space coordinates
.
Because the Einstein–Hilbert Lagrangian (
112) depends on Lagrange multipliers, it is not analogous to the singular Lagrangian (
1) of our example problem. Rather, it is analogous to the partially reduced Lagrangian that appears in the integrand of the action
of Equation (
89). Recall that the partially reduced Hamiltonian
contains only first class constraints and no restrictions on the Lagrange multipliers. Likewise, the Hamiltonian for general relativity [
32] is constructed from first class constraints (the Hamiltonian and momentum constraints), and the Lagrange multipliers (the lapse function and shift vector) are unrestricted.
We can attempt to eliminate the lapse and shift from the Einstein–Hilbert action, just as we eliminated the
’s from
and obtained the result
in Equation (
91). It is straightforward to eliminate the lapse
N—the result is the Baierlein–Sharp–Wheeler action [
33]. However, the shift vector cannot be eliminated algebraically because the equations of motion obtained by varying the action with respect to the lapse and shift depend on spatial derivatives of
.
So, for general relativity, we do not have a singular Lagrangian analogous to Equation (
1), and we cannot expect to apply the Dirac–Bergmann algorithm from beginning to end as laid out by Dirac [
10]. Nevertheless, the general Dirac–Bergmann algorithm serves as the foundation for our understanding and interpretation of the Hamiltonian form of the theory.