1. Introduction
The strongly-interacting quantum many-body problem is crucial to our understanding of many intriguing physical phenomena, but it is also inherently difficult to treat numerically owing to the exponential growth of the Hilbert space with system size. A commonly used approximate strategy is the variational method where a trial state, characterized by a tractable number of variational parameters, is optimized in energy. The effectiveness of this approach is highly dependent on the ansatz having an expressive form that can be systematically improved, to minimize bias, while also allowing relevant observables to be evaluated efficiently. Tensor networks have provided several examples of such ansatzes, with matrix product states (MPS) [
1,
2] displaying impressive accuracy in one-dimensional systems, along with projected entangled-pair states (PEPS) [
1,
3], tree tensor networks (TTN) [
4,
5], and multi-scale entanglement renormalization ansatz (MERA) [
6] making two-dimensional systems accessible. Recently, artificial neural networks (ANNs) have emerged as another class of highly flexible variational ansatzes with many variants, such as restricted Boltzmann machines (RBM) [
7], Deep Boltzmann Machines (DBM) [
8,
9,
10], convolutional neural networks (CNN) [
11,
12,
13,
14,
15], and feed-forward neural networks (FFNN) [
16,
17,
18,
19]. An important advantage of ANNs is that they are highly flexible and can be applied to any number of spatial dimensions, making them a powerful method for tackling the subtle physics seen in two-dimensional systems.
Although one of the simplest ANN variants, RBMs have seen widespread applications, including for open quantum systems [
20,
21,
22,
23], frustrated spin problems [
24,
25], quantum circuit simulation [
26,
27,
28], and more. There are several reasons for their continued use. First, their simple structure allows for efficient sampling crucial for applying variational Monte Carlo (VMC) [
29,
30]. Second, RBMs are also a good candidate for a weakly biased ansatz, given that they are capable of exactly representing arbitrary states when their hidden unit number
M scales exponentially with system size
N. Third, RBMs are capable of representing states with volume-law entanglement [
31], which further distinguishes them from tensor networks [
32], despite their conceptual similarities [
33,
34,
35]. Finally, there are also numerous classes of states with efficient exact RBM representations, including graph states [
8]; spin Jastrow states, such as Laughlin states [
33,
36,
37]; and general stabilizer states, such as the toric code [
38,
39,
40,
41], as well as more exotic hypergraph states and XS-stabilizer states [
40]. Recently, we found that all but the last class listed, in fact, have RBM representations requiring
hidden units [
42], illustrating how even very modestly sized RBMs have significant representational power.
Despite their efficacy for spin-
systems, the application of RBMs to systems with a local on-site dimension
, such as spin-1 or bosonic systems, has been limited with convolutional or feedforward neural networks generally being favored [
16,
17,
43]. The typical approach in machine learning to handle models with multinomial or categorical variables is so-called “one-hot” or “unary” encoding [
44]. Rather than representing a physical degree of freedom directly with one visible unit this approach encodes the possible local physical states into a set of binary visible units. While this approach leverages the power of binary or spin-
RBMs, it multiplies the number of visible units by a factor
d, significantly increasing the parameter count and complexity of the optimization. As a consequence, studies utilizing unary encoding so far, for example, on the Bose Hubbard model [
45,
46], have been limited to small system sizes. Thus, there is need to devise more efficient RBM constructions tailored for
systems.
Progress has been made in this direction in recent work [
47], where multivalued RBMs were applied directly to the one-dimensional spin-1 anti-ferromagnetic Heisenberg model (AFH) and substantially enhanced by incorporating a transformation to a coupled SU(2) symmetric basis. Complementary to this, here, we propose and study a direct generalization of the RBMs to spin-1 systems that retains key properties of spin-
RBM with a minimal increase in variational parameters (as we will not be examining other network architectures, we will use RBM and NQS interchangeably in this paper). Specifically, the ability to describe arbitrary product states without hidden units, invariance of the parameterization to the values assigned to visible variables (labeling freedom), and equivalence to the tensor network formulation. This leads to the introduction of new quadratic bias and interaction weights in the RBM effective energy function. We demonstrate the effectiveness of the new formulation via VMC calculations for the spin-1 AFH model, where it is seen to deliver the same accuracy as unary encoding but with substantially fewer variational parameters. Additionally, we also investigate how the local single-spin basis affects the hidden unit complexity of a state by performing calculations in both the
and
spin-1 bases. For the AFH model with periodic boundary conditions, we find that the
is more accurate. A useful advantage of our new spin-1 RBM formulation is that it permits tensor network based analytic constructions. Focusing on the paradigmatic Affleck-Kennedy-Lieb-Tasaki (AKLT) model, we give explicit exact NQS representations in both the
and
bases. Our
basis NQS construction displays the expected [
36]
scaling, while the simplification of the amplitude structure in the
basis gives an NQS construction with
hidden units. By using VMC calculations, we find compelling evidence that the AKLT state, in fact, only requires
hidden units to be represented exactly in the
basis.
The structure of this paper is as follows. We briefly outline the VMC method applied to the many body problem in
Section 2 and discuss desirable properties for variational ansatzes. Next, in
Section 3, we discuss the RBM in its spin-
form and analyze its key properties. In
Section 4, we introduce a new generalization of NQS to spin-1 systems designed to mimic these properties and present VMC results for the AFH model. We then introduce analytic constructions for the AKLT model in
Section 5, followed by VMC calculations in both the
and
basis. Finally, in
Section 6, we conclude and discuss some open problems.
4. Generalization to Spin-1 Systems
A tentative definition of a spin-1 RBM can be made by simply using the three-valued visible variables
with a natural Ising-like physical to visible mapping,
in the standard linear energy function Equation (
3). By retaining two-valued Ising-like hidden units
, the amplitudes
continue to be given by Equation (
6), but now admitting three-valued visible variables.
This spin-1 generalization of the RBM lacks the key properties of the spin-
variant. First,
cannot easily describe a generic product spin state
with coefficients
, since the visible bias term
cannot discriminate each of the local spin-1 states. This points to issues of expressiveness since we are using the same set of
parameters
to describe a bigger state space. Second, the energy function Equation (
3) does not possess labeling freedom for the visible units values. An arbitrary physical to visible mapping,
is generated from an Ising-like variable
via the quadratic transformation
. Consequently, transforming visible variables in this way cannot be accommodated in
by changing only the parameters
. One way to avoid these issues is to use “one-hot” or “unary” encoding which has been successfully applied to both spin-1 [
40] and bosonic systems [
45,
46].
4.1. Unary Encoding Approach
Unary encoding applies the spin-
RBM formalism of Equation (
4) to a larger state space of each spin-1 by mapping the physical system into one comprising a larger number of spin-
particles. Specifically, the local
basis of each spin-1 is mapped on to three spin-
’s as
Physical states of the spin-1 system are now contained in the unary encoded subspace where only a single-excitation occurs within any three-site unit cell which facilitates the efficient projection of the representation. This naturally generalizes to a
d-dimensional local state space.
By using this mapping, a spin-
RBM can be applied inheriting its labeling freedom. Owing to the single-excitation projection, product states in Equation (
13) are readily described by the visible biases
and
associated with the three spin-
’s {a,b,c} encoding a given spin-1. In general, for a
d-dimensional local state space, unary encoding accounts for the enlarged state space by increasing the number of variational parameters to
compared to the naive spin-1 RBM. However, there are some obvious deficiencies of this approach. First, unary encoding appears to unnecessarily enlarge the parameter count, as evidenced by the fact that it increases it even for
. This will significantly increase the computational cost. Second, splitting the interaction weights
w across a unary cell makes the interpretation of any individual hidden unit’s contribution to the physical state rather opaque.
4.2. Defining a Spin-1 RBM and Tensor Network
These deficiencies show that a more general RBM energy function is needed that is sensitive to the three-valued nature of the visible units through the inclusion of terms involving the square of visible variables (a similar approach is likely to have been used in Ref. [
47] already, although it was not explicitly stated). This motivates the following definition: efinition
Definition 2 (spin-1 NQS).
We introduce a direct spin-1 RBM as the ansatz with amplitudes via the energy function defined by the parameters . The new contributions to a spin-1 RBM are
, an
-dimensional matrix of quadratic interactions, and
, an
N-dimensional vector of quadratic visible biases. There are now
complex parameters in total. Tracing out the two-valued Ising-like hidden units gives amplitudes
The inclusion of a quadratic visible bias now allows any product state in Equation (
7) to be described without hidden units by setting
while the quadratic interaction term ensures labeling freedom for the visible variable.
A strong justification for Equation (
16) being the appropriate spin-1 generalization of RBMs is its relation to an NQS tensor network for spin-1. To handle spin-1 systems, we introduce a COPY tensor with three-dimensional legs copying the
basis states
. Its properties are summarized in
Figure 3a–f and are straightforward generalizations of the spin-
case given in
Figure 2a–f and discussed in
Section 3.2. The major difference is that we now distinguish three-dimensional legs with ‘=’ lines instead of ‘−’. As before, the NQS tensor network follows from the conversion of the RBM graph in
Figure 1, except that visible vertices are now replaced by three-dimensional COPY tensors giving:
Owing to the mixed dimensionality of the COPY tensors in this network, it now requires a rectangular
coupling matrix
between the
i-th hidden and
j-th visible unit. For the
i-th hidden unit, its set of coupling matrices
can be explicitly tabulated as
The amplitudes
of this hidden unit’s correlator then follow by summing the product of coupling matrix elements selected by
along each row, giving
The amplitudes of NQS tensor network are then the product of each of these hidden unit correlators
which, like an RBM, can be exactly and efficiently sampled.
The spin-1 NQS tensor network appears to have
complex variational parameters; however, again, gauge freedom allows the shuffling of diagonal matrices (of an appropriate dimension) through a COPY tensors reducing this. Specifically, its equivalence to the generalized spin-1 RBM proposed in Equation (
16) is made using the coupling matrix decomposition
The solution for the weights
and partial biases
is outlined in
Appendix B, from which the full RBM biases are formed as
,
and
. The spin-1 NQS tensor network, thus, reduces to
complex parameters.
This correspondence between the spin-1 RBM and the spin-1 NQS tensor network highlights an advantage over unary encoding. Specifically, coupling matrices provide an intuitive tool for engineering the correlations and structures that a given hidden unit imprints on the amplitudes of an NQS. A trivial case is when all the elements of the
j-th coupling matrix are 1’s, denoted generically as
, equivalent to the hidden unit being disconnected from that visible unit. A more complex example with conditional correlations is a hidden unit with coupling matrices
built from
This hidden unit generates a correlator
, where a factor
is introduced conditional on the
kth spin being in the state
, with
being the number of
and
states in the configuration
between spins
and
N. Similar types of hidden units will be used extensively in
Section 5.1 to construct an exact representation of a state.
4.3. Projection of Unary Encoding into a Spin-1 RBM
The tensor network formalism provides further evidence that unary encoding from
Section 4.1 is an over-parameterization of RBMs for spin-1 systems with
redundant parameters. Unary projection is implemented by an order-4 tensor
obeying:
By directly applying this projector to the spin-
NQS tensor network for a unary encoded state and performing a graphical rewrite, we obtain the spin-1 NQS introduced in Equation (
18).
Figure 4 summarizes the crucial manipulations required.
In
Figure 4a, some representative examples of contractions between the projection tensor
and hidden units are shown. There are three steps to rewriting the network. The first step, shown in
Figure 4b, essentially pulls
through the three unary two-dimensional COPY tensors, leaving behind a single three-dimensional COPY tensor representing the physical spin-1. Two important cases are shown in the example in
Figure 4b. A hidden unit may have connections to each of the unary spin-
’s, where upon they get bundled up by the
tensor. A hidden unit may connect to only a subset of the unary spin-
’s, which is handled by plugging the unused legs of
tensor with
. If a hidden unit couples to more than one of the unary spins, then the second step, shown in
Figure 4c, involves splitting the hidden unit’s two-dimensional COPY tensor to separate those connections. The final step is then to contract the split COPY tensor, coupling matrices and the projection
to form a rectangular
coupling matrix, as depicted in
Figure 4d. If a hidden unit has connections exclusively within the unary spin-
’s, then it becomes an entirely local visible bias contribution in the spin-1 NQS.
4.4. Change of Local Spin Basis
For tensor network representations, such as MPS or PEPS, the complexity (internal bond dimension) of a given state’s description is rooted in its entanglement structure. As such changing the local basis of the spins used in a calculation has no effect on this complexity. Moreover, transforming a representation from one basis to another is accomplished by simply admixing the local tensors. Within VMC, a change of local spin basis leaves the locality and the sparsity of the Hamiltonian essentially unchanged. However, it is pivotal to the method that the amplitudes of whatever ansatz is used can be efficiently evaluated in this new basis. This is not generally true of NQS since their sampleability is intimately tied to the basis that factorizes the COPY tensors they are built from.
To understand how NQS behave, consider a representation of some state
, for instance, of four spin-1’s sampleable in the
basis
Now, suppose we transform the local
basis
to a new basis
via a unitary
. Formally, from a tensor network perspective, we find the
basis NQS representation by sandwiching
on each physical leg and computing the
basis NQS representation of
. However, currently, there is no known procedure for updating exactly an NQS after the application of an arbitrary single-spin unitary, even allowing for an increased number of hidden units. While we are guaranteed that an NQS representation of
exists, as illustrated here schematically
there is no guarantee it will be efficient, even if the representation of
was originally. An example of such a catastrophic loss of NQS efficiency has been presented by Gao and Duan [
8]. They show that, so long as the polynomial hierarchy in computational complexity theory does not collapse, a two-dimensional cluster state, which has an efficient NQS representation, has no efficient NQS representation after a specific layer of translation-invariant single-spin unitaries are applied. Thus, NQS complexity depends non-trivially on the local spin basis used.
Motivated by this, we will consider NQS calculations in two different local spin-1 bases to examine how the complexity varies, specifically, the standard
basis and the
basis defined as
In the
basis, the individual spin-1 operators all acquire the same off-diagonal form
as a consequence of them all contributing one eigenstate to the basis. On a practical level, using the
basis for the spin-1 RBM in Equation (
17) simply requires replacing
with
and a physical-visible mapping from
, such as
Equivalently, the visible unit COPY tensors for this NQS tensor network can be considered to be rotated into this
basis.
4.5. Numerical Example—Spin-1 Anti-Ferromagnetic Heisenberg Model
To confirm the effectiveness of our spin-1 NQS, we performed VMC calculations to reach the ground state of the well-known anti-ferromagnetic Heisenberg (AFH) model in one dimension. The Hamiltonian is given by
where
is the magnetic interaction strength and with periodic boundary conditions
. We focus on small systems allowing direct comparison to the ground state calculated from exact diagonalization via the overlap
. Additionally, we performed NQS optimizations in both the
and
bases to compare any differences in performance. This basis change alters the fixed quantum numbers of the system. In particular, in the
basis, the AFH model preserves the total
projection of the system, and its ground state lies in the
sector of the full Hilbert space, while, in the
basis, the AFH model preserves the “parity” of the total
spin populations in the system, and, for even
N, the ground state lies in the subspace of configurations, where there is an even number of each basis state. As is common for VMC calculations, we only select configurations from the relevant subspace during sampling. We perform the calculations with hidden unit numbers
, initializing
with random small complex parameters. For each successive calculation, we need the NQS with the parameters for
and initialize the
Mth hidden unit with random small parameters, gradually increasing the size of the network in a sequential manner. To check the robustness against initialization bias of the qualitative features we have discussed, we rerun optimization sequences 5 to 10 times and present the best results here.
In
Figure 5, we show how the accuracy of spin-1 NQS and unary encoding representations improve with an increasing number of hidden units
M for both the
and
bases plotted in terms of the variational parameter count. The spin-1 NQS achieves a superior accuracy to unary encoding for a similar
. The inset of
Figure 5a shows the collapse of the same
data plotted against
M, indicating that the spin-1 NQS and unary encoding have in fact located the same solution for a given
M. However, by using
less parameters, the spin-1 NQS is considerably more efficient to optimize, especially noting that
scales with both system size and hidden unit number (for example, consider that the
spin-1 NQS has a comparable parameter count to an
unary NQS). In
Figure 5b, we observe a noticeable drop in accuracy for both NQS variants in the
basis compared to the
basis. This suggests that the AFH ground state amplitude structure with periodic boundaries is inherently more complicated in the
basis regardless of encoding. Moreover, this confirms that hidden unit number
M of an NQS is basis-dependent quantity and cannot be used as a proxy of the entanglement.
5. Revisiting the AKLT Model
We now move on to benchmark our spin-1 NQS against the analytically solvable AKLT model [
52], which is a spin-1 chain governed by a bilinear-biquadratic SU(2)-isotropic Heisenberg Hamiltonian of the form
with periodic boundary conditions
. It has special significance since it was the first solvable spin-1 chain model that exhibits the ‘Haldane gap’ [
53]. The AKLT state
is the ground state of
at the AKLT point
, and has an energy of exactly zero.
As is well-known,
has a special structure of correlations which are related to a valence bond solid. Specifically, each spin-1 is envisaged as being a pair of spin-
particles that are correspondingly entangled in a singlet state with a partner spin-
in the nearest neighboring spin-1 on the chain. The AKLT state is then the projection
of the local pairs of spin-
particles into the triplet subspace, as depicted in
Figure 6. This also leads to
possessing a very compact MPS representation with matrices
where
, such that the (unnormalized) amplitudes of the ground state in the
basis follow as
Since the AKLT point of
lies in the gapped Haldane phase,
has finite-ranged magnetic correlations,
yet it also has an unbroken spin rotation symmetry which is a hallmark of a symmetry protected topological order. Specifically, the string-order parameter
reveals the presence of infinite-ranged anti-ferromagnetic correlations. This is evident from the structure of the MPS amplitudes. Any matrix product
, so any configuration containing a ferromagnetic segment, like “+ 0 0 0 +”, with any number of 0’s is not allowed. In contrast, allowed configurations contain only anti-ferromagnetic segments, such as “– 0 + 0 0 0 – + 0”, arising from sequences, like
, where every ± is partnered with a ∓ separated by with an arbitrary string of 0’s.
Despite its simple MPS representation it is surprisingly non-trivial to capture the non-commutative matrix products making up the AKLT amplitudes with an NQS. Direct conversion to an NQS from the MPS representation gives two reasons why it must contain long-ranged hidden units. First, it has been shown [
38] that any short-ranged translationally invariant NQS cast into a MPS form by mapping hidden units into virtual bonds has
matrices that are at most rank-1. Since the matrix
in the AKLT state is rank-2, it fails this condition. Second, if we divide the chain into a sequence of three contiguous parts
, once we make
b larger than the longest range of any hidden unit, so no hidden unit connects to visible units in both
a and
c, then the NQS amplitudes factorize as
implying that
and
are uncorrelated [
34]. The AKLT amplitudes
do not satisfy this property since region
b can be any length of 0’s, and there will always be non-zero amplitudes “+ 0 0 0 ...0 –” and “– 0 0 0 ...0 +” encoding string order correlations that are not factorizable.
It has been previously found that the AKLT state in the
basis requires an NQS with
long-ranged hidden units [
36], and this was borne out in numerical calculations for small systems. In
Appendix C, we explicitly construct an NQS for the
basis AKLT amplitudes using
hidden units, many of which are extensive over the system. The
scaling can be readily understood as a consequence of having hidden units that each eliminate disallowed configurations, such as “± 0 0 0 ±”, and impose the sign for allowed configurations, such as “± 0 0 0 ∓”, for all
N separations and
N translations over the system. This is rather less efficient than the compact spin-
NQS found for Jastrow, graph and stabilizer states in ref. [
42]. The AKLT state can be expressed with
hidden units but at the expense of needing a 2-layer DBM network [
38] that cannot in general be exactly sampled, complicating its use numerically. However, as we saw for the AFH model numerical results, the hidden unit complexity is basis dependent. Surprisingly, we will show next that an exact
spin-1 NQS representation of the AKLT state is obtained in the
basis.
5.1. Exact Spin-1 NQS for AKLT State in the Basis
The AKLT state provides an instructive example of how a single spin basis change can significantly alter the amplitude structure. Transforming the MPS representation into the
basis yields matrices
and, thus, renders the amplitudes into products of Pauli matrices
where
with
label the
basis. As expected, there is no change in the complexity/internal dimension of the MPS representation.
The structure of the amplitudes in the basis is significantly simpler than in the basis. Amplitudes are now evaluated by tracking the anticommutations of Pauli matrices required to make the matrices of each the type form a contiguous sequence, e.g., , and then reducing the product repeatedly via . The resulting matrix trace is non-zero only when the overall product is , and so all non-zero amplitudes have an equal magnitude. Depending on whether N is even or odd, this condition requires that there is either an even or odd number of and z’s in any configuration string, respectively. Using this, we arrive at the following result:
Theorem 1 (AKLT state NQS). The AKLT state in the spin basis has an exact spin-1 NQS representation requiring hidden units.
Proof. We establish this result using a direct and intuitive construction for in which hidden units are devised to implement the nodal structure and sign structure of this state. The rules governing the amplitudes are as follows:
To implement the parity constraint on the number of
and
z’s in any configuration string
, we introduce the following
coupling matrices:
By defining two hidden units from these matrices as
and
, we arrive at the product filter
in which the hidden units cancel out any strings
that have odd numbers of both
x’s and
y’s, and
y’s and
z’s, respectively. Together, these hidden units completely establish for any
N the nodal structure of the AKLT state amplitudes in this basis.
To reproduce the sign structure arising from anticommuting Pauli matrices into a contiguous sequence, we require two types of hidden units. The first type of hidden unit uses a conditional coupling matrix for the local state
along with
to define a hidden unit of the form
where
appears in the
kth position in the sequence. The action of
is to induce on a configuration a factor
between site
and the left boundary, conditional on site
k being in state
. This is the sign that would occur if a
matrix was anticommuted to this boundary through the corresponding product of Pauli matrices. The second type of hidden unit uses two further coupling matrices
defining a hidden unit of the form
where
appears in the
kth position in the sequence. The action of
is to induce on a configuration a factor
between site
and the right boundary, conditional on site
k being in state
. This is the sign that would occur if a
matrix was anticommuted to the right boundary through the corresponding product of Pauli matrices, assuming that any
’s have already been anticommuted to the left boundary. To capture all locations
k for both types, thus, requires
hidden units which entirely establish the sign structure of the AKLT state amplitudes in this basis.
This gives a total of hidden units. □
The resulting amplitude-wise product decomposition of the AKLT into hidden unit correlators is depicted
Figure 7 for
.
5.2. Analytic Example—AKLT Unary Stabilizer State
An explicit NQS construction for the AKLT state in the
basis has been given before in Ref. [
40] using unary encoded cells of
spin-
’s. Their construction involves initializing the b spin-
’s in state
while entangling the a and c spin-
’s between adjacent unary cells in the state
. As a tensor network, this is represented as
where each box is a unary cell, and H denotes the Hadamard matrix. Each unary cell then has the following unitary applied to it
where S is the phase gate [
40] (we have switched the controlled-NOT gates in the circuit given in
Figure 5b of ref. [
40] into controlled-
Z gates here to expose the graph state equivalence). Putting these pieces together, the unary encoded spin-
state
is a stabilizer state constructed by the circuit
where the top part (above the dashed line) generates a graph state, and the bottom part applies local Clifford gates. Once unary projection to the spin-1 is applied,
generates the AKLT state [
40].
Previously, in ref. [
42], we showed how any stabilizer state for
N spin-
’s can be readily converted into an NQS with
. Here, we just summarize the basic process. The first step in this conversion is to use the local Clifford equivalence of stabilizer states to graph states to relocate all the non-diagonal Clifford gates to independent vertices of the graph. This conversion takes the simple chain-like graph state and pattern of Clifford gates from Equation (
32) and gives the following for
spins:
As required, the resulting graph has diagonal Clifford gates on all vertices, except for a small independent set
highlighted. Notice that the three-site translational invariance of
, mirrored by the initial chain graph, is still formally present in the transformed graph but is now obscured by its highly connected topology. By forming a vertex cover
, we obtain a NQS [
42] with
hidden units (despite the more complex graph, this is the same number of hidden units required to describe the initial chain graph state as an NQS):
More generally, for
N spin-
’s, this procedure generates an NQS with
hidden units.
After applying the unary projection and contraction process directly to this spin-
NQS, we obtain spin-1 NQS tensor network, whose schematic structure is shown here for
:
It is evident that pairs of hidden units possess coordination
and 2, while one pair gets projected down to a spin-1 visible bias, giving
overall. The same structure applies for general
N with pairs of equal coordination spanning
, giving a spin-1 NQS for the AKLT state in the
basis requiring
hidden units. This representation is essentially identical to the one presented in
Section 5.1, except that the hidden units implementing the nodal structure (the fully connected pair) also contribute to the sign structure, reducing the total hidden unit count by one pair. This raises an interesting question of whether an even more compressed spin-1 NQS representation of the AKLT state is possible. We finish by examining this using direct numerical optimizations.
5.3. Numerical Example—AKLT in and Bases
Although the AKLT state is translationally invariant, the hidden units encoding the sign structure of this solution are neither individually translationally invariant nor do their translates appear. Consequently, we performed VMC calculations with increasing
M using both the spin-1 NQS and unary NQS for
and compared against exact diagonalization. As with the AFH model (which is
with
), earlier in
Section 4.5, we considered both the
and
basis with their corresponding nodal structure enforced by sampling. We also utilize the same sequential growth scheme as we used in the Heisenberg calculations, again confirming the robustness of our qualitative conclusions by performing reruns of the optimization sequence and presenting the best results here.
As we are performing stochastic optimization, intrinsic sampling noise will limit the accuracy to which any formally exact solution can be found. It is, therefore, crucial to quantitatively characterize when exactness may have been reached numerically. For a gapped Hamiltonian, the average energy of an approximate state
E can be related to its infidelity with the ground state using [
30]
where
is the energy gap of the Hamiltonian, and
is the energy deviation. As the ground state energy
of the AKLT Hamiltonian is zero, the energy deviation is simply the sampled energy of the state
. As shown in
Figure 8a, even if the exact analytic spin-1 NQS solution is used,
fluctuates when using a finite number of samples typical of an optimization step. For all optimizations presented in this paper, we used the following hyperparameters: number of samples per optimization step
, number of optimization steps
. Typically, the full wavefunction and its fidelity are calculated and checked every 1000 steps to gauge whether the solution has converged or requires further optimization. To account for fluctuations caused by a finite
employed throughout the stochastic optimization, therefore, we use, in Equation (
35),
, the standard deviation of the sampled energy. As shown in
Figure 8b,
vanishes as
, and we estimate an algorithmic fidelity resolution of
for
sites below, in which it may be hard to discriminate an exact solution from an extremely good approximate one.
In
Figure 9, we show the smooth decrease in
against
M for the
basis. While the
curve in
Figure 9a drops below
, this is not attained for larger
N shown in
Figure 9b–d within the
M’s tested. This indicates that no “exact” NQS solutions have been located. As with the AFH model, the spin-1 NQS and unary NQS achieve a similar accuracy verses
M, although the former utilizes less variational parameters.
The analogous results in
Figure 10 for the
basis display remarkable features in comparison. For each
N, a sharp drop in
by over 4 orders of magnitude is observed at
that consistently pushes the infidelity below
. After reaching this point,
versus
M plateaus, and subsequent hidden units have negligible bearing on the accuracy of the wavefunction due to statistical fluctuations originating from the stochastic optimization. These features are consistently produced by both the spin-1 NQS and unary NQS, aside from the largest system size
in
Figure 10d, indicating that the increasing number of redundant variational parameters in the unary encoding is complicating the optimization. The overall behavior of
observed in this basis is strong evidence of a numerically exact solution with
(once the 2 hidden units implementing the nodal structure are included). This is substantially smaller than the analytic solutions introduced and is very suggestive of there being a
compact exact spin-1 NQS representation of the AKLT state in the
basis.
6. Conclusions
We have introduced the most natural and direct generalization of RBM from spin- to spin-1. This necessitated including a quadratic visible bias and a quadratic visible-hidden interaction in the RBM energy function to ensure trivial product state representation, labeling freedom and gauge equivalence to the tensor network formulation. We demonstrated its use numerically for the spin-1 AFH model in both the and bases, illustrating how the choice of basis can affect the accuracy and hidden unit complexity of an NQS representation. Using our spin-1 NQS, we then re-examined how to represent the AKLT state exactly. In the basis, it is known to require hidden units, yet, by changing to the basis, we construct an NQS with hidden units.
Numerical VMC calculations have indicated that, by capturing the nodal structure, either implicitly within the sampling or explicitly through the inclusion of extra hidden units, the optimization can find even more efficient constructions for the sign structure. The resulting spin-1 NQS for the AKLT state in the basis requires hidden units in total making it compact. This example raises the interesting possibility of improving the efficiency and accuracy of NQS calculations by including single-spin basis transformations to lower the hidden unit complexity.
Several important open questions remain about NQS representations. In particular, it would be instructive to build representations of classes of bosonic states using multinomial RBMs. In this case, a local Fock basis is typically employed; however, our findings suggest that it could be useful to explore a local basis that breaks the particle number symmetry when describing condensates. Moreover, the elevation of visible units from binary to multinomial raises the question of whether also using multinomial hidden units can enhance the expressiveness of NQS. This has been explored in the context of binary visible units in ref. [
54] in an analytical context to precisely represent certain two- and three-body interactions. The use of multinomial hidden units for numerical VMC calculations has been largely unexplored and is the subject of forthcoming work [
55] for the Bose Hubbard model in two dimensions.