1. Introduction
Why is the physical world described by quantum theory? If we wish to sensibly address this question, we have to step beyond quantum theory and to consider it within a landscape of alternative theories. This, after all, permits us to ponder about how the world could have been different, possibly described by modifications of quantum theory. Such an endeavor forces us to leave the usual textbook formulation of quantum theory, and everything we take for granted about it, behind and to develop a more general language that also applies to alternative theories. Ideally, this language should be operational, encompassing the interactions of some observer with physical systems in a plethora of conceivable, physically-distinct worlds.
If we wish to also provide a possible answer to the above question, we then have to find physical properties of quantum theory that single it out, at least within the given landscape of alternatives. In particular, the goal should be to find an operational justification for the textbook axioms, i.e., ultimately for complex Hilbert spaces, unitary dynamics, tensor product structure for composite systems, Born rule, and so on. The result would be a reconstruction of quantum theory from operational axioms [
1,
2,
3,
4,
5,
6,
7,
8,
9,
10] and should ideally yield a better understanding of what quantum theory tells us about Nature; and why it is the way it is.
In this manuscript, we shall review and summarize how the quantum formalism for arbitrarily many qubits can be reconstructed from operational rules restricting an observer’s acquisition of information about a set of observed systems [
1,
2]. The goal of this summary is to provide a didactical and easily-accessible overview of this reconstruction. Its underlying framework is especially engineered for unraveling the architecture of quantum theory, and so many reconstruction steps are instructive for understanding the origin of quantum properties. As we shall see, this reconstruction provides a transparent, informational explanation for the structure of qubit quantum theory and especially also for its paradigmatic features, such as entanglement, monogamy and non-locality. The approach also produces novel ‘conserved informational charges’, indeed appearing in quantum theory, that turn out to characterize the unitary group and the set of pure states and which might find practical applications in quantum information.
The premise of the summarized approach is to only speak about information that the observer has access to. It is thus purely operational and survives without any ontological commitments. This approach is inspired, in part, by Rovelli’s relational quantum mechanics [
11] and the Brukner–Zeilinger informational interpretation of quantum theory [
12,
13]; this successful reconstruction can be viewed as a completion of these ideas for qubit systems.
The rest of the manuscript is organized as follows. In
Section 2, we review the landscape of alternative theories; in
Section 3, we formulate the operational quantum axioms; in
Section 4, we summarize the key steps of the reconstruction itself and, finally, conclude in
Section 5.
2. Overview of a Landscape of Theories
We shall begin with an overview of a landscape of alternative theories, which has been developed in [
1,
2] to which we also refer for further details.
2.1. From Questions and Answers to Probabilities and States
Our first aim is to define a notion of a state both for a single system and an ensemble of systems.
Consider an observer
O who interrogates an ensemble of (identically prepared [
1]) systems
, coming out of a preparation device, with binary questions
from some set
. For example, in the case of quantum theory, such a question could read “is the spin of the electron up in
x-direction?” This set
shall only contain repeatable questions in the sense that
O will receive
times the same answer whenever asking any
m times in immediate succession to a single system
. We shall assume any
to always give a definite answer if asked some
, which moreover is not independent of
’s preparation. Accordingly,
can only contain physically-implementable questions, which are ‘answerable’ by the
and not arbitrary logically conceivable binary questions. Furthermore, since we assume definite answers, we do not address the measurement problem. The answers to the
given by the
shall follow a specific statistics for each way of preparing the
(for
n sufficiently large). The set of all the possible answer statistics for all
for all preparations is denoted by Σ.
O, being a good experimenter, has developed, through his experiments, a theoretical model for
and Σ which he employs to interpret the outcomes of his interrogations (and to decide whether a question is in
or not). This permits
O to assign, for the next
to be interrogated, a prior probability
that
’s answer to
will be ‘yes’. Namely,
O determines
through a belief updating—in a broadly Bayesian spirit—according to his model of Σ, any prior information on the way of preparation and possibly to the frequencies of ‘yes’ answers to questions from
, which he may have recorded in previous interrogation runs on systems identically prepared to
. (We add “broadly” here as we also consider the typical laboratory situation of an ensemble of systems.) In particular,
O may also not have carried out previous interrogations on systems identically prepared to
(e.g., if the ensemble contains only the single
) in which case, he will estimate the prior
for the single
solely according to his model of Σ and any prior information about the preparation (more on this and update rules will be discussed in
Section 2.3 and
Section 2.4).
While need not necessarily contain all binary measurements that O could, in principle, perform on the , we shall assume that is ‘tomographically complete’ in the sense that the are sufficient to compute the probabilities for all other physically realizable measurements possibly not contained in the , as well. Hence, the encode everything O could possibly say about the future outcomes to arbitrary experiments on the in his laboratory. It will therefore be sufficient to henceforth restrict O to acquire information about the solely through the . It is also natural to identify O’s ‘catalog of knowledge’ about the given , i.e., the collection of , with the state of relative to O. This is a state of information and an element of Σ. Conversely, any element in Σ assigns a probability to all . Thus, we identify Σ with the state space of .
The state
is the prior state for the single
to be interrogated next, but also coincides with the state
O assigns to the ensemble
(which may only contain a single member) given that its members are identically prepared [
1].
2.2. Time Evolution of O’s “Catalog of Knowledge”
We permit O to subject the to interactions, which cause a state at time to evolve in time to another legitimate state. Any permitted time evolution shall be temporally translation invariant, thus defining a one-parameter map from Σ to itself, which only depends on the time interval , but not on . We denote by the set of all time evolutions to which we allow O to expose the .
Clearly, is a further crucial ingredient of O’s world model; his model for describing his interrogations with the is thus encoded in the triple .
2.3. Convexity and State of No Information
It will be our challenge to unravel what O’s world model is. This requires us to subject the triple to a number of further operational conditions that are ‘natural’ in the context of information acquisition with a broadly Bayesian spirit. Upon imposing the quantum postulates, this will turn out to restrict and to incorporate only a ‘natural’ subset of all possible quantum measurements and time evolutions, namely projective binary measurements and unitaries, respectively (rather than arbitrary positive operator-valued measures (POVMs) and completely positive maps). However, this suffices for our purposes to reconstruct the textbook quantum formalism.
To account for the possibility of randomness in the method of preparation, we assume Σ to be convex. Consider a collection of identical systems (i.e., with identical ) that are not necessarily in identical states and for which O uses a cascade of biased coin tosses to decide which system to interrogate. Then O is enabled to assign a single prior state to this collection, which is a convex combination of their individual states.
Next, we assume the existence of a special method of preparation, which generates even completely random answer statistics over all
. This preparation is described by a special state in Σ, namely
,
, and shall be called the state of no information. This distinguished state is a constraint on the pair
. (E.g., in quantum theory, the pair
does not satisfy this condition because there exist inherently biased POVMs, while
does.) It plays two crucial roles: it defines (1) the prior state of
that
O will start with in a Bayesian updating when he has no ‘prior information’ about the
(except what his model
is); and (2) an unambiguous notion of the (in-)dependence of questions (cf.
Section 2.4), which otherwise would be state dependent. (E.g., in quantum theory, the questions
“Is the spin of Qubit 1 up in
x-direction?” and
“Is the spin of Qubit 2 up in
x-direction?” are independent relative to the completely mixed state, however not relative to a state with entanglement in
x-direction.)
2.4. State Updating and (In)Dependence and Compatibility of Questions
There are two kinds of state update rules, one for the state of the ensemble
(which coincides with the prior state assigned to the next
to be interrogated) and one for the posterior state of a given ensemble member
. In a single shot interrogation,
O receives a single
, assigns a prior state to it according to his prior information (cf.
Section 2.1), interrogates it with some questions from
(without intermediate re-preparation) and, depending on the answers, updates the prior to a posterior state valid for this specific
only. This requires a consistent posterior state update rule, which permits
O to update the probabilities
for all
in a manner that respects the structure of Σ and the repeatability of questions (i.e., an answer
‘yes’ or ‘no’ must have a posterior
or 0 as a consequence, respectively). This is also a belief updating, but about the single
, and is not the same as in
Section 2.1 and
Section 2.3. Specifically, the posterior state of
may differ significantly from its prior state if
O has experienced an information gain on at least some
(this will necessarily happen when complementary questions are involved; see below). This is the ‘collapse’ of the state: it is merely
O’s update of information about the specific
[
1].
By contrast, in a multiple shot interrogation,
O carries out a single shot interrogation on each member of an entire (identically prepared [
1]) ensemble
to do ensemble state tomography and estimate the state of the ensemble from his/her prior information about the preparation and the collection of posterior states from the single shot interrogations. With every further interrogated
,
O updates the ensemble state, which coincides with the prior state of the next system from the ensemble to be interrogated. Accordingly, this requires a prior state update rule. This is the belief updating alluded to in
Section 2.1 and
Section 2.3 about the ensemble
.
It will not be necessary to specify these two update rules in detail; we just assume
O uses consistent ones. Specifically, given a posterior state update rule, we shall call
(maximally) independent | if, after having asked to S in the state of no information, the posterior probability . That is, if the answer to relative to the state of no information tells O ‘nothing’ about the answer to . |
dependent | if, after having asked to S in the state of no information, the posterior probability (if or 1, they are maximally dependent). That is, if the answer to relative to the state of no information gives O at least partial information about the answer to . |
(maximally) compatible | if O may know the answers to both simultaneously, i.e., if there exists a state in Σ such that can be simultaneously zero or one. |
(maximally) complementary | if every state in Σ, which features , necessarily implies . Notice that complementarity implies independence (but not vice versa). |
(One can also define partial compatibility similarly [
1].) These relations shall be symmetric; e.g.,
is independent of
if and only if
is independent of
, etc.
We impose a final condition on the posterior state update rule: if are maximally compatible and independent, then asking shall not change , i.e., O’s information about .
2.5. Informational Completeness
The fundamental building blocks of the theories in the landscape that we are constructing are to be sets of pairwise independent questions. This will help to render the convoluted parametrization of a state by
more economical. Consider a set of pairwise independent questions
; it is called maximal if no question from
can be added to
without destroying the pairwise independence of its elements. We shall assume that any maximal
is informationally complete in the sense that all
can be computed from the corresponding probabilities
for all states in Σ. Any such
features
D elements [
1] such that Σ becomes a
D-dimensional convex set and states become vectors:
2.6. Information Measure
Our focus is
O’s acquisition of information, so we need to quantify
O’s information about the systems. Since
is binary, we quantify
O’s information about
’s answer to it by a function
with
bit and
bit ⇔
and
bit.
O’s total information about a
must be a function of the state; we make an additive ansatz:
The quantum postulates will single out the specific function
α.
Consider a set
of mutually (maximally) complementary questions. It is clear that whenever
O has maximal information
bit about
from this set, he must have zero
bits of information about all other questions in the set. We require more generally that such a set cannot support more than one
bit of information, regardless of the state:
for otherwise
O could, for some states, reduce his total information about such a set by asking another question from it. These complementarity inequalities represent informational uncertainty relations that describe how the information gain about one question enforces an information loss about questions complementary to it (see also the state ‘collapse’ in
Section 2.4).
2.7. Composite Systems and (Classical) Rules of Inference
O must be able to tell a composite system apart into its constituents purely by means of the information accessible to him through interrogation and thus ultimately by means of the question sets. Let systems
have question sets
. It is then natural to say that they define a composite system
if any
is maximally compatible with any
and if:
where
only contains composite questions, which are iterative compositions,
, via some logical connectives
, of individual questions
about
and
about
. This definition is extended recursively to composite systems with more than two subsystems.
Since O can never test the truthfulness of statements about the logical connectives of complementary questions through interrogations and since all propositions must have operational meaning, we shall permit O to logically connect two (possibly composite) questions directly with some * only if they are compatible. For the same reason, O is allowed to apply classical rules of inference (in terms of Boolean logic) exclusively to sets of mutually-compatible questions.
We stress that this definition of composite systems is distinct from the usual state tensor product rule in generalized probabilistic theories coming from local tomography [
3,
4,
5]. In particular, this composition rule admits non-locally tomographic composites (see
Section 4.3).
2.8. Computing Probabilities and Questions as Vectors
Thanks to informational completeness, the probability function
that
‘yes’, given the state
, exists for all
and
. As shown in [
2], the exhibited structure yields:
where
is a question vector encoding
and
is a vector with each coefficient equal to one in the basis corresponding to
. This equation gives rise to (part of) the Born rule.
Suppose
were both encoded by the same
. Then, by (
4), they would be probabilistically indistinguishable, and
O must view them as logically equivalent.
O is free to remove any such redundancy from his description of
upon which every permissible question vector
will encode a unique
. Finally, for every
, there exists a state
, which is the updated posterior state of
after
O received a ‘yes’ answer to the single question
Q from
in the (prior) state of no information.
O had zero
bits of information before, and
encodes a single independent question answer, so we naturally require that it encodes one independent
bit. Hence, for every
, there exists
with
bit, such that
. (In quantum theory, the
will only turn out to be pure states for a single qubit; e.g., for two qubits and
‘Is the spin of Qubit 1 up in
z-direction?’, represented by the rank-two projector
,
corresponds to the mixed state
. Clearly,
.)
3. The Quantum Principles as Rules Constraining O’s Information Acquisition
In the sequel, we consider the most elementary of information carriers. Within the introduced landscape of theories, we now establish rules on
O’s acquisition of information that single out the quantum theory of a composite system
of
qubits, modeled in our language by a triple
. Effectively, these rules constitute a set of ‘coordinates’ for quantum theory on this landscape. The rules are spelled out first colloquially, then mathematically and are motivated in more detail in [
1,
2].
Empirically, the information accessible to an experimenter about (characteristic properties of) elementary systems is limited. For example, an experimenter may know one binary proposition about an electron (e.g., its spin in x-direction), but nothing fully independent of it (and similarly for a classical bit). We shall characterize a composition of N elementary systems according to how much information is, in principle, simultaneously available to O.
Rule 1. (Limited information) “The observer O can acquire maximally independent bits of information about the system at any moment of time.”
There exists a maximal set , , of N mutually maximally independent and compatible questions in .
O can thereby distinguish maximally states of in a single shot interrogation.
However, empirically, elementary systems admit more independent propositions than what, due to the information limit, they are able to answer at a time. This is Bohr’s complementarity. The unanswered properties must be random (and so ‘in superposition’) because the information limit makes it impossible to ascribe definite outcomes to them. For example, an experimenter may also inquire about the spin of the electron in y-direction. Yet doing so is at the total expense of his information about its spin in the x- and z-directions, and subsequent such measurements have random outcomes. For the N elementary systems, we assert the existence of complementarity.
Rule 2. (Complementarity) “The observer O can always get up to N new independent bits of information about the system . However, whenever O asks a new question, he experiences no net loss in his total amount of information about .”
There exists another maximal set , , of N mutually maximally independent and compatible questions in , such that are maximally complementary and are maximally compatible.
The peculiar mathematical form of Rule 2 becomes intuitive upon recalling that
is a composite system, such that complementarity should exist
per elementary system [
1].
Rules 1 and 2 are conceptually inspired by (non-technical) proposals made by Rovelli [
11] and Zeilinger and Brukner [
12,
13]. These rules say nothing about what happens in-between interrogations. Naturally, we demand
O not to gain or lose information without asking questions.
Rule 3. (Information preservation) “The total amount of information O has about (an otherwise non-interacting) is preserved in-between interrogations.”
is constant in time in-between interrogations for (an otherwise non-interacting) .
Hence, O’s total information is a ‘conserved charge’ of any time evolution .
The more interactions to which O may subject are available, the more ways in which any state may, in principle, change in time and, thus, the more ‘interesting’ O’s world. We therefore demand that any time evolution is physically realizable as long as it is consistent with the other rules (since are interdependent, this is distinct from ‘maximizing the number’ of states).
Rule 4. (Time evolution) “O’s ‘catalog of knowledge’ about evolves continuously in time in-between interrogations, and every consistent such evolution is physically realizable.”
is the maximal set of transformations on states such that, for any fixed state , is continuous in and compatible with Principles 1–3 (and the structure of the theory landscape).
(If we did not require this ‘maximality’ of
, we would still ultimately obtain a linear, unitary evolution, but not necessarily the full unitary group. This is the sole reason for demanding ‘maximality’. Note that Principles 3 and 4 are
not equivalent to the axiom of ‘continuous reversibility’ of generalized probabilistic theories [
3,
4,
5].)
We shall also allow O to ask any question to which ‘makes (probabilistic) sense’.
Rule 5. (Question unrestrictedness) “Every question that yields legitimate probabilities for every way of preparing is physically realizable by O.”
Every question vector that satisfies and for which there exists with bit, such that corresponds to a .
(Without Principle 5, we would still obtain the structure of an informationally complete set
, finding that it encodes a basis of projective Pauli operator measurements [
2]; Principle 5 legalizes
all such measurements.)
These five rules turn out to leave two solutions for the triple
. Remarkably, they cannot distinguish between complex and real numbers. Namely, the two solutions are qubit and rebit quantum theory, i.e., two-level systems over real Hilbert spaces [
1,
2]. Since the latter is both mathematically and physically a subcase of the former, these five rules can be regarded as sufficient. However, if one also wishes to discriminate rebits operationally, then an extra rule, adapted from [
3,
4,
5] and imposed solely for this purpose (it is partially redundant), succeeds.
Rule 6. (Tomographic locality) “O can determine the state of the composite system by interrogating only its subsystems.”
As shown in [
1,
2], Rules 1–6 are equivalent to the textbook axioms. More precisely:
Claim. The only solution to Rules 1–6 is qubit quantum theory where: is the space of density matrices over ,
states evolve unitarily according to and the equation describing the state dynamics is (equivalent to) the von Neumann evolution equation,
is (isomorphic to) the set of projective measurements onto the eigenspaces of N-qubit Pauli operators (a Hermitian operator on is a Pauli operator iff it has two eigenvalues of equal multiplicity), and the probability for to be answered with ‘yes’ in some state is given by the Born rule for projective measurements.