1. Introduction
Building a large-scale quantum computer remains challenging, and there are many problems to be solved. For example, the performance of error correction depends very strongly on how coherent the noise process is [
1], and experimenters need to improve the quantum computing system through analysis of the physical noise [
2]. When we prepare an imperfect quantum computing system, it is important to specify the noise based on a suitable error model. The process of understanding the physical error model for a prepared system corresponds to that of obtaining a certain amount of information on the system. In the present paper, keeping this in mind, we consider how to evaluate such information
without any entropic concept.
An overly simple example is given by a purely rotation error model [
2] with a parameter range. Let the ideal qubit state be
and the error model be
, where
and
denotes the parameter to be specified. Then the parameter range reflects the information we have on the system. We have more information when we know
than when we know
.
However, what if we consider a more complicated situation such as
=
, where
, with the parameter range of, say,
and
? How do we compare the range with the parameter range
and
? To simplify the problem, let us adopt discrete models. We consider two situations: First, the qubit is described by one of the candidate pure states,
, where
and
. Second, it is described by one of the candidate pure states
. Then, which state’s information is greater? Or equivalently, which uncertainty is larger? We will give a definite answer in the present article (See
Section 8).
Let us describe our problem in a slightly more formal way before going into detail. Let the quantum system be described in a separable Hilbert space and a subset of pure states be given. We call the subset a (pure-state) model. Suppose we know that the quantum system is described by one of the pure states in the model. Then we evaluate the amount of information obtained by knowing the model.
We write the model as a countable set for simplicity, but a model might consist of an uncountable set of pure states, e.g., the above parametric pure state
. Another continuous model would be a wavefunction with a certain continuous parameter
(see, e.g., Holevo [
3]).
Our problem is closely related to so-called quantum estimation, but the above type of problem has still not been investigated. In quantum state estimation [
4,
5,
6,
7,
8,
9] or quantum state discrimination [
6,
10,
11,
12,
13], for a given model, we find an optimal quantum measurement to extract information on the quantum state and choose the true state in the model from observation. This has been a typical problem and has been investigated by many authors. In quantum information experiments, quantum tomography has been also discussed [
14,
15,
16,
17]. As far as the author knows, these studies do not refer to the comparison of several models in terms of information quantity.
In our setting, we focus on the information that we obtain before preparation for a measurement. As we see later, we clearly obtain a certain amount of information other than the dimension of the Hilbert space.
We consider only pure state models so that we can neglect classical fluctuation. As we shall see later, there is no classical counterpart for such information. In other words, we calibrate so that such classical information becomes zero. If a positive amount of information remains under the calibration, then we expect that it reflects the truly quantum information. We do not have a proper name for this information, and we call it model information or information of the model tentatively. It would become an alternative to the usual entropy.
In the next section, we provide a rough idea of how to define model information and present pure state models as examples. Then, we will formulate a pure state model and define the representative quantum state for it in a rigorous manner. In
Section 4, we describe the equivalence between the problem of finding the representative quantum state and finding the minimax facility location on the sphere in operations research. In
Section 5, we introduce the purely quantum information of the model and calculate it in several examples. We also describe the relationship of entropic concepts to our result and extension to infinite-dimensional Hilbert space in
Section 7. Finally, concluding remarks are given in
Section 8.
2. Rough Idea on Defining Model Information
2.1. Preliminary Considerations
In this section, we describe a rough idea of how we evaluate model information. First, we recall classical information theory. Suppose that Alice picks a three-letter word , where and we set . If Bob knows , Bob does not feel that he obtains much information. However, if , then Bob feels that he obtains more information on the word Alice picks.
The above situation corresponds to a commutative case in quantum theory. Keeping this in mind, let us consider the model information in the quantum system. We assume that Bob already knows that the quantum system is described in a d-dimensional Hilbert space. Since information quantity is a relative concept, let us compare two models. Let the first model consist of a d-dimensional orthonormal basis, i.e., and the second model consist of (). At least we can say that the second model gives more significant information to Bob than the first model. This is because the quantum state is in a proper subspace.
Now we tackle the case where some quantum states are nonorthogonal. For simplicity, we set
and consider the following models:
where
,
. In explicit calculations, we set
.
Suppose that we know that the quantum state is one of the candidate states in (hereinafter, we write for simplicity). Perhaps we agree that the information is more than and . Then, which is more informative, or ? Both models consist of three nonorthogonal state vectors. Likewise, which is more informative, or ? In the present article, we consider how to quantitatively evaluate the information obtained when Bob knows that the quantum state belongs to a model .
2.2. Full Rank Condition
In order to avoid technical difficulties, we give one important assumption here. Let us define the
rank of a model as
We assume that the rank of a model is equal to the dimension of the Hilbert space, i.e., . We call it the full rank condition. The full rank condition implies that there exists no complementary subspace that is orthogonal to every state vector in the model .
2.3. Rough Idea on Defining Model Information
Under the full rank condition, we consider the case where a model has considerable information on the quantum system. Suppose that we are given the following model:
(
). While this satisfies the full rank condition, clearly all candidate quantum states are approximately in the same direction as
.
Then, the quantum system is approximately described by a representative state vector . When but , the model information is expected to increase.
From the above discussion, we find that the information quantity associated with is completely different from the number of elements, . Rather, a certain scale or a size of the model should be included in the definition of the model information.
Along the lines of the above rough idea, we discuss in the next section:
- (a)
How to determine a representative state vector for a given model ;
- (b)
One definition of the model information;
- (c)
The relationship with the concept of entropy.
We emphasize that all of these have no classical counterpart and thus it might be difficult to understand them. Before going into detail, we shall give an overview of each item here.
For (a), we consider maximin overlap between quantum states and define the representative quantum state of a model. Mathematically speaking, it is regarded as a variant of the facility location problem on the unit sphere [
18,
19], which appears in operations research. In operations research, many authors have developed algorithms on the facility location problem. In particular, finding the minimax solution is our concern. For a finite model (
), we present a naive algorithm to find the representative quantum state for a model using this consideration.
In order to consider item (b), we introduce an imaginary two-person game called the quantum detection game. Bob benefits from the information of a given model to obtain a higher score than Alice. The value of the game, which is determined by a least favorable prior [
20] in this game, defines one information quantity related to the model
.
In (c), we compare our method with the formal analogue based on the von Neumann entropy. Later we will see that the newly proposed information quantity is related to the minimum entropy [
21,
22] rather than the von Neumann entropy.
3. Basic Definitions
3.1. Definition of Pure State Models and Assumptions
In the present paper, let
be a
d-dimensional Hilbert space. (
d could be
∞). We call a finite-dimensional parametric family of quantum pure states
a
quantum statistical model of pure states or briefly a
(pure state) model. Note that
. Basically, the parameter set
is a compact subset of finite-dimensional Euclidean space.
We assume the following two conditions:
- (1)
Identifiability: . Conventionally, we only consider quantum states up to the (global) phase factor below, and we often identify a pure state with a density operator .
- (2)
Continuity: For every sequence
and
,
holds. (
denotes the operator norm, i.e.,
).
For simplicity, we often consider a finite set of parameters, . Then, is denoted by . We often call it a discrete model, which is written as .
3.2. Preliminary Results
In the present paper, we introduce the information of a model
. Although the formal definition is given in
Section 5, we need several concepts to understand them analytically and geometrically.
In this section, we introduce the most fundamental concept, the representative quantum state of a model. We shall give a rough idea for when and . Specifically, we set with . When two quantum states are close to each other, , it is natural to consider that a representative quantum state of model should be a “midpoint between two quantum states”. We often identify the state vector with the point on the whole pure states specified by the vector.
Mathematically, we may try to define the point as the point
equidistant between
and
such that
holds.
However, the above equidistance condition does not determine the point
generally. Thus, we maximize the above “overlap” under the condition (
1). Then we obtain an explicit formula for the representative point of a model,
where
satisfies
and then the maximum overlap is given by
.
Next, we consider the case where . Let us take and introduced in the previous section:
In
, the above idea applies, i.e., we find the quantum state to maximize the overlap
Up to the global phase, we set
as
Then, we obtain an explicit solution satisfying (
3),
after lengthy but straightforward algebra. (See also
Section 4.1.3).
However, we find no solution satisfying Equation (
3) in
. We need a more careful treatment. First, we fix an arbitrary quantum state
and consider the set of numbers
r satisfying
,
. The condition assures that the overlap between
and an arbitrary quantum state in
is not less than
r. For each
, the maximum of
r is equal to
.
We consider that the larger the overlap gets, the more suitable becomes as a representative quantum state of the model . Thus, we maximize r as a function of .
It is convenient for explicit calculation to use the squared overlap (i.e., Fidelity), , and we regard as a representative quantum state of the model . Based on the above idea, we will give a more formal definition of the representative quantum state in the next subsection.
3.3. Representative Quantum State
Now we are ready to define the representative quantum state of a given model formally. We adopt the distance rather than the overlap.
Definition 1. Let a model be given. When a quantum state satisfies is called a representative quantum state of the model with respect to the distance . When we emphasize the model
, we write
. While the terms max and min are enough for discrete models, using the terms sup and inf is generally inevitable. (see
Section 7). We also use a condition equivalent to (
5),
In the above definition, is also interpreted as the minimax estimate in quantum estimation with no observation. Suppose that a parametric family of pure states or countable set of pure states is given. Then we give an estimate, say , as the true quantum state without any observation. The error is evaluated by the Fidelity-based quantity, . The above representative quantum state is a minimax estimate in this setting.
In the context of quantum estimation, this may seem quite strange because we do not perform any measurement. However, it is not unnatural to consider estimation with no observation. For example, in classical information theory, we infer the outcome of an information source with no observation. For a given parametric model of source code distribution
, this kind of estimation corresponds to constructing a minimax code [
23].
Apart from actual application, quantum estimation with no observation also makes sense theoretically. In a quantum computer, a quantum bit will be processed under a certain quantum gate with an unknown parameter, say, , during the computing process. When is uncontrollable with a range , it might be necessary to estimate the quantum bit. Since there is no reason to estimate , we need a certain formulation to estimate the quantum bit.
We should also mention why we adopt
as the distance among several candidates as the closeness measure in our definition. There are two reasons. One is the operational meaning of the quantum detection game, which is explained in
Section 5. The other is due to the following property:
Lemma 1. Let a model be given. Let f be a continuous nondecreasing function on . When we adopt the distance , then the representative quantum state remains the same.
Proof. It is enough to show that for every
,
holds.
If Equation (
6) holds true, then we show the statement in the following way. For every
, from the definition of
,
holds. Since
f is an increasing function, applying
f to both sides and using Equation (
6) yields
, which implies that
is a representative quantum state with respect to the distance
.
Now let us show Equation (
6). Let
be fixed and set
. For every
, due to the continuity of
f, there exists
such that
. We take
such that
. Then
Since is arbitrary, we obtain .
Next, observe that
for every
since
. Taking the supremum of RHS with respect to
, we obtain the converse inequality. Thus, Equation (
6) is shown, and the proof is complete. □
In
Section 5, we shall define the information quantity obtained when we know
, which is denoted by
. When we find
, it is shown to be easy to calculate
.
Now let us consider the representative quantum state of a two-state model geometrically. Recall that each pure state in a two-dimensional Hilbert space is written in the form (
4). If we switch to the Bloch representation, we obtain one-to-one correspondence between
on the unit sphere (Bloch sphere). When one pure state is set to
(
P), the distance between the pure state and another pure state specified with
(
Q) is
along the shortest path on the Bloch sphere. The shortest path connecting two points
P and
Q on the Bloch sphere is the arc along the large circle on the Bloch sphere. The arc is called a
geodesic connecting
P and
Q and the equidistant point
M on the geodesic from both points is called the
geodesic midpoint between P and Q. The representative quantum state corresponds to the geodesic midpoint. The concept of geodesics on the Bloch sphere is often useful and has been investigated in several works [
24,
25,
26,
27].
For every pair of independent quantum states
and
, let us consider a two-dimensional subspace
. Then each state in the subspace is regarded as a point on the Bloch sphere. By using the Formula (
2), we summarize the above statements.
Lemma 2. Let a model be given. Then, for every pair of quantum states and with , the geodesic midpoint is given bywhere δ satisfies and then arc length α between and is given by and arc length between and is . Understanding the geometry of the unit sphere is very helpful to find the representative quantum state, which is discussed in
Section 4.
3.4. Example of a Representative Quantum State:
As a slightly nontrivial example, let us focus on
and find a representative quantum state. First, we focus on the submodel
. Then its representative quantum state,
is the geodesic midpoint of
and
. Using the Formula (
7),
is obtained.
Next, we use the following lemma.
Lemma 3. Let a model and its submodel be given. If the representative quantum state of satisfiesthen is also the representative quantum state of . Proof. Let
be an arbitrary quantum state. Since
,
which implies that
is also the representative quantum state of
. □
It is easily seen that
. Due to the above Lemma 3,
(
8) is also the representative quantum state of
.
5. Quantum Detection Game and Model Information
We have explained how to determine a representative quantum state of a given model. Based on the state, in the present section, we define a new information quantity, model information. Geometrically, this is the maximum radius from the representative quantum state as the center.
The basic strategy to define an information quantity is to introduce a certain imaginary two-person game where one player obtains points according to the information available.
For example, in classical information theory, we consider assigning the ideal code length
to each alphabet
x when we know the code distribution
. Bob’ s best score is given by his guessed distribution
and obtains the score
for each alphabet
x. Taking the average with respect to
, we obtain the Kullback–Leibler information [
23], which is a very fundamental quantity in information theory.
According to Tanaka [
20], we consider a quantum detection game as an imaginary two-person game.
5.1. Quantum Detection Game and Definition of Purely Quantum Information
As an example, we introduce a four-dimensional pure state model,
and set
between
. This consists of the following four-dimensional vectors:
while
, it is enough to consider each vector in a
real four-dimensional vector space.
Let us explain the quantum detection game under the model . First, Alice picks one pure state from the model (i.e., ) and then sends it to Bob. Bob knows only the candidate pure state sets and the model, and prepares a two-outcome measurement in the form , where I is the identity operator and is the unit vector. We call a detector or a detector state. Bob’s purpose is not to guess the number that Alice has chosen but to obtain "detection" with a higher probability.
The detection rate for the chosen state is given by when Alice sends to Bob and Bob prepares as a detector. ( denotes the matrix trace and is regarded as a matrix). As a game, Alice aims at making Bob’s detection rate smaller by choosing with a certain probability. In contrast, Bob aims at making the detection rate larger by preparing his detector based on the knowledge of the model. Later, we will evaluate the information of the model .
Now we go back to the general situation and explain the details. First, we seek the minimum detection rate for Bob among all possible models. Suppose that Alice picks among the whole pure states in a completely random way (i.e., with respect to the Haar measure). This is the worst case for Bob. When Bob is allowed to adopt a randomized strategy, the detection rate is (d is the dimension of the Hilbert space). If the model consists of the orthonormal basis, then again the detection rate is . It is the minimum detection rate.
Next, suppose that Alice has a certain tendency for choosing the pure state, which is also described by the model and Bob knows this for some reason. Although we do not care about the origin of such models, there are various situations where they apply in quantum science and technology. For example, in the bipartite system without interaction, a pure state arises as a product state like . Then an entangled state such as is not expected. In quantum computation, the output qubit state under the unitary gate, which has some rotation error, would be . Then, Bob could obtain a detection rate larger than based on the information of the model. Following this idea, we propose one information quantity for a model below.
A detailed explanation of the quantum detection game and useful results are described in the author’s previous work [
20]. Below, we only present some of the results in a formal way, which is necessary to define the information quantity. Those definitions hold in an infinite-dimensional Hilbert space.
First, we define the Bayes mixture in a slightly formal way. Let
be a model, (see
Section 3) and
be a probability distribution on the parameter space
. Then, the Bayes mixture
is defined as
In the context of Bayesian statistics [
31,
32,
33], we call
a
prior distribution or briefly a
prior. For a discrete model, the above integral is replaced with a finite sum
. Then, when Alice sends
to Bob with probability
, it is equivalent to sending
to Bob in the quantum detection game.
Finally, we have come to our main theme: to define the information quantity of a model .
Definition 2. Let be a model in a d-dimensional Hilbert space. Then, we define the purely quantum information (PQI) of a model as For calibration, we subtract the lower bound
, and thus
. When Bob knows that the quantum state Alice prepares is among
, we interpret this as Bob obtaining
. As shown in
Section 5.3, the above infimum is related to the value of the quantum detection game (possible maximum score) through the minimax theorem [
20].
Let us rewrite
in a slightly simpler form. For a discrete model, there exists a prior distribution that achieves the infimum of
. Then, we call the prior a
least favorable prior (LFP). LFP is one of the technical terms in statistical decision theory or game theory (see, e.g., Section 1.7, p. 39 in Ferguson [
30]). Using the least favorable prior
, PQI is defined by
Even if LFP is not uniquely determined,
remains the same [
20]. As some readers may recognize, the LFP completely agrees with the least favorable weight in
Section 4.
5.2. Basic Properties of PQI
From the form (
14), we obtain some properties of PQI easily. First, by definition,
is independent of a basis. In other words, both
and
, where
U is a unitary operator, yield the same PQI. Second, clearly the following holds.
Lemma 4. Let be a model in a d-dimensional Hilbert space. The following conditions are equivalent.
- (i)
.
- (ii)
for every LFP.
When , Alice can send the completely mixed state effectively and then Bob obtains no information from the model to achieve a higher detection rate than . Geometrically speaking, such a model fully spreads with no specific direction.
In contrast, when , a certain bias exists and it prevents Alice from preparing the completely mixed state. Thus, Bob benefits after knowing the model. If satisfies the full-rank condition, then there exists a prior such that the Bayes mixture ( denotes the positive definiteness of a Hermitian matrix A). If does not satisfy the full-rank condition, we have a -dimensional subspace where (restricted to the subspace) satisfies the full-rank condition. Since , we have the lower bound of PQI, .
We mention the relation between PQI and the von Neumann entropy
. (Recall that the von Neumann entropy is defined by
). It is easily shown that
if and only if
. The worst case for Bob also corresponds to the maximum entropy state. As we shall see in
Section 6, our formulation is instead related to the minimum entropy.
Next, we consider how to calculate the PQI of a given model. If the model has a certain symmetry, then we obtain the LFP analytically and directly calculate
. On the other hand, due to the minimax theorem in the author’s previous work [
20], the infimum of the operator norm of a Bayes mixture,
is easily calculated by finding the representative quantum state of the model, which is defined in
Section 3. Thus, we are able to calculate the PQI of a given model by finding the minimax point (the representative quantum state of the model), and to do so, we utilize the algorithm shown in
Section 4 in order to find the minimax point in the facility location problem on the unit sphere. We present the above procedure explicitly in
Section 5.4 in detail.
The mathematical structure is quite similar to the calculation of channel capacity in classical information theory [
34,
35]. However, we emphasize that even in a formal analogue, we do not introduce any entropic quantity or any concept from information theory to define the above PQI. What we have used is an imaginary two-person game called the quantum detection game and some basic rules in quantum physics. Taking into account many works in quantum information theory [
36], it is a bit surprising to develop purely quantum information
without referring to any classical concepts in information theory [
37,
38,
39,
40].
We also note that PQI is completely different from other kind of information quantity such as the Fisher information [
7,
8]. For a parametric model of quantum states,
, differentiable with respect to the parameter
, Fisher information evaluates the change of quantum states,
. It is related to the distinguishability between two close quantum states
and
from observation
after performing some measurements. Let us take a specific example to see the difference. Suppose that we have a continuous one-parameter model
. Although quantum Fisher information has been defined in various ways as an extension of classical Fisher information, it is not defined for a discrete model such as
. Indeed, for
, we only consider distinguishing between two possible states (quantum state discrimination), while for
, we have to consider parameter estimation (quantum state estimation), and the estimation error is bounded by SLD Fisher information [
4,
5,
6]. However, PQI yields the same value for both
and
.
5.3. Basic Formula for PQI Calculation
We provide several examples to show how we calculate the PQI of a model below. We give the following formula, which connects the representative quantum state and PQI. The formula is obtained by the minimax theorem (
10).
Or equivalently, we have the formula
Using the above formula and result in
Section 4, we obtain the PQI of
,
, and
(For the definition, see
Section 2.1).
First, we consider the PQI of
. We already know the representative quantum state of
from
Section 4. Thus, using the Formula (
16), we obtain
and
. In the same way, for
, using the Formula (
16), we obtain
,
. Straightforward calculation also yields
, which coincidently is equal to
.
5.4. Example of PQI Calculation:
Next, as a more nontrivial example, we calculate the PQI of the model
introduced in
Section 5.1. First, following the algorithm in
Section 4, let us find the minimax point (the representative quantum state of the model
).
In Step (1), we find the most distant pair. We mainly focus on the inner product between two vectors instead of the geodesic distance between them. Then, the most distant pair corresponds to those for which the inner product is the closest to zero. Since , , , and , the most distant pair is , , and .
Using the Formula (
7) in Lemma 2, we obtain the geodesic midpoint between
and
, which is denoted by
and
. Comparing the inner products, it is easily seen that
is located at a point more distant from
than
and
. Thus, we go to Step (2) in the algorithm. Note that in the model
, all inner products are real and positive.
In Step (2), we find the most distant triplet. In our model, it is enough to consider the circumscribed hypercircle in a real four-dimensional Euclidean space. Due to the symmetry, we only check two triangles, and whose vertices are and , respectively.
First, let Q be the center of the circumscribed hypercircle of the triangle . (Each edge is a geodesic on the sphere). Generally, the point Q is not uniquely determined. However, by imposing the condition that Q is on the three-dimensional real subspace , the point Q is uniquely determined as the point achieving the minimum distance from each vertex (radius of the circumscribed hypercircle). The condition is equivalent to an orthogonality condition, i.e., , where .
Now let , be a vector corresponding to Q. Then it satisfies , , and , where denotes the Euclid norm. We obtain the solution .
Next, we investigate the other circumscribed hypercircle of the triangle . In a similar way, we define the point R for . Then, the state vector corresponding to R is given by .
Let each radius of the circumscribed hypercircle be and . Then , and . Thus, , the most distant triplet is .
Finally, we check whether the circumscribed hypercircle with center and radius includes the other point . (If not, we go to Step (3) in the algorithm) Since we assume that , holds, this implies that is closer to the point than the other three points. Thus, the Algorithm stops and is the minimax location.
Due to Equation (
16),
agrees with the infimum of the detection rate
, and we obtain PQI
.
5.5. PQI Calculation from LFP
Now let us find the LFP in this model. Since the model has a certain symmetry, we obtain it directly.
First, let the support of
be
, that is,
and
,
. Then we obtain one of the LFPs, which is given by the uniform distribution,
. To see this, we use the following two facts. First, for every permutation
(e.g.,
),
holds. (To see this, construct the unitary operator
, which is the group homomorphism
). Second, the average for every permutation is given by
where
is the permutation group acting on the set
. Thus, for every
,
holds. Thus, the uniform distribution is the LFP when
.
Then, we relax the condition
. We set the uniformly mixed state as
Then, the operator norm is given by
From direct but very tedious calculation, we can see that the above achieves the infimum even if we permit
. Setting
, we obtain
Therefore, we obtain the LFP as
Even when we do not find the representative quantum state directly, we can construct it from the LFP in the following way. Since the Bayes mixture with respect to LFP is given by Equation (
17), we find the first eigenvector
with the maximum eigenvalue
(no degeneracy), which is given by
. Actually, this agrees with the representative quantum state,
, in
Section 5.4.
Through the minimax theorem [
20], we can also directly show that the norm (
18) achieves the minimum. To see this, we introduce the following inequality:
Since
(for details, see the author’s previous work [
20]), we obtain
, which implies
. Thus, one LFP is given by (
19). The above argument does not exclude another possibility for the LFP with
.
5.6. Difference from Maximization of von Neumann Entropy
In our formulation, PQI has no direct relation to any entropic concept. Since some readers may expect a certain relationship, let us see what happens if we formally adopt the von Neumann entropy to obtain the LFP in the last example. We consider the maximization of
over the prior
. The concavity of
yields
where
denotes LFP (
19). For some
, we numerically find a positive
achieving the maximum of
. Thus, a positive weight for
could appear under the maximization, which is clearly different from our result.
While the LFP , which is obtained by minimization of , yields the minimax solution in the quantum detection game, the maximizer, say, , is meaningless, at least in this example. Indeed, , which implies that is more informative than , and the prior is not the least favorable to Bob anymore.
We have carefully treated the information or uncertainty of a nonorthogonal pure state model and excluded classical fluctuation. As a consequence, the remaining uncertainty is not evaluated by the usual entropy anymore. For a nonorthogonal pure state model, the von Neumann entropy as a measure of information lacks theoretical justification. At least in the quantum detection game, the method based on the von Neumann entropy is a mere formal extension. It makes sense only for a model that consists of orthogonal pure states (see
Section 2).
However, there are many variants of entropy [
37,
38,
39,
40,
41] both in classical and quantum information theory. We discuss a certain relationship between our information quantity and the minimum entropy in the next section.
6. Discussion: Relation to Entropy
In the previous section, we introduced an information quantity for a pure state model called PQI. Under the full-rank condition, any classical model consists of an orthonormal basis. Then PQI of the model necessarily vanishes.
We emphasize that PQI is literally something purely quantum since we have not formally extended something in classical information theory. It is the information quantity completely independent of the concept of entropy, which does make sense in classical information theory. Thus, a natural question arises, i.e., what kind of relationship do the entropy and PQI have? Actually, PQI is related to the minimum entropy instead of the von Neumann entropy, as we will discuss below.
6.1. Jaynes Principle and Distinguishability
First, we briefly review the concept of entropy and the Jaynes principle [
42,
43].
Suppose that we are given the set of the alphabet. Then our lack of knowledge on the set is evaluated by Shannon entropy through a probability distribution satisfying . (Recall that classical Shannon entropy is defined as ). The larger the entropy becomes, the larger the uncertainty we have.
We have minimum information as interpreted as the maximum entropy state, that is,
and thus
holds. The central idea also provides the theoretical foundation for maximum entropy methods in data processing [
44,
45].
The underlying concept is distinguishability. In classical information theory, distinguishability holds trivially. In quantum theory, it is represented by the orthogonality of two quantum states. When pure states corresponding to alphabets, say, , are orthogonal to each other, every result in classical information theory is extended in a straightforward manner.
In statistical physics, a physical state of an ensemble is estimated through entropy maximization when we have no knowledge of the system. This way of thinking is called the
Jaynes principle [
42,
43], and it is fundamental to statistical physics. For example, for a given set of eigenstates of a Hamiltonian, say,
, with some conditions, we obtain a canonical ensemble by using the principle.
In quantum physics, we are able to consider the maximization of the von Neumann entropy of the density matrix
(Bayes mixture) for orthogonal vectors
. Since
, this maximization completely reduces to the classical case. Then the maximizer is the completely mixed state, i.e.,
, which corresponds to the uniform distribution. Formally, additional constraints also yield a quantum exponential family [
46], which is the quantum analogue of the classical exponential family [
47,
48].
However, we have no solid criterion such as the Jaynes principle for a nonorthogonal pure states model. For example, a qubit
processed under one unitary operation, which is assumed to be among
. (
,
). In a sense, it is a simplified rotation error model (e.g., Kueng et al. [
2]). In our formulation, a model
is given. Suppose that we have no information or knowledge on which unitary gate processed the qubit. Then, how do we describe the quantum bit?
Mathematically, it is possible to extend the maximum entropy criterion to the noncommutative case. Then we consider the maximization,
over the prior
. Is this kind of formal extension enough in quantum information theory? There are many quantities such as Rényi’s entropy [
37,
38,
39,
40,
41] in both classical and quantum information theory. Is there another possibility to consider such quantities?
Every quantum state in the model is not orthogonal anymore; thus, they are not distinguishable, which is completely different from the set of the alphabet. In spite of this, do we seek some justification for the maximization of the entropy from classical information theory?
In our formulation, the above formal argument breaks down. First, for the model
, we describe the system by the representative quantum state
, which is completely independent of the von Neumann entropy. Second, in the quantum detection game between Alice and Bob, we see that the von Neumann entropy is useless in a specific example (
Section 5.6). Rather, we consider the least favorable case or the minimization of the detection rate,
, which is contrastive to the maximization of entropy (
20).
If we seek a purely quantum counterpart of the Jaynes principle, then minimization of would be promising. Luckily, due to monotonicity, the minimization is equivalent to the maximization of , where the function is known as the minimum entropy of . Some of its properties are similar to those of the von Neumann entropy and others are not. In the next subsection, we review basic properties of the minimum entropy.
6.2. Properties of the Minimum Entropy
In the present subsection, we briefly review basic properties of the minimum entropy and then give another definition for purely quantum information. The minimum entropy of the density matrix is defined by , which is a special case of quantum Rényi entropy.
Quantum Rényi entropy has a real parameter
and is defined by
for fixed
(see, e.g., p. 117 in Ohya and Petz [
41]). When
, we obtain the minimum entropy.
The minimum entropy inherits some common properties of quantum Rényi entropy. For example, additivity holds. For every pure state, T equals to zero.
However, the concavity does not necessarily hold. Concavity of entropy means that a probability mixture of quantum states increases the uncertainty of the whole system.
This negative property is not due to noncommutativity. To see this, let us take two commutative density matrices,
where
. Then, since
,
Thus, convexity rather than concavity holds in the above example.
Since minimum entropy is based on the operator norm, we obtain a sufficient condition for convexity easily.
Lemma 5. Let two density matrices ρ and σ exist where . Suppose thatholds for every λ, . Thenholds. Proof. Since
is convex,
holds. □
To see the above lemma, when we introduce as a variant of entropy over the whole density operators, its theoretical significance seems very weak.
However, when we consider PQI in a pure state model, the situation changes drastically. For the pure state family, concavity of the minimum entropy necessarily holds in the following sense.
Lemma 6. Let a set of pure states be given, say . Then concavity holds, restricted to the model.
Proof. Choose a finite set of pure states from
, say
. Then
holds. □
Other properties of minimum entropy are usually shown in the context of quantum Rényi entropy (see, e.g., Hu and Ye [
22] (Section III, p. 4), Dam and Hayden [
21]).
Observing the above, we provide another possible definition of purely quantum information through
. In the quantum detection game, finding the LFP is equivalent to maximizing the minimum entropy
rather than the von Neumann entropy
. To consider the logarithm of the detection rate, we obtain another definition of purely quantum information such as
By definition, vanishes if the model has orthogonal pure states with the full-rank condition, where the classical case is included.
From Lemmas 5 and 6, we must be careful to treat minimum entropy in the definition of . At least, it should not be considered over the set of the whole density matrices. As a consequence, a comparison of the two definitions, and , also should be performed carefully and would require a deeper understanding of the model information, which will be a topic for future research.
Finally, we make two comments. First, our definition of the model information yields one operational meaning for the minimum entropy. It is apart from the usual extension of entropic concepts in classical information theory. Rather, it comes from an imaginary design of the quantum detector and facility location problem on the unit sphere in a complex projective space. Second, we expect that the purely quantum version of the Jaynes principle is established based on the minimum entropy. (For some related works on maximum entropy methods, see the reference [
49]). It might be possible to develop data processing methods and some dynamics based on the new principle.
7. Infinite-Dimensional Hilbert Space
Thus far, we have considered the PQI of a model only in a finite-dimensional Hilbert space. While our definition of PQI applies to infinite-dimensional Hilbert space, technical difficulties seem to arise due to a parametric family of functions. In this section, we only skim them in a specific example.
Let denote the set of square integrable complex functions over and g be a known continuous function in satisfying . Let us consider the quantum statistical model describing a wavefunction with a single parameter.
Parameter estimation of the shift parameter
has been theoretically investigated [
3]. If we replace the wavefunction
g with a probability density function such as the Gaussian density, the estimation problem for the shift parameter
is called that for the location parameter and is very common in classical statistics [
50].
Before evaluating the PQI of the model , first let us formally consider quantum state estimation with no observation. It is seen that the worst-case error equals one for every estimation.
Lemma 7. Let and with . Thenholds. For proof, see the
Appendix A.1. The above lemma says that every quantum state in
would be a minimax location on “
”.
Since the parameter space
is noncompact, the minimax theorem [
20] does not hold generally. However, we directly show that the Formula (
16) holds in this specific example, that is,
The first equality holds due to the following lemma. Because of technical difficulties, we give a proof in the
Appendix A.2.
Lemma 8. Let with and ϵ be an arbitrary positive constant. Then there exists a finite set and the uniform prior over the set such that Thus, a formal definition of PQI shows that . We can interpret the result as follows. Even if Bob knows that the quantum state is in the model or that the quantum system is described by a wavefunction , he obtains no information, which gives Bob an advantage over Alice in the quantum detection game.
We do not obtain the conditions where PQI is positive with explicit examples. Even if the Formula (
16) holds under some conditions, calculation of PQI would become drastically different. A detailed investigation is left for future study.