The complex-valued image description is introduced. Different representations of complex-valued images are compared with respect to mutual information. An evaluation method for estimating the uniformity of feature distributions is proposed.
2.1. Complex-Valued Image Description
The data provided by active sensors consists of the active and passive intensities together with the range information. In this article, it is assumed that the latter is given by a phase measurement. In this wide-spread method, the phase is usually interpreted as actual distance. However, this approach causes problems if the observed object is at a distance beyond the uniqueness range. For this reason, we will always interpret that information as what it is: namely a phase value (Note: The measured phase difference (a physical quantity) can be represented by a phase value (a mathematical quantity).). Hence, a description of an active-passive image using complex numbers becomes quite natural.
Throughout this article,
are image coordinates,
is the range image,
the active intensity image, and
the passive intensity image. The latter does not depend on the range
r. The complex-valued image function is now defined as:
where the phase
is defined via the range
with
. Notice that passive intensity is treated here as an offset. In this setting,
is the
uniqueness range of the camera with
. The natural number
ℓ is a multiple of some unit of length, and
n is the “wrapping number”.
The two standard ways of representing complex numbers yield two different image representations: the Polar representation
where
and the Cartesian representation
where
Throughout the article, it is assumed that
for all complex images. This normalisation can be achieved through division by the maximal value of
. The remainder of this article will discuss these two different representations of complex images coming from different types of sensors from the entropy perspective.
2.2. Mutual Information in Complex-Valued Images
If a representation of complex-valued images
f with real values is given, the image-value dimension is at least two. However, the information content of data is known to depend on their representation. For complex-valued images, this means that some real-valued representations could be more preferred than others from the viewpoint of information theory. For this purpose, the differential entropy is defined as
where
R is the range of quantity
q,
is a probability measure and
is the distribution function of
q.
If
, then
becomes the joint entropy of amplitude
and phase
. Likewise,
is the joint entropy of the real and imaginary parts
,
of the complex-valued image:
It is a fact that the entropy of a system depends on the choice of coordinates, the change in entropy being dependent on the Jacobian of the transformation (
cf. e.g., [
12]). In the case of complex-valued images, this general result specialises to a preference of Cartesian over Polar coordinates:
Theorem 2.1. The transformation from the Polar to the Cartesian image representation increases the entropy. More precisely, it holds true thatwhereand is the joint distribution function of and . Proof. The statement follows from the well-known transformation rule of the distribution function:
where
J is the Jacobian of
, the transformation from the Polar to the Cartesian image representation. In this case,
, since
A is normalised amplitude by Equation (
9). It follows that the mean
is negative. ☐
As a consequence, Theorem 2.1 allows to compute the difference in mutual information
for the pairs
and
from the individual entropies:
and the quantity
μ becomes a measure for the gain of independence by transforming from Polar to Cartesian image representation. Namely,
if and only if the quantities
a and
b are independent, and
can be interpreted as the degree of dependence between
a and
b. This allows to formulate:
Conjecture 2.2. For almost all complex-valued images, there is a gain of independence by transforming from the Polar to the Cartesian image representation. In other words: for almost all complex-valued images.
In fact, the experiments of
Section 3 indicate that
which means that
.
2.3. Naive Approach
For range measurements within the uniqueness range, the well-known inverse-square law of active intensity implies the approximation:
where the phase
φ is identified with the range
r (w.l.o.g.
in Equation (
2)). This means that it does make sense to consider
and
as correlated and detect features only in the pair
. This is called the
naive approach. Hence, there are two successive transformations leading to our complex image in Polar representation:
where
with Jacobian of the composite map being
We wish to exclude the possibility that the benefit from
of the second transformation (Polar to Cartesian image representation with Jacobian
) is jeopardised by the first transformation with Jacobian
.
The relation between
and
for general composed transformations
is known to be
Hence,
and it follows that
where the means are each taken over the corresponding probability distribution. Hence, we would like to exclude large positive values of
. From Equation (
20), Equation (
24) and Equation (
25), it follows that
which depends only on
φ. Notice that the denominator is strictly positive, and a closer look reveals that
if
φ is not concentrated in some specific small neighbourhood of
π.
Notice, that the inverse-square law in Equation (
18) can be used to estimate missing values of
or
in order to obtain our complex image representation. Use of this will be made in the following section.
2.4. Feature Distribution in Complex-Valued Images
Scale-space feature detection usually involves finding extrema in real-valued functions, and these are obtained from the image through filtering. In the case of complex-valued images f, it makes sense to detect features individually in the components of a representation over the real numbers. This means, for the Polar representation, the detection of features in and in , and for the Cartesian representation in and . The classical SIFT can be applied to any kind of real-valued images. In particular, applying SIFT to the pairs or componentwise defines ℂSIFT. If the complex-valued image is represented by the pair of real values, a feature for ℂSIFT is defined as a point which is a classical SIFT-feature for u or v.
The preferred representation usually has the desired property that it contains more features, and these are also more uniformly distributed over the image grid than in other representations. More texture in an image can be obtained by increasing the entropy. Hence, a transformation whose Jacobian has absolute value less than one yields more texture by Equation (
14), and Theorem 2.1 then says that the Cartesian representation yields more structure than the Polar representation. On the other hand, using the scale-space equation
aims at finding texture which is sufficiently persistent through the filtering cascade. Hence, increasing entropy of the image derivative leads to more persistent texture. And also from this persistence point of view, the Cartesian representation turns out more advantageous than the Polar representation:
Theorem 2.3. Transforming from Polar to Cartesian image representation increases the entropy of the scale-space derivatives:withwhere the expectation value is taken over the joint probability distribution of and . Proof. In the light of Theorem 2.1, the statement follows from the Jacobian
of the transformation of derivatives. ☐
A
Cartesian feature in a complex-valued image is defined to be a scale-space feature for
or
, and, similarly, a
Polar feature of
f is a scale-space feature for
or
. Consequently, one can formulate:
Conjecture 2.4. The expected number of Cartesian features is larger than the expected number of Polar features for almost all complex-valued images f.
It is natural that a mere increase in the number of features is not sufficient for many applications, e.g., the more the points of interest are concentrated in one small portion of the image, the less relevant their number becomes for estimating the relative camera pose. Hence, an important issue is the distribution of features on the image grid. In fact, it is often desired to know that they are sampled from a uniform distribution.
For
n independent, identically distributed random variables
, the empirical distribution function
is defined as
where
is the indicator function
Then, by the Glivenko-Cantelli Theorem [
13,
14], the
converge uniformly to their common cumulative distribution function
F:
with increasing number
n of observations. For arbitrary cumulative distribution functions
F, the expression
is known as the
Kolmogorov-Smirnov statistic. It has the general properties of a distance between cumulative distribution functions. Therefore, it will be called here the
KS-distance.
In the case of a snapshot taken from a scene, one can assume that the observed n features are produced by the scene independently from another. By viewing the scene as the single source of features, one can further assume that the features are identically distributed. In other words, they can be assumed taken from a common cumulative distribution function F.
However, there seems to be no straightforward generalisation of the KS-distance to the multivariate case, as indicated by [
15], in particular the proposed generalisations seem to lack robustness. Therefore, we simply propose the Euclidean norm of the two coordinate-wise KS-distances:
where
S is a sample of
n points in the plane,
is the empirical distribution function,
and
λ are the cumulative density functions of the uniform distribution on the
i-coordinate axis and on the plane, respectively. This will be called the
Euclidean KS-distance to uniformity. Conjecturally, there will be more uniformity in the detected scale-space features for the Cartesian representation than in those for the Polar representation. Let
be the sample of Cartesian features, and
the sample of Polar features of a given complex-valued image
f. Then:
Conjecture 2.5. For almost all complex images f, the transformation from Cartesian to Polar image decreases the Euclidean KS-distance to uniformity:where λ is the uniform distribution on the image plane. Conjecture 2.2 says that the pair will be more independent than . Conjecture 2.4 says that there will be more Cartesian than Polar features. Intuitively, these conjectures together support Conjecture 2.5.