2. Methods
Let L and X be compact subsets of with respect to the Euclidean metric. For , define to be the set of bounded piecewise- paths from x to y, parametrized by the Euclidean arc-length. Similarly, denotes all paths from x to a set S.
For any compact set
, define
by:
The length of a unit-speed path
is denoted as:
For
, define:
and:
Note that is a distance function, while is not. The latter function can be interpreted as a first-order approximation of the former.
Definition 1. For any compact set , for some compact set , the α-offsets with respect to are: The distance function
can be transformed into an arbitrarily close smooth function
[
18], yielding a Riemannian metric
defined in an identical manner as
. From this, one has corresponding
-offsets
that are arbitrarily close to
. We will encounter this smoother version in
Section 3.3.
We will approximate the offsets by a union of balls as follows.
Definition 2. For any compact set , for some compact set , the approximate α-offsets with respect to are: A useful property of is that it is a one-Lipschitz function. In general, a function f between two metric spaces and is said to be k-Lipschitz if for all , .
Lemma 1. The function is one-Lipschitz from the metric space to .
Proof. Fix any . There exists point and a path such that . Likewise, there exists such that .
This implies that the concatenation of and is a path in . Thus, . As this holds for all , we conclude that , as desired. □
We can use to define the Hausdorff distance, which is a metric between compact sets. This metric is useful for stating bounds on the quality, or uniformity, of a sample near a set.
Definition 3. The Hausdorff distance between two compact sets is defined as: If the Hausdorff distance between a compact set and a sample is bounded, Lemma 3 shows that their -offsets are interleaved at particular scales.
Lemma 2. Let be such that . Then, for all , and .
Proof. Let be any point. By the definition of , we have . Therefore, there exists such that . The Hausdorff assumption that implies that for all , we have . By Lemma 1, , implying . The second inclusion is proven by a symmetric argument. □
The following is the definition of an adaptive sample we will use throughout. For the special case when X is a manifold and L is its medial axis, it corresponds to the -sample used in surface reconstruction.
Definition 4. Given a compact set and compact sets such that , we say that is an ε-sample of X, for , if for all , there exists such that .
This definition is closely related to that of the approximate -offsets, because if is an -sample of X, then for all , .
3. Results
Lemma 3. Consider to be such that . Then, for all , and .
Proof. Fix . By definition, , which implies that there exists such that . , which implies that for all , . Now, by Lemma 1, , implying . By a symmetric argument, the other statement holds. □
Lemma 4 relates the length of a path with respect to two distance-to-set functions, assuming they have a close Hausdorff distance with respect to a Euclidean metric.
Lemma 4. Let be two compact sets such that for some . For all unit-speed, , where for some positive c, we have the following inequalities. Proof. Take an arbitrary unit-speed path where . Since the image of the path is a subset of , then for all , . By the Hausdorff distance between L and , we have . Likewise, we have that . Rearranging both of these, we have that .
By the definition of and , these inequalities imply that □
The following lemma provides a bound on how close to L a shortest path to a compact set X can be and a constant c to satisfy Lemma 4 that is dependent on what compact set one is working.
Lemma 5. Take compact set , compact set , and , for . If γ is the shortest path from y to X with respect to , then: Proof. Since , , so there exists such that . Take as the shortest path from y to X. For all , .
By Lemma 10, , and by being Lipschitz, we have that . This means that every point on the path is at least distance away from L. □
We define a noisy -sample, for , of compact with respect to for some compact set L as a compact set such that for all , there exists such that . Likewise, for all , there exists , such that . The following theorems relate a noisy -sample to the Hausdorff distance between the sample and the set X and vice versa.
Lemma 6. Consider compact set L and compact . If is a noisy ε-sample of X with respect to , for , then .
Proof. Given , by definition, there exists such that . By Lemma 10, , so for all, , .
Furthermore, given , there exists such that , so for all , ; thus, . □
Lemma 7. Consider compact set L and sets . If , then is a noisy -sample of X with respect to .
Proof. implies that for all , . Thus, there exists such that . By Lemma 10, .
Similarly, implies that for all , ; thus, there exists such that , and thus, . Since , then , so is a noisy -sample of X. □
Lemma 8. Given compact set and compact set , for , .
Proof. Take so that . Thus, there exists such that . By Lemma 10, this implies that , which implies that . □
Lemma 9. Given compact set and compact set , for , .
Proof. Consider . Thus, , for some , so . Applying Lemma 10, we then have that , and as , . □
3.1. Adaptive Sampling
In this section, we prove that a uniform sample in the induced metric corresponds to an adaptive sample in the Euclidean metric and vice versa. The key to this proof is the following lemma, which will also be used for the more elaborate interleaving results of
Section 3.2.
Lemma 10. Let be a compact set, and let . Then, the following two statements hold for all .
- (i)
If , then .
- (ii)
If , then .
Proof. To prove (i), we assume
. Let
be the path in
such that
. Then, we have the following inequalities following from the Lipschitz property of
.
It follows that . Because is the length of the shortest path between a and b in the Euclidean metric, we conclude that .
Next we prove (ii). Assume
. For all points
z in the straight line segment
,
This implies the following inequality.
□
We can now state the main theorem relating adaptive samples in the Euclidean metric to uniform samples in the metric induced by a set L.
Theorem 1. Let L and X be compact sets; let be a sample; and let be a constant. If is an ε-sample of X with respect to the distance to L, then . Furthermore, if , then is an -sample of X with respect to the distance to L.
Proof. Given , there exists such that . By Lemma 10, , so for all , . As , this proves .
Furthermore, implies that for all , ; thus, there exists such that . Thus, by Lemma 10 . Since , then , so is an -sample of X. □
3.2. Interleaving
A filtration is a nested family of sets. In this paper, we consider filtrations
F parameterized by a real number
so that
, and whenever
, we have
. Often, our filtrations are sublevel filtrations of a real valued function
. The sublevel filtration
F corresponding to the function
f is defined as:
Definition 5. A pair of filtrations is -interleaved in an interval if whenever and whenever . We require that the functions to be nondecreasing in .
The following lemma gives us an easy way to combine interleavings.
Lemma 11. If is -interleaved in and is -interleaved in , then is -interleaved in , where and .
Proof. If , then we have . Similarly, if , then . □
3.2.1. Approximating X with
Ultimately, the goal is to relate , the offsets in the induced metric, to , the approximate offsets computed from approximations (or samples) to both X and L. This relationship will be given by an interleaving that is built up from an interleaving for each approximation step. For each of the following lemmas, let and be compact sets.
Lemma 12. If , then are -interleaved in , where .
Proof. This lemma is a reinterpretation of Lemma 3 in the interleaving notation. □
3.2.2. Approximating the Induced Metric
It is much easier to use a union of Euclidean balls to model the sublevel sets of the distance function . Below, we show that this is a reasonable approximation. The following results may also be viewed as a strengthening of the adaptive sampling result of the previous section (Theorem 1).
Lemma 13. The pair is -interleaved in , where .
Proof. It will suffice to show that for , , and for , .
Take so that . Thus, there exists such that . By Lemma 10, this implies that , which implies that .
Consider any point . For some , we have , so . Applying Lemma 10, we have that . Finally, , because . □
3.2.3. Approximating L with
Usually, the set
L is unknown at the start and must be estimated from the input. For example, if
L is the medial axis of
X, there are several known techniques for approximating
L by taking some vertices of the Voronoi diagram [
5,
6]. We would like to give some sampling conditions that allow us to replace
L with an approximation
. Interestingly, the sampling conditions for
are dual to those used for
: we require
. In other words,
must be an adaptive sample with respect to the distance to
.
Lemma 14. If , then is -interleaved in , where .
Proof. Fix any
. There is a point
such that
. Moreover, there is a nearest point
to
x such that
. Lemma 10 and the assumption that
together imply that there exists
such that:
It then follows from the definitions that:
Therefore, we can bound
in terms of
as follows.
Therefore, , and so, we conclude that . The proof is symmetric to show that □
3.2.4. Putting It All Together
We can now use Lemma 11 to combine the interleavings established in Lemmas 12–14.
Theorem 2. Let and be compact sets. If and , then are -interleaved in , where and .
Proof. We use Lemma 11 to combine the interleavings from Lemmas 12–14 to conclude that the pair
is
-interleaved in
. To complete the proof, we expand
and
as follows.
Therefore, we have that and . □
3.3. Smooth Adaptive Distance and Homology Inference
In the preceding sections, we showed how to approximate (via interleaving) , the sublevels of the distance to X in the induced metric, using a finite set of Euclidean balls, . Now, we show how and when such an approximation gives a guarantee about the underlying space X itself. This is substantially more difficult, because it requires us to relate the sublevels of the induced metric to an object we do not have direct access to. As such, we will require some stronger hypotheses.
We will first review the critical point theory of distance functions. Then, we show how to smooth the induced metric to an arbitrarily close Riemannian metric, rendering the critical point theory applicable. Then, we put these together to prove the main inference result of the paper, Theorem 3.
3.3.1. Critical Points of Distance Functions
In this section, we give a minimal presentation of the critical point theory of distance functions to explain and motivate the results about interleaving offsets of distance functions in Riemannian manifolds. The main fact we use is that such interleavings lead immediately to results about homology inference (Lemma 16).
For a smooth Riemannian manifold
M and a compact subset
, one can consider the function
that maps each point in
M to the distance to its nearest point in
X as measured by the metric on the manifold. The gradient of
can be defined on
M, and the critical points are those points for which the gradient is zero. The critical values of
are those values of
r such that
contains a critical point. The critical point theory of distance functions developed by Grove and others [
11] extends the ideas from Morse theory to such distance functions. In particular, the theory gives the following result.
Lemma 15 (Grove [
11]).
If contains no critical values, then is a homotopy equivalence. This means that for intervals that do not contain critical values, the inclusion maps in the filtration are all homotopy equivalences and therefore induce isomorphisms in homology. This is used to give some information about the homology of filtrations that are interleaved with F.
We write to denote homology over a field. Therefore, for a set , we have a vector space , and for a continuous map , we have a linear map . For the canonical inclusion map for a subset , we will denote the corresponding linear map in homology as . The image of this map is denoted .
Lemma 16. Let be the distance function to a compact set in a Riemannian manifold such that contains no critical values of . Let F be the sublevel filtration of , and let G be a filtration such that are -interleaved in . If , then: Proof. The interleaving and the hypotheses imply that we have the following inclusions.
The preceding lemma implies that the maps , , and all induce isomorphisms in homology. It follows that , because the inclusion of spaces in G is factored through a space in F, and it factors an inclusion of spaces, all of which are isomorphic in homology. □
3.3.2. Smoothing the Metric
To apply the critical point theory of distance functions to the induced metric directly, we would need it to be a smooth Riemannian manifold. Although it is not smooth, we can smooth it with an arbitrarily small change. The process, though a little technical, is not surprising, nor very difficult. It proceeds in three steps.
We smooth the distance to L. This is the source of non-smoothness in the induced metric. This replaces with a smooth approximation, .
The smoothed distance to L is used to define the smoothed induced metric analogously to the original construction of .
The induced distance function can then be replaced by its smoothed version , and the corresponding smoothed offsets are then well defined.
The complete construction of the smoothed offsets is presented in
Appendix A. The end result is an interleaving between the induced offsets
and the smoothed version
as expressed in the following lemma.
Lemma 17. Given , consider compact sets and compact sets , such that and , then are -interleaved on , where and .
3.3.3. The Weak Feature Size
Chazal and Leutier [
19] introduced the weak feature size (
) as the least positive critical value of a Riemannian distance function. We denote the weak feature size with respect to
as
. In light of the critical point theory of distance functions, a bound on the weak feature size gives a guaranteed interval with no critical points. This allows one to infer the homology from another filtration (usually one that is discrete and built from data) as long as the second filtration is interleaved in that critical point free interval.
Lemma 18 (Adapted from [
19] Theorem 4.2; see also [
20]).
Let S and be compact subsets of . If and , then for all sufficiently small , The key idea in that proof is that the Hausdorff bound gives an interleaving, while the weak feature size bound gives the interval without critical points. The technical condition regarding is present to account for strange compact sets that may be homologically different from their arbitrarily small offsets. It is reasonable to assume that for some sufficiently small that , and thus, one could “compute” the homology of S using only the sample .
Most previous uses of the weak feature size have been applied in Euclidean spaces, but the critical point theory of distance functions can be applied more broadly to other smooth Riemannian manifolds. This is why we introduced it as (with the superscript) to indicate the underlying metric.
3.3.4. Homology Inference
We have now introduced all the necessary pieces to prove our main homology inference result.
Theorem 3. Given , consider compact sets and compact sets , such that and . Define the real-valued functions and as: Given any , such that , if , then: Proof. Given
such that
, we have the following sequence of inclusions as a result of Lemma 17.
As we assume that
, by the definition of the weak feature size, Lemma 16 implies that the inclusions
and
are homotopy equivalences. We remind the reader that if two spaces are homotopy equivalent, all the induced homology maps between the spaces are isomorphisms. By applying homology to each space and inclusion in the previous sequence, we have the following sequence of homology groups, where
and
are isomorphisms.
The aforementioned isomorphisms and factor through and , respectively, proving that is surjective and is injective. We then have that . □
3.3.5. Computing the Homology
The last step is to relate the smoothed offsets to something that can be computed. It will generally be the case that the approximation of X is not just compact, but also finite. Then, for any scale , we have that is the union of a finite set of Euclidean balls.
The nerve theorem provides a natural way to compute the homology of a union of Euclidean balls. The nerve of a collection
U of sets is the set of all subsets of
U that have a nonempty intersection. It has the structure of a simplicial complex, whose homology can be directly computed by standard matrix reduction algorithms. When all nonempty intersections are contractible, the cover is said to be good. A cover by Euclidean balls (or any convex shape) is always good. For good covers, the nerve theorem, a standard result in algebraic topology [
21], implies that:
This is the most basic way to compute the homology of union of balls and is used throughout topological data analysis.
In our case, we are not just computing the homology of the union, but also the homology of the inclusion map. This computation will require a slightly stronger result. The persistent nerve lemma [
20], applied to Diagram (
4) when combined with the above isomorphisms, yields the following.
This last statement turns the isomorphism into an algorithm, because standard algorithms [
22] can compute the homology of the inclusion of the nerves.
4. Conclusions
We present an alternative metric in Euclidean space that connects adaptive sampling and uniform sampling. We show how to apply classical results from the critical point theory of distance functions to infer topological properties of the underlying space from such samples. This provides a connection between methods in surface reconstruction (based on adaptive sampling) and homology inference (based on uniform sampling).
We show in Theorem 1 that there is a precise relationship between samples that are uniformly taken with respect to at some scale, to those same samples being adaptive in the Euclidean metric. In Theorem 2, we show that we can interleave the sublevel sets of our distance function under this alternative metric with the metric balls resulting from our approximation of the metric, assuming that both and are uniformly well sampled with respect to the Hausdorff distance of and . Finally, we show how to fully extend the critical point theory of distance functions and the weak feature size to give theoretical guarantees on homology inference from finite samples of X and L using the induced metric (Theorem 3).
The main limitation of adaptive metrics is that they require two sets as input, one to define the set and one to define the metric. In many instances, this is not available. However, we expect that the approach could find wider use in problems with labeled data. For example, data with binary labels may be viewed as the two sets X and L. Then, each set defines a metric on the other, where the metric is scaled according to how close it is to the other set. This is the subject of ongoing and future work.