1. Introduction
Imaging is extremely important for acquiring the light field information of the target [
1]. Single-pixel imaging (SPI) [
2,
3], as a novel imaging alternative, can obtain images by using only a point detector without spatial resolution, having extraordinary application prospects, especially at non-visible wavelengths. Except for point-by-point scanning, SPI has also developed some imaging techniques based on Fourier [
4] or Hadamard transforms [
5,
6,
7] (i.e., complete orthogonal basis modulation, full sampling). Both Fourier and Hadamard SPI are computational imaging techniques. It is worth mentioning that there is another computational imaging technology named ghost imaging (GI) [
8], which acquires the object image by correlating the intensity signals recorded in both reference arm and object arm. GI generally requires random modulation and oversampling, it has several modalities, such as quantum GI [
8], thermal light GI [
9], pseudo-thermal light GI [
10] and computational GI [
11,
12,
13]. In the first three schemes, the reference patterns usually need to be captured by a scanning detector or a spatially resolved array imager. While in computational GI, the reference patterns can be precomputed from a programmable spatial light modulator, thus the imaging apparatus can be simplified to one optical beam without reference arm and one can compute the correlation between single-pixel intensities and the known patterns. Therefore, among various GI schemes, only computational GI can be regarded as SPI. Lately, with the rapid development of information theory, the single-pixel camera [
14] based on compressed sensing (CS) [
15,
16] has been implemented. This technology can obtain images of sparse/compressive targets at very low sampling (subsampling) rates, breaking the Nyquist-Shannon sampling limit. Generally, there is a trade-off between sparsity and subsampling, and the optimal recovery order of
M (the measurement number) proportional to
k (the object sparsity) can be cast in the phase transition framework of CS [
17,
18]. To our knowledge, the phase transition is bounded by two alternative methods: the polytope analysis [
19] and the geometric functional analysis [
20]. To some extent, single-pixel compressive imaging shares the same mathematical measurement model with thermal light, pseudo-thermal light or computational GI, all of
, where
y denotes the single-pixel/bucket measured vector,
A is a measurement matrix consisting of
M rows, each of which is stretched from every modulated/reference pattern, and
e stands for the measurement noise. But their reconstruction principles are different, the former relies upon sparse recovery, and the latter applies correlation functions. Generally, the computational complexity of sparse recovery is much higher than that of correlation functions. For common resolutions, one can obtain good performance via random compressive sampling when the sampling ratio is higher than 30% (an empirical value, because the sparsity of natural images cannot be very low) [
21], which has become the bottleneck of compressive imaging for realizing real-time practical applications, especially in large pixel-scale imaging.
Recent studies have demonstrated that not all patterns contribute to the reconstructions of computational imaging, resulting in a technology named correspondence imaging (CI) [
22,
23,
24,
25,
26], where the reference patterns with the corresponding single-pixel/bucket values well above or below the mean of detected values are picked up for reconstruction after oversampling the object image. Since there is such a phenomenon, why not just construct a universal deterministic (instead of random) measurement matrix where the modulated patterns with the biggest contributions to the reconstruction are preferentially displayed. In particular, an earlier research work [
21] reported a GI method based on “Russian Dolls” (RD) ordering of the Hadamard basis to improve imaging efficiency, which provided us with inspiration.
In this paper, we propose an origami model to construct deterministic patterns for SPI. The row vectors stretched from each constructed pattern happen to be orthogonal to each other. It turns out that the origami patterns are reordered Hadamard patterns. Under the CS framework, the sampling ratio required by this method can approach the bounds limited by phase transition. This technique may open a door to practical compressive video applications [
27,
28] with fewer measurements.
2. Theory and Methods
Before we start introducing our approach, we first need to present some GI principles and related consensus. Given the consistency of the measurement models of GI and CS, some conclusions found in GI can be easily extended to CS. In GI fields, according to the theory of CI [
22,
23,
24,
25,
26], a positive (or negative) ghost image can be retrieved by only averaging some small fractions of the reference subset
(or
), i.e.,
(or
), corresponding to
(or
). Although only a small number of patterns are involved in the calculation, the total number of measurements is not reduced, and the conditional averaging of reference patterns is made after oversampling. The mechanism behind it was firstly elucidated with some attempts [
24,
25], and recently was strictly explained through a probability theory, which regards the light intensities as stochastic variables and uses a joint probability density function, followed by an analysis of the visibility and contrast-to-noise ratio (reconstruction quality) vs. the threshold parameter (i.e., former percentage) and the pixel number of the object part [
29]. According to this theory, it can be concluded that not all the measured bucket/single-pixel intensities are necessary for reconstruction, and that the bucket signal with larger values generally has a higher reconstruction contribution.
Now, let us review the definition of differential ghost imaging (DGI) [
30], which greatly improves the quality of conventional GI. The bucket/single-pixel signal is defined as
, where
and
represents the intensity and transmission function at the spatial position
of the object
x of
pixels, respectively. Similarly, the reference bucket signal (the total intensity of corresponding reference pattern) can be defined as
, where
stands for the values at the spatial position
of reference speckle patterns. By using a differential bucket signal
instead of
in second-order correlation
, where
signifies the ensemble average and
is a constant, we can compute a differential ghost image
from
measurements via
. A high image quality can be guaranteed by DGI, but at the cost of oversampling. Assume that the light field distribution of the source and the object beam are
O and
, and that the minimum variation of the object transmission function
T is denoted by
, then we can get the signal-to-noise ratio (SNR) of recovered ghost image
where
indicates the number of speckles in the area of the light beam
.
equals the square of each speckle size
, which can be determined by calculating the autocorrelation half height width of each point in the light field. For a fixed object, the term
is constant, then SNR scales as
. It means that making the speckle coherence area in each group change from large to small (i.e., making Nspeckle in each group change from small to large) can achieve better performance. Let us recall the RD method, it first divides the Hadamard basis sequence into four quarters and splits the first quarters in the next layer, and finally reorders the patterns within each quarter, also following the above rule, i.e., making the speckle coherence area for each pattern in every quarter change from large to small. Although this sorting method can produce a good image quality relative to that of CS, all quadrants (quarters) in each layer except the first quadrant are rough and contain too many patterns, which will cause many patterns in every quadrant of equal speckle coherence areas (i.e., resulting in multiple feasible solutions).
It is easy to understand that the random patterns are obviously not the best choices to minimize the number of measurements, because there is a high probability that half of all pixels transmit or reflect the light, which is more likely to generate an averaged single-pixel value, undoubtedly degrading image quality or weakening imaging efficiency. Moreover, the random sampling can be regarded as a relatively blind process with a large amount of redundant measurements. And it is really difficult for us to determine which random pattern has a higher reconstruction contribution or produces a higher bucket intensity. Although there are some definite matrices such as Toeplitz matrix [
31] and polynomial matrix [
32], which have been used in SPI and are much easier to be implemented, but their reconstruction qualities are generally much poorer than that of random measurement matrix. Recently, it was found that by using the two-dimensional polynomial to characterize optical aberrations at the pupil plane helps to recover more details of the object images and to decrease the average error [
33].
According to common sense, the patterns that are highly similar or correlated with the object have a higher reconstruction contribution, i.e., there are many effective modulation spots falling within the object area. This undoubtedly requires us to know the object contour beforehand and let the shape of the patterns change with the target object, which is the main idea of adaptive compressive ghost imaging [
12], i.e., adaptively sample the significant object regions at multi-scales. However, if the object is unknown or there is no rough profile measurements in advance, how can we generate a pattern that produces a higher measured value or reconstruction contribution? Fortunately, we find that, if we encode on the spatial light modulator (SLM) with a pattern consisting of all-one, which can be treated as a connected domain (CD, or block, which will be elaborated below), the detected intensity value will be the highest. If the pattern is divided equally into two CDs with the white-black pixels 50%:50%, then the detected value will be the second highest with a high probability. On this basis, a good construction of CDs in patterns can make a positive contribution to the image quality. To some extent, the area of CD is equivalent to the speckle coherence area. Therefore, the patterns with smaller number of CDs generally have the larger contributions to the recovery. Following above reasons and ideas, here we propose a novel origami pattern construction method, which makes full use of symmetric reverse folding (reverse the values on the corresponding pixels at the symmetric positions) and the axial symmetry of the rescaled pattern. The steps of our origami pattern construction are as follows.
Step 1: Assume that the pixel-sizes of patterns are all
square matrices, with the same size as the object, where
p,
q (
) are all even numbers. Suppose there are
n patterns in total, we divide them into
groups, each of which with four patterns. The group number is denoted by
. Note, as an initial setting, the first pattern
in the first group is a matrix with all ones. Then, the second and the third patterns in this group
,
are obtained by inversely folding the first one
in this group up and down, left and right, respectively. That is, keeping the upper (or the left) half unchanged while the lower (or right) half is axisymmetric with the upper (or left) half but with its values in the lower (or right) half taking the opposite. By this means, we realize black (−1) and white (+1) inverting. The fourth pattern
in this group is formed by performing both up-down and left-right reverse folding operations (the sequence of operations is interchangeable) on the first pattern
. Similarly, for all the subsequent groups, the last three patterns in the
ith group,
,
and
will be generated from
following the same processes:
Step 2: The first pattern
in the
ith group (also the
th pattern in the complete sequence) is built on the basis of the
ith pattern in the current complete sequence:
Take
in the second group for example; we firstly scale both horizontal and vertical pixel dimensions of the second pattern in the complete sequence (i.e.,
) to their
size, and place the scaled one on the upper left
part of a
square matrix with all zeros. Secondly, we perform the up and down, left and right axial symmetry about the midline lines on the vertical and horizontal axes of the matrix. After that, the pattern
(also the 5th pattern in the complete sequence) is generated. Then, repeat Step 1 to acquire the second to fourth patterns in the second group,
,
and
(also the 6th to 8th patterns in complete sequence). By analogy, we can create all
groups.
Figure 1a gives a good illustration of Steps 1 and 2, and
Figure 1b shows the result obtained after Step 2. To some extent, Step 2 is similar to the multiscale/pyramid methods, such as wavelet transforms. The front pattern with the absolute subscript
i in the current complete sequence actually captures low-frequency characteristics (rough contours) of the target object, while the four patterns
,
and
in the
ith group are actually generated from their front pattern
but with finer CDs and will undoubtedly add the details of high-frequency components with respect to the object. This also explains the operational mechanism of our method.
Step 3: Adjust the pattern sequence to ensure that the number of CDs for each pattern in the
ith group is in an incremental order. Here, a CD is defined as an up-down-left-right connected area consisting of equal values (see the upper right corner of
Figure 2, which is just an example to illustrate how a CD works, but not related to the actual patterns used). The neighborhoods on both sides of the axis may cause the partition of CDs, which is worth of investigating. Let us first look at the pattern
whose pixel values on both sides of the symmetry axis are the same, its contribution to the recovery is the largest in the
ith group. Since the pattern
needs double symmetric reverse folding, its number of CDs is largest (with lowest reconstruction contribution) in the
ith group. The patterns
and
are produced by only once symmetric reverse folding, whether to first perform up-down or left-right operation is approximately equivalent, thus they are particularly worth considering. As shown in
Figure 2, the patterns
and
with a group indication (ID) (note: not the absolute subscript)
need to be order-reversed for the case of
, while those with a group ID
require adjustments for the case of
. This rule also exists in the higher-order cases. Given this, we set fourfold as a level. From top to bottom, the total
groups (as a whole) can be catalogued into four parts: only the second part remains unchanged, the groups in the third part all need to be adjusted, while the groups in the first and fourth parts also need to be adjusted but with the same group IDs, depending on the recursive layer. Then, repeat the operation for the first part (as a new whole) until the number of groups in each new quarter part equals to 1. In the last layer, only the third part needs to be adjusted. Then, switch the order of the patterns
and
in these found groups.
If we perform left-right and up-down reverse folding to generate
and
in Step 1, respectively, then the ID positions should be adjusted accordingly in Step 3. After the above three steps, we will get the final pattern sequence.
Figure 3 gives an example of the final pattern sequence for
. In order to deeply analyze the advantages of our method, we compare it with the RD method. It is interesting to find that these two approaches have similar numbers of CDs in the low-order part (especially for the first 16 patterns, they are the same). Since their subsequent results are both derived from these first 16 patterns, it is advisable to set each 16 patterns as a comparison unit.
Figure 4 shows the differences in their high-order parts (the last 64 patterns in 256-order sequence which also comprises the largest division of the RD method). As shown in
Figure 4a, the RD ordering becomes very rough in this part, and has a lot of pattern pairs with the same number of CDs in each comparison unit. Because the maximum partition length of RD is
and its minimum segmentation length is 4, it will inevitably incur too much uncertainty. Although the internal order exchange in these pattern pairs does not affect the RD sorting too much, it does bring hundreds of millions of possibilities of finding the optimal pattern sequence to get the best imaging quality, while in our scheme, each four patterns are treated as a group, i.e., the partition length is always 4. As shown in
Figure 4b, there are only four red marks in our method for the high-order part, with only
uncertainty. An intuitive explanation is that we rearrange the patterns into
groups, each of which comprises four patterns, since the folding projections are implemented in the entire part or the first half part of one previous pattern with only four limited combinations. Fewer patterns with the same number of CDs in much smaller groups may make our method more accurate. Further mathematical interpretation will be the focus of our future work.
Now, each pattern
can be sequentially reshaped into a row vector of
, and then
n such row vectors constitute a full-rank square matrix. It happens to be an orthogonal matrix and turns out to be a row-reordered Hadamard matrix in which only the row order is different from that of a natural-ordered Hadamard matrix. Such matrix is quite suitable for forming a measurement matrix
of CS by just selecting the first
m row vectors. Here, we use TVAL3 solver [
34] to recover the image, in which our row number arrangement can be easily combined with its large-scale image reconstruction algorithm, and of course, our origami patterns are also available for GI.