1. Introduction
Compressed sensing (CS) [
1,
2,
3,
4] is very suitable for resource-constrained applications due to its low complexity [
5,
6,
7,
8]. In CS, it is necessary to quantize real-valued CS measurements. The CS framework with the quantization process is called quantized compressed sensing (QCS) [
8].
The QCS primarily focuses on optimizing the encoder or decoder to enhance the quality of signal reconstruction for the commonly used quantization methods. The exploration of advanced compression techniques in the realm of block-based compressed sensing (BCS) has been a focal point in contemporary research. One such advancement is the application of Differential Pulse Code Modulation (DPCM) for quantized block-based compressed sensing of images. This strategy leverages DPCM’s efficiency in exploiting spatial correlations within image blocks to minimize quantization errors, demonstrating improved bit-rate efficiency without sacrificing reconstruction quality [
9]. However, the quantized measurements must undergo entropy coding to attain ideal performance. Ref. [
10] proposes a progressive quantization method, which essentially improves the encoding and decoding strategies while still utilizing uniform scalar quantization. The reconstruction algorithm [
11,
12,
13,
14] has been a primary focal point in CS for optimizing the decoder.
In order to improve the quantization performance, vector quantization has been used to quantify the CS measurements [
15,
16]. Subsequently, to further amplify the efficiency of the vector quantization technique, ref. [
17] has leveraged deep neural networks. Due to the high computational complexity of vector quantization, scalar quantization is more suitable for CS measurements. In data compression theory, the scalar quantizer that performs entropy coding on quantized output data is usually called an entropy-constrained quantizer or entropy-coded quantizer [
18]. When quantization error is used as a distortion measurement, the uniform scalar quantizer is the optimal entropy coding quantizer in rate-distortion performance [
19,
20]. In other words, for the BCS measurements, the rate-distortion performance of the uniform or non-uniform scalar quantization methods will be inferior to the joint performance of uniform quantization and entropy coding. Therefore, in current research on CS for images, the CS measurements are usually quantified using the uniform scalar quantization method [
21], and the quantized measurements are encoded by entropy coding to improve the compression performance [
22,
23]. However, since the computational cost involved in entropy coding is usually high [
23,
24], using entropy coding will reduce the low-complexity advantage of the CS encoder.
There are two main ways to improve rate-distortion performance. The first way is to reduce the bitrate while keeping the distortion constant, while the second way is to reduce the distortion while keeping the bitrate constant. Using entropy coding on quantized data is the first way. Without considering entropy coding, the second way is the only method to improve rate-distortion performance. Moreover, there are strategies aimed at enhancing the compression efficiency of CS. For example, ref. [
25] introduces a novel application of Zadoff–Chu sequences, renowned for their excellent autocorrelation properties. By utilizing this sequence in the measurement matrix, sparsity in the compressive domain is enhanced, leading to improved recovery accuracy for quantized CS data. Ref. [
26] explores the use of Discrete Fourier Transform (DFT) for measurement, enabling parallel processing capabilities and enhancing the efficiency of block compressive sensing. This strategy capitalizes on the computational advantages of FFT to accelerate the CS process while maintaining reconstruction fidelity. While many approaches can improve compression efficiency at the encoding stage [
25,
26,
27,
28], they are not necessarily applicable or effective for QCS.
While the field of block compressed sensing (BCS) has witnessed significant advancements in recent years, traditional quantization techniques employed in image/video coding applications continue to face pivotal challenges. Specifically, uniform quantization, a common choice due to its simplicity and compatibility with entropy coding, often incurs substantial quantization errors, compromising the fidelity of the reconstructed images. Moreover, it tends to overlook the inherent structure and correlations within the data, leading to encoding redundancy and inefficiency. On the other hand, Lloyd–Max quantization, although theoretically optimal for minimizing mean squared error, necessitates computationally intensive offline training and struggles to adapt dynamically to varying image characteristics. Furthermore, the reliance on entropy coding as a supplementary step to mitigate the loss from quantization adds to the computational burden and complexity of the encoding process.
In light of these challenges, our work introduces a novel convolutional neural network (CNN)-based quantization method specifically designed to overcome the drawbacks of traditional approaches. By leveraging the power of deep learning, our method transcends the uniformity imposed by classic quantizers, achieving a more nuanced mapping of measurements that closely follows their underlying distribution. This adaptive quantization, coupled with a sophisticated dequantization mechanism that harnesses correlation from quantized data, significantly reduces quantization errors without resorting to additional entropy coding.
In this paper, we propose a quantization method of BCS measurements that can reduce distortion while maintaining the bitrate. At the end of encoding, the proposed method models the measurements’ distribution of each image block. Subsequently, it quantizes the measurements of data that conform to a uniform distribution based on the distribution model. The proposed method uses the distribution parameters of each image block as side information of the encoder and adopts the same strategy to quantize the side information with 1 bit. At the end of decoding, the proposed method first restores the quantized data to data that conform to the distribution of the original measurements, then extracts the correlation information of the measurements from the quantized data of adjacent blocks to correct the measurements. Before the dequantization of the measurements, the same strategy is used to dequantize the side information. All quantization and dequantization processes are implemented as convolutional neural networks (CNN), and all networks are jointly learned offline. The CS coding structure based on the proposed method is shown in
Figure 1.
The main contributions of this article are as follows:
A quantization method of BCS measurement based on CNN is proposed. The proposed method constructs and jointly trains the CNN of the quantization and dequantization processes, which aims to maximize the coding output entropy and minimize the quantization error.
A quantization process based on measurements’ distribution is proposed. Based on the properties of the cumulative distribution function (CDF), a neural network model was constructed to map measurements to the quantized data following a uniform distribution, which maximizes the amount of information carried in the quantized data. An activation function with a constrained output range was designed to reduce the computational complexity of the network’s activation function.
A dequantization process based on the neighborhood information of measurements is proposed. The inverse process of quantization is used as a module to restore the quantized data to data that conform to the distribution of the original measurements. Furthermore, an information correction module is introduced to extract correlation information from multiple quantized values for correcting the measurements. The two modules are used to improve the quality of the dequantized measurements through residual connection.
The distribution parameters of the block measurements are used as side information, which is quantized with 1 bit by the same quantization process.
While conventional approaches such as uniform quantization and Lloyd–Max quantization with entropy coding have been widely employed, they often introduce significant quantization errors and encoding inefficiencies. Our work diverges from these methodologies by introducing a CNN-based quantization strategy that not only maps measurements to a uniformly distributed quantized space to maximize the information content but also incorporates a novel dequantization process that leverages correlations from the quantized data to minimize reconstruction errors. This innovative method bypasses the need for entropy coding, offering a more efficient and adaptive solution for BCS applications.
In comparison to uniform quantization, which assigns equal intervals to the entire dynamic range, and Lloyd–Max quantization, known for minimizing the mean squared error but requiring complex optimization, our CNN-based approach dynamically adapts to the underlying data distribution. Unlike entropy coding methods, which reduce redundancy at the expense of computational complexity, our method directly optimizes the quantization process through learning, achieving superior performance without additional encoding steps.
The core of our method lies in the design of the CNN architecture, which jointly learns the quantization and dequantization processes. This contrasts with traditional quantization techniques that rely on predetermined, static decision boundaries. Our network specifically tailors the quantization levels based on the input data’s statistical properties, ensuring a closer match to the original signal characteristics. Additionally, the utilization of correlation information from the quantized data for post-processing further distinguishes our method, leading to reduced quantization artifacts.
The rest of this paper is organized as follows.
Section 2 introduces the proposed method, which mainly includes the BCS quantization process, parameter estimation, parameter quantization and dequantization, and the BCS dequantization process.
Section 3 presents the experimental results. The conclusion is given in
Section 4.
3. Results
In this section, we present various experimental results that validate the performance of our method. The proposed method is primarily implemented through CNNs, which requires the collection of a training dataset for network training. The training dataset comprises 200 training images taken from the BSDS500 dataset [
32]. Each image has been cropped into grayscale images of size 256 × 256 with a stride of 60 pixels. The block size utilized in BCS is set at 16 × 16. The samples and labels of the training data set are both a matrix of BCS measurements for each image. The matrix used to collect the measurements has a sampling rate of 0.8, so the trained network can be applied to any measurements’ matrix with a sampling rate lower than 0.8. Each block typically requires at least ten measurements to reconstruct an image from the BCS measurements efficiently, so we take
as the number of the partial measurements.
All CNNs were implemented using the Pytorch framework. We trained the CNNs of the quantization and dequantization processes together. The batch size was set to 32, with the optimization process performed using the Adam algorithm, initialized with a learning rate of 0.001. After the initial training of 10,000 epochs, the learning rate was reduced by a factor of 10, and all networks were trained for an additional 20,000 epochs. The training process was conducted on a server powered by an Intel Xeon CPU, a Nvidia RTX 2080Ti GPU with 11 GB of memory, and 128 GB of DDR4 RAM. The test images consisted of an APC, aerial, airplane, airport, building, moon surface, tank, and truck, as illustrated in
Figure 12. Publicly available datasets such as Set5 [
33] (5 images), Set11 [
34] (11 images), Set14 [
35] (14 images), and BSD68 [
36] (68 images) were also employed. All experiments were conducted on an Intel Core i5-8300H CPU @ 2.30GHz, and the proposed method’s performance was measured using the PSNR of the reconstructed images.
3.1. Analysis of Measurement Reconstruction Results
In this section, we analyzed the number of reconstruction levels and the quantization errors of the quantization method.
The current studies on improving CS quantization methods typically focus on sparse signals, but these methods are not suitable for images with a large number of elements. To ensure a low complexity of the encoder, the BCS encoder usually uses uniform quantization and entropy coding to process the BCS measurements. In addition, uniform quantizer is considered as the optimal quantizer for entropy-coded quantization in data compression theory [
23,
24], which is why BCS encoders tend to use uniform quantization and entropy coding. Currently, the most advanced quantization techniques for BCS of images are believed to be the prediction quantization method [
9] and the progressive quantization method [
10]. However, they essentially improve the coding strategy, while the quantization method employed is still uniform quantization, which can explore improvements using the commonly used quantization methods and the proposed method. To simplify the experimental process, this paper only compares the quantization techniques, using entropy-coded uniform quantization, µ-law quantization [
37,
38], and Lloyd–max quantization [
39,
40] as benchmarks. The entropy-coded uniform quantization method refers to the use of entropy coding after performing uniform quantization on the measurements. The codebook for Lloyd–Max quantization was obtained through offline training.
In scalar quantization methods, the number of reconstruction levels is typically equal to the number of quantization levels, as shown in
Table 1.
Table 2 shows the number of reconstructed levels of the proposed method for eight test images at a measurement rate of 0.2. Comparison between
Table 1 and
Table 2 shows that the proposed method have more different elements in the dequantized measurements. This is mainly because the proposed method utilizes the information of multiple quantized values for the dequantization, which gives the proposed method the advantage of many-to-many mapping. Moreover, each row of the measurements’ matrix adopts different maximum and minimum values for local denormalization. The local normalization approach also increases the number of reconstruction levels in accordance with the increase in measurement rate.
The greater the number of reconstruction levels of dequantized measurements, the greater the quantization error can be reduced.
Table 3,
Table 4 and
Table 5 display the MSE of the various quantization methods for the measurements quantized with 3-bit, 6-bit, and 8-bit, respectively.
Table 3 shows that when using 3-bit quantization, the proposed method reduces the MSE by 788.07, 670.48, and 585.35 compared with uniform quantization, µ-law quantization, and Lloyd–Max quantization, respectively. Similarly, in
Table 4, the proposed method accomplishes a reduced MSE by 8.25, 6.77, and 10.66 when 6-bit quantization is employed.
Table 5 reveals that when 8-bit quantization is applied, the proposed method reduces the MSE by 0.4665, 0.3765, and 1.8162 compared with uniform quantization, µ-law quantization, and Lloyd–Max quantization, respectively.
Table 3,
Table 4 and
Table 5 demonstrate that the proposed method has a significantly lesser MSE than other quantization methods.
3.2. Analysis of the Impact of Entropy Loss Constraints
In this section, we analyzed the effect of entropy constraint. The parameter
in the loss function determines the degree of entropy constraint. In order to analyze the appropriate value of
, we only select a few common values for training the CNNs of the quantization and dequantization processes.
Table 6 shows the MSE of the dequantized measurements and the information entropy of the quantized measurements when 8-bit is used to quantize the measurements of the BSDS500 dataset.
It can be seen from
Table 6 that the entropy constraint can increase the information entropy of the quantized measurements but it has a slight impact on reducing the MSE of the dequantized measurements. When
, the MSE of the dequantized measurements is the smallest. Therefore, when training the proposed network, we use
.
In addition, it can be observed that the entropies of the quantified measurements are very close to the bit-depth. Some images may not be compressed when using a fixed code table for entropy coding. In other words, the measurements quantized by the proposed method do not need entropy coding.
3.3. Analysis of the Impact of the Measurement Information Correction Module
In this section, we analyzed the impact of the information correction module on the dequantization process.
Table 7 shows the quantization performance of the BSDS500 dataset when different information correction modules are used in the proposed method. In
Table 7, Contrast Scheme 1 did not use a measurement information correction module. Contrast Scheme 2 used a measurement information correction module composed of three convolutional layers. Contrast Scheme 3 used a measurement information correction module composed of six convolutional layers.
Table 7 reveals that the proposed method has significant advantages over the µ-law quantization method. Compared with Contrast Scheme 1, the MSE of Contrast Scheme 2 is reduced by 0.0271, while its entropy is increased by 0.067. Similarly, the MSE of Contrast Scheme 3 is reduced by 0.0457 and the entropy is increased by 0.1616. These results illustrate that the information correction module effectively improves the quantization performance. Furthermore, the information correction module exhibits a stronger correction capability with the increase of convolutional layers.
3.4. Rate-Distortion Performance Comparison
In this section, we compared the rate-distortion performance of the proposed method with the entropy-coded uniform quantization, µ-law quantization, and Lloyd–Max quantization. The entropy-coded uniform quantization method refers to the use of entropy coding after performing uniform quantization on the measurements, which is expressed by “uniform quantization + entropy coding” in this paper. When drawing the rate-distortion curve, we traverse multiple quantization bit-depths and sampling rates to encode and decode the test images. Then, we choose the optimum Bitrate-PSNR points and connect them with a line. The bit-depth adopts seven values in {2, 3, 4, …, 8}, and the sampling rate chooses 77 values in {0.04, 0.05, 0.06, …, 0.8}. The image reconstruction algorithm used is the BCS-SPL-DCT algorithm [
41]. When calculating the bitrate of “uniform quantization + entropy coding,” the average codeword length of entropy coding is replaced by information entropy.
Figure 13 shows the PSNR curve of the eight test images.
In
Figure 13, the proposed method has the best rate-distortion performance on all eight test images, particularly for the aerial, building, and tank images. The PSNR curve of “uniform quantization + entropy coding” is better than the µ-law and Lloyd–Max quantization methods. This observation confirms that the existing quantizers without entropy coding have inferior rate-distortion performance compared with “uniform quantization + entropy coding.”
Figure 14 shows the reconstructed images of the eight test images with different methods at a compression bit rate of 0.1. The Lloyd–Max quantization approach generates an adaptive quantization dictionary for each image. We do not count the bits of the quantization dictionary in the compression bit rate of the Lloyd–Max quantization method. Therefore, the results of the Lloyd–Max quantization method in
Figure 14 are equivalent to the optimal results of the conventional quantizer.
As shown in
Figure 14, the proposed method exhibits the best visual effect and PSNR, followed by “uniform quantization + entropy coding” and Lloyd–Max quantization. Compared with “uniform quantization + entropy coding,” for the eight test images, the PSNR of the proposed method increased by 0.65 dB, 0.44 dB, 1.97 dB, 0.02 dB, 0.46 dB, 0.09 dB, 0.37 dB, and 0.29 dB, respectively. Compared with the Lloyd–Max quantization, for the eight test images, the PSNR of the proposed method increased by 2.1 dB, 0.75 dB, 1.8 dB, 0.28 dB, 0.78 dB, 1.55 dB, 1.76 dB, and 1.53 dB, respectively.
The four quantization methods were also tested on the four test image datasets, and the average PSNR curves are shown in
Figure 15.
In
Figure 15, the average PSNR is the mean of the PSNR of the reconstructed images at a given bit rate for all images in the dataset. The PSNRs at a given bit rate are obtained by linear interpolation from the Bitrate-PSNR curve for each image. The given bit rates are set to {0.1, 0.2, …, 1 bpp}. For datasets Set5, Set11, Set14, and BSD68, the average PSNR curve of the proposed method is better than “uniform quantization + entropy coding,” µ-law quantization and Lloyd–Max quantization. Particularly, at a low bit rate (around 0.1 bpp), the proposed method’s PSNR is much higher than the other methods.
The datasets SunHays80 [
42] and Urban100 [
43] have been extended for testing (all data are converted to grayscale images with 256 × 256). The quality of reconstruction is evaluated by the peak signal to noise ratio (PSNR) and the structural similarity (SSIM) between the reconstructed image and the original image.
Table 8 shows the PSNRs and SSIMs of the four datasets at a bit rate of 0.1 bpp.
Table 9 shows the PSNRs and SSIMs of the four datasets at a bit rate of 0.2 bpp.
For all images of the six datasets, when the bitrate is set to 0.1 bpp, the proposed method, “uniform quantization + entropy coding,” µ-law quantization, and Lloyd–Max quantization achieve average PSNRs of 19.69 dB, 19.24 dB, 17.54 dB, and 18.67 dB, respectively. Compared with “uniform quantization + entropy coding,” the proposed method improves the PSNR by an average of 0.45 dB without entropy coding. The proposed method, “uniform quantization + entropy coding,” µ-law quantization, and Lloyd–Max quantization achieve average SSIMs of 0.1855, 0.1739, 0.1408, and 0.1547 respectively. Compared with “uniform quantization + entropy coding,” the proposed method improves the SSIM by an average of 0.0116 without entropy coding.
For all images of the six datasets, when the bitrate is set to 0.2 bpp, the proposed method, “uniform quantization + entropy coding,” µ-law quantization, and Lloyd–Max quantization achieve average PSNRs of 21.27 dB, 21.09 dB, 20.88 dB, and 20.93 dB, respectively. Compared with “uniform quantization + entropy coding,” the proposed method improves the PSNR by an average of 0.18 dB without entropy coding. The proposed method, “uniform quantization + entropy coding,” µ-law quantization, and Lloyd–Max quantization achieve average SSIMs of 0.2738, 0.2683, 0.2543, and 0.2567, respectively. Compared with “uniform quantization + entropy coding,” the proposed method improves the SSIM by an average of 0.0055 without entropy coding.
Across all images in the six datasets, the proposed method demonstrates superior performance compared to standard uniform quantization combined with “uniform quantization + entropy coding,” as well as the μ-law and Lloyd–Max quantization schemes.
3.5. Analysis of Computational Complexity
On the encoding side, the calculation of the proposed quantization method involves four networks: the position parameter estimation network, the scale parameter estimation network, the parameter quantization network, and the measurement quantization network. The network structure of the position parameter estimation network and scale parameter estimation network are identical, as shown in
Table 10. Similarly, the network structures of the parameter quantization and measurement quantization are identical, as shown in
Table 11.
The position and scale parameters are derived from partial measurements
. According to
Table 10, convolutional layer 1 requires around
multiplications and
additions, convolutional layer 2 requires
multiplications and
additions, and convolutional layer 3 requires
multiplications and
additions. In total, location parameter estimation and scale parameter estimation need about
times multiplications,
times additions, and
times LeakyReLU operations.
According to
Table 11, to quantize a parameter or measurement, the quantization network typically requires 48 times multiplications, 48 times additions, 12 times LeakyReLU operations, and one time
operation. For the measurements’ matrix
, the numbers of measurements and parameters that need to be quantized are
and
, respectively. In total, it is necessary to compute
times multiplications,
times additions,
times LeakyReLU, and
times
.
The proposed method requires approximately the same number of multiplication and addition operations, and the activation function to be computed only involves linear operations. Since addition is much faster than multiplication in practical operations, we only compared the number of multiplication operations. With an image size of 256 × 256 and a block size of 16 × 16, the total number of blocks would be 256. Assuming that the measurement rate of BCS is 0.1, each block obtains 26 measurements. Each measurement needs about 256 times multiplications and 255 times additions, and the calculation of measurements is about times multiplications and times additions. The proposed quantization method requires times multiplications, times additions, times LeakyReLUs, and times . Compared with the calculation for BCS measurements, the calculation of the proposed quantization process is about 9.92% of that of the BCS measurements.