1. Introduction
Digital images are quantized signals, and quantization introduces quantization error. De-quantization approximates the original signal based on the digital samples. Most of the time, the intensities in the pixels of the digital images are represented with integer values in the range of , where n is called the bit depth. Low bit depth images have fewer levels than the display which is used for presentation. De-quantization aims to decrease artifacts originating from the quantization process. Bit-depth extension or enhancement are also used to describe de-quantization.
One of the most disturbing artifacts of low bit depth images is the appearance of false contours. For instance, a sunset photo shows a large sky area, but the low number of available shades leads to large homogeneous regions, and a disturbing contour appears at the border between two regions (see
Figure 1).
There are several ways to mitigate the problem. Dithering [
1] was invented to de-correlate quantization error with the original signal. This is a very effective technique, and is still widely used.
Another class of low bit depth images were created using limited precision analog-to-digital converters and sensors, which has led to rounding or truncation. A large amount of old content exists in such low bit depth format where a high quality source is not available, and the only way to improve quality is de-quantization and bit depth expansion.
Theoretically, both dithered and non-dithered images could be subjected to bit depth expansion, but the number of possible dithering methods is infinite, and every dithering method would require a dedicated algorithm. Most papers, including the present one, focus on non-dithered, truncated, or rounded images.
2. Related Work
Consider an image that is discretized using
n bits. Any value between two discrete levels is rounded. The rounding strategy can be the nearest integer, or the nearest “smaller/larger than the signal” integer. In both cases, the original signal (
) is rounded to a discretized signal (
g). The difference (
) is the lost information which is the target of the recovery operation. Using the
rounding operator and
spatial coordinates, the problem is as follows:
The lost signal must remain within certain limits. If the rounding operator is “rounding to the nearest integer”, then the limit is
unit, but it could also be “rounding down” or other operators. The general formulation is:
What is the approximation that makes the recovered signal the most plausible, most appealing?
Early strategies utilized blurring for false contour removal. For instance, Nishihara and Keda [
2] used random blurring. Blurring and exploitation of the human visual system’s properties remained a research interest, and led to low- and high-frequency separation techniques and adaptive cancellation, as reviewed by Daly and Feng [
3]. The most challenging issue with this approach is the fact that blurring might also affect real edges. The separation of false contours from real edges is a very difficult problem on its own.
The most recent approaches describe the false contour removal as a pure mathematical de-quantization challenge, and they do not aim to model the human visual system. Cheng’s algorithm [
4] uses flooding-based downward/upward distance maps to calculate step ratio and linearly interpolate. This method achieves very nice results for gradients, but fails to recover the local maximal/minimal (LMM) regions.
Wan et al. [
5] developed a content-adaptive method using a virtual skeleton marking (VSM) model which transforms the 2D extrapolation problem into a 1D interpolation, but it is still based on linear interpolation. In a follow-up paper [
6], a spatially varying sparsity-based de-quantization method was published, based on the observation that the second derivatives are positive/negative in the LMM regions and approximately zero in the other regions. The derived model is a Thikonov regularization-based anisotropic total variation minimization problem.
A similar approach was used by Ouyang [
7], who developed a bounded interval regularization method, where the recovered signal must remain within limits, which is enforced with a penalty term outside limits. The algorithm in the allowed interval minimizes an energy function, which is quadratic in absolute value of the gradient (
). This formulation is equivalent to a bounded Laplace problem, where internal points (not exceeding the limits) satisfy the Laplace equation:
This partial differential equation approach has several advantages, including the possibility of handling non-uniform quantization levels. However, it fails to reconstruct LMM regions, and perceived smoothness of the image might be low when the original signal’s second derivative is significantly different from zero.
Wan et al. [
8] developed an AC/DC separation model with maximum a posteriori AC estimation, where DC denotes the mean of the signal, and AC is the difference of the original signal and the DC component. They viewed the problem from a minimum mean square error perspective, and used a graph smoothness prior. The method works on blocks, where overlapping blocks are used to minimize artifacts at the edges.
3. Problem Statement
Partial differential equation-based signal de-quantization is a very appealing approach, but we identified the following areas where the previous approaches could be improved:
non-continuous regions,
incorporation of available additional information into the model,
higher-order signal approximation for non-LMM regions.
Non-continuous regions appear when a foreground object splits the background into segments. Each segment belongs to the same underlying object (e.g., sky), and the false contours should be removed from all segments. The segments should match as if there was no foreground. See
Figure 1 for an example.
The incorporation of available information means that sometimes additional information can be used to get a better approximation. Consider the following example: ski trails in white snow or a thin rope in a sunset have low contrast, and the difference between the foreground object and background might be just one discretization level. In these cases, it is very important to keep these edges, and removing them would degrade the impression of the images.
Finally, low-order signal approximation for medium intensities might cause a second-order false contour. By second-order, we mean that the intensity change is continuous, but the first derivative has a discontinuity and the observer might detect a break in the pace of lightness change.
4. Material and Methods
4.1. Plate Bending Model
The solution of the Laplace equation is similar to a stretched rubber plane, which has no stiffness. This is obviously a smooth solution, but it is also known that the solution always takes maximum value at the borders.
Reconstructing the LMM regions would be possible if the gradients from internal regions would be smoothly continued in the LMM regions, like a bended beam continues after the points were external forces bends the beam. This observation inspired a linear elasticity-based [
9] approach:
The signal is modeled as a thin plate.
The maximum and minimum allowed values deform this plane.
The stationary solution gives the approximation of the signal.
One simple model is the pure bending model, which can be described with the biharmonic equation:
4.2. Boundary Conditions
The quantization error of the signal can be taken into account with an upper (
U) and lower envelope (
L). The original signal cannot be larger than the upper envelope, and cannot be smaller than the lower envelope:
The distance between the envelopes does not have to be constant. If there is a priori knowledge about the signal, or the quantization levels are uneven, then appropriately chosen envelopes could take care of this problem. For instance, if the two envelopes touch each other, then it is equivalent with a Dirichlet boundary condition.
Numerical solution becomes much more tractable if mirrored periodic boundary conditions are assumed. This is a frequent assumption which does not limit the generality of the approach. Both “odd” and “even” mirroring could be used. In our examples, we used “odd” mirroring, which subtracts the reflected values two times from the edge value. Odd mirroring is depicted in
Figure 2.
Non-Continuous Regions
The envelopes should only be defined for the regions of interest. In irrelevant regions, the envelopes should take maximum/minimum allowed values. Using the plate analogy, supporting points (defined envelope values) can bend the plate. In irrelevant regions there is no external force; the plate is continuous as the biharmonic equation governs it. We use a binary mask to define regions which should not be processed and where the envelope limits should be removed. After solving the the numerical problem, these regions are kept intact from the low-resolution input image.
4.3. Numerical Solution
The non-continuous regions and the additional information is taken into account through the upper–lower limits, and the higher-order smoothness is guaranteed by the biharmonic equation.
A straightforward solution can be implemented using finite differences schemes with a gradient descent method. However, a clipping step is required after every iteration step: if a value is lower than the lower envelope in the given point, or higher than the upper envelope, then approximation is set to be equal with the corresponding envelope. Laplace and biharmonic equations are part of the polyharmonic equations, and they can be described with similar equations where the difference is the order (
p) of the Laplace-operator, which is 1 for Laplace and 2 for biharmonic operator. The gradient for time step is as follows:
While initially the solution function (f) does not satisfy the Laplace/biharmonic equations, evolving it with small time steps converges it to the solution. In this formulation, the solution of the problem is in limit.
During the discretization of the partial differential equations, we assume that the pixel sizes are equal in both dimensions, and the lowest-order central difference schemes are satisfactory for the numerical solution. With these assumptions, we use the following stencils approximating the Laplace and biharmonic operators:
These stencils can be used everywhere in the image, including the edges, due to the periodic boundary conditions.
The convergence rate of this method is slow. Reducing execution time, the implementation uses a multi-grid approach where one level means a factor of 2 change in sampling points along an axis. First, both the envelopes and the initial image is downscaled. The image pixels take the average value of the higher-resolution pixels, the lower envelope takes the minimum of the higher-resolution envelope, and the upper envelope takes their maximum. This lower-resolution problem is solved first. When the approximation reaches convergence, a bicubic resampling is used to upscale the approximate solution, which serves as input for the next level. The pseudo-code of the iteration scheme is as follows:
- 1,
load initial image.
- 2,
if the smallest edge is longer than 31 pixels and divisible with two:
- 2a,
downsample the problem with factor of 2:
lower-envelope: minimum of grid points
upper-envelope: maximum of grid points
image: average of grid points
- 2b,
solve the problem (jump to step 1)
- 2c,
upscale the solution with a factor of two using bicubic interpolation
- 2d,
use the upscaled image as initial image for the next step
- 3,
calculate gradient for the time step
- 4,
update the image
- 5,
clip the image if it exceeds limits (envelopes)
- 6,
continue with Step 3 if iteration number is less than specified
- 7,
return the image to higher level or as final solution
Alternatively, any other de-quantization method could be used as an initial input instead of Steps 2a–2d, and the algorithm should converge to the same solution.
5. Results
One-dimensional comparison can be seen in
Figure 3 for demonstration purposes. The Laplace solution cuts the top and bottom of the dynamic range, while a higher-order solution allows overshoot and reconstructs values in the LMM regions.
Figure 4 shows two signals: a Gaussian and a Rosenbrock function [
10]. The Gaussian does not have saddle points, while the Rosenbrock function is a frequently used test function in optimization due to its saddle-like shape with a relatively flat valley in the middle. The sub-figures show the high-bit-depth original, the reduced-bit-depth version, and the Laplace-reconstructed and biharmonic reconstructed solutions. The advantages of the higher-order approximation are obvious in the figure: the maximal region is smoothly recovered, and there is no second-order false contour.
Table 1 compares peak signal-to-noise ratio (PSNR) [
11], normalized root mean square error (NRMSE), and structural similarity index (SSIM) [
12] for a few selected (2, 3, 5) bit depths of Gaussian and Rosenbrock images extended with zero padding, Laplacian method, and with the proposed method.
Finally, an image of the night sky with the moon meant to demonstrate masking was used. When multiple objects were present in the scene, then the transitions led to overshoots. It is expected that the bended plate tries to keep its shape at the edges. However, if the image were segmented into parts, then reconstruction might be applied partially, ignoring highly textured areas. Similarly, if the neighboring pixels differ in more than one discretization unit, then it can be assumed that they belong to different image components, and the assumption about signal smoothness does not hold. For an example, see the moon in
Figure 5. If these regions (moon and sky) were not handled separately, then due to the stiffness of the plate in the model, the moon would generate a halo around itself. The synthetic examples were generated and the photo was taken by the first author.
6. Discussion
One advantage of the biharmonic model is that it yields smooth surfaces. On the other hand, the bending model has an issue: it tries to continue the surface. This is very useful for large, related, but disconnected regions, but it leads to inferior image quality at the border of two different regions (e.g., hair and skin). This is easily understandable from the plate analogy. This is the trade-off for the handling of unconnected regions and higher-order approximation.
This could be mitigated with carefully chosen image masks where large flat regions are marked for reconstruction but textured regions should be left intact. For instance, the neighboring regions cannot have more than 1 unit difference, otherwise the difference does not originate from quantization error.
Note that limited applicability is a general phenomenon. The binary image is an extreme form of low-bit-depth images. The Laplace algorithm, Cheng’s algorithm, and the proposed algorithm would assign a single value to every pixel in a binary image. This happens because the extreme conditions violate (implicit) model assumptions.
6.1. Dither and Noise
Another general challenge for bit depth enhancement is noise and artifacts from dithering. This affects every method, but more advanced methods tend to use more information from the image. Unfortunately, this also means higher sensitivity in handling the noise.
In the beginning of the article it was assumed that the image was not subject to dithering. However, if it was, it could generate serious artifacts: close to an edge, a dithering algorithm would generate points from both shades. There is a function which fits nicely to this data: a flat surface with equal distance from both quantization levels. This would lead to an image where there is a flat region in the neighborhood of the edges, which is exactly the opposite of the original aim. Further away from the edges, a smooth transition is achieved.
This artifact could be mitigated with relaxed upper and lower envelopes. The original assumption was that the rounding/flooring of the signal is perfect. Depending on the gradients in the image and the dithering method, a locally increased range for the allowed values (decreased lower envelope, increased upper envelope) would allow the algorithm to effectively mitigate dithering artifacts. However, the cost of this flexibility might be some blurring in neighboring areas. Similarly, noise which exceeds a quantization level deforms the plate.
From an algorithmic point of view, masking and relaxing is the same numerical problem, with two exceptions: in masked out points, the upper and lower envelopes are set to dynamic range limits, and after the iterative solution, in these points the original low bit depth pixels are used.
Adding a data fidelity term to the original problem, when the sum of the bending energy and the error term should be minimized, it would transform the problem into a denoising problem. This seems to be appealing, and modeling the signal as with a bended plate could be beneficial in some applications, however it does not suit the de-quantization problem. The reason lies in the fact that minimizing the error term would flatten out the reconstructed signal further from the edges. Technically, if the fidelity term is too small, it has no effect, and if it has significant effect, it would force the reconstructed signal to closely follow the flat regions.
Considering these arguments, our recommendation is to locally increase the distance between the envelopes, or remove the problematic pixels with noise filters or anomaly detection, as discussed in the next section.
6.2. Cross-Terms
There are previous higher-order polynomial approximations for this problem, but to our knowledge, the axes are handled individually. For instance, a fourth-order generalization of the Laplace equation without cross-terms could be an “almost biharmonic” equation where the cross-terms are downweighted, and while they give some contribution, their effect is much more subtle.
Not connecting the two axis has both advantages and disadvantages. In case of noise, it will not propagate into the other dimension, and will not generate wave-like patterns, but also smoothness will be limited. There are at least two ways to find balance between the approaches.
First, a lower-order stretching could be applied. In this approach, not only the plate bending energy should be minimized, which leads to the biharmonic equation, but an extra energy term which is proportional with the surface area. Applying this second energy term would lead to the Laplace equation. However, the sum of the two energies lead to non-trivial solutions, because the surface minimization tends to generate sharp edges at the border LMM regions. The weighting between the two energy terms could be chosen based on the application.
Second, anomalous points could be detected. The main idea is as follows: If the reconstructed function has extrema at a given point, and this point also lies at the border of two regions in the low-bit-depth image, then this originates from noise. The reason is simple. If a large smooth region has extrema, it is unlikely that it will be exactly on the edge of the borders. Therefore, this point should be masked. However, if the point was in fact a real extrema, removing it would not significantly change the image, because the rest of the points should still produce continuous smooth surface, and this process would regenerate the extrema, even without the control point. This process can be iteratively performed until all the anomalous points are taken care of. Actually, this approach can applied axis-by-axis, which can take care of possible ringing at edges. The main disadvantage of this approach is its time-consuming nature due to the iterations.
Third, alternative elasticity models could also be used. Referring to the literature, solving the biharmonic equation minimizes the mean curvature (
B) [
13] of a surface. Using principal curvatures (
,
), the mean curvature is as follows:
An alternative approach could be minimizing the total curvature (
T) [
13]:
This is equivalent to minimizing the thin plate energy functional. The biharmonic model is permissive towards saddle points, while the thin plate model prefers surfaces which can be approximated locally with a spherical surface.
The benefits of having higher-order plate-like approximation, having more traditional spline approximation, or a combined Laplace–biharmonic model depends on the content—especially on the structuredness, noise, and presence of saddle-like areas. From an application point of view, if similar high-resolution content exists, a simulated degradation–reconstruction process could be used to select the best solution from the aforementioned options in order to maximize a chosen metric, or to maximize observers’ opinion score.
6.3. Mask Generation
Mask generation is a segmentation problem that is a big field on its own. Depending on the input image, even a simple strategy could work. False contours are only perceivable if the affected area is large enough, which means that the signal changes slowly locally. In this case, the quantized signal can change only one discretization level in a single step. If it changes more, then the edge belongs to an object border. However, if the difference between separate objects is only one discretization level, then this approach fails, and more sophisticated algorithms would be required (e.g., binary images). The mask in the example is hand-generated.
6.4. Advantages and Limitations
As far as we know, other approaches have not tried to recover signals in non-continuous regions. Our masking approach has the potential to handle such situations, but it requires a mask. Identifying which sub-regions should be handled together is a segmentation problem, and it is outside of the scope of this paper.
Does a single-pixel LMM region carry real information, or is it noise? The Laplace model effectively removes LMM regions (see the examples in [
7]), and higher-order approximations (spline, biharmonic) try to take these points into account. If the pixel contains noise, these higher-order methods, including the proposed method, reconstruct a false local peak. On the other hand, if the data do not originate from noise, these methods better reconstruct the signal in the LMM region. In this case, the proposed model handles saddle points well, and yields good results for very low bit depths (2–3 bits).
The biharmonic equation is based on a simple linear elasticity model [
9], where the bending energy should be minimal. This model assumes constant stiffness. It is easy to generalize this model to more general elastic models. However, the biharmonic model is high enough order for achieving smooth results but simple enough in terms of parameters, while applying statistical models [
8] requires assumption about the signal distribution, and using neural networks [
14] requires representative training data.
Handling non-equidistant discretization levels is trivial in the proposed model—only the upper and lower envelopes are needed to set appropriately.
In term of computational time, the proposed method is slower than many competing methods, because large flat areas need significant time to converge, and the convergence is also affected by the cross-terms. The Laplace method converges faster, and the Laplace operator can be more efficiently calculated than the biharmonic. However, elastic problems have been studied for a long time, and optimized solvers were developed to handle them [
15], which means that even more generalized models would be tractable.
While the proposed method requires both masking and long computation until it achieves convergence, it processes the whole image. Many algorithms use block-based processing [
8], which requires overlapping blocks in order to avoid artifacts, or try to detect the false contours, and operate around the contours [
2,
14]. Using blocks and/or the neighborhood of the false contours becomes a limitation at extremely low bit depths (2–3 bits), where processing blocks are smaller than the distance between the contours.
8. Conclusions and Future Directions
Our proposed method based on the bending of deformable plates gives an intuitive way to handle non-continuous regions, and allows the incorporation of extra information into the model while ensuring a high-order signal approximation. To our knowledge, this is the first algorithm with such properties for the de-quantization problem.
Several questions remain open for future research. Mask generation requires the separation of foreground–background objects. It is a research area on its own, and low-bit-depth input makes it even more challenging.
Using biharmonic, Laplace, or other differential equations assumes an energy-like functional to be minimized. These are derived from physical assumptions such as minimal surface or minimal bending energy. The biharmonic equation is one of the simplest models in elasticity, with location-independent “stiffness”. Total curvature minimization was mentioned as an alternative approach. More advanced physical models [
9] (e.g., location-dependent elastic properties) could take further parameters into account. In general, elastic problems are most frequently solved with finite element methods. Solving these advanced models would require more computational power, and choosing the local parameters might be challenging.
If more advanced models are not derived from physical analogies, then machine learning could be used to learn signal properties. Normal images and their simulated down-quantized version could be used as training samples, similar to the super-resolution problem where this approach is already used [
21], or to detect false contours [
14] for later processing.
Finally, temporal consistency and high-throughput processing become interesting research questions when de-quantization is applied to video sources [
22,
23].
References