3.1. Decoder Complexity Reduction
Obtaining further acceleration of the Multi-ctx decoder requires simplifying the Formula (
12). In place of the MED
predictor, we propose the much more efficient GBSW
. At the same time, we omit the constant weight
, thus reducing the formula for the predicted value to a simple linear combination:
Given the possibility of using a hardware implementation of the decoder with low power consumption and hardware resources, it is worth noting that this formula is feasible using fixed-point computing. The decoding process requires access to only three image rows (the currently decoded and two previous rows) in the case of (we use the 12 closest pixels of ) and up to four image rows at .
Since the proposed codec assumes the presence of time asymmetry between the encoder and decoder, a flexible approach is possible in determining the prediction order and in how to choose the prediction coefficients, which are placed in the file header. One approach may be to use ISA, but in most cases, the encoding time may not be acceptable. It is also possible to use the determination of prediction coefficients with the fast classic MMSE method or Algorithm 2 described in
Section 3.2, which offers some compromise between the speed of operation and the bit-average level
.
Unchanged from Multi-ctx is the context-dependent correction of the cumulative prediction error (module CDCCR—Context-Dependent Constant Component Removing—in
Figure 5 showing the block coding scheme proposed in this work). For a detailed description of CDCCR, see paper [
28]. Similar solutions were previously used in codecs, such as JPEG-LS and CALIC. Removing the constant
component associated with one of the 1024 contexts requires a slight modification of Formula (
1) to the following form:
Another improvement over Multi-ctx is using a newer prediction error encoder, a combination of an adaptive variation of the Golomb code and a binary arithmetic encoder (see paper [
25] for details).
The use of these modifications, despite the lack of code optimizations (such as the introduction of fixed-point operations in place of floating-point operations or the conversion of the code from C++ to assembler), made it possible to reduce the decoding time of a Lennagrey image (with dimensions of 512 × 512 pixels) using an i5 3.4 GHz processor from 0.35 s to 0.245 s (with
) for a Multi-ctx decoder. This represents a 30% reduction in time. At the same time, the bit average decreased, even when the prediction order in the encoder was reduced to
. The fast classical MMSE approach was used to determine the prediction coefficients instead of ISA or Algorithm 2. A detailed comparison of bit averages with other codecs known from the literature will be discussed in
Section 4.
3.2. Algorithm for Fast Determination of Prediction Coefficients Using the Non-MMSE Method
As discussed in
Section 2.2, Algorithm ISA is characterized by a too-long coding time at high prediction orders. In the literature on determining prediction coefficients using non-MMSE methods, one also encounters attempts to use, for example, genetic algorithms. However, for instance, in works such as [
46,
47,
48], such algorithms were designed for low-order prediction models (
and
). Our experiments indicate that it is more difficult with a high prediction order (
) to obtain a significantly faster convergence of linear prediction model construction in this way than is the case when using ISA. Thus, there was a need to design a compromising solution that could determine linear predictor coefficients quickly, allowing lower entropy than the classic MMSE approach with forward adaptation. To this end, to select the final prediction model, it was necessary to take advantage of the possibilities offered by methods with backward transformation, which can iteratively improve compression efficiency.
The most common prediction coefficient adaptation methods use least mean square (LMS). Regarding image coding, it is among the simplest but, at the same time, the least efficient methods, even in the normalized version of LMS (NLMS). This is due to two reasons. First, a problematic issue is the selection of the right learning rate
with different levels of data input variability. In the few works using LMS in lossless image compression, for this reason, the value of
is set relatively low, such as
–
[
49], and this results in slow convergence to the expected results. Second, most of the NLMS improvement methods known from the literature, including the selection of a locally variable
value, are optimized for time series in which we have a one-dimensional input string. This also applies to more computationally complex methods such as RLS.
In contrast, images offer correlations between adjacent pixels in two dimensions. One can largely overcome these difficulties by introducing transformations that use, for example, differential data of neighboring pixel values [
50]. This reduces the input signal’s dynamics while increasing the predictive model’s learning speed. Various input data transformations are often used in the literature to achieve faster convergence of adaptive methods [
51]. For this purpose, a vector is introduced:
where
is the transformation matrix, the vector
contains the neighborhood pixels of the currently encoded pixel
. The transformed data vector
is used to calculate the predicted value based on the prediction model included in vector
:
Typically, the matrix
is a square matrix of size
, where
r is the most commonly defined power of two; for example, the Walsh–Hadamard matrix at
looks like this:
In this case (as in DCT), the first element of vector
is proportional to the arithmetic mean of vector
, and the sum of the values in the other rows of matrix
is zero. In the case of lossless image coding, this first row can be dispensed with, and at the same time, it can be noted that matrix
does not have to be square. Indeed, there are more rows composed of, for example, the numbers
, in which the sum is zero. The following formula determines their number:
where the parameter
k determines in such a matrix
the number of pairs of numbers
and
is the number of zeros in a row of the matrix. For example, with
, we have 19 rows. Omitting the row of the matrix
consisting of only zeros, we obtain the following matrix:
in which a certain symmetry can be observed, and the other half can thus be rejected for this reason. Then, after using Formula (
15), we obtain a vector
of the data after transformation with nine elements (in general with
elements) and, consequently, also a vector of nine prediction coefficients
. By using the relationships between the data in the
vector and the data in the
vector, there is a chance of faster convergence by the adaptive LMS-type methods. At the same time, it should be noted that after determining the final prediction model
, it is possible to obtain a prediction model
of order
r using an inverse transformation, as will be shown later in this section with a practical example. Due to the rapidly increasing number of rows of matrix
along with the prediction row, in the practical implementation, we decided to use only the rows of matrix
in which there is only one pair of non-zero values
. This choice boils down to the fact that elements
consist of differences between two pixels. In practice, we reduce this even further to pixels that are immediately adjacent to each other (and available in the decoding procedure); for example, the nearest neighbors of pixel
are
,
,
, and
. Then, through experiments, we select (as in work [
50]) a subset of these differences that work well with the prediction coefficient adaptation method. Taking into account the prediction model proposed in Formula (
13), we obtain an input data vector of the form
. Assuming that we are initializing the vector
, modify Formula (
16) to the form:
We designed two sets of transformations for orders
and
, respectively.
Table 4 shows the experimentally selected values of the
vector for
and for
.
In designing Algorithm 2, we used a modified form of the activity-level classification model (ALCM
), which is closer to MAE minimizing than to MSE, as is the case with NLMS, as a means of adaptively determining the prediction coefficients of
. In the original, ALCM operated on fifth- or sixth-order linear prediction models [
52]. After encoding the next pixel, only a select two of these several coefficients were adapted (by a constant value of
). In the solution proposed here (see Algorithm 2), as many as eight of the
of the
coefficients of vector
are adapted for each successively encoded pixel. For this purpose, four of the lowest and highest values from the
vector are determined. We denote them as the four smallest values,
, and the four largest values,
, respectively. Then, assuming
, the adaptation presented in Algorithm 2 is performed.
Algorithm 2 Prediction coefficients adaptation |
1: for all Pixel do calculate the estimated value based on Formula (20) |
2: if then |
3: for |
4: for |
5: else if then |
6: for |
7: for |
8: end if |
9: end if |
Learning coefficients are determined as follows:
where
where
, and the values of
and
change in successive iterations. This is because the entire image is scanned five times, and in each successive iteration, the influence of the learning rate should be reduced (according to the principle from coarse to fine-tuning). Experimentally selected parameters are
,
, respectively, for
. Thus, we obtain (regardless of the prediction order) the final form of vector
after only five initial image encodings, which we transform into vector
. This requires
Table 5 for
or
Table 6 for
, which contains a set of transformations of coefficients from the domain of vector
to vector
. Thus, after the adaptation stage, we can use the target Formula (
13) for the predicted value in the encoder and in the decoder. The final step is to reduce, using Formula (
3), the accuracy of the prediction coefficients’ notation to
bits after the decimal point, after which we perform the full final encoding according to the scheme shown in
Figure 5. In the case of the Algorithm 2 implementation, the total time to determine the prediction coefficients and encode the Lennagrey image is 1.515 s with
and 1.925 s with
(decoding times are 0.24 and 0.245 s, respectively).