A New Transformation Technique for Reducing Information Entropy: A Case Study on Greyscale Raster Images
Abstract
:1. Introduction
2. Background
2.1. Move-To-Front Transform
- Initialisation: fill L with all ,
- For each :
- –
- Find the index l in L where is located;
- –
- Send l to Y;
- –
- Increment the positions of all , where ;
- –
- Move to the front of L.
2.2. Inversion Frequencies
- ;
- ;
- ;
- .
2.3. Burrows–Wheeler Transform
- Generating permutations of X through rotational shift-right operations;
- Sorting the obtained permutations lexicographically;
- Reading the BWT(X) from the last column of the sorted permutations;
- Determining the position of X in the sorted array of permutations. This position is essential for reconstruction, and is considered a BWT index, ;
3. Materials and Methods
3.1. Move with Interleaving
- Mode 1:
- MTF is applied when .
- Mode 2:
- The temporal array T is filled with interleaved values, starting with when . The values from T are then inserted at the front of L, shifting all the remaining values in L accordingly. This is why we named this transformation Move with Interleaving (MwI).
- Function InitialiseL in Line 4 of Algorithm 1 obtains the first element from X () and as input, and populates the list L. The first element in L becomes , while elements from interval are interleaved around this value. The remaining elements of L are then filled from the smallest up to the largest possible value from the interval , according to the alphabetical order in . also becomes the first element in Y (see Line 5) to enable the same initialisation of L during the decoding phase. The situation after the initialisation is therefore:
- .
- The remaining elements from X are then transformed within the For-loop starting at Line 6 and ending in Line 15. Let us follow the algorithm for some elements from X.
- :
- Function in Line 7 finds the position for . l is inserted into Y in Line 8. The MTF transform in Line 10 is applied as to obtain the following situation:
- .
- :
- For , function GetPosition returns , which is inserted into Y. As , function FillT in Line 12 generates the interleaved values around . It returns . Function ModifyL in Line 13 moves T in front of L, while other values are placed after it. We obtained:
- .
- :
- GetPosition returns for and inserts l into Y. Function MTF is applied because . The obtained situation is:
- .
- :
- , , . Sequence contains only 6 elements this time, as the values being outside are not inserted. The obtained situation is therefore:
- .
Algorithm 1 Transformation MwI | ||
1: | function MwI(X, ) | ▹ Returns transformed sequence Y |
2: | ▹ X: input sequence; : tolerance | |
3: | ||
4: | ▹ InitialiseL(, ) | Initialisation of L is done according to |
5: | ▹ The first entry in Y is | |
6: | for to do | ▹ For all other |
7: | GetPosition(L, ) | ▹ Find the position of in L |
8: | AddToY(Y, l) | ▹ Store the position in Y |
9: | if then | ▹ If position is smaller that |
10: | MTF(l, L) | ▹ Then rearrange L according to MTF |
11: | else | ▹ otherwise |
12: | FillT(, ) | ▹ Fill temporal sequence T |
13: | ModifyL(L, T) | ▹ Place symbols in T in front of L |
14: | end if | |
15: | end for | |
16: | return Y | ▹ Returns transformed sequence |
17: | end function |
Algorithm 2 Inverse MwI transformation | ||
1: | function IMwI(Y, ) | ▹ Returns restored sequence X |
2: | ▹ Y: input sequence of indices; : tolerance | |
3: | ▹ The first entry in X is | |
4: | InitialiseL(, ) | ▹ Initialisation of L is done according to first element |
5: | AddToX(X, ) | ▹ is sent to the reconstructed sequence X |
6: | for to do | ▹ For all other |
7: | ▹ Get the position from Y | |
8: | ▹ Get the value from L | |
9: | AddToX(X, v) | ▹ Store the value in X |
10: | if then | ▹ If position is smaller that |
11: | MTF(l, L) | ▹ Then rearrange L according to MTF |
12: | else | ▹ Otherwise |
13: | FillT(v, ) | ▹ Fill temporal sequence T |
14: | ModifyL(L, T) | ▹ Place symbols in T in front of L |
15: | end if | |
16: | end for | |
17: | return X | ▹ Returns restored sequence |
18: | end function |
Time Complexity Estimation
- MTF:
- In the worst-case scenario, the last element of L should always be moved to the front. There are elements in L and consequently, . Since , .
- IF:
- For each , all elements in X are always visited, resulting in . Again, since , .
- BWT:
- The algorithm presented in Section 2.3 has, unfortunately, A time complexity of , which limits its practical use for longer sequences. Later, it was shown that BWT can be constructed from the suffix array in linear time [23], and since there are known algorithms for constructing the suffix array in linear time [29,30], BWT itself can be obtained in time.
- BWT+MTF:
- Based on the above analysis, the combination BWT, followed by MTF, works in .
- BWT+IF:
- Similarly, as above, the combination BWT, followed by IF, operates in time complexity .
- MwI:
- MwI operates in two modes. In mode 1, then . In mode 2, the algorithm performs two tasks. Firstly, it fills the auxiliary sequence T with elements. After that, it applies MTF times resulting in a total of operations. Since , .
3.2. Rearranging Raster Data in the Sequence
4. Experiments
I | N × M | H | MwI() | MTF | IF | MwI 1 |
BWT MTF | BWT IF |
BWT MwI 1 | |
---|---|---|---|---|---|---|---|---|---|---|
(1) | 7.357 | 25 | 6.471 | 7.308 | 7.415 | 6.567 | 6.656 | 6.680 | 6.661 | |
(2) | 7.346 | 5 | 4.074 | 5.649 | 5.497 | 4.172 | 3.989 | 3.920 | 4.008 | |
(3) | 7.484 | 13 | 6.033 | 6.936 | 6.868 | 6.037 | 6.285 | 6.112 | 6.248 | |
(4) | 7.343 | 16 | 5.351 | 6.769 | 6.888 | 6.370 | 6.156 | 6.020 | 6.518 | |
(5) | 7.325 | 40 | 6.757 | 7.517 | 7.849 | 7.016 | 6.792 | 6.873 | 6.786 | |
(6) | 6.828 | 9 | 4.526 | 5.454 | 5.364 | 4.535 | 4.617 | 4.459 | 4.625 | |
(7) | 7.088 | 11 | 5.043 | 6.008 | 5.909 | 5.043 | 5.266 | 5.095 | 5.271 | |
(8) | 6.904 | 11 | 5.551 | 6.349 | 6.323 | 5.551 | 5.834 | 5.667 | 5.799 | |
(9) | 7.155 | 17 | 5.544 | 6.740 | 6.702 | 5.598 | 5.569 | 5.511 | 5.555 | |
(10) | 7.410 | 8 | 4.764 | 6.375 | 6.288 | 4.803 | 4.553 | 4.463 | 4.515 | |
(11) | 7.305 | 17 | 5.608 | 6.832 | 6.668 | 5.641 | 5.823 | 5.629 | 5.808 | |
(12) | 7.366 | 10 | 4.840 | 6.294 | 6.218 | 4.844 | 4.805 | 4.657 | 4.740 | |
(13) | 7.288 | 12 | 5.111 | 6.489 | 6.455 | 5.114 | 5.166 | 5.067 | 5.178 | |
(14) | 7.530 | 14 | 5.262 | 6.485 | 6.308 | 5.268 | 5.516 | 5.382 | 5.547 | |
(15) | 7.348 | 10 | 5.184 | 6.629 | 6.720 | 5.189 | 5.240 | 5.118 | 5.233 | |
(16) | 7.792 | 14 | 5.427 | 6.994 | 6.865 | 5.439 | 5.326 | 5.153 | 5.321 | |
(17) | 6.964 | 6 | 3.912 | 5.056 | 4.932 | 4.036 | 3.927 | 3.961 | 4.051 | |
(18) | 7.524 | 15 | 5.480 | 6.900 | 6.834 | 5.511 | 5.662 | 5.523 | 5.682 | |
(19) | 7.729 | 6 | 4.138 | 5.739 | 5.673 | 4.204 | 4.060 | 3.930 | 4.083 | |
(20) | 4.711 | 16 | 3.707 | 4.352 | 4.313 | 3.728 | 3.850 | 3.687 | 3.854 | |
(21) | 7.18 | 8 | 4.770 | 6.082 | 5.935 | 4.787 | 4.770 | 4.632 | 4.768 | |
(22) | 2.983 | 0 | 0.125 | 0.125 | 0.173 | 0.133 | 0.133 | 0.126 | 0.143 | |
(23) | 7.585 | 12 | 6.042 | 7.230 | 7.095 | 6.042 | 5.961 | 5.788 | 5.875 | |
(24) | 7.256 | 7 | 4.460 | 5.842 | 5.652 | 4.482 | 4.590 | 4.455 | 4.638 | |
(25) | 7.482 | 13 | 5.203 | 6.977 | 6.845 | 5.216 | 4.908 | 4.819 | 4.846 | |
(26) | 7.594 | 14 | 5.258 | 6.840 | 6.967 | 5.288 | 5.459 | 5.373 | 5.451 | |
(27) | 7.088 | 21 | 4.632 | 5.127 | 5.022 | 4.658 | 4.732 | 4.510 | 4.711 | |
(28) | 7.131 | 9 | 4.774 | 5.865 | 5.677 | 4.775 | 4.966 | 4.845 | 5.039 | |
(29) | 6.950 | 5 | 3.599 | 4.533 | 4.261 | 3.641 | 3.551 | 3.360 | 3.713 | |
(30) | 7.328 | 8 | 4.290 | 4.741 | 4.523 | 4.310 | 4.411 | 4.269 | 4.493 | |
(31) | 7.560 | 8 | 5.093 | 6.421 | 6.473 | 5.105 | 5.098 | 4.976 | 5.025 | |
(32) | 7.334 | 11 | 4.830 | 6.536 | 6.436 | 4.830 | 4.941 | 4.889 | 4.975 | |
Average | 7.102 | 12.2 | 4.870 | 6.039 | 5.973 | 4.935 | 4.949 | 4.842 | 4.974 | |
Rank | 6 | 5 | 2 | 3 | 1 | 4 |
Order | MTF | IF | MwI | BWT MTF | BWT IF | BWT MwI |
---|---|---|---|---|---|---|
Scan-line | 6.039 | 5.973 | 4.935 | 4.949 | 4.842 | 4.974 |
Left-right | 6.004 | 5.905 | 4.925 | 4.959 | 4.843 | 4.977 |
Strips | 5.111 1 | 5.036 2 | 4.962 3 | 5.118 4 | 5.000 4 | 5.149 1 |
Hilbert | 5.349 | 5.905 | 4.754 | 5.012 | 4.889 | 5.050 |
Image | No. of Pixels | BWT and MTF | BWT and IF | MwI |
---|---|---|---|---|
(8) | 65,536 | 0.042 | 0.068 | 0.037 |
(1) | 262,144 | 0.260 | 0.308 | 0.162 |
(6) | 414,720 | 0.432 | 0.465 | 0.217 |
(18) | 1,048,576 | 1.365 | 1.589 | 0.616 |
(16) | 1,745,280 | 2.554 | 2.781 | 0.992 |
(27) | 2,073,600 | 8.910 | 9.451 | 2.901 |
(30) | 17,448,000 | 31.966 | 32.669 | 9.331 |
5. Discussion
- When BWT is not used before MwI, it is considerably more efficient than MTF and IF.
- MwI is as efficient as BWT, followed by MTF or IF.
- BWT followed by MwI yields worse results in comparison to the results obtained by MwI alone.
- MwI is less sensitive to the arrangements on the input data compared to MTF and IF.
- MwI is the most efficient transformation technique when the Hilbert data arrangement is used.
- BWT, for its operations, requires the knowledge of the whole sequence in advance, while MwI operates incrementally and can, therefore, also be used in streaming applications.
- Implementing MwI is easier compared to BWT, as it does not require the implementation of a prefix array for computational efficiency.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
- Shannon, C.E. A Mathematical Theory of Communication. AT&T Tech. J. 1948, 27, 379–423. [Google Scholar]
- Liu, S.; Xu, M.; Qin, Y.; Lukać, N. Knowledge Graph Alignment Network with Node-Level Strong Fusion. Appl. Sci. 2022, 12, 9434. [Google Scholar] [CrossRef]
- Gray, R.M. Entropy and Information Theory, 2nd ed.; Springer: New York, NY, USA, 2011. [Google Scholar]
- Sabirov, D.S.; Shepelevich, I.S. Information Entropy in Chemistry: An Overview. Entropy 2021, 23, 1240. [Google Scholar] [CrossRef] [PubMed]
- Vasco-Olmo, J.M.; Díaz, F.A.; García-Collado, A.; Dorado-Vicente, R. Experimental evaluation of crack shielding during fatigue crack growth using digital image correlation. Fatigue Fract. Eng. Mater. Struct. 2013, 38, 223–237. [Google Scholar] [CrossRef]
- Ben-Naim, A. Entropy, Shannon’s Measure of Information and Boltzmann’s H-Theorem. Entropy 2017, 19, 48. [Google Scholar] [CrossRef]
- Sayood, K. Introduction to Data Compression, 4th ed.; Morgan Kaufman: Waltham, MA, USA, 2012. [Google Scholar]
- Rahman, M.A.; Hamada, M. Lossless Image Compression Techniques: A State-of-the-Art Survey. Symmetry 2019, 11, 1274. [Google Scholar] [CrossRef]
- Salomon, D.; Motta, G. Handbook of Data Compression, 5th ed.; Springer: London, UK, 2010. [Google Scholar]
- Ryabko, B.Y. Data compression by means of a ‘book stack’. Probl. Pereda. Inform. 1980, 16, 265–269. [Google Scholar]
- Arnavut, Z.; Magliveras, S.S. Block sorting and compression. In Proceedings of the IEEE Data Compression Conference, DCC’97, Snowbird, UT, USA, 25–27 March 1997; Storer, J.A., Cohn, M., Eds.; IEEE Computer Society Press: Los Alamitos, CA, USA, 1997; pp. 181–190. [Google Scholar]
- Abel, J. Improvements to the Burrows-Wheeler Compression Algorithm: After BWT Stages. 2003. Available online: https://api.semanticscholar.org/CorpusID:16110299 (accessed on 1 November 2023).
- Bentley, J.L.; Sleator, D.D.; Tarjan, R.E.; Wei, V.K. A Locally Adaptive Data Compression Scheme. Commun. ACM 1986, 29, 320–330. [Google Scholar] [CrossRef]
- Deorowicz, S. Improvements to Burrows-Wheeler Compression Algorithm. Softw. Pract. Exper. 2000, 30, 1465–1483. [Google Scholar] [CrossRef]
- Binder, E. Distance Coding. 2000. Available online: https://groups.google.com/g/comp.compression/c/96DHNJgf0NM/m/Ep15oLxq1CcJ (accessed on 14 November 2023).
- Albers, S. Improved randomized on-line algorithms for the list update problem. SIAM J. Comput. 1998, 27, 682–693. [Google Scholar] [CrossRef]
- Burrows, M.; Wheeler, D.J. A Block-Sorting Lossless Data Compression Algorithm; Technical Report No. 124; Digital Systems Research Center: Palo Alto, CA, USA, 1994. [Google Scholar]
- Abel, J. Post BWT stages of the Burrows-Wheeler compression Algorithm. Softw. Pract. Exper. 2010, 40, 751–777. [Google Scholar] [CrossRef]
- Dorrigiv, R.; López-Ortiz, A.; Munro, J.I. An Application of Self-organizing Data Structures to Compression. In Experimental Algorithms, Proceedings of the 8th International Symposium on Experimental Algorithms, SEA 2009, Dortmund, Germany, 3–6 June 2009; Vahrenhold, J., Ed.; Lecture Notes in Computer Science 5526; Springer: Berlin, Germany, 2009; pp. 137–148. [Google Scholar]
- Žalik, B.; Lukač, N. Chain code lossless compression using Move-To-Front transform and adaptive Run-Length Encoding. Signal Process. Image Commun. 2014, 29, 96–106. [Google Scholar] [CrossRef]
- Arnavut, Z. Move-To-Front and Inversion Coding. In Proceedings of the IEEE Data Compression Conference, DCC’2000, Snowbird, UT, USA, 28–30 March 2000; Cohn, M., Storer, J.A., Eds.; IEEE Computer Society Press: Los Alamitos, CA, USA, 2000; pp. 193–202. [Google Scholar]
- Adjeroh, D.; Bell, T.; Mukherjee, A. The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching, 2nd ed.; Springer Science + Business Media: New York, NY, USA, 2008. [Google Scholar]
- Lee, Y.L.; Han, K.H.; Sullivan, G.J. Improved lossless intra coding for H. 264/MPEG-4 AVC. IEEE Trans. Image Process. 2006, 15, 2610–2615. [Google Scholar] [PubMed]
- Khademi, A.; Krishnan, S. Comparison of JPEG 2000 and other lossless compression schemes for digital mammograms. IEEE Trans. Image Process. 2005, 25, 693–695. [Google Scholar]
- Barina, D. Comparison of Lossless Image Formats. arXiv 2021, arXiv:2108.02557. [Google Scholar]
- Ulacha, G.; Łazoryszczak, M. Lossless Image Coding Using Non-MMSE Algorithms to Calculate Linear Prediction Coefficients. Entropy 2023, 25, 156. [Google Scholar] [CrossRef] [PubMed]
- Kohek, Š.; Strnad, D.; Žalik, B.; Kolmanič, S. Interactive synthesis and visualization of self-organizing trees for large-scale forest succession simulation. Multimed. Syst. 2019, 25, 213–227. [Google Scholar] [CrossRef]
- Nong, G.; Zhang, S.; Chan, W.H. Two efficient algorithms for linear time suffix array construction. IEEE Trans. Comput. 2011, 60, 1471–1484. [Google Scholar] [CrossRef]
- Kärkkäinen, J.; Sanders, P.; Burkhardt, S. Linear work suffix array construction. J. ACM 2017, 53, 918–936. [Google Scholar] [CrossRef]
- Bader, M. Space-Filling Curves—An Introduction with Applications in Scientific Computing; Springer: Berlin, Germany, 2013. [Google Scholar]
- Chung, K.-L.; Huang, Y.-L.; Liu, Y.-W. Efficient algorithms for coding Hilbert curve of arbitrary-sized image and application to window query. Inf. Sci. 2007, 17, 2130–2151. [Google Scholar] [CrossRef]
- Žalik, B.; Mongus, D.; Rizman Žalik, K.; Lukač, N. Boolean Operations on Rasterized Shapes Represented by Chain Codes Using Space Filling Curves. J. Vis. Commun. Image Represent. 2017, 49, 420–430. [Google Scholar] [CrossRef]
- Lawder, J.K.; King, P.J.H. Using state diagrams for Hilbert curve mappings. Int. J. Comput. Math. 2001, 78, 327–342. [Google Scholar] [CrossRef]
- Žalik, B.; Strnad, D.; Kohek, Š.; Kolingerová, I.; Nerat, A.; Lukač, N.; Lipuš, B.; Žalik, M.; Podgorelec, D. FLoCIC: A Few Lines of Code for Raster Image Compression. Entropy 2023, 25, 533. [Google Scholar] [CrossRef]
X | / 1 | b | a | r | b | a | r | a | | | b | a | r | b | a | r | a | |
L | 0 | a | b | a | r | b | a | r | a | | | b | a | r | b | a | r | a |
1 | b | a | b | a | r | b | a | r | a | | | b | a | r | b | a | r | |
2 | r | r | r | b | a | r | b | b | r | a | | | b | a | r | b | b | |
3 | | | | | | | | | | | | | | | | | b | r | r | | | | | | | | | | | |
Y | 1 | 1 | 2 | 2 | 2 | 2 | 1 | 3 | 3 | 2 | 3 | 2 | 2 | 2 | 1 |
i | Step 1 | Step 2 | Step 3 | Step 4 |
---|---|---|---|---|
0 | barbara|barbara | abarbara|barbar | r | |
1 | arbara|barbarab | arabarbara|barb | b | |
2 | rbara|barbaraba | ara|barbarabarb | b | |
3 | bara|barbarabar | arbarabarbara|b | b | |
4 | ara|barbarabarb | arbara|barbarab | b | |
5 | ra|barbarabarba | a|barbarabarbar | r | |
6 | a|barbarabarbar | barabarbara|bar | r | |
7 | |barbarabarbara | bara|barbarabar | r | |
8 | barbarabarbara| | barbarabarbara| | | | |
9 | arbarabarbara|b | barbara|barbara | a | ← |
10 | rbarabarbara|ba | rabarbara|barba | a | |
11 | barabarbara|bar | ra|barbarabarba | a | |
12 | arabarbara|barb | rbarabarbara|ba | a | |
13 | rabarbara|barba | rbara|barbaraba | a | |
14 | abarbara|barbar | |barbarabarbara | a |
i | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Y | r | b | b | b | b | r | r | r | | | a | a | a | a | a | a |
C | a | a | a | a | a | a | b | b | b | b | r | r | r | r | | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Žalik, B.; Strnad, D.; Podgorelec, D.; Kolingerová, I.; Lukač, L.; Lukač, N.; Kolmanič, S.; Žalik, K.R.; Kohek, Š. A New Transformation Technique for Reducing Information Entropy: A Case Study on Greyscale Raster Images. Entropy 2023, 25, 1591. https://doi.org/10.3390/e25121591
Žalik B, Strnad D, Podgorelec D, Kolingerová I, Lukač L, Lukač N, Kolmanič S, Žalik KR, Kohek Š. A New Transformation Technique for Reducing Information Entropy: A Case Study on Greyscale Raster Images. Entropy. 2023; 25(12):1591. https://doi.org/10.3390/e25121591
Chicago/Turabian StyleŽalik, Borut, Damjan Strnad, David Podgorelec, Ivana Kolingerová, Luka Lukač, Niko Lukač, Simon Kolmanič, Krista Rizman Žalik, and Štefan Kohek. 2023. "A New Transformation Technique for Reducing Information Entropy: A Case Study on Greyscale Raster Images" Entropy 25, no. 12: 1591. https://doi.org/10.3390/e25121591
APA StyleŽalik, B., Strnad, D., Podgorelec, D., Kolingerová, I., Lukač, L., Lukač, N., Kolmanič, S., Žalik, K. R., & Kohek, Š. (2023). A New Transformation Technique for Reducing Information Entropy: A Case Study on Greyscale Raster Images. Entropy, 25(12), 1591. https://doi.org/10.3390/e25121591