A Method for In-Loop Video Coding Restoration
Abstract
:1. Introduction
2. Related Works and Theoretical Background
2.1. In-Loop Restoration in Existing AV1 Video Codec
2.1.1. Deblocking Filter
2.1.2. Constrained Directional Enhancement Filter (CDEF)
2.1.3. Self-Guided
2.1.4. Wiener Filter
2.2. In-Loop Restoration Based on Deep Learning
3. Sparse Restoration Method (SRM)
3.1. Sparse Decoding Residuals
3.2. Sparse Coefficient Magnitude Estimation
3.3. Sparse Coefficient Position Estimation
Algorithm 1 Sparse prediction algorithm at the encoder per block basis. |
Input: Reference block: x, Decoded block: y, block size: () |
Task: Set encoder-flag |
|
Algorithm 2 Sparse prediction algorithm at the decoder. |
Input: Decoded block: y, block size: (), |
Task: Predict the decoding residual block at the decoder () |
|
4. Experiments Design
4.1. Dataset
4.2. Computing Details
5. Results
6. Conclusions and Future Works
- We proposed SRM as a low-complexity method that shows the capability of being used with synthetic video sequences. A large market of digital games, child movies, and educational and training videos, among others, can benefit from the use of SRM.
- SRM was able to predict a proper DR (decoding residual) block using the GGD shape parameter (a) as a quality selector factor at a block level in the DCT space, which maximized the objective visual quality of a restored block. Moreover, SRM achieved a 2.5 BD-rate gain, in terms of VMAF, against existing switchable restoration filter—in AV1/AV2—over synthetic content (class B), while complexity was kept between 105% and 110%. SRM leverages a guiding restoration flag to determine which blocks require restoration and which blocks can be passthrough. A future improvement for SRM will be targeted to predict this flag at the decoder side, which will reflect a significant bitrate reduction (approx. 10–15%).
- Sparse representation is, without a doubt, an efficient method for image restoration tasks. Regarding the video coding in-loop restoration scenario, the critical challenge was eliminating required information to transfer between the encoder and decoder to represent the nonzero coefficients. However, moving the high-intensive task to the decoder is not an option, considering the real-time exigency during the decoding process. Therefore, we developed a hybrid approach where most of the required information was predicted in the decoder, and only a guiding bit encoder-flag was required. The reason for utilizing a guiding bit is the poor precision of predicting if a block, i.e., , should be collapsed or expanded in terms of the GGD.
- Traditional full-reference quality metrics, such as PSNR, SSIM, and VMAF, are not completely consistent for assessing image/video restoration. These metrics rely on an existing reference image that is not always artifacts free—i.e., noise, blocking, and blurring. Therefore, future works should be addressed to define computational-efficient and real-time non-reference metrics at a frame and block levels, in order to provide human-correlated data to the decoder to perform restoration without requiring context information shared by the encoder. This mechanism will improve in-loop restoration efficiency in terms of the required amount of signaling bits.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Correction Statement
References
- Lyn, C. The Global Internet Phenomena Report; Technical Report; Sandvine: Waterloo, ON, Canada, 2023. [Google Scholar]
- Sullivan, G.J.; Ohm, J.R.; Han, W.J.; Wiegand, T. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
- Han, J.; Li, B.; Mukherjee, D.; Chiang, C.H.; Grange, A.; Chen, C.; Su, H.; Parker, S.; Deng, S.; Joshi, U.; et al. A Technical Overview of AV1. Proc. IEEE 2021, 109, 1435–1462. [Google Scholar] [CrossRef]
- Bross, B.; Wang, Y.K.; Ye, Y.; Liu, S.; Chen, J.; Sullivan, G.J.; Ohm, J.R. Overview of the Versatile Video Coding (VVC) Standard and Its Applications. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3736–3764. [Google Scholar] [CrossRef]
- Bhojani, D.R.; Dwivedi, V.J.; Thanki, R.M. Comparative Comparison of Standard and Hybrid Video Codec. In Hybrid Video Compression Standard; Springer: Singapore, 2020; pp. 57–58. [Google Scholar] [CrossRef]
- Ding, D.; Chen, G.; Mukherjee, D.; Joshi, U.; Chen, Y. A progressive CNN in-loop filtering approach for inter frame coding. Signal Process. Image Commun. 2019, 94, 116201. [Google Scholar] [CrossRef]
- Kong, L.; Ding, D.; Liu, F.; Mukherjee, D.; Joshi, U.; Chen, Y. Guided CNN Restoration with Explicitly Signaled Linear Combination. In Proceedings of the 2020 IEEE International Conference on Image Processing, Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 3379–3383. [Google Scholar] [CrossRef]
- Barman, N.; Martini, M.G.; Reznik, Y. Revisiting Bjontegaard Delta Bitrate (BD-BR) Computation for Codec Compression Efficiency Comparison. In Proceedings of the 1st Mile-High Video Conference, MHV ’22, New York, NY, USA, 1–3 March 2022; pp. 113–114. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A. Reduced- and No-Reference Image Quality Assessment. IEEE Signal Process. Mag. 2011, 28, 29–40. [Google Scholar] [CrossRef]
- Wang, Z.; Simoncelli, E.; Bovik, A. Multiscale structural similarity for image quality assessment. In Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 9–12 November 2003; Volume 2, pp. 1398–1402. [Google Scholar] [CrossRef]
- Valin, J.M. The Daala Directional Deringing Filter. arXiv 2016, arXiv:1602.05975. [Google Scholar]
- Umnov, A.V.; Krylov, A.S.; Nasonov, A.V. Ringing artifact suppression using sparse representation. Lect. Notes Comput. Sci. 2015, 9386, 35–45. [Google Scholar]
- Wiener, N. The Linear Predictor and Filter for Multiple Time Series. In Extrapolation, Interpolation, and Smoothing of Stationary Time Series: With Engineering Applications; MIT Press: Cambridge, MA, USA, 1964; pp. 104–116. [Google Scholar]
- Ekstrom, M.P. Realizable Wiener Filtering in Two Dimensions. IEEE Trans. Acoust. Speech Signal Process. 1982, 30, 31–40. [Google Scholar] [CrossRef]
- Siekmann, M.; Bosse, S.; Schwarz, H.; Wiegand, T. Separable Wiener filter based adaptive in-loop filter for video coding. In Proceedings of the 28th Picture Coding Symposium, Nagoya, Japan, 8–10 December 2010; pp. 70–73. [Google Scholar] [CrossRef]
- Jia, C.; Wang, S.; Zhang, X.; Wang, S.; Liu, J.; Pu, S.; Ma, S. Content-Aware Convolutional Neural Network for In-Loop Filtering in High Efficiency Video Coding. IEEE Trans. Image Process. 2019, 28, 3343–3356. [Google Scholar] [CrossRef] [PubMed]
- O’Shea, K.; Nash, R. An Introduction to Convolutional Neural Networks. arXiv 2015, arXiv:1511.08458. [Google Scholar]
- Kim, J.; Lee, J.K.; Lee, K.M. Accurate Image Super-Resolution Using Very Deep Convolutional Networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654. [Google Scholar] [CrossRef]
- Dai, Y.; Liu, D.; Wu, F. A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding. In Lecture Notes in Computer Science; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; pp. 28–39. [Google Scholar] [CrossRef]
- Chen, C.Y.; Tsai, C.Y.; Huang, Y.W.; Yamakage, T.; Chong, I.S.; Fu, C.M.; Itoh, T.; Watanabe, T.; Chujoh, T.; Karczewicz, M.; et al. The adaptive loop filtering techniques in the HEVC standard. In Proceedings of the Applications of Digital Image Processing XXXV; Tescher, A.G., Ed.; Society of Photo-Optical Instrumentation Engineers (SPIE): Bellingham, WA, USA, 2012; Volume 8499, p. 849913. [Google Scholar] [CrossRef]
- Ding, D.; Kong, L.; Chen, G.; Liu, Z.; Fang, Y. A Switchable Deep Learning Approach for In-Loop Filtering in Video Coding. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 1871–1887. [Google Scholar] [CrossRef]
- Segall, C.A.; Katsaggelos, A.K.; Molina, R.; Mateos, J. Super-Resolution from Compressed Video. In Super-Resolution Imaging; Chaudhuri, S., Ed.; Springer: Boston, MA, USA, 2000; pp. 211–242. [Google Scholar] [CrossRef]
- Cai, T.T.; Wang, L. Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise. IEEE Trans. Inf. Theory 2011, 57, 4680–4688. [Google Scholar] [CrossRef]
- Mairal, J.; Sapiro, G.; Elad, M. Learning multiscale sparse representations for image and video restoration. Multiscale Model. Simul. 2008, 7, 214–241. [Google Scholar] [CrossRef]
- Reininger, R.C.; Gibson, J.D. Distributions of the Two-Dimensional DCT Coefficients for Images. IEEE Trans. Commun. 1983, 31, 835–839. [Google Scholar] [CrossRef]
- Oxford. A Dictionary of Statistics; Oxford University Press: Oxford, UK, 2014. [Google Scholar] [CrossRef]
- Saad, M.A.; Bovik, A.C.; Charrier, C. DCT statistics model-based blind image quality assessment. In Proceedings of the 2011 18th IEEE International Conference on Image Processing, Brussels, Belgium, 11–14 September 2011; pp. 3093–3096. [Google Scholar] [CrossRef]
- Salazar, C.; Madan, S.; Bodas, A.V.; Velasco, A.; Bird, C.A.; Barut, O.; Trigui, T.; Liang, X. AWS Compute Video Super-Resolution powered by the Intel® Library for Video Super Resolution. In Proceedings of the 3rd Mile-High Video Conference, Denver, CO, USA, 11–14 February 2024; pp. 124–125. [Google Scholar] [CrossRef]
- Antsiferova, A.; Lavrushkin, S.; Smirnov, M.; Gushchin, A.; Vatolin, D.; Kulikov, D. Video compression dataset and benchmark of learning-based video-quality metrics. arXiv 2023, arXiv:2211.12109. [Google Scholar]
On | Off | Perf. On vs. Off | |
---|---|---|---|
Average speed (fps) | 15 | 22 | % |
Total encoding time (ms) | 40,856 | 27,951 | % |
QP | DC | AC |
---|---|---|
85 | ||
110 | ||
135 | ||
160 | ||
185 | ||
210 |
Class | Sub-Class | Resolution | Total |
---|---|---|---|
Natural Videos (A) | A2 | 10 | |
A3 | 6 | ||
A4 | 6 | ||
A5 | 3 | ||
Synthetic (B) | B1 | 7 |
BD-Rate * | ||||
---|---|---|---|---|
Sequence | Implementation | PSNR | SSIM | VMAF |
A2 | AV2 + SF | −1.816 | −0.035 | −1.994 |
AV2 + SRM | 0.487 | −0.337 | −2.206 | |
A3 | AV2 + SF | −1.813 | −0.183 | −2.310 |
AV2 + SRM | 0.794 | 0.763 | −1.642 | |
A4 | AV2 + SF | −2.156 | −1.890 | −0.326 |
AV2 + SRM | 1.158 | 1.355 | 0.730 | |
A5 | AV2 + SF | −0.499 | −0.627 | −1.33 |
AV2 + SRM | 0.615 | −0.174 | −1.276 | |
B1 | AV2 + SF | −0.337 | 0.025 | −1.603 |
AV2 + SRM | 1.047 | −0.949 | −2.585 |
Sequence | AV2 + SF * | AV2 + SRM * | Time |
---|---|---|---|
A2–A5 | 18.32 | 20.21 | +110.31% |
B1 | 14.84 | 15.61 | +105.18% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Salazar, C.; Trujillo, M.; Branch-Bedoya, J.W. A Method for In-Loop Video Coding Restoration. Electronics 2024, 13, 2422. https://doi.org/10.3390/electronics13122422
Salazar C, Trujillo M, Branch-Bedoya JW. A Method for In-Loop Video Coding Restoration. Electronics. 2024; 13(12):2422. https://doi.org/10.3390/electronics13122422
Chicago/Turabian StyleSalazar, Carlos, Maria Trujillo, and John W. Branch-Bedoya. 2024. "A Method for In-Loop Video Coding Restoration" Electronics 13, no. 12: 2422. https://doi.org/10.3390/electronics13122422
APA StyleSalazar, C., Trujillo, M., & Branch-Bedoya, J. W. (2024). A Method for In-Loop Video Coding Restoration. Electronics, 13(12), 2422. https://doi.org/10.3390/electronics13122422