Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec
Abstract
:1. Introduction
Related Work
2. Proposed Method
2.1. Shallow Feature Extraction
Residual Convolution Block
2.2. Deep Feature Extraction
2.2.1. Channel-Wise Self-Attention
2.2.2. Block-Wise Spatial Self-Attention
2.2.3. Patch-Wise Self-Attention
2.3. Image Reconstruction
Structure of the Image Reconstruction
2.4. Training and Testing Configuration
2.4.1. Training Dataset
2.4.2. Training Strategy
2.4.3. Experimental Setup
3. Results
3.1. Evaluation Process
3.1.1. Testing Dataset
3.1.2. Testing Strategy
3.1.3. Rate-Distortion Plot Analysis
3.1.4. Visual Quality Evaluation
4. Discussion
4.1. Computational Complexity Increase by Using Patch-Wise Self-Attention
4.2. Ablation Studies
4.2.1. Impact of Overlapping Sequences between Training and Testing Datasets
4.2.2. Processing of Boundary Pixels in Training Data
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Liu, S.; Bross, B.; Chen, J. Versatile Video Coding (Draft 6). In Proceedings of the Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 15th Meeting, Gothenburg, Sweden, 3–12 July 2019. JVET-O2001-vE. [Google Scholar]
- Chen, Y.; Murherjee, D.; Han, J.; Grange, A.; Xu, Y.; Liu, Z.; Parker, S.; Chen, C.; Su, H.; Joshi, U.; et al. An overview of core coding tools in the AV1 video code. In Proceedings of the 2018 Picture Coding Symposium (PCS), San Francisco, CA, USA, 24–27 June 2018; pp. 41–45. [Google Scholar]
- Han, J.; Li, B.; Mukherjee, D.; Chiang, C.; Grange, A.; Chen, C.; Su, H.; Parker, S.; Deng, S.; Joshi, U.; et al. A Technical Overview of AV1. arXiv 2020, arXiv:2008.06091. [Google Scholar] [CrossRef]
- Zou, N.; Zhang, H.; Cricri, F.; Youvalari, R.G.; Tavakoli, H.R.; Lainema, J.; Aksu, E.; Hannuksela, M.; Rahtu, E. Adaptation and Attention for Neural Video Coding. arXiv 2021, arXiv:2112.08767. [Google Scholar]
- Wang, Y.; Zhu, H.; Li, Y.; Chen, Z.; Liu, S. Dense Residual Convolutional Neural Network based In-Loop Filter for HEVC. In Proceedings of the 2018 IEEE Visual Communications and Image Processing (VCIP), Taichung, Taiwan, 9–12 December 2018; pp. 1–4. [Google Scholar]
- Chen, S.; Chen, Z.; Wang, Y.; Liu, S. In-Loop Filter with Dense Residual Convolutional Neural Network for VVC. In Proceedings of the 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), Shenzhen, China, 6–8 August 2020; pp. 149–152. [Google Scholar]
- Zhao, Y.; Lin, K.; Wang, S.; Ma, S. Joint Luma and Chroma Multi-Scale CNN In-loop Filter for Versatile Video Coding. In Proceedings of the 2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, 27 May–1 June 2022; pp. 3205–3208. [Google Scholar]
- Kathariya, B.; Li, Z.; Van der Auwera, G. Joint Pixel and Frequency Feature Learning and Fusion via Channel-Wise Transformer for High-Efficiency Learned In-Loop Filter in VVC. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 4070–4082. [Google Scholar] [CrossRef]
- Ding, D.; Chen, G.; Mukherjee, D.; Joshi, U.; Chen, Y. A CNN-based In-loop Filtering Approach for AV1 Video Codec. In Proceedings of the 2019 Picture Coding Symposium (PCS), Ningbo, China, 12–15 November 2019; pp. 1–5. [Google Scholar]
- Xia, J.; Wen, J. Asymmetric Convolutional Residual Network for AV1 Intra in-Loop Filtering. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 1291–1295. [Google Scholar]
- Guan, Z.; Xing, Q.; Xu, M.; Yang, R.; Liu, T.; Wang, Z. MFQE 2.0: A New Approach for Multi-Frame Quality Enhancement on Compressed Video. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 949–962. [Google Scholar] [CrossRef] [PubMed]
- Lin, W.; He, X.; Han, X.; Liu, D.; See, J.; Zou, J.; Xiong, H.; Wu, F. Partition-Aware Adaptive Switching Neural Networks for Post-Processing in HEVC. IEEE Trans. Multimedia 2020, 22, 2749–2763. [Google Scholar] [CrossRef]
- Zhang, F.; Feng, C.; Bull, D.R. Enhancing VVC through CNN-Based Post-Processing. In Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), London, UK, 6–10 July 2020; pp. 1–6. [Google Scholar]
- Lin, J.; Yang, Y. Multi-Frequency Residual Convolutional Neural Network for Steganalysis of Color Images. IEEE Access 2021, 9, 141938–141950. [Google Scholar] [CrossRef]
- Liu, T.; Cui, W.; Hui, C.; Jiang, F.; Gao, Y.; Xie, S.; Wu, P. AHG11: Post-Process Filter Based on Fusion of CNN and Transformer. In Proceedings of the Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, 26th Meeting, Teleconference, 20–29 April 2022. JVET-Z0101-v2. [Google Scholar]
- Santamaria, M.; Yang, R.; Cricri, F.; Zhang, H.; Lainema, J.; Youvalari, R.G.; Tavakoli, H.R.; Hannuksela, M.M. Overfitting Multiplier Parameters for Content-Adaptive Post-Filtering in Video Coding. In Proceedings of the 10th European Workshop on Visual Information Processing (EUVIP), Lisbon, Portugal, 11–14 September 2022; pp. 1–6. [Google Scholar]
- Das, T.; Choi, K.; Choi, J. High Quality Video Frames From VVC: A Deep Neural Network Approach. IEEE Access 2023, 11, 54254–54264. [Google Scholar] [CrossRef]
- Zhang, F.; Ma, D.; Feng, C.; Bull, D.R. Video Compression With CNN-Based Postprocessing. IEEE MultiMedia 2021, 28, 74–83. [Google Scholar] [CrossRef]
- Xiao, T.; Singh, M.; Mintun, E.; Darrell, T.; Dollár, P.; Girshick, R. Early Convolutions Help Transformers See Better. arXiv 2021, arXiv:2106.14881v3. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 20–30 June 2016; pp. 770–778. [Google Scholar]
- Bello, I.; Zoph, B.; Le, Q.; Vaswani, A.; Shlens, J. Attention Augmented Convolutional Networks. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3285–3294. [Google Scholar]
- Li, Y.; Rusanovskyy, D.; Karczewicz, M. EE1-1.5: Report on Implementation of HOP In-Loop Filter with Transformer Blocks. In Proceedings of the Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, 33rd Meeting, Teleconference, 17–26 January 2024. Document JVET-AG0162_v1. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations (ICLR), Virtual Event, 3–7 May 2021. [Google Scholar]
- Ma, D.; Zhang, F.; Bull, D.R. BVI-DVC: A Training Database for Deep Video Compression. IEEE Trans. Multimedia 2022, 24, 3847–3858. [Google Scholar] [CrossRef]
- AOMediaCodec. SVT-AV1: Scalable Video Technology for AV1 Encoder. Available online: https://gitlab.com/AOMediaCodec/SVT-AV1 (accessed on 2 March 2024).
- Zhao, X.; Lei, Z.; Norkin, A.; Daede, T.; Tourapis, A. AOM Common Test Conditions v2.0. Alliance for Open Media, Codec Working Group. 2021. Available online: https://aomedia.org/docs/CWG-B075o_AV2_CTC_v2.pdf (accessed on 2 March 2024).
- Prangnell, L. Visible Light-Based Human Visual System Conceptual Model. arXiv 2016, arXiv:1609.04830. [Google Scholar]
- Horé, A.; Ziou, D. Image Quality Metrics: PSNR vs. SSIM. In Proceedings of the 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 23–26 August 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 2366–2369. [Google Scholar]
- Alshina, E.; Galpin, F.; Rusanovskyy, D. AhG11/AhG14 teleconference. In Proceedings of the Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, 33rd Meeting, Teleconference, 17–26 January 2024. JVET-AG0041-v1. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019; pp. 8026–8037, Article 721. [Google Scholar]
- Bjontegaard, G. Response to Call for Proposals for H.26L. ITU-T SG16 Doc. Q15-F-11. In Proceedings of the International Telecommunication Union, Sixth Meeting, Seoul, Republic of Korea, 3–6 November 1998. [Google Scholar]
- Barman, N.; Martini, M.G.; Reznik, Y. Revisiting Bjontegaard Delta Bitrate (BD-BR) Computation for Codec Compression Efficiency Comparison. In Proceedings of the 1st Mile-High Video Conference, MHV ’22, New York, NY, USA, 1–3 March 2022; pp. 113–114. [Google Scholar]
Class | Video Resolution | Number of Videos | Frames | Bit Depth | Chroma Sampling |
---|---|---|---|---|---|
A | 3840 × 2176 | 200 | 64 | 10 | 4:2:0 |
B | 1920 × 1088 | 200 | 64 | 10 | 4:2:0 |
C | 960 × 544 | 200 | 64 | 10 | 4:2:0 |
D | 480 × 272 | 200 | 64 | 10 | 4:2:0 |
Configuration Parameter | Command Line | Range | Setting Used | Description |
---|---|---|---|---|
RateControlMode | --rc | [0–2] | 0 | Rate control mode [0: CRF or CQP (if --aq-mode is 0) [Default], 1: VBR, 2: CBR] |
AdaptiveQuantization | --aq-mode | [0–2] | 0 | Set adaptive QP level [0: off, 1: variance base using AV1 segments, 2: deltaq pred efficiency] |
QuantizationParameter | --qp | [1–63] | 20, 32, 43, 55, 63 | Initial QP level value |
FrameRate | --fps | [1–240] | Sequence dependent | Input video frame rate, integer values only, inferred if y4 m |
EncoderColorFormat | --color-format | [0–3] | 1 | Color format, only yuv420 is supported at this time [0: yuv400, 1: yuv420, 2: yuv422, 3: yuv444] |
EncoderBitDepth | --input-depth | [8, 10] | 10 | Input video file and output bitstream bit-depth |
PredStructure | --pred-struct | [1–2] | 2 | Set prediction structure [1: low delay, 2: random access] |
Model | QP Base Range |
---|---|
Model QP20 | QPbase < 26 |
Model QP32 | 26 ≤ QPbase < 37.5 |
Model QP43 | 37.5 ≤ QPbase < 49 |
Model QP55 | 49 ≤ QPbase < 59 |
Model QP63 | 59 ≤ QPbase < 63 |
Class | Sequence | Resolution | Frame Rate | Bit-Depth |
---|---|---|---|---|
A1 | BoxingPractice | 3840 × 2160 | 59.94 | 10 |
Crosswalk | 3840 × 2160 | 59.94 | 10 | |
FoodMarket2 | 3840 × 2160 | 59.94 | 10 | |
Neon1224 | 3840 × 2160 | 29.97 | 10 | |
NocturneDance | 3840 × 2160 | 60 | 10 | |
PierSeaSide | 3840 × 2160 | 29.97 | 10 | |
Tango | 3840 × 2160 | 59.94 | 10 | |
TimeLapse | 3840 × 2160 | 59.94 | 10 | |
A2 | Aerial3200 | 1920 × 1080 | 59.94 | 10 |
Boat | 1920 × 1080 | 59.94 | 10 | |
CrowdRun | 1920 × 1080 | 50 | 8 * | |
DinnerSceneCropped | 1920 × 1080 | 29.97 | 10 | |
FoodMarket | 1920 × 1080 | 59.94 | 10 | |
GregoryScarf | 1080 × 1920 | 30 | 10 | |
MeridianTalksdr | 1920 × 1080 | 59.94 | 10 | |
Motorcycle | 1920 × 1080 | 30 | 8 * | |
OldTownCross | 1920 × 1080 | 30 | 8 * | |
PedestrianArea | 1920 × 1080 | 25 | 8 * | |
RitualDance | 1920 × 1080 | 59.94 | 10 | |
Riverbed | 1920 × 1080 | 25 | 8 * | |
RushFieldCuts | 1920 × 1080 | 29.97 | 8 * | |
Skater227 | 1920 × 1080 | 30 | 10 | |
ToddlerFountainCropped | 1080 × 1920 | 29.97 | 10 | |
TreesAndGrass | 1920 × 1080 | 30 | 8 * | |
TunnelFlag | 1920 × 1080 | 59.94 | 10 | |
Verticalbees | 1080 × 1920 | 29.97 | 8 * | |
WorldCup | 1920 × 1080 | 30 | 8 * | |
A3 | ControlledBurn | 1280 × 720 | 30 | 8 * |
DrivingPOV | 1280 × 720 | 59.94 | 10 | |
Johnny | 1280 × 720 | 60 | 8 * | |
KristenAndSara | 1280 × 720 | 60 | 8 * | |
RollerCoaster | 1280 × 720 | 59.94 | 10 | |
Vidyo3 | 1280 × 720 | 60 | 8 * | |
Vidyo4 | 1280 × 720 | 60 | 8 * | |
WestWindEasy | 1280 × 720 | 30 | 8 * | |
A4 | BlueSky | 640 × 360 | 25 | 8 * |
RedKayak | 640 × 360 | 29.97 | 8 * | |
SnowMountain | 640 × 360 | 29.97 | 8 * | |
SpeedBag | 640 × 360 | 29.97 | 8 * | |
Stockholm | 640 × 360 | 59.94 | 8 * | |
TouchdownPass | 640 × 360 | 29.97 | 8 * | |
A5 | FourPeople | 480 × 270 | 60 | 8 * |
ParkJoy | 480 × 270 | 50 | 8 * | |
SparksElevator | 480 × 270 | 59.94 | 10 | |
VerticalBayshore | 270 × 480 | 29.97 | 8 * |
Class | Sequence | BD-BR (%) | ||
---|---|---|---|---|
Y | Cb | Cr | ||
A1 | BoxingPractice | −18.31% | −27.86% | −22.23% |
Crosswalk | −13.06% | −19.48% | −11.35% | |
FoodMarket2 | −12.61% | −19.29% | −21.84% | |
Neon1224 | −16.37% | −23.06% | −25.46% | |
NocturneDance | −19.60% | −24.55% | −20.02% | |
PierSeaSide | −15.52% | −26.94% | −25.54% | |
Tango | −16.64% | −34.52% | −30.50% | |
TimeLapse | −6.60% | −18.85% | −12.40% | |
Average | −14.84% | −24.32% | −21.17% | |
A2 | Aerial3200 | −4.54% | −16.19% | −24.99% |
Boat | −10.03% | −42.92% | −26.80% | |
CrowdRun | −13.21% | −32.24% | −27.83% | |
DinnerSceneCropped | −14.09% | −35.18% | −18.85% | |
FoodMarket | −11.67% | −28.86% | −25.61% | |
GregoryScarf | −11.08% | −37.28% | −22.58% | |
MeridianTalksdr | −12.51% | −33.22% | −21.34% | |
Motorcycle | −10.98% | −21.85% | −23.55% | |
OldTownCross | −13.94% | −45.59% | −21.14% | |
PedestrianArea | −15.98% | −17.45% | −20.61% | |
RitualDance | −16.57% | −26.43% | −34.26% | |
Riverbed | −9.93% | −12.56% | −10.13% | |
RushFieldCuts | −10.73% | −15.35% | −13.92% | |
Skater227 | −15.60% | −19.88% | −12.87% | |
ToddlerFountainCropped | −12.13% | −21.78% | −21.56% | |
TreesAndGrass | −4.39% | −17.25% | −6.69% | |
TunnelFlag | −21.59% | −40.11% | −45.87% | |
Verticalbees | −11.45% | −10.59% | −12.68% | |
WorldCup | −18.46% | −23.86% | −22.63% | |
Average | −12.57% | −26.24% | −21.78% | |
A3 | ControlledBurn | −7.83% | −26.81% | −21.66% |
DrivingPOV | −14.20% | −24.55% | −22.48% | |
Johnny | −12.36% | −13.77% | −14.01% | |
KristenAndSara | −10.60% | −13.89% | −12.38% | |
RollerCoaster | −17.18% | −18.66% | −27.76% | |
Vidyo3 | −11.03% | −6.71% | −7.97% | |
Vidyo4 | −9.93% | −11.60% | −15.35% | |
WestWindEasy | −12.67% | −38.16% | −25.02% | |
Average | −11.98% | −19.27% | −18.33% | |
A4 | BlueSky | −10.21% | −14.54% | −29.31% |
RedKayak | −6.13% | −16.46% | 14.80% | |
SnowMountain | 4.45% | 1.49% | 2.52% | |
SpeedBag | −9.24% | −8.94% | −11.27% | |
Stockholm | −10.53% | −10.64% | −19.96% | |
TouchdownPass | −7.64% | −21.13% | −16.71% | |
Average | −6.55% | −11.70% | −9.99% | |
A5 | FourPeople | −7.66% | −14.09% | −13.12% |
ParkJoy | −3.92% | −16.67% | −8.84% | |
SparksElevator | −4.11% | −10.28% | −10.62% | |
VerticalBayshore | −8.61% | −17.32% | −12.70% | |
Average | −6.07% | −14.59% | −11.32% | |
Average | −10.40% | −19.22% | −16.52% |
Network | Number of Parameters |
---|---|
RCB | 295,682 |
CWSA | 49,536 |
BWSSA | 49,536 |
PWSA | 1,258,3296 |
BVI-DVC | AVM-CTC |
---|---|
BoxingPracticeHarmonics | BoxingPractice |
DCrosswalkHarmonics | Crosswalk |
CrowdRunMCLV | CrowdRun |
TunnelFlagS1Harmonics | TunnelFlag |
DrivingPOVHarmonics | DrivingPOV |
Class | Sequence | BD-BR (%) | ||
---|---|---|---|---|
Y | Cb | Cr | ||
A1 | FoodMarket2 | −12.61% | −19.29% | −21.84% |
Neon1224 | −16.37% | −23.06% | −25.46% | |
NocturneDance | −19.60% | −24.55% | −20.02% | |
PierSeaSide | −15.52% | −26.94% | −25.54% | |
Tango | −16.64% | −34.52% | −30.50% | |
TimeLapse | −6.60% | −18.85% | −12.40% | |
Average | −14.56% | −24.53% | −22.63% | |
A2 | Aerial3200 | −4.54% | −16.19% | −24.99% |
Boat | −10.03% | −42.92% | −26.80% | |
DinnerSceneCropped | −14.09% | −35.18% | −18.85% | |
FoodMarket | −11.67% | −28.86% | −25.61% | |
GregoryScarf | −11.08% | −37.28% | −22.58% | |
MeridianTalksdr | −12.51% | −33.22% | −21.34% | |
Motorcycle | −10.98% | −21.85% | −23.55% | |
OldTownCross | −13.94% | −45.59% | −21.14% | |
PedestrianArea | −15.98% | −17.45% | −20.61% | |
RitualDance | −16.57% | −26.43% | −34.26% | |
Riverbed | −9.93% | −12.56% | −10.13% | |
RushFieldCuts | −10.73% | −15.35% | −13.92% | |
Skater227 | −15.60% | −19.88% | −12.87% | |
ToddlerFountainCropped | −12.13% | −21.78% | −21.56% | |
TreesAndGrass | −4.39% | −17.25% | −6.69% | |
Verticalbees | −11.45% | −10.59% | −12.68% | |
WorldCup | −18.46% | −23.86% | −22.63% | |
Average | −12.00% | −25.07% | −20.01% | |
A3 | ControlledBurn | −7.83% | −26.81% | −21.66% |
Johnny | −12.36% | −13.77% | −14.01% | |
KristenAndSara | −10.60% | −13.89% | −12.38% | |
RollerCoaster | −17.18% | −18.66% | −27.76% | |
Vidyo3 | −11.03% | −6.71% | −7.97% | |
Vidyo4 | −9.93% | −11.60% | −15.35% | |
WestWindEasy | −12.67% | −38.16% | −25.02% | |
Average | −11.66% | −18.51% | −17.74% | |
A4 | BlueSky | −10.21% | −14.54% | −29.31% |
RedKayak | −6.13% | −16.46% | 14.80% | |
SnowMountain | 4.45% | 1.49% | 2.52% | |
SpeedBag | −9.24% | −8.94% | −11.27% | |
Stockholm | −10.53% | −10.64% | −19.96% | |
TouchdownPass | −7.64% | −21.13% | −16.71% | |
Average | −6.55% | −11.70% | −9.99% | |
A5 | FourPeople | −7.66% | −14.09% | −13.12% |
ParkJoy | −3.92% | −16.67% | −8.84% | |
SparksElevator | −4.11% | −10.28% | −10.62% | |
VerticalBayshore | −8.61% | −17.32% | −12.70% | |
Average | −6.07% | −14.59% | −11.32% | |
Average | −10.17% | −18.88% | −16.34% |
Filled with Zero | Edge Value Extended | |||||
---|---|---|---|---|---|---|
Y | Cb | Cr | Y | Cb | Cr | |
A2 | −7.42% | −19.78% | −21.06% | −11.72% | −18.82% | −21.27% |
A3 | −2.84% | −17.02% | −16.72% | −10.34% | −16.22% | −18.73% |
A4 | 2.71% | −11.63% | −10.88% | −6.07% | −11.75% | −11.49% |
A5 | −2.16% | −12.71% | −12% | −5.01% | −15.55% | −13% |
Average | −2.43% | −15.28% | −15.17% | −8.28% | −15.58% | −16.04% |
Filled with Zero | Edge Value Extended | |||||
---|---|---|---|---|---|---|
Y | Cb | Cr | Y | Cb | Cr | |
A2 | −5.17% | −15.38% | −14.66% | −5.36% | −15.44% | −14.74% |
A3 | −6.26% | −13.59% | −14.85% | −6.55% | −14.14% | −14.96% |
A4 | −3.92% | −8.91% | −9.83% | −4.14% | −9.23% | −9.83% |
A5 | −2.71% | −13.88% | −11.58% | −2.90% | −14.17% | −11.70% |
Average | −4.51% | −12.94% | −12.73% | −4.74% | −13.24% | −12.81% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gwun, W.; Choi, K.; Park, G.H. Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec. Mathematics 2024, 12, 2874. https://doi.org/10.3390/math12182874
Gwun W, Choi K, Park GH. Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec. Mathematics. 2024; 12(18):2874. https://doi.org/10.3390/math12182874
Chicago/Turabian StyleGwun, Woowoen, Kiho Choi, and Gwang Hoon Park. 2024. "Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec" Mathematics 12, no. 18: 2874. https://doi.org/10.3390/math12182874
APA StyleGwun, W., Choi, K., & Park, G. H. (2024). Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec. Mathematics, 12(18), 2874. https://doi.org/10.3390/math12182874