Rapid CU Partitioning and Joint Intra-Frame Mode Decision Algorithm
Abstract
:1. Introduction
2. Background and Related Works
2.1. Rapid CU Partitioning Algorithm
2.2. Method for Intra-Frame Mode Selection
3. Proposed Algorithm
3.1. Rapid CU Partitioning Based on DT Model
3.2. Intra-Frame Decision Method Based on Bidirectional Gradient Search
- Objective Function: Hadamard Cost.The Hadamard Cost can be understood here as a balance between computational resource consumption and compression efficiency when using the Hadamard transform in the encoding process compared to other transform methods. To achieve greater reduction in computational complexity, we adopt the Hadamard Cost as the objective function of our algorithm.
- Initial Search Point: Reorder the MPM list according to the Hadamard cost, and select the option with the lowest cost as the initial mode for the next phase.Based on the analysis above, our choice for the initial search point is to choose the mode that the majority of blocks tend to favor. Therefore, for the initial search point, the basis for our selection is shown in Figure 5 below, where the sequence of modes in Figure 5 is randomly selected for encoding at the following 4 QP values: 20, 25, 30, and 35. It is not difficult to find that more blocks choose the directional mode after calculating the MPM proportion in the total modes, as shown in Figure 6. Compared to the simple directional mode, the vast majority of CUs tend to select the mode from the MPM list as the optimum mode; therefore, we select the corresponding mode from the MPM list as the initial search point.
- Search Mode: Bidirectional Search.The search mode is a key factor in finding the global optimum; to avoid falling into a local optimum, we propose a dual search mode algorithm. This algorithm initiates two search processes at the starting and ending points; one searches forward, the other backward, with both conducting gradient descent searches. Based on the results of the two searches, two modes are obtained, which are compared, and the mode with the lower cost is selected. The use of this search mode is a natural form of parallel processing, which enables the algorithm to converge more quickly and further improves search efficiency.
- Search Step Size: Adaptive Step Size Search.In this paper, to achieve better RD performance, our algorithm searches within a range of , where is the result with relatively lower costs obtained when left and right gradient descent are used. Based on the search results, the current best mode, , is determined. After obtaining this mode, a full RDO is performed on DC, Planar, and to arrive at the optimal mode that our algorithm, , ultimately selects.
3.3. Overall Frame
4. Model Training Process and Experimental Setup
4.1. Performance of the Rapid CU Partitioning Algorithm Based on DT
4.2. Performance of the Gradient Descent Algorithm
4.3. Overall Performance
4.4. Discussion of Algorithm Applications
- Task-driven encoding strategy.When the algorithm is applied to different downstream tasks, CU partitioning can be adjusted according to the task requirements. For example, in object detection tasks, it may be necessary to preserve finer spatial details, so smaller CUs might be preferred to enhance spatial resolution. On the other hand, in action recognition tasks, temporal information may be more critical, so the importance of temporal information can be considered during CU partitioning, selecting an appropriate CU size and inter-frame correlation. The gradient descent-based intra-mode decision algorithm can also be optimized according to the demands of downstream tasks. For example, in tasks requiring precise edge detection, intra-frame modes with strong edge-preserving capabilities may be favored, whereas in other tasks, texture information or other features might be prioritized.
- Coordinating encoding decisions with downstream tasks.If a downstream task is particularly sensitive to specific image features (such as texture, motion information, edges, etc.), our algorithm can design an encoding strategy that prioritizes the preservation of these features. By using the DT-based fast CU partitioning, the partitioning strategy can be dynamically adjusted based on the importance of the task, thereby enhancing the quality of key feature retention. In scenarios where multiple downstream tasks need to be supported, a multimodal encoding strategy can be considered. This means that during video encoding, different CU partitioning and intra-frame mode decision strategies are employed according to the requirements of various tasks, generating multiple versions of the encoded video stream. Downstream tasks can then select the version that best suits their needs, achieving a balance between computational efficiency and task performance.
- Dynamic Adjustment and Adaptive Encoding.If the requirements of downstream tasks are dynamic (such as adjusting task priorities in real-time video analysis), the encoder can dynamically adjust CU partitioning and intra-frame mode selection based on downstream feedback. For example, the system can adjust the decision parameters in the DT algorithm in real time according to the feedback from downstream tasks, allowing the encoder to adaptively generate the optimal encoding results for the current task. In the second stage of the algorithm, gradient descent can be applied to frame-by-frame optimization, continuously refining the encoding strategy based on downstream task feedback. In this case, the encoding of each frame depends not only on the content of the current frame but also on the long-term needs of the task, thereby gradually improving encoding performance while maintaining real-time processing.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bross, B.; Wang, Y.K.; Ye, Y.; Liu, S.; Chen, J.; Sullivan, G.J.; Ohm, J.R. Overview of the Versatile Video Coding (VVC) Standard and Its Applications. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3736–3764. [Google Scholar] [CrossRef]
- Fan, Y.; Chen, J.; Sun, H.; Katto, J.; Jing, M. A Fast QTMT Partition Decision Strategy for VVC Intra Prediction. IEEE Access 2020, 8, 107900–107911. [Google Scholar] [CrossRef]
- Chen, J.; Sun, H.; Katto, J.; Zeng, X.; Fan, Y. Fast QTMT Partition Decision Algorithm in VVC Intra Coding Based on Variance and Gradient. In Proceedings of the 2019 IEEE Visual Communications and Image Processing (VCIP), Sydney, Australia, 1–4 December 2019; pp. 1–4. [Google Scholar] [CrossRef]
- Cui, J.; Zhang, T.; Gu, C.; Zhang, X.; Ma, S. Gradient-Based Early Termination of CU Partition in VVC Intra Coding. In Proceedings of the 2020 Data Compression Conference (DCC), Snowbird, UT, USA, 24–27 March 2020; pp. 103–112. [Google Scholar] [CrossRef]
- Amestoy, T.; Mercat, A.; Hamidouche, W.; Menard, D.; Bergeron, C. Tunable VVC Frame Partitioning Based on Lightweight Machine Learning. IEEE Trans. Image Process. 2020, 29, 1313–1328. [Google Scholar] [CrossRef]
- Dong, X.; Shen, L.; Yu, M.; Yang, H. Fast Intra Mode Decision Algorithm for Versatile Video Coding. IEEE Trans. Multimed. 2022, 24, 400–414. [Google Scholar] [CrossRef]
- Fu, T.; Zhang, H.; Mu, F.; Chen, H. Fast CU Partitioning Algorithm for H.266/VVC Intra-Frame Coding. In Proceedings of the 2019 IEEE International Conference on Multimedia and Expo (ICME), Shanghai, China, 8–12 July 2019; pp. 55–60. [Google Scholar] [CrossRef]
- Tissier, A.; Hamidouche, W.; Mdalsi, S.B.D.; Vanne, J.; Galpin, F.; Menard, D. Machine Learning Based Efficient QT-MTT Partitioning Scheme for VVC Intra Encoders. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 4279–4293. [Google Scholar] [CrossRef]
- Tech, G.; Pfaff, J.; Schwarz, H.; Helle, P.; Wieckowski, A.; Marpe, D.; Wiegand, T. Rate-Distortion-Time Cost Aware CNN Training for Fast VVC Intra-Picture Partitioning Decisions. In Proceedings of the 2021 Picture Coding Symposium (PCS), Bristol, UK, 29 June–2 July 2021; pp. 1–5. [Google Scholar] [CrossRef]
- Park, S.h.; Kang, J.W. Fast Multi-Type Tree Partitioning for Versatile Video Coding Using a Lightweight Neural Network. IEEE Trans. Multimed. 2021, 23, 4388–4399. [Google Scholar] [CrossRef]
- Li, T.; Xu, M.; Tang, R.; Chen, Y.; Xing, Q. DeepQTMT: A Deep Learning Approach for Fast QTMT-Based CU Partition of Intra-Mode VVC. IEEE Trans. Image Process. 2021, 30, 5377–5390. [Google Scholar] [CrossRef]
- Chen, W.; Hong, D.; Qi, Y.; Han, Z.; Wang, S.; Qing, L.; Huang, Q.; Li, G. Multi-Attention Network for Compressed Video Referring Object Segmentation. In Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, Portugal, 10–14 October 2022; pp. 4416–4425. [Google Scholar] [CrossRef]
- Zhang, T.; Sun, M.T.; Zhao, D.; Gao, W. Fast Intra-Mode and CU Size Decision for HEVC. IEEE Trans. Circuits Syst. Video Technol. 2017, 27, 1714–1726. [Google Scholar] [CrossRef]
- Yang, H.; Shen, L.; Dong, X.; Ding, Q.; An, P.; Jiang, G. Low-Complexity CTU Partition Structure Decision and Fast Intra Mode Decision for Versatile Video Coding. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 1668–1682. [Google Scholar] [CrossRef]
- Wu, S.; Shi, J.; Chen, Z. HG-FCN: Hierarchical Grid Fully Convolutional Network for Fast VVC Intra Coding. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 5638–5649. [Google Scholar] [CrossRef]
- Lei, M.; Luo, F.; Zhang, X.; Wang, S.; Ma, S. Look-Ahead Prediction Based Coding Unit Size Pruning for VVC Intra Coding. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 4120–4124. [Google Scholar] [CrossRef]
- Zhang, Q.; Zhao, Y.; Jiang, B.; Wu, Q. Fast CU Partition Decision Method Based on Bayes and Improved De-Blocking Filter for H.266/VVC. IEEE Access 2021, 9, 70382–70391. [Google Scholar] [CrossRef]
- Feng, A.; Liu, K.; Liu, D.; Li, L.; Wu, F. Partition Map Prediction for Fast Block Partitioning in VVC Intra-Frame Coding. IEEE Trans. Image Process. 2023, 32, 2237–2251. [Google Scholar] [CrossRef] [PubMed]
- Zhao, T.; Huang, Y.; Feng, W.; Xu, Y.; Kwong, S. Efficient VVC Intra Prediction Based on Deep Feature Fusion and Probability Estimation. IEEE Trans. Multimed. 2023, 25, 6411–6421. [Google Scholar] [CrossRef]
- Huang, Y.; Yu, J.; Wang, D.; Lu, X.; Dufaux, F.; Guo, H.; Zhu, C. Learning-Based Fast Splitting and Directional Mode Decision for VVC Intra Prediction. IEEE Trans. Broadcast. 2024, 70, 681–692. [Google Scholar] [CrossRef]
- Peng, Z.; Shen, L.; Ding, Q.; Dong, X.; Zheng, L. Block-Dependent Partition Decision for Fast Intra Coding of VVC. IEEE Trans. Consum. Electron. 2024, 70, 277–289. [Google Scholar] [CrossRef]
- Wang, D.; Yu, J.; Lu, X.; Dufaux, F.; Hang, B.; Guo, H.; Zhu, C. Fast Mode and CU Splitting Decision for Intra Prediction in VVC SCC. IEEE Trans. Broadcast. 2024, 1–12. [Google Scholar] [CrossRef]
- Pakdaman, F.; Adelimanesh, M.A.; Hashemi, M.R. BLINC: Lightweight Bimodal Learning for Low-Complexity VVC Intra-Coding. J. Real-Time Image Process. 2022, 19, 791–807. [Google Scholar] [CrossRef]
- Chen, Z.; Shi, J.; Li, W. Learned Fast HEVC Intra Coding. IEEE Trans. Image Process. 2020, 29, 5431–5446. [Google Scholar] [CrossRef]
- Jiang, W.; Ma, H.; Chen, Y. Gradient Based Fast Mode Decision Algorithm for Intra Prediction in HEVC. In Proceedings of the 2012 2nd International Conference on Consumer Electronics, Communications and Networks (CECNet), Yichang, China, 21–23 April 2012; pp. 1836–1840. [Google Scholar] [CrossRef]
- Hu, N.; Yang, E.H. Fast Mode Selection for HEVC Intra-Frame Coding With Entropy Coding Refinement Based on a Transparent Composite Model. IEEE Trans. Circuits Syst. Video Technol. 2015, 25, 1521–1532. [Google Scholar] [CrossRef]
- Zhang, Q.; Wang, Y.; Huang, L.; Jiang, B. Fast CU Partition and Intra Mode Decision Method for H.266/VVC. IEEE Access 2020, 8, 117539–117550. [Google Scholar] [CrossRef]
- Gou, A.; Sun, H.; Liu, C.; Zeng, X.; Fan, Y. A Novel Fast Intra Algorithm for VVC Based on Histogram of Oriented Gradient. J. Vis. Commun. Image Represent. 2023, 95, 103888. [Google Scholar] [CrossRef]
- Li, Y.; He, Z.; Zhang, Q. Fast Decision-Tree-Based Series Partitioning and Mode Prediction Termination Algorithm for H.266/VVC. Electronics 2024, 13, 1250. [Google Scholar] [CrossRef]
- Bjontegaard, G. Calculation of Average PSNR Differences between RD-Curves. ITU SG16 Doc. VCEG-M33. 2001. Available online: https://cir.nii.ac.jp/crid/1571980074917801984 (accessed on 20 August 2024).
- VTM-10.0 · jvet/VVCSoftware_VTM · GitLab. Available online: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftwareVTM/-/releases/VTM-10.0 (accessed on 20 August 2024).
- Amna, M.; Imen, W.; Fatma Ezahra, S. Fast Multi-Type Tree Partitioning for Versatile Video Coding Using Machine Learning. Signal, Image Video Process. 2023, 17, 67–74. [Google Scholar] [CrossRef]
- Zhao, J.; Wu, A.; Jiang, B.; Zhang, Q. ResNet-Based Fast CU Partition Decision Algorithm for VVC. IEEE Access 2022, 10, 100337–100347. [Google Scholar] [CrossRef]
- Ni, C.T.; Lin, S.H.; Chen, P.Y.; Chu, Y.T. High Efficiency Intra CU Partition and Mode Decision Method for VVC. IEEE Access 2022, 10, 77759–77771. [Google Scholar] [CrossRef]
Feature | Feature Scores for Different Sequences | |||
---|---|---|---|---|
BQMall | Campfire | Johnny | RaceHorses | |
0.102 | 0.133 | 0.084 | 0.124 | |
0.121 | 0.095 | 0.102 | 0.095 | |
0.127 | 0.103 | 0.095 | 0.098 | |
QP | 0.089 | 0.101 | 0.087 | 0.084 |
0.103 | 0.064 | 0.078 | 0.052 | |
0.099 | 0.071 | 0.079 | 0.054 | |
DCE | 0.046 | 0.083 | 0.096 | 0.058 |
Class | TestSequence | Resolution | Frame Count | Frame Rate | Bit Depth |
---|---|---|---|---|---|
A1 | Campfire | 3840 × 2160 | 300 | 30 fps | 10 |
A2 | DaylightRoad2 | 3840 × 2160 | 300 | 60 fps | 10 |
B | BasketballDrive | 1920 × 1080 | 500 | 50 fps | 8 |
C | BQMall | 832 × 480 | 600 | 60 fps | 8 |
D | RaceHorses | 416 × 240 | 300 | 30 fps | 8 |
E | Johnny | 1280 × 720 | 600 | 60 fps | 8 |
Class | TestSequence | Amna [32] | Zhao [33] | Ni [34] | Proposed | ||||
---|---|---|---|---|---|---|---|---|---|
BDBR | TS | BDBR | TS | BDBR | TS | BDBR | TS | ||
A1 | Campfire | 0.49 | 48.12 | 1.55 | 52.32 | 0.53 | 52.57 | 1.76 | 60.68 |
Drums | — | — | — | — | — | — | 1.64 | 58.72 | |
FoodMarket4 | 0.16 | 42.85 | 1.53 | 50.08 | 0.44 | 58.89 | 1.77 | 59.21 | |
A2 | TrafficFlow | — | — | — | — | — | — | 1.76 | 58.37 |
DaylightRoad | 0.80 | 48.70 | — | — | — | — | 1.67 | 57.12 | |
ParkRunning3 | 0.54 | 50.28 | 1.42 | 47.62 | 0.21 | 46.07 | 1.69 | 57.93 | |
B | BasketballDrive | 0.67 | 47.12 | 1.58 | 43.31 | 0.55 | 47.36 | 1.75 | 54.26 |
BQTerrace | 0.89 | 44.65 | 0.84 | 47.58 | 0.46 | 48.03 | 1.31 | 53.49 | |
Cactus | 0.79 | 47.19 | 1.28 | 44.03 | 0.55 | 46.12 | 1.36 | 56.91 | |
MarketPlace | — | — | — | — | 0.30 | 46.15 | 0.74 | 54.71 | |
RitualDance | — | — | — | — | 0.65 | 51.65 | 0.89 | 55.85 | |
C | BasketballDrill | 1.31 | 48.34 | 1.27 | 44.15 | 0.96 | 46.37 | 1.58 | 50.24 |
BQMall | 1.06 | 45.15 | 1.11 | 47.37 | 0.69 | 50.00 | 0.81 | 52.57 | |
PartyScene | 0.62 | 45.11 | 0.78 | 46.37 | 0.42 | 46.01 | 0.62 | 47.68 | |
RaceHorsesC | 0.79 | 47.55 | 0.84 | 47.58 | 0.47 | 48.66 | 0.75 | 51.93 | |
D | BasketballPass | 1.01 | 43.69 | 1.34 | 38.31 | 0.70 | 50.50 | 1.46 | 50.32 |
BlowingBubbles | 0.61 | 45.7 | 0.93 | 44.29 | 0.51 | 45.16 | 1.23 | 47.64 | |
BQSquare | 0.53 | 43.72 | 0.82 | 46.65 | 0.67 | 49.10 | 1.41 | 49.91 | |
RaceHorses | — | — | 1.12 | 39.46 | 0.47 | 48.66 | 1.09 | 50.79 | |
E | FourPeople | 1.2 | 47.91 | 1.35 | 48.15 | 0.82 | 49.96 | 1.78 | 56.37 |
Johnny | 1.06 | 47.68 | 1.67 | 51.60 | 0.76 | 52.42 | 1.73 | 57.68 | |
KristenAndSara | 0.98 | 44.98 | 1.49 | 48.42 | 0.70 | 51.42 | 1.65 | 57.26 | |
Average | 0.80 | 46.54 | 1.24 | 46.31 | 0.57 | 49.06 | 1.38 | 54.53 |
Class | TestSequence | 1 | 2 | Proposed | |||
---|---|---|---|---|---|---|---|
BDBR | TS | BDBR | TS | BDBR | TS | ||
A1 | Campfire | 1.67 | 44.29 | 0.38 | 26.82 | 1.76 | 60.68 |
Drums | 1.51 | 51.96 | 0.57 | 28.51 | 1.64 | 58.72 | |
FoodMarket4 | 1.69 | 46.67 | 0.29 | 21.73 | 1.77 | 59.21 | |
A2 | TrafficFlow | 1.42 | 52.1 | 0.42 | 25.96 | 1.76 | 58.37 |
DaylightRoad | 1.29 | 54.09 | 0.25 | 29.87 | 1.67 | 57.12 | |
ParkRunning3 | 1.35 | 50.78 | 0.37 | 27.25 | 1.69 | 57.93 | |
B | BasketballDrive | 1.65 | 50.34 | 0.59 | 24.23 | 1.75 | 54.26 |
BQTerrace | 1.29 | 49.88 | 0.47 | 28.55 | 1.31 | 53.49 | |
Cactus | 1.31 | 50.02 | 0.61 | 27.09 | 1.36 | 56.91 | |
MarketPlace | 0.41 | 49.86 | 0.56 | 24.62 | 0.74 | 54.71 | |
RitualDance | 0.72 | 50.96 | 0.39 | 20.17 | 0.89 | 55.85 | |
C | BasketballDrill | 1.02 | 47.93 | 0.65 | 29.76 | 1.58 | 50.24 |
BQMall | 0.75 | 43.21 | 0.73 | 27.93 | 0.81 | 52.57 | |
PartyScene | 0.43 | 44.59 | 0.56 | 26.36 | 0.62 | 47.68 | |
RaceHorsesC | 0.68 | 42.07 | 0.62 | 25.47 | 0.75 | 51.93 | |
D | BasketballPass | 1.41 | 41.35 | 0.52 | 25.61 | 1.46 | 50.32 |
BlowingBubbles | 1.13 | 40.28 | 0.66 | 29.32 | 1.23 | 47.64 | |
BQSquare | 1.32 | 43.67 | 0.70 | 27.98 | 1.41 | 49.91 | |
RaceHorses | 1.03 | 43.91 | 0.64 | 27.53 | 1.09 | 50.79 | |
E | FourPeople | 1.59 | 53.28 | 0.49 | 24.02 | 1.78 | 56.37 |
Johnny | 1.64 | 50.11 | 0.83 | 29.68 | 1.73 | 57.68 | |
KristenAndSara | 1.52 | 48.62 | 0.67 | 25.90 | 1.65 | 57.26 | |
Average | 1.22 | 47.73 | 0.54 | 26.56 | 1.38 | 54.53 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, W.; Li, C.; Zhang, Q. Rapid CU Partitioning and Joint Intra-Frame Mode Decision Algorithm. Electronics 2024, 13, 3465. https://doi.org/10.3390/electronics13173465
Song W, Li C, Zhang Q. Rapid CU Partitioning and Joint Intra-Frame Mode Decision Algorithm. Electronics. 2024; 13(17):3465. https://doi.org/10.3390/electronics13173465
Chicago/Turabian StyleSong, Wenjun, Congxian Li, and Qiuwen Zhang. 2024. "Rapid CU Partitioning and Joint Intra-Frame Mode Decision Algorithm" Electronics 13, no. 17: 3465. https://doi.org/10.3390/electronics13173465
APA StyleSong, W., Li, C., & Zhang, Q. (2024). Rapid CU Partitioning and Joint Intra-Frame Mode Decision Algorithm. Electronics, 13(17), 3465. https://doi.org/10.3390/electronics13173465