Next Article in Journal
Efficient Two-Stage Max-Pooling Engines for an FPGA-Based Convolutional Neural Network
Next Article in Special Issue
Improved Test Input Prioritization Using Verification Monitors with False Prediction Cluster Centroids
Previous Article in Journal
Design and Implementation of an Internet-of-Things-Enabled Smart Meter and Smart Plug for Home-Energy-Management System
Previous Article in Special Issue
Optical and SAR Image Registration Based on Multi-Scale Orientated Map of Phase Congruency
 
 
Article
Peer-Review Record

Transform-Based Feature Map Compression Method for Video Coding for Machines (VCM)

Electronics 2023, 12(19), 4042; https://doi.org/10.3390/electronics12194042
by Minhun Lee 1, Seungjin Park 1, Seoung-Jun Oh 2, Younhee Kim 3, Se Yoon Jeong 3, Jooyoung Lee 3 and Donggyu Sim 1,*
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Electronics 2023, 12(19), 4042; https://doi.org/10.3390/electronics12194042
Submission received: 23 August 2023 / Revised: 20 September 2023 / Accepted: 22 September 2023 / Published: 26 September 2023
(This article belongs to the Special Issue Image/Video Processing and Encoding for Contemporary Applications)

Round 1

Reviewer 1 Report

This paper presented a transform-based feature map compression approach for VCM, which achieved a considerable improvement compared with the anchor in the MPEG-VCM test. The proposed method achieved the same performance in machine vision tasks while keeping a lower bit rate.

 

1. Matrix used in PCA like TGB and TGM do not specify how they were obtained through training.

2. As shown in Figure 9 (b), the proposed method is inferior to the benchmark model at high bit rates.

 

3. The process of combining the low-level feature maps in Figure 4 was not verified by ablation experiments.

4. More related references (below) prefer to be added: 

i) Chen, Sien, Jian Jin, Lili Meng, Weisi Lin, Zhuo Chen, Tsui-Shan Chang, Zhengguang Li, and Huaxiang Zhang. "A new image codec paradigm for human and machine uses." arXiv preprint arXiv:2112.10071 (2021).

ii) Chen, Zhuo, Kui Fan, Shiqi Wang, Lingyu Duan, Weisi Lin, and Alex Chichung Kot. "Toward intelligent sensing: Intermediate deep feature compression." IEEE Transactions on Image Processing 29 (2019): 2230-2243.

 

Author Response

Please see the attachment (Response to Reviewer 1 Comments).

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper proposes a transform-based feature map compression method for video coding for machines (VCM) to enhance machine recognition through video information compression. It utilizes a principal component analysis (PCA)-based compression methodology for multi-level feature maps extracted from the feature pyramid network (FPN) structure. The proposed method eliminates the need for a separate PCA process by employing a generalized basis matrix and mean vector derived from channel correlations. It achieves further compression by amalgamating high-dimensional feature maps, taking advantage of spatial redundancy. The proposed VCM encoder does not incur any compression loss and only requires compressing the coefficients for each feature map using versatile video coding (VVC). Experimental results demonstrate superior performance over previous PCA-based feature map compression methods, achieving an 89.3% BD-rate reduction for instance segmentation tasks.

The abstract is informative and well-described. The introduction is relevant. The authors describe previous works and their limitations compared with the proposed method. The objectives and the structure of the paper are also described. The second section provides an adequate description of the proposed transform-based feature map compression method. Figures and tables are very useful. Please check that all figures are inserted into the main text close to their first citation. If possible to increase the size of the figures to become more readable. Experimental results are well-analyzed using informative tables and figures. In the conclusions, the paper does not explicitly mention any limitations or drawbacks of the proposed method. It would be useful to mention some limitations. In addition, it would be useful to highlight the importance of the research and briefly mention the implications of the proposed method.

 

According to my point of view, this paper can be published even in its current form.

Author Response

Please see the attachment (Response to Reviewer 2 Comments).

Author Response File: Author Response.pdf

Reviewer 3 Report

The authors propose a principal component analysis (PCA)-based compression methodology for multi-level feature maps, extracted from the feature pyramid network (FPN) structure. The paper is interesting and in a good way. However, the manuscript must be proofread by a native speaker. Also, the paper needs to present the related work, and a discussion with the comparison of the results. It is difficult to understand the improvements of the proposed method.

The manuscript contained some grammatical mistakes. It must be proofread by a native speaker.

Author Response

Please see the attachment (Response to Reviewer 3 Comments).

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

The authors considered the previous comments, and the paper can be accepted

Back to TopTop