A Near-Field Area Object Detection Method for Intelligent Vehicles Based on Multi-Sensor Information Fusion
Round 1
Reviewer 1 Report
My detailed comments are as follows:
1. There are a lot of theoretical discussions in the article, the actual workload of the subject is not reflected.
2. In the second chapter, the explanation of the formula and the relevant parameters in the formula is placed in a paragraph, resulting in the unclear presentation of the formula.
3. Section 2.2 involves multiple mechanisms. The relationship and connection between the mechanisms are not clearly given, and the links between the summaries are not close.
4. The setting basis of relevant parameters in experimental analysis is not given.
5. The experimental verification part pays attention to the presentation of the test results, and lacks the inspection process.
6. The name repetition of Table 2 and Table 3。
Author Response
Thank you for your valuable suggestions on our work, which have greatly contributed to the improvement of our paper. We have revised the manuscript according to your suggestions, please see the attachment.Author Response File: Author Response.docx
Reviewer 2 Report
The article focuses on the design of a method for target detection in intelligent vehicle applications. The authors extensively described the proposed methodology, considering its efficacy compared with other state-of-the-art techniques. The article is interesting and well written. I have only a few remarks to consider by the authors.
1. The title seems to be too complicated. Firstly, the near-field area is incomprehensible. The authors evaluated their method on the nuScenes dataset, which is not specially tailored for near-field area detection. What is more, the authors compared their technique with the CenterFusion method. This method is also not developed for near-fielded detection but considers typical road scenarios. Secondary, the Microcirculation Urban Road term is also incomprehensible. As mentioned before, the authors used the general database. Consequently, if the authors want to use the microcirculation or near-field area cases, they should point out how those cases were analysed.
2. The authors compared their method with the original algorithm Centerfusion and presented the obtained improvement. However, the Centerfusion algorithm does not utilise a lidar sensor. The authors should stress this fact in their paper.
3. In Table 2, the authors showed a performance comparison of target detection for some methods. Since the authors selected six types of objects in their study, the other methods’ performance is also calculated for those types of objects?
Minor remarks
Line 408 – the % character seems to be more suitable;
Line 412 – should be F-PointNet;
Figure 5 – should be HeatMap;
Author Response
Thank you for your valuable suggestions on our work, which have greatly contributed to the improvement of our paper. We have revised the manuscript according to your suggestions, please see the attachment.
Author Response File: Author Response.docx
Reviewer 3 Report
Title: Near-field Area Object Detection Based on Multi-Sensor Infor3 mation Fusion for Microcirculation Urban Road
Authors: Yanqiu Xiao, Shiao Yin, Guangzhen Cui, Lei Yao, Zhanpeng Fang and Weili Zhang
General comments:
Manuscript entitled with “Near-field Area Object Detection Based on Multi-Sensor Infor3 mation Fusion for Microcirculation Urban Road” presents method for near-field object detection based on multi-sensor fusion for Microcirculation Urban Road. Overall manuscript highlight the one of the important issue related to efficient environment perception for autonomous vehicles. And produced enough evidence to prove the significance of proposed work. Manuscript written well and organized each sections clearly to convey the scope of research. All figures and diagrams were clearly presented to depict the actual intentions.
But its quite not clear that scope of the proposed method limited to object detection in the near-field . There are many method based on Feature-Feature fusion approach of multi-modal sensor proves to be promising for autonomous vehicles perception requirement, in the proposed approach authors focused only near object.
Major comments:
1. In the introduction section, it is strongly recommended to highlight the specific contribution that this research shares. Since its quite difficult to understand in the current manuscript.
2. More importantly, In the Figure 2 (a,b,c) authors presented existing approaches to fuse the multiple sensor data for the object detection and highlighted the advantageous and dis advantageous of each approach very clearly. Assuming proposed method follows Figure 2(c) feature–feature fusion approach, it very important to ensure that installed sensor covers the common field of view. Because scenes shown in Figure 1, Figure 4 looks like LiDAR may not able to acquire nearest object to ego-vehicle due to its blind spot issue and it is difficult to find feature correspondence between LiDAR and Camera. Though proposed approach mainly focused on near-field object detection, experimental results and data appears to be used only single LiDAR RADAR and camera sensor information. Especially detecting objects such as traffic cone near to vehicles requires multiple LiDAR or Camera installed different orientation. It is not clear that, does proposed approach works even there is absence of data or partial data from one of sensor.
Hence I would like to suggest authors to mention actual (minimum) detectable range that assumed to be near-field where listed objects can be detectable using proposed approach.
3. In the subsection 2.3 Feature Information Fusion it is mentioned that point cloud feature map is mapped to image space for geometric alignment. Does this mapping includes LiDAR to Camera calibration parameters as well ? Otherwise it would be very challenging to define ROI for LiDAR and Camera.
4. Also in the line # 330 it s mentioned that “As shown in Figure 10, each square in the square represents a PointPillar feature and has the mean a coordinate of points accumulated into the column” But in the Figure 10 it is mentioned that camera feature. So its quite confusing to understand which features actually each square in the square represents.
5. It would be nice if time performance analysis included along.
Author Response
Thank you for your valuable suggestions on our work, which have greatly contributed to the improvement of our paper. We have revised the manuscript according to your suggestions, please see the attachment.
Author Response File: Author Response.docx
Round 2
Reviewer 3 Report
Comments for author File: Comments.pdf
Author Response
Thank you for your valuable suggestions on our work, which have greatly contributed to the improvement of our paper. We have revised the manuscript according to your suggestions, please see the attachment.
Author Response File: Author Response.docx