A Millimeter-Wave Radar-Aided Vision Detection Method for Water Surface Small Object Detection
Abstract
:1. Introduction
- Extrinsic Calibration. To perform decision-level fusion, the spatial relationship between the mmWave radar and camera needs to be found, which is referred to as extrinsic calibration. Due to the characteristics of glittery and sparsity of mmWave radar point clouds, extrinsic calibration between mmWave radar and cameras typically requires specific markers, and the calibration process is usually complex. Current extrinsic calibration is mainly conducted offline with human assistance. However, the positions of sensors on the platform may change due to vibrations, shocks, or structural deformations of USVs, leading to some degree of variation in the extrinsic parameter between the mmWave radar and the camera.
- Data association. Traditional methods tend to manually craft various distance metrics to represent the similarities between vision and mmWave radar data. However, these manually crafted metrics are not adaptable when the data from different sensors degrade, and setting the parameters is also challenging.
- We propose a new mmWave radar-aided visual small object detection method.
- We propose a new image–radar association model based on the metric learning model, which can achieve a robust association of mmWave radar data and images with inaccurate extrinsic parameters to some degree.
- We test the proposed method on real-world data, and the results show that our method achieves significantly better performance than current vision detection methods.
2. Related Works
2.1. Object Detection on Water Surfaces
2.2. Visual–Radar Fusion Detection
3. Our Method
3.1. Network Overview
3.2. Detection Stage
3.2.1. Vision-Based Detection
3.2.2. Radar-Based Detection
3.3. Fusion Association Stage
3.4. Loss Function
4. Experiment and Evaluation
4.1. Dataset, Evaluation Metric, and Baseline
4.2. Training Details
4.3. Quantitative Evaluation
4.4. Robustness Analysis
4.5. Ablation Analysis
4.6. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
CFAR | Constant false alarm rate |
DBSCAN | Density-based spatial clustering of applications with noise |
DOA | Direction of arrival |
FC | Full connect |
FFT | Fast Fourier Transformation |
FMCW | Frequency-modulated continuous wave |
FPS | Farthest point sampling |
GNSS | Global navigation satellite system |
IMU | Inertial measurement unit |
mAP | Mean of average precision |
MLP | Multi-layer perception |
mmWave | Millimeter wave radar |
RDM | Range–Doppler matrix |
USV | Unmanned surface vehicle |
References
- Wang, W.; Gheneti, B.; Mateos, L.A.; Duarte, F.; Ratti, C.; Rus, D. Roboat: An autonomous surface vehicle for urban waterways. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Venetian Macao, Macau, 4–8 November 2019; pp. 6340–6347. [Google Scholar]
- Chang, H.C.; Hsu, Y.L.; Hung, S.S.; Ou, G.R.; Wu, J.R.; Hsu, C. Autonomous Water Quality Monitoring and Water Surface Cleaning for Unmanned Surface Vehicle. Sensors 2021, 21, 1102. [Google Scholar] [CrossRef]
- Zhu, J.; Yang, Y.; Cheng, Y. SMURF: A Fully Autonomous Water Surface Cleaning Robot with A Novel Coverage Path Planning Method. J. Mar. Sci. Eng. 2022, 10, 1620. [Google Scholar] [CrossRef]
- Wu, Y.; Wang, Y.; Zhang, S.; Ogai, H. Deep 3D object detection networks using LiDAR data: A review. IEEE Sens. J. 2020, 21, 1152–1171. [Google Scholar] [CrossRef]
- Carballo, A.; Lambert, J.; Monrroy, A.; Wong, D.; Narksri, P.; Kitsukawa, Y.; Takeuchi, E.; Kato, S.; Takeda, K. LIBRE: The multiple 3D lidar dataset. In Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA, 19 October–13 November 2020; pp. 1094–1101. [Google Scholar]
- Patole, S.M.; Torlak, M.; Wang, D.; Ali, M. Automotive Radars: A Review of Signal Processing Techniques. IEEE Signal Process. Mag. 2017, 34, 22–35. [Google Scholar] [CrossRef]
- Brodeski, D.; Bilik, I.; Giryes, R. Deep radar detector. In Proceedings of the 2019 IEEE Radar Conference (RadarConf), Boston, MA, USA, 22–26 April 2019; pp. 1–6. [Google Scholar]
- Hammedi, W.; Ramirez-Martinez, M.; Brunet, P.; Senouci, S.M.; Messous, M.A. Deep learning-based real-time object detection in inland navigation. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Waikoloa, HI, USA, 9–13 December 2019; pp. 1–6. [Google Scholar]
- Moosbauer, S.; Konig, D.; Jakel, J.; Teutsch, M. A benchmark for deep learning based object detection in maritime environments. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019. [Google Scholar]
- Prasad, D.K.; Rajan, D.; Rachmawati, L.; Rajabally, E.; Quek, C. Video processing from electro-optical sensors for object detection and tracking in a maritime environment: A survey. IEEE Trans. Intell. Transp. Syst. 2017, 18, 1993–2016. [Google Scholar] [CrossRef]
- Zhou, Z.; Yu, S.; Liu, K. A Real-time Algorithm for Visual Detection of High-speed Unmanned Surface Vehicle Based on Deep Learning. In Proceedings of the 2019 IEEE International Conference on Signal, Information and Data Processing (ICSIDP), Chongqing, China, 11–13 December 2019; pp. 1–5. [Google Scholar]
- Zhang, W.; Gao, X.z.; Yang, C.f.; Jiang, F.; Chen, Z.y. A object detection and tracking method for security in intelligence of unmanned surface vehicles. J. Ambient. Intell. Humaniz. Comput. 2020, 13, 1279–1291. [Google Scholar] [CrossRef]
- Li, Y.; Guo, J.; Guo, X.; Liu, K.; Zhao, W.; Luo, Y.; Wang, Z. A novel target detection method of the unmanned surface vehicle under all-weather conditions with an improved YOLOV3. Sensors 2020, 20, 4885. [Google Scholar] [CrossRef] [PubMed]
- Wu, Y.; Qin, H.; Liu, T.; Liu, H.; Wei, Z. A 3D object detection based on multi-modality sensors of USV. Appl. Sci. 2019, 9, 535. [Google Scholar] [CrossRef]
- Cardillo, E.; Ferro, L. Multi-frequency analysis of microwave and millimeter-wave radars for ship collision avoidance. In Proceedings of the 2022 Microwave Mediterranean Symposium (MMS), Pizzo Calabro, Italy, 9–13 May 2022; pp. 1–4. [Google Scholar]
- Im, S.; Kim, D.; Cheon, H.; Ryu, J. Object Detection and Tracking System with Improved DBSCAN Clustering using Radar on Unmanned Surface Vehicle. In Proceedings of the 2021 21st International Conference on Control, Automation and Systems (ICCAS), Jeju, Republic of Korea, 12–15 October 2021; pp. 868–872. [Google Scholar]
- Ha, J.S.; Im, S.R.; Lee, W.K.; Kim, D.H.; Ryu, J.K. Radar based Obstacle Detection System for Autonomous Unmanned Surface Vehicles. In Proceedings of the 2021 21st International Conference on Control, Automation and Systems (ICCAS), Jeju, Republic of Korea, 12–15 October 2021; pp. 863–867. [Google Scholar]
- Stanislas, L.; Dunbabin, M. Multimodal sensor fusion for robust obstacle detection and classification in the maritime RobotX challenge. IEEE J. Ocean. Eng. 2018, 44, 343–351. [Google Scholar] [CrossRef]
- Long, Y.; Morris, D.; Liu, X.; Castro, M.; Chakravarty, P.; Narayanan, P. Radar-camera pixel depth association for depth completion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 12507–12516. [Google Scholar]
- Nobis, F.; Geisslinger, M.; Weber, M.; Betz, J.; Lienkamp, M. A deep learning-based radar and camera sensor fusion architecture for object detection. In Proceedings of the 2019 Sensor Data Fusion: Trends, Solutions, Applications (SDF), Bonn, Germany, 15–17 October 2019; pp. 1–7. [Google Scholar]
- Nabati, R.; Qi, H. Rrpn: Radar region proposal network for object detection in autonomous vehicles. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 3093–3097. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Cheng, Y.; Xu, H.; Liu, Y. Robust Small Object Detection on the Water Surface Through Fusion of Camera and Millimeter Wave Radar. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 10–17 October 2021; pp. 15263–15272. [Google Scholar]
- Chadwick, S.; Maddern, W.; Newman, P. Distant vehicle detection using radar and vision. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–4 June 2019; pp. 8311–8317. [Google Scholar]
- Li, L.q.; Xie, Y.l. A feature pyramid fusion detection algorithm based on radar and camera sensor. In Proceedings of the 2020 15th IEEE International Conference on Signal Processing (ICSP), Beijing, China, 6–9 December 2020; Volume 1, pp. 366–370. [Google Scholar]
- Nabati, R.; Qi, H. Centerfusion: Center-based radar and camera fusion for 3D object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Online, 5–9 January 2021; pp. 1527–1536. [Google Scholar]
- Chang, S.; Zhang, Y.; Zhang, F.; Zhao, X.; Huang, S.; Feng, Z.; Wei, Z. Spatial attention fusion for obstacle detection using mmwave radar and vision sensor. Sensors 2020, 20, 956. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Jha, H.; Lodhi, V.; Chakravarty, D. Object detection and identification using vision and radar data fusion system for ground-based navigation. In Proceedings of the 2019 6th International Conference on Signal Processing and Integrated Networks (SPIN), Noida, India, 7–8 March 2019; pp. 590–593. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Domhof, J.; Kooij, J.F.; Gavrila, D.M. An extrinsic calibration tool for radar, camera and lidar. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 8107–8113. [Google Scholar]
- Ultralytics. YOLO-v5. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 20 August 2023).
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Bäcklund, H.; Hedblom, A.; Neijman, N. A density-based spatial clustering of application with noise. Data Min. TNM033 2011, 33, 11–30. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst. 2017, 30, 5105–5114. [Google Scholar]
- Kuhn, H.W. The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 1955, 2, 83–97. [Google Scholar] [CrossRef]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Schroff, F.; Kalenichenko, D.; Philbin, J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 815–823. [Google Scholar]
- Cheng, Y.; Zhu, J.; Jiang, M.; Fu, J.; Pang, C.; Wang, P.; Sankaran, K.; Onabola, O.; Liu, Y.; Liu, D.; et al. FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10953–10962. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Qi, C.R.; Litany, O.; He, K.; Guibas, L.J. Deep hough voting for 3d object detection in point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9277–9286. [Google Scholar]
- Danzer, A.; Griebel, T.; Bach, M.; Dietmayer, K. 2d car detection in radar data with pointnets. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 17–30 October 2019; pp. 61–66. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
Method | mAP (IoU = 0.35, %) | FPS |
---|---|---|
YOLOv5-l [32] | 74.66 | 29 |
Cascade-RCNN [42] | 78.36 | 17 |
Faster-RCNN [41] | 74.34 | 19 |
Swin-Transformer [43] | 77.33 | 15 |
Ours | 81.41 | 29 |
Modality | Method | mAP (IoU = 0.35, %) |
---|---|---|
Radar | VoteNet [44] | 45.24 |
Danzer et al. [45] | 32.65 | |
Vision + Radar | CRF-Net [20] | 74.35 |
Li et al. [25] | 77.23 | |
Jha et al. [29] | 77.98 | |
RISFNet [23] | 83.25 | |
Ours | 81.41 |
Method | mAP (IoU = 0.35, %) | FPS |
---|---|---|
(with Image–Radar Association Model) | ||
YOLOv5-l [32] | 81.41 (+6.75) | 29 |
Cascade-RCNN [42] | 83.62 (+5.26) | 15 |
Faster-RCNN [41] | 79.53 (+5.19) | 17 |
Swin-Transformer [43] | 82.42 (+5.09) | 19 |
Method | mAP (IoU = 0.35, %) |
---|---|
Using origin radar data | 81.41 |
Using radar data with slight bias | 80.83 |
Using radar data with large bias | 56.29 |
Method | mAP (IoU = 0.35, %) |
---|---|
Without double prediction heads | 79.85 |
Without image–radar association model | 78.17 |
Our method | 81.41 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhu, J.; Yang, Y.; Cheng, Y. A Millimeter-Wave Radar-Aided Vision Detection Method for Water Surface Small Object Detection. J. Mar. Sci. Eng. 2023, 11, 1794. https://doi.org/10.3390/jmse11091794
Zhu J, Yang Y, Cheng Y. A Millimeter-Wave Radar-Aided Vision Detection Method for Water Surface Small Object Detection. Journal of Marine Science and Engineering. 2023; 11(9):1794. https://doi.org/10.3390/jmse11091794
Chicago/Turabian StyleZhu, Jiannan, Yixin Yang, and Yuwei Cheng. 2023. "A Millimeter-Wave Radar-Aided Vision Detection Method for Water Surface Small Object Detection" Journal of Marine Science and Engineering 11, no. 9: 1794. https://doi.org/10.3390/jmse11091794
APA StyleZhu, J., Yang, Y., & Cheng, Y. (2023). A Millimeter-Wave Radar-Aided Vision Detection Method for Water Surface Small Object Detection. Journal of Marine Science and Engineering, 11(9), 1794. https://doi.org/10.3390/jmse11091794