Intelligent Perception System of Robot Visual Servo for Complex Industrial Environment
Abstract
:1. Introduction
- The visual constraint conditions in the actual production environment are evaluated, and various constraint conditions are quickly identified based on visual characteristics, which effectively improves the detection accuracy while ensuring the real-time performance of the calculations.
- By using YOLO-v2 as the main network model in combination with the ROI pooling structure, densely connected convolutional networks, and embedded deep-dense modules, the proposed network can make full use of high-resolution features of an image to realize the multiplexing and fusion of the shallow and deep features of the image.
- Based on the requirements for the real-time processing of environmental information by environmental perception systems, this paper designs a joint architecture for detection and segmentation; i.e., target detection and semantic segmentation share the same feature extraction network through joint training while reducing the inference time, thus effectively improving the performance of subtasks.
2. Classification of Visual Objects in the Industrial Environment
Basic Interaction Matrix of Visual Servo
3. Object Detection Model
3.1. YOLO v2 Dense Detection Model
- In the YOLO v2-DENSE network, use the 21-layer feature map as the input of after normalization and function active rectified linear units (RLU), convolve into 256 feature maps with 256 1 × 1 convolution kernels; then, through normalization and RLU operation, use 128 3 × 3 convolution kernels to obtain 128 feature maps ; finally, stitch and into 640 feature maps, and use as the input of .
- after normalization and activation function RLU convolve into 256 feature maps with 256 1 × 1 convolution kernels; then, through normalization and RLU operation, use 128 3 × 3 convolution kernels to obtain 128 feature maps ; then, merge and into 768 feature maps, and use as the input of .
- By analogy, the deep-feature map of channels are obtained. The DENSE Net makes the input of layer directly affect all subsequent layers, and its output is expressed as:
3.2. YOLO-v2 Algorithm Architecture Integrating ROI
4. Experimental Results Analysis
4.1. Experimental Platform Construction
4.2. Training Parameters Configuration
4.3. Experimental Results and Analysis
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Li, S.; Li, D.; Zhang, C.; Wan, J.; Xie, M. RGB-D Image Processing Algorithm for Target Recognition and Pose Estimation of Visual Servo System. Sensors 2020, 20, 430. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ghasemi, A.; Li, P.; Xie, W.F. Adaptive Switch Image-based Visual Servoing for Industrial Robots. Int. J. Control Autom. Syst. 2019, 18, 1324–1334. [Google Scholar] [CrossRef]
- Lee, D.; Kim, G.; Kim, D.; Myung, H.; Choi, H.-T. Vision-based object detection and tracking for autonomous navigation of underwater robots. Ocean Eng. 2012, 48, 59–68. [Google Scholar] [CrossRef]
- Bo, T.; Zeyu, G.; Han, D. Research progress of robot calibration-free visual servo control. Chin. J. Theor. Appl. Mech. 2016, 48, 767–783. [Google Scholar]
- Sivaraman, S.; Trivedi, M. A General Active-Learning Framework for On-Road Vehicle Recognition and Tracking. IEEE Trans. Intell. Transp. Syst. 2010, 11, 267–276. [Google Scholar] [CrossRef] [Green Version]
- Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-Up Robust Features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
- Burges, C.J. A Tutorial on Support Vector Machines for Pattern Recognition. Data Min. Knowl. Discov. 1998, 2, 121–167. [Google Scholar] [CrossRef]
- Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv 2013, arXiv:1312.6229. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Van De Sande, K.E.A.; Uijlings, J.R.R.; Gevers, T.; Smeulders, A.W.M. Segmentation as selective search for object recognition. In Proceedings of the 2011 International Conference on Computer Vision (CVPR), Colorado Springs, CO, USA, 20–25 June 2011; pp. 1879–1886. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Montréal, QC, Canada, 7–12 December 2015; pp. 91–99. [Google Scholar]
- He, K.; Gkioxari, G.; Dollar, P.; Girshick, R.B. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Doulamis, A.; Doulamis, N.; Ntalianis, K.; Kollias, S. An efficient fully unsupervised video object segmentation scheme using an adaptive neural-network classifier architecture. IEEE Trans. Neural Netw. 2003, 14, 616–630. [Google Scholar] [CrossRef] [PubMed]
- Martinez-Martin, E.; Del Pobil, A.P. Object Detection and Recognition for Assistive Robots: Experimentation and Implementation. IEEE Robot. Autom. Mag. 2017, 24, 123–138. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Ullah, S.; Kim, D.-H. Lightweight Driver Behavior Identification Model with Sparse Learning on In-Vehicle CAN-BUS Sensor Data. Sensors 2020, 20, 5030. [Google Scholar] [CrossRef] [PubMed]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Interventions (MICCAI), Munich, Germany, 5–9 October 2015. [Google Scholar]
- Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
- Kim, B.; Ye, J.C. Mumford–Shah Loss Functional for Image Segmentation with Deep Learning. IEEE Trans. Image Process. 2019, 29, 1856–1866. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, T.; Zhang, K.; Li, W.; Huang, Q. Research on ROI Algorithm of Ship Image Based on Improved YOLO. In Proceedings of the 2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM), Dublin, Ireland, 16–18 October 2019; pp. 130–133. [Google Scholar]
- Li, S.; Tao, F.; Shi, T.; Kuang, J. Improvement of YOLOv3 network based on ROI. In Proceedings of the 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chengdu, China, 20–22 December 2019; Volume 1, pp. 2590–2596. [Google Scholar]
- Morera, Á.; Sánchez, Á.; Moreno, A.; Sappa, A.D.; Vélez, J.F. SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities. Sensors 2020, 20, 4587. [Google Scholar] [CrossRef] [PubMed]
- Chollet, F. Keras. Available online: https://keras.io (accessed on 9 September 2020).
- Tzutalin. LabelImg. Available online: https://github.com/tzutalin/labelImg (accessed on 14 September 2020).
- Singh, S.P.; Wang, L.; Gupta, S.; Goli, H.; Padmanabhan, P.; Gulyás, B. 3D Deep Learning on Medical Images: A Review. Sensors 2020, 20, 5097. [Google Scholar] [CrossRef] [PubMed]
- Tien, K.-Y.; Samani, H.; Lui, J.H. A survey on image processing in noisy environment by fuzzy logic, image fusion, neural network, and non-local means. In Proceedings of the 2017 International Automatic Control Conference (CACS), Pingtung, Taiwan, 12–15 November 2017; pp. 1–6. [Google Scholar]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Available online: Tensorflow.org (accessed on 14 September 2020).
- Jia, Y.; Shelhamer, E.; Donahue, J.; Karayev, S.; Long, J.; Girshick, R.; Guadarrama, S.; Darrell, T. Caffe. Available online: https://github.com/BVLC/caffe (accessed on 16 September 2020).
YOLO-v2-ROI | Match Index | Shape | Color | Texture | Edge | Area |
---|---|---|---|---|---|---|
Target object | √ | √ | √ | √ | √ | |
Interfering object 1 | √ | √ | × | √ | √ | |
Interfering object 2 | √ | × | √ | √ | √ | |
Interfering object 3 | × | × | × | √ | √ | |
Interfering object 4 | × | √ | √ | × | × | |
Interfering object 5 | √ | × | √ | √ | √ | |
Interfering object 6 | × | × | × | × | × |
YOLO-v2-ROI | Parameter Value |
---|---|
Learning rate | 0.001 |
Learning attenuation strategy | Steps |
Sample update parameters | 5000 |
Momentum | 0.76 |
Weight decay regularization parameter | 0.0001 |
The maximum number of iterations | 4500 |
Learning rate change ratio during the experiment | 0.1 0.01 0.001 |
Enter Size (Pixel) | Iterations Number | Accuracy | Detection Speed (Frame/s) | |
---|---|---|---|---|
YOLO-v2 | 416 × 416 | 4500 | 81.45% | 25 |
YOLO-v2-DENSE | 416 × 416 | 4500 | 83.51% | 27 |
YOLO-v2-ROI | 416 × 416 | 4500 | 93.23% | 36 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Luo, Y.; Li, S.; Li, D. Intelligent Perception System of Robot Visual Servo for Complex Industrial Environment. Sensors 2020, 20, 7121. https://doi.org/10.3390/s20247121
Luo Y, Li S, Li D. Intelligent Perception System of Robot Visual Servo for Complex Industrial Environment. Sensors. 2020; 20(24):7121. https://doi.org/10.3390/s20247121
Chicago/Turabian StyleLuo, Yongchao, Shipeng Li, and Di Li. 2020. "Intelligent Perception System of Robot Visual Servo for Complex Industrial Environment" Sensors 20, no. 24: 7121. https://doi.org/10.3390/s20247121
APA StyleLuo, Y., Li, S., & Li, D. (2020). Intelligent Perception System of Robot Visual Servo for Complex Industrial Environment. Sensors, 20(24), 7121. https://doi.org/10.3390/s20247121