Tell Me, What Do You See?—Interpretable Classification of Wiring Harness Branches with Deep Neural Networks
Abstract
:1. Introduction
2. Related work
3. Materials and Methods
3.1. Industrial Context
3.2. Dataset
3.3. Evaluated Models
- RGB—a model that takes as an input an RGB image only;
- Depth—a model that takes as an input an 8-bit depth image only;
- Depth Jet—a model that takes as an input a 24-bit depth image colored with the jet color-map;
- RGBD—a pair of RGB and Depth models with the logits weighting;
- RGBD jet—a pair of RGB and Depth Jet models with the logits weighting;
- RGBD early fusion—a model that takes as an input an RGB image concatenated with a depth image.
3.4. Class-Based and Class-Agnostic Saliency Map Generation
4. Experiments
4.1. Model Accuracy Evaluation
4.2. Analysis of the Models Reliability
4.3. Analysis of the Dataset
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Mallick, A.; del Pobil, A.P.; Cervera, E. Deep Learning based Object Recognition for Robot picking task. In Proceedings of the 12th International Conference on Ubiquitous Information Management and Communication, Langkawi, Malaysia, 5–7 January 2018; pp. 1–9. [Google Scholar] [CrossRef]
- Voulodimos, A.; Doulamis, N.; Doulamis, A.; Protopapadakis, E. Deep Learning for Computer Vision: A Brief Review. Comput. Intell. Neurosci. 2018, 2018, 7068349. [Google Scholar] [CrossRef] [PubMed]
- Nair, A.; Chen, D.; Agrawal, P.; Isola, P.; Abbeel, P.; Malik, J.; Levine, S. Combining self-supervised learning and imitation for vision-based rope manipulation. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 2146–2153. [Google Scholar] [CrossRef] [Green Version]
- Guo, Y.; Liu, Y.; Oerlemans, A.; Lao, S.; Wu, S.; Lew, M.S. Deep learning for visual understanding: A review. Neurocomputing 2016, 187, 27–48. [Google Scholar] [CrossRef]
- Hohm, K.; Hofstede, H.M.; Tolle, H. Robot assisted disassembly of electronic devices. In Proceedings of the 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No. 00CH37113), Takamatsu, Japan, 31 October–5 November 2000; IEEE: New York, NY, USA, 2000; Volume 2, pp. 1273–1278. [Google Scholar] [CrossRef]
- Lee, W.; Cao, K. Application of Machine Vision to Inspect a Wiring Harness. In Proceedings of the 2019 IEEE International Conference on Industrial Cyber Physical Systems (ICPS), Taipei, Taiwan, 6–9 May 2019; pp. 457–460. [Google Scholar] [CrossRef]
- Yumbla, F.; Abeyabas, M.; Luong, T.; Yi, J.S.; Moon, H. Preliminary Connector Recognition System Based on Image Processing for Wire Harness Assembly Tasks. In Proceedings of the 2020 20th International Conference on Control, Automation and Systems (ICCAS), Busan, Korea, 13–16 October 2020; pp. 1146–1150. [Google Scholar] [CrossRef]
- Parmar, P. Use of computer vision to detect tangles in tangled objects. In Proceedings of the 2013 IEEE Second International Conference on Image Information Processing (ICIIP-2013), Shimla, India, 9–11 December 2013; pp. 39–44. [Google Scholar] [CrossRef] [Green Version]
- Mohandoss, R.; Ganapathy, V.; Rengaraj, R.; Rohit, D. Image processing based automatic color inspection and detection of colored wires in electric cables. Int. J. Appl. Eng. Res. 2017, 12, 611–617. [Google Scholar]
- Shi, G.; Jian, W. Wiring harness assembly detection system based on image processing technology. In Proceedings of the 2011 International Conference on Electronics, Communications and Control (ICECC), Ningbo, China, 9–11 September 2011; pp. 2397–2400. [Google Scholar] [CrossRef]
- De Gregorio, D.; Zanella, R.; Palli, G.; Pirozzi, S.; Melchiorri, C. Integration of Robotic Vision and Tactile Sensing for Wire-Terminal Insertion Tasks. IEEE Trans. Autom. Sci. Eng. 2019, 16, 585–598. [Google Scholar] [CrossRef] [Green Version]
- Busi, M.; Cirillo, A.; De Gregorio, D.; Indovini, M.; De Maria, G.; Melchiorri, C.; Natale, C.; Palli, G.; Pirozzi, S. The WIRES Experiment: Tools and Strategies for Robotized Switchgear Cabling. Procedia Manuf. 2017, 11, 355–363. [Google Scholar] [CrossRef]
- Palli, G.; Pirozzi, S. A Tactile-Based Wire Manipulation System for Manufacturing Applications. Robotics 2019, 8, 46. [Google Scholar] [CrossRef] [Green Version]
- Cirillo, A.; De Maria, G.; Natale, C.; Pirozzi, S. Design and evaluation of tactile sensors for the estimation of grasped wire shape. In Proceedings of the 2017 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), Munich, Germany, 3–7 July 2017; pp. 490–496. [Google Scholar] [CrossRef]
- Nakagaki, H.; Kitagi, K.; Ogasawara, T.; Tsukune, H. Study of insertion task of a flexible wire into a hole by using visual tracking observed by stereo vision. In Proceedings of the IEEE International Conference on Robotics and Automation, Minneapolis, MN, USA, 22–28 April 1996; Volume 4, pp. 3209–3214. [Google Scholar] [CrossRef]
- Kicki, P.; Bednarek, M.; Walas, K. Measuring Bending Angle and Hallucinating Shape of Elongated Deformable Objects. In Proceedings of the 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids), Beijing, China, 6–9 November 2018; pp. 270–276. [Google Scholar] [CrossRef]
- Nakagaki, H.; Kitagaki, K.; Ogasawara, T.; Tsukune, H. Study of deformation and insertion tasks of a flexible wire. In Proceedings of the International Conference on Robotics and Automation, Albuquerque, NM, USA, 25–25 April 1997; Volume 3, pp. 2397–2402. [Google Scholar] [CrossRef]
- Pirozzi, S.; Natale, C. Tactile-Based Manipulation of Wires For Switchgear Assembly. IEEE/ASME Trans. Mechatron. 2018, 23, 2650–2661. [Google Scholar] [CrossRef]
- Kicki, P.; Bednarek, M.; Walas, K. Robotic Manipulation of Elongated and Elastic Objects. In Proceedings of the 2019 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, Poland, 18–20 September 2019; pp. 23–27. [Google Scholar] [CrossRef]
- Zhu, J.; Navarro-Alarcon, D.; Passama, R.; Cherubini, A. Vision-based Manipulation of Deformable and Rigid Objects Using Subspace Projections of 2D Contours. Robot. Auton. Syst. 2021, 142, 10379. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef] [Green Version]
- Selvaraju, R.R.; Chattopadhyay, P.; Elhoseiny, M.; Sharma, T.; Batra, D.; Parikh, D.; Lee, S. Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance. In Proceedings of the European conference on computer vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Selvaraju, R.R.; Lee, S.; Shen, Y.; Jin, H.; Batra, D.; Parikh, D. Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019. [Google Scholar]
- Phang, J.; Park, J.; Geras, K.J. Investigating and Simplifying Masking-based Saliency Methods for Model Interpretability. arXiv 2020, arXiv:2010.09750. [Google Scholar]
- Zhang, J.; Bargal, S.A.; Lin, Z.; Brandt, J.; Shen, X.; Sclaroff, S. Top-Down Neural Attention by Excitation Backprop. Int. J. Comput. Vis. 2018, 126, 1084–1102. [Google Scholar] [CrossRef] [Green Version]
- Gupta, S.; Girshick, R.; Arbeláez, P.; Malik, J. Learning Rich Features from RGB-D Images for Object Detection and Segmentation. In Computer Vision—ECCV 2014; Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T., Eds.; Springer International Publishing: Cham, Germany, 2014; pp. 345–360. [Google Scholar]
- Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning Deep Features for Discriminative Localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Eitel, A.; Springenberg, J.T.; Spinello, L.; Riedmiller, M.; Burgard, W. Multimodal deep learning for robust RGB-D object recognition. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 681–687. [Google Scholar] [CrossRef] [Green Version]
- Simard, P.; Steinkraus, D.; Platt, J. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the Seventh International Conference on Document Analysis and Recognition, Edinburgh, UK, 3–6 August 2003; pp. 958–963. [Google Scholar] [CrossRef]
- Romera, E.; Álvarez, J.M.; Bergasa, L.M.; Arroyo, R. ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation. IEEE Trans. Intell. Transp. Syst. 2018, 19, 263–272. [Google Scholar] [CrossRef]
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Bednarek, M.; Kicki, P.; Walas, K. On Robustness of Multi-Modal Fusion—Robotics Perspective. Electronics 2020, 9, 1152. [Google Scholar] [CrossRef]
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
- Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Dataset Split | I, II/III | II, III/I | I, III/II | 3-Fold Average | ||||
---|---|---|---|---|---|---|---|---|
Model | Training | Validation | Training | Validation | Training | Validation | Training | Validation |
RGB | 99.0 | 88.1 | 94.3 | 92.5 | 92.0 | 96.0 | 95.1 ± 3.6 | 92.2 ± 4.0 |
Depth | 93.3 | 86.6 | 95.8 | 93.0 | 93.8 | 94.5 | 94.3 ± 1.3 | 91.4 ± 4.2 |
Depth jet | 96.3 | 90.1 | 97.5 | 94.5 | 93.5 | 81.0 | 95.8 ± 2.0 | 88.5 ± 6.9 |
RGBD | 91.3 | 84.6 | 95.5 | 94.5 | 99.0 | 99.5 | 95.3 ± 3.9 | 92.9 ± 7.6 |
RGBD jet | 96.3 | 88.6 | 95.0 | 92.0 | 81.6 | 67.5 | 90.9 ± 8.2 | 82.7 ± 13.3 |
RGBD early fusion | 96.5 | 90.1 | 96.8 | 92.5 | 95.0 | 98.0 | 96.1 ± 0.9 | 93.5 ± 4.0 |
RGB + Depth | 95.5 | 91.5 | 96.5 | 94.0 | 95.8 | 96.5 | 95.9 ± 0.5 | 94 ± 2.5 |
RGB + Depth jet | 96.3 | 89.6 | 97.5 | 93.5 | 96.5 | 94.0 | 96.8 ± 0.7 | 92.4 ± 2.4 |
RGB + Depth + IN | 69.0 | 63.7 | 64.8 | 75.0 | 69.8 | 77.5 | 67.9 ± 2.2 | 72.1 ± 6.0 |
RGB + Depth + DA | 100.0 | 92.5 | 100.0 | 98.0 | 100.0 | 98.0 | 100.0 ± 0.0 | 95.5 ± 2.3 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kicki, P.; Bednarek, M.; Lembicz, P.; Mierzwiak, G.; Szymko, A.; Kraft, M.; Walas, K. Tell Me, What Do You See?—Interpretable Classification of Wiring Harness Branches with Deep Neural Networks. Sensors 2021, 21, 4327. https://doi.org/10.3390/s21134327
Kicki P, Bednarek M, Lembicz P, Mierzwiak G, Szymko A, Kraft M, Walas K. Tell Me, What Do You See?—Interpretable Classification of Wiring Harness Branches with Deep Neural Networks. Sensors. 2021; 21(13):4327. https://doi.org/10.3390/s21134327
Chicago/Turabian StyleKicki, Piotr, Michał Bednarek, Paweł Lembicz, Grzegorz Mierzwiak, Amadeusz Szymko, Marek Kraft, and Krzysztof Walas. 2021. "Tell Me, What Do You See?—Interpretable Classification of Wiring Harness Branches with Deep Neural Networks" Sensors 21, no. 13: 4327. https://doi.org/10.3390/s21134327
APA StyleKicki, P., Bednarek, M., Lembicz, P., Mierzwiak, G., Szymko, A., Kraft, M., & Walas, K. (2021). Tell Me, What Do You See?—Interpretable Classification of Wiring Harness Branches with Deep Neural Networks. Sensors, 21(13), 4327. https://doi.org/10.3390/s21134327