Instance Segmentation Method Based on Improved Mask R-CNN for the Stacked Electronic Components
Abstract
:1. Introduction
2. Dataset
2.1. Image Collection for the Dataset
2.2. Data Augmentation
2.3. Image Annotation
3. Structure
3.1. Backbone
3.2. RPN
3.3. RoIAlign
3.4. Head Architecture
3.5. Loss Function
4. Experiments and Discussion
4.1. Implementation Details
4.2. Training, Validation, and Test Results
4.3. Testing New Images
4.4. Comparative Study
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. Presented at the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. Presented at the Computer Vision—ECCV 2016, Cham, Switzerland, 17 September 2016. [Google Scholar]
- Girshick, R. Fast R-CNN. arXiv 2015, arXiv:1504.08083. Available online: https://ui.adsabs.harvard.edu/abs/2015arXiv150408083G (accessed on 1 April 2015).
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. arXiv 2014, arXiv:1311.2524. [Google Scholar]
- He, K.M.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Hang, M.; Wilson, D.; Huang, Y.; Kelly, S.; Crozier, S.; Bradley, A.; Chandra, S. Fully automatic computer-aided mass detection and segmentation via pseudo-color mammograms and Mask R-CNN. arXiv 2019, arXiv:1906.12118. [Google Scholar]
- Dai, Z.; Carver, E.; Liu, C.; Lee, J.; Feldman, A.; Zong, W.; Pantelic, M.; Elshaikh, M.; Wen, N. Segmentation of the Prostatic Gland and the Intraprostatic Lesions on Multiparametic MRI Using Mask-RCNN. arXiv 2019, arXiv:1904.02575. [Google Scholar]
- Chiao, J.Y.; Chen, K.Y.; Liao, K.Y.; Hsieh, P.H.; Zhang, G.; Huang, T.C. Detection and classification the breast tumors using mask R-CNN on sonograms. Medicine 2019, 98, e15200. [Google Scholar] [CrossRef] [PubMed]
- Xu, Y.; Sun, Z.; Hoegner, L.; Stilla, U.; Yao, W. Instance Segmentation of Trees in Urban Areas from MLS Point Clouds Using Supervoxel Contexts and Graph-Based Optimization. Presented at the 2018 10th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS), Beijing, China, 19–20 August 2018. [Google Scholar]
- Brabandere, B.D.; Neven, D.; Gool, L.V. Semantic Instance Segmentation for Autonomous Driving. Presented at the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Chen, K.; Pang, J.; Wang, J.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Shi, J.; Ouyang, W.; et al. Hybrid Task Cascade for Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Hariharan, B.; Arbeláez, P.; Girshick, R.; Malik, J. Simultaneous Detection and Segmentation. Presented at the Computer Vision—ECCV 2014, Cham, Switzerland, 6–12 September 2014. [Google Scholar]
- Hariharan, B.; Arbelaez, P.; Girshick, R.; Malik, J. Hypercolumns for object segmentation and fine-grained localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 15 October 2015. [Google Scholar]
- Dai, J.; He, K.; Li, Y.; Ren, S.; Sun, J. Instance-Sensitive Fully Convolutional Networks. Presented at the Computer Vision—ECCV 2016, Cham, Switzerland, 17 September 2016. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. Presented at the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Dai, J.; He, K.; Sun, J. Convolutional Feature Masking for Joint Object and Stuff Segmentation. arXiv 2014, arXiv:1412.1283. [Google Scholar]
- Dai, J.; He, K.; Sun, J. Instance-aware Semantic Segmentation via Multi-task Network Cascades. arXiv 2015, arXiv:1512.04412. [Google Scholar]
- Li, Y.; Qi, H.; Dai, J.; Ji, X.; Wei, Y. Fully Convolutional Instance-aware Semantic Segmentation. arXiv 2016, arXiv:1611.07709. [Google Scholar]
- Pham, V.-Q.; Ito, S.; Kozakaya, T. BiSeg: Simultaneous Instance Segmentation and Semantic Segmentation with Fully Convolutional Networks. arXiv 2017, arXiv:1706.02135. [Google Scholar]
- Pinheiro, P.; Collobert, R.; Dollar, P. Learning to Segment Object Candidates. arXiv 2015, arXiv:1506.06204. [Google Scholar]
- Pinheiro, P.O.; Lin, T.-Y.; Collobert, R.; Dollár, P. Learning to Refine Object Segments. Presented at the Computer Vision—ECCV 2016, Cham, Switzerland, 17 September 2016. [Google Scholar]
- Cai, Z.W.; Vasconcelos, N. Cascade R-CNN: Delving into High Quality Object Detection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 17 December 2018; pp. 6154–6162. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.F.; Shi, J.P.; Jia, J.Y. Path Aggregation Network for Instance Segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; 8759–8768. [Google Scholar]
- Nicholas, R.; Paul, N.; Siddhartha, S. Physics-Based Grasp Planning Through Clutter. In Robotics: Science and Systems VIII; MIT Press: Cambridge, MA, USA, 2013; pp. 57–64. [Google Scholar]
- Di, G.; Tao, K.; Fuchun, S.; Huaping, L. Object discovery and grasp detection with a shared convolutional neural network. Presented at the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016. [Google Scholar]
- Zhang, H.; Lan, X.; Bai, S.; Wan, L.; Yang, C.; Zheng, N. A Multi-task Convolutional Neural Network for Autonomous Robotic Grasping in Object Stacking Scenes. arXiv 2018, arXiv:1809.07081. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. arXiv 2016, arXiv:1612.03144. [Google Scholar]
- Howard, A.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Nair, V.; Hinton, G. Rectified Linear Units Improve Restricted Boltzmann Machines Vinod Nair. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010. [Google Scholar]
- Dumoulin, V.; Visin, F. A Guide to Convolution Arithmetic for Deep Learning. arXiv 2016, arXiv:1603.07285. [Google Scholar]
- Mask R-CNN. Available online: https://github.com/matterport/Mask_RCNN (accessed on 15 October 2019).
Class | Training Set | Validation Set | Testing Set |
---|---|---|---|
Number of Objects | Number of Objects | Number of Objects | |
Electrolytic Capacitor | 365 | 120 | 204 |
Resistance | 329 | 168 | 180 |
Tantalum Capacitor | 340 | 130 | 220 |
Potentiometer | 343 | 174 | 210 |
Class | Identity (id) | Description |
---|---|---|
Electrolytic Capacitor | 1 | Capa |
Resistance | 2 | Resis |
Tantalum Capacitor | 3 | Tcapa |
Potentiometer | 4 | Poten |
Stage | Type | Filter Shape | Stride | Output |
---|---|---|---|---|
1 | Conv | 3 × 3 × 3 × 32 | 2 | S1 |
Depthwise Conv | 3 × 3 × 32 | 1 | ||
Conv | 1 × 1 × 32 × 64 | 1 | ||
2 | Depthwise Conv | 3 × 3 × 64 | 2 | S2 |
Conv | 1 × 1 × 64 × 128 | 1 | ||
Depthwise Conv | 3 × 3 × 128 | 1 | ||
Conv | 1 × 1 × 128 × 128 | 1 | ||
3 | Depthwise Conv | 3 × 3 × 128 | 2 | S3 |
Conv | 1 × 1 × 128 × 256 | 1 | ||
Depthwise Conv | 3 × 3 × 256 | 1 | ||
Conv | 1 × 1 × 256 × 256 | 1 | ||
4 | Depthwise Conv | 3 × 3 × 256 | 2 | S4 |
Conv | 1 × 1 × 256 × 512 | 1 | ||
Depthwise Conv | 3 × 3 × 512 | 1 | ||
Conv | 1 × 1 × 512 × 512 | 1 | ||
5 | Depthwise Conv | 3 × 3 × 512 | 2 | S5 |
Conv | 1 × 1 × 512 × 1024 | 1 | ||
Depthwise Conv | 3 × 3 × 1024 | 2 | ||
Conv | 1 × 1 × 1024 × 1024 | 1 |
AP | AP50 | AP75 | Backbone | |
---|---|---|---|---|
Mask R-CNN | 60.53 | 82.48 | 64.53 | ResNet-101 |
Cascade Mask R-CNN | 64.74 | 89.65 | 68.47 | ResNet-101 |
Ours | 62.63 | 91.04 | 67.78 | MobileNets-FPN |
Cascade Mask R-CNN | Mask R-CNN | Ours | |
---|---|---|---|
Model size | 615.7 MB | 255.9 MB | 91.1 MB |
Test time per image | 2.61 s | 3.9 s | 1.8s |
Cascade Mask R-CNN (%) | Mask R-CNN (%) | Ours (%) | |
---|---|---|---|
Electrolytic Capacitor | 82.47 | 94.60 | 86.55 |
Resistance | 92.35 | 68.18 | 92.23 |
Tantalum Capacitor | 86.92 | 84.74 | 97.32 |
Potentiometer | 92.62 | 91.09 | 96.36 |
mAP | 88.64 | 84.65 | 93.12 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, Z.; Dong, R.; Xu, H.; Gu, J. Instance Segmentation Method Based on Improved Mask R-CNN for the Stacked Electronic Components. Electronics 2020, 9, 886. https://doi.org/10.3390/electronics9060886
Yang Z, Dong R, Xu H, Gu J. Instance Segmentation Method Based on Improved Mask R-CNN for the Stacked Electronic Components. Electronics. 2020; 9(6):886. https://doi.org/10.3390/electronics9060886
Chicago/Turabian StyleYang, Zhixian, Ruixia Dong, Hao Xu, and Jinan Gu. 2020. "Instance Segmentation Method Based on Improved Mask R-CNN for the Stacked Electronic Components" Electronics 9, no. 6: 886. https://doi.org/10.3390/electronics9060886
APA StyleYang, Z., Dong, R., Xu, H., & Gu, J. (2020). Instance Segmentation Method Based on Improved Mask R-CNN for the Stacked Electronic Components. Electronics, 9(6), 886. https://doi.org/10.3390/electronics9060886