An Efficient Multi-Label Classification-Based Municipal Waste Image Identification
Abstract
:1. Introduction
- Development of a flexible multi-label image classification framework: We present the Query2Label (Q2L) framework, tailored for the complex task of municipal waste image recognition. This model excels in identifying multiple types of waste within the same image, utilizing self-attention and cross-attention mechanisms to accurately classify waste types, enhancing both accuracy and efficiency.
- Utilization of a novel municipal waste dataset: Our study employs the “Garbage In, Garbage Out” (GIGO) dataset, a newly developed collection of urban waste images. This dataset, with its diversity and real-world scenarios, significantly aids in improving the model’s performance by providing a wide array of waste images for training and testing.
- High accuracy with low computational complexity: Compared to existing models, our approach achieves superior precision in identifying various types of waste while maintaining computational efficiency. This ensures the model’s suitability for real-time applications, highlighting its potential for practical deployment in waste management systems.
2. Related Work
2.1. Multi-Label Image Classification
2.2. Intelligent Waste Identification
3. Method
3.1. Q2L Framework for Intelligent Multi-Label Waste Image Recognition
3.2. Backbone
3.3. Asymmetric Loss Function
4. Dataset and Experimental Settings
4.1. Dataset—GIGO
4.2. Experimental Settings
4.3. Evaluation Metrics
5. Experimental Evaluations
5.1. Comparison of Different Backbone Networks
5.2. Comparison of Different Loss Functions
5.3. Confusion Matrix
5.4. Ablation Experiment
- Baseline model: Utilized the initial backbone network and loss function settings, serving as the performance comparison benchmark.
- Backbone modification: Altered only the backbone network to ViT-B/16, assessing its impact on model performance.
- Loss function modification: Maintained the backbone network while changing the loss function to asymmetric loss, exploring the performance improvement due to this modification.
- Combined modification: Simultaneously changed the backbone network to ViT-B/16 and the loss function to asymmetric loss, examining the model’s performance under the combined effect of these optimizations.
6. Conclusions and Future Work
- Dataset expansion and diversification: To enhance the model’s generalization capabilities across a broader spectrum of waste types and scenarios, it is imperative to expand and diversify the training dataset. This expansion could include a variety of waste materials and configurations, as well as a more extensive range of environmental conditions. Additionally, incorporating data from multiple cities can mitigate the influence of specific urban aesthetics and municipal characteristics, which will further enhance the model’s adaptability and performance across diverse urban settings.
- Integration of multiple sensory inputs: Incorporating data from additional modalities, such as infrared imaging, depth sensing, and perhaps even acoustic sensors, could significantly enhance the model’s ability to distinguish between different types of waste in visually complex scenes. This multi-modal approach might reveal characteristics of materials that are not apparent in visual-spectrum photographs alone.
- Development of lightweight models: Investigating and developing more efficient model architectures that maintain high accuracy while being computationally less demanding is essential. This could facilitate the deployment of advanced waste classification systems on mobile or embedded devices, enabling real-time processing and decision making at the point of waste collection or sorting.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Al-Antari, M.A. Artificial intelligence for medical diagnostics—Existing and future aI technology! Diagnostics 2023, 13, 688. [Google Scholar] [CrossRef] [PubMed]
- Grigorescu, S.; Trasnea, B.; Cocias, T.; Macesanu, G. A survey of deep learning techniques for autonomous driving. J. Field Robot. 2020, 37, 362–386. [Google Scholar] [CrossRef]
- Kolhar, M.; Alameen, A. Artificial Intelligence Based Language Translation Platform. Intell. Autom. Soft Comput. 2021, 28. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Ruiz, V.; Sánchez, Á.; Vélez, J.F.; Raducanu, B. Automatic image-based waste classification. In Proceedings of the From Bioinspired Systems and Biomedical Applications to Machine Learning: 8th International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2019, Almería, Spain, 3–7 June 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 422–431. [Google Scholar]
- Dada, M.A.; Obaigbena, A.; Majemite, M.T.; Oliha, J.S.; Biu, P.W. Innovative approaches to waste resource management: Implications for environmental sustainability and policy. Eng. Sci. Technol. J. 2024, 5, 115–127. [Google Scholar] [CrossRef]
- Smith, Y.R.; Nagel, J.R.; Rajamani, R.K. Eddy current separation for recovery of non-ferrous metallic particles: A comprehensive review. Miner. Eng. 2019, 133, 149–159. [Google Scholar] [CrossRef]
- Zurbrugg, C. Urban solid waste management in low-income countries of Asia how to cope with the garbage crisis. Present. Sci. Comm. Probl. Environ. (SCOPE) Urban Solid Waste Manag. Rev. Sess. Durban S. Afr. 2002, 6, 1–13. [Google Scholar]
- Choi, J.; Lim, B.; Yoo, Y. Advancing Plastic Waste Classification and Recycling Efficiency: Integrating Image Sensors and Deep Learning Algorithms. Appl. Sci. 2023, 13, 10224. [Google Scholar] [CrossRef]
- Malik, M.; Sharma, S.; Uddin, M.; Chen, C.L.; Wu, C.M.; Soni, P.; Chaudhary, S. Waste classification for sustainable development using image recognition with deep learning neural network models. Sustainability 2022, 14, 7222. [Google Scholar] [CrossRef]
- Wang, C.; Qin, J.; Qu, C.; Ran, X.; Liu, C.; Chen, B. A smart municipal waste management system based on deep-learning and Internet of Things. Waste Manag. 2021, 135, 20–29. [Google Scholar] [CrossRef]
- Das, S.; Lee, S.H.; Kumar, P.; Kim, K.H.; Lee, S.S.; Bhattacharya, S.S. Solid waste management: Scope and the challenge of sustainability. J. Clean. Prod. 2019, 228, 658–678. [Google Scholar] [CrossRef]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef] [PubMed]
- Wang, J.; Yang, Y.; Mao, J.; Huang, Z.; Huang, C.; Xu, W. Cnn-rnn: A unified framework for multi-label image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2285–2294. [Google Scholar]
- Chen, Z.M.; Wei, X.S.; Wang, P.; Guo, Y. Multi-label image recognition with graph convolutional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5177–5186. [Google Scholar]
- Van Horn, G.; Mac Aodha, O.; Song, Y.; Cui, Y.; Sun, C.; Shepard, A.; Adam, H.; Perona, P.; Belongie, S. The inaturalist species classification and detection dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8769–8778. [Google Scholar]
- Wei, Y.; Xia, W.; Lin, M.; Huang, J.; Ni, B.; Dong, J.; Zhao, Y.; Yan, S. HCP: A flexible CNN framework for multi-label image classification. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 1901–1907. [Google Scholar] [CrossRef]
- Gao, Q.; Long, T.; Zhou, Z. Mineral identification based on natural feature-oriented image processing and multi-label image classification. Expert Syst. Appl. 2024, 238, 122111. [Google Scholar] [CrossRef]
- Wang, X.; Peng, Y.; Lu, L.; Lu, Z.; Bagheri, M.; Summers, R.M. Chestx-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2097–2106. [Google Scholar]
- Arbeláez-Estrada, J.C.; Vallejo, P.; Aguilar, J.; Tabares-Betancur, M.S.; Ríos-Zapata, D.; Ruiz-Arenas, S.; Rendón-Vélez, E. A Systematic Literature Review of Waste Identification in Automatic Separation Systems. Recycling 2023, 8, 86. [Google Scholar] [CrossRef]
- Sinthiya, N.J.; Chowdhury, T.A.; Haque, A.B. Artificial intelligence based Smart Waste Management—A systematic review. In Computational Intelligence Techniques for Green Smart Cities; Springer: Berlin/Heidelberg, Germany, 2022; pp. 67–92. [Google Scholar]
- Aral, R.A.; Keskin, Ş.R.; Kaya, M.; Hacıömeroğlu, M. Classification of trashnet dataset based on deep learning models. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 2058–2062. [Google Scholar]
- Proença, P.F.; Simoes, P. Taco: Trash annotations in context for litter detection. arXiv 2020, arXiv:2003.06975. [Google Scholar]
- Singh, S.; Gautam, J.; Rawat, S.; Gupta, V.; Kumar, G.; Verma, L.P. Evaluation of transfer learning based deep learning architectures for waste classification. In Proceedings of the 2021 4th International Symposium on Advanced Electrical and Communication Technologies (ISAECT), Alkhobar, Saudi Arabia, 6–8 December 2021; pp. 1–7. [Google Scholar]
- Funch, O.I.; Marhaug, R.; Kohtala, S.; Steinert, M. Detecting glass and metal in consumer trash bags during waste collection using convolutional neural networks. Waste Manag. 2021, 119, 30–38. [Google Scholar] [CrossRef]
- Lu, G.; Wang, Y.; Xu, H.; Yang, H.; Zou, J. Deep multimodal learning for municipal solid waste sorting. Sci. China Technol. Sci. 2022, 65, 324–335. [Google Scholar] [CrossRef]
- Chen, Y.; Sun, J.; Bi, S.; Meng, C.; Guo, F. Multi-objective solid waste classification and identification model based on transfer learning method. J. Mater. Cycles Waste Manag. 2021, 23, 2179–2191. [Google Scholar] [CrossRef]
- Feng, B.; Ren, K.; Tao, Q.; Gao, X. A robust waste detection method based on cascade adversarial spatial dropout detection network. In Proceedings of the Optoelectronic Imaging and Multimedia Technology VII, Online, 11–16 October 2020; SPIE: Bellingham, WC, USA, 2020; Volume 11550, pp. 179–188. [Google Scholar]
- Cai, H.; Cao, X.; Huang, L.; Zou, L.; Yang, S. Research on Computer Vision-Based Waste Sorting System. In Proceedings of the 2020 5th International Conference on Control, Robotics and Cybernetics (CRC), Wuhan, China, 16–18 October 2020; pp. 117–122. [Google Scholar]
- Liu, S.; Zhang, L.; Yang, X.; Su, H.; Zhu, J. Query2label: A simple transformer way to multi-label classification. arXiv 2021, arXiv:2107.10834. [Google Scholar]
- Ruby, U.; Yendapalli, V. Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 5393–5397. [Google Scholar]
- Ridnik, T.; Ben-Baruch, E.; Zamir, N.; Noy, A.; Friedman, I.; Protter, M.; Zelnik-Manor, L. Asymmetric loss for multi-label classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 82–91. [Google Scholar]
- Sukel, M.; Rudinac, S.; Worring, M. GIGO, Garbage In, Garbage Out: An Urban Garbage Classification Dataset. In Proceedings of the International Conference on Multimedia Modeling, Bergen, Norway, 9–12 January 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 527–538. [Google Scholar]
- Zou, F.; Shen, L.; Jie, Z.; Zhang, W.; Liu, W. A sufficient condition for convergences of adam and rmsprop. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 11127–11135. [Google Scholar]
- Li, Y.; Chen, Y.; Dai, X.; Chen, D.; Liu, M.; Yuan, L.; Liu, Z.; Zhang, L.; Vasconcelos, N. Micronet: Improving image recognition with extremely low flops. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 468–477. [Google Scholar]
- Tripathi, M. Analysis of convolutional neural network based image classification techniques. J. Innov. Image Process. (JIIP) 2021, 3, 100–117. [Google Scholar] [CrossRef]
- Wu, Z.; Shen, C.; Van Den Hengel, A. Wider or deeper: Revisiting the resnet model for visual recognition. Pattern Recognit. 2019, 90, 119–133. [Google Scholar] [CrossRef]
- Koonce, B.; Koonce, B. MobileNetV3. In Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization; Springer: Berlin/Heidelberg, Germany, 2021; pp. 125–144. [Google Scholar]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 33, 21002–21012. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Class Name | Number of Images |
---|---|
Not Garbage | 15,647 |
Garbage | 9351 |
Garbage Bag | 1957 |
Cardboard | 4391 |
Bulky Waste | 5055 |
Litter | 4863 |
Number of Garbage | Number of Images |
---|---|
0 | 15,647 |
1 | 4676 |
2 | 2802 |
3 | 1496 |
4 | 378 |
Parameter | Value |
---|---|
Batch size | 24 |
Optimizer | RMSprop with Momentum [34] |
Initial learning rate | |
Decay | 0.95 |
Decay steps | 10,000 |
Momentum | 0.9 |
Final learning rate |
Backbone Network | Number of Parameters | FLOPs | mAP (%) |
---|---|---|---|
ResNet-101 | 44.5 M | 45.8 G | 87.61 |
MobileNetV3 | 5.4 M | 12.3 G | 82.54 |
ViT-B/16 | 86.0 M | 35.8 G | 92.36 |
Loss Function | Parameter Setting | mAP (%) |
---|---|---|
Binary Cross-Entropy | — | 89.97 |
Focal Loss [39] | , | 91.08 |
Asymmetric Loss | , | 92.36 |
Configuration | mAP (%) |
---|---|
Baseline model | 88.62 |
Backbone modification (ViT-B/16) | 89.97 |
Loss function modification (asymmetric loss) | 90.20 |
Combined modification | 92.36 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, R.; Liu, X.; Zhang, T.; Xia, J.; Li, J.; Zhu, M.; Gu, G. An Efficient Multi-Label Classification-Based Municipal Waste Image Identification. Processes 2024, 12, 1075. https://doi.org/10.3390/pr12061075
Wu R, Liu X, Zhang T, Xia J, Li J, Zhu M, Gu G. An Efficient Multi-Label Classification-Based Municipal Waste Image Identification. Processes. 2024; 12(6):1075. https://doi.org/10.3390/pr12061075
Chicago/Turabian StyleWu, Rongxing, Xingmin Liu, Tiantian Zhang, Jiawei Xia, Jiaqi Li, Mingan Zhu, and Gaoquan Gu. 2024. "An Efficient Multi-Label Classification-Based Municipal Waste Image Identification" Processes 12, no. 6: 1075. https://doi.org/10.3390/pr12061075
APA StyleWu, R., Liu, X., Zhang, T., Xia, J., Li, J., Zhu, M., & Gu, G. (2024). An Efficient Multi-Label Classification-Based Municipal Waste Image Identification. Processes, 12(6), 1075. https://doi.org/10.3390/pr12061075