LMFRNet: A Lightweight Convolutional Neural Network Model for Image Analysis
Abstract
:1. Introduction
- First, we propose a lightweight model that attains an outstanding accuracy of 94.6% on the CIFAR-10 dataset, leveraging a mere 0.52 million parameters. Notably, in equivalent parameter setups, the model surpasses existing SOTA models, showcasing a remarkable balance between performance and parameter efficiency.
- Second, we conducted an extensive array of experiments to carefully adjust various common hyperparameters, including training epochs, optimizer selection, data augmentation strategies, and learning rates. By comparing the performance of different hyperparameter settings, we provide valuable experimental results to help practitioners better understand and apply these critical parameters, facilitating more effective model training.
2. Method
2.1. Basic CNN Components
2.2. The Proposed Model
3. Dataset
4. Experiments
4.1. Experimental Environment
4.2. Experimental Results
4.2.1. The Influence of Epochs
- In the first approximately 25 rounds of training, the loss rapidly decreases, and the accuracy significantly improves.
- From the 25th round to roughly the 175th round, the loss continues to decrease, and the accuracy steadily increases.
- From the 175th round to approximately the 200th round, both the loss and accuracy gradually converge.
4.2.2. The Influence of Learning Rate
- When the learning rate is set to a higher value (e.g., 0.1 in Figure 4), the model’s learning speed noticeably accelerates, leading to a rapid increase in accuracy. However, if the learning rate is set excessively high, it may lead to training oscillations and non-convergence and cause the model to miss the global optimum.
- Conversely, when the learning rate is set to a lower value(e.g., 0.0001 in Figure 4), the model’s learning speed slows down, resulting in a more stable training process. However, this can significantly slow down the convergence speed. Additionally, a smaller learning rate may cause the model to become stuck in a local optimum.
4.2.3. The Influence of Optimizer
4.2.4. The Influence of Data Augmentation
- Baseline: Original data without any augmentation.
- Norm: Pixel values in training images are normalized to a standard distribution.
- HFlip: Randomly flip images horizontally.
- Crop: Randomly crop and pad images.
- Cutout: Randomly mask out a rectangular region in input images.
- All: Combines all of the above strategies.
4.3. Comparison between the MF Block and the Plain Block
4.4. Comparative Analysis with Other Models
4.4.1. Performance on the CIFAR-10 Dataset
4.4.2. Performance on the CIFAR-100 Dataset
4.4.3. Performance on the MNIST Dataset
4.4.4. Performance on the Fashion-MNIST Dataset
5. Conclusions
- Extending our versatile classification model to a wider range of resource-constrained scenarios, such as Internet of Things (IoT) applications and embedded systems with limited computational resources.
- Investigating the use of the model as a backbone network for real-time object detection and other vision tasks, such as in autonomous vehicles, real-time crowd counting, and augmented reality environments.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Rawat, W.; Wang, Z. Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review. Neural Comput. 2017, 29, 2352–2449. [Google Scholar] [CrossRef]
- Dhillon, A.; Verma, G.K. Convolutional Neural Network: A Review of Models, Methodologies and Applications to Object Detection. Prog. Artif. Intell. 2020, 9, 85–112. [Google Scholar] [CrossRef]
- Wang, Y.; Tian, Y. Exploring Zero-Shot Semantic Segmentation with No Supervision Leakage. Electronics 2023, 12, 3452. [Google Scholar] [CrossRef]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef]
- Savelli, B.; Bria, A.; Molinara, M.; Marrocco, C.; Tortorella, F. A Multi-Context CNN Ensemble for Small Lesion Detection. Artif. Intell. Med. 2020, 103, 101749. [Google Scholar] [CrossRef]
- Frid-Adar, M.; Diamant, I.; Klang, E.; Amitai, M.; Goldberger, J.; Greenspan, H. Modeling the Intra-class Variability for Liver Lesion Detection Using a Multi-class Patch-Based CNN. In Patch-Based Techniques in Medical Imaging; Wu, G., Munsell, B.C., Zhan, Y., Bai, W., Sanroma, G., Coupé, P., Eds.; Springer International Publishing: Cham, Switzerland, 2017; Volume 10530, pp. 129–137. [Google Scholar] [CrossRef]
- Bojarski, M.; Choromanska, A.; Choromanski, K.; Firner, B.; Ackel, L.J.; Muller, U.; Yeres, P.; Zieba, K. Visualbackprop: Efficient Visualization of Cnns for Autonomous Driving. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 4701–4708. [Google Scholar]
- Coşkun, M.; Uçar, A.; Yildirim, Ö.; Demir, Y. Face Recognition Based on Convolutional Neural Network. In Proceedings of the 2017 International Conference on Modern Electrical and Energy Systems (MEES), Kremenchuk, Ukraine, 15–17 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 376–379. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; Volume 2, pp. 1097–1105. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef]
- Bhuiyan, M.A.B.; Abdullah, H.M.; Arman, S.E.; Rahman, S.S.; Al Mahmud, K. BananaSqueezeNet: A Very Fast, Lightweight Convolutional Neural Network for the Diagnosis of Three Prominent Banana Leaf Diseases. Smart Agric. Technol. 2023, 4, 100214. [Google Scholar] [CrossRef]
- Gu, M.; Zhang, Y.; Wen, Y.; Ai, G.; Zhang, H.; Wang, P.; Wang, G. A Lightweight Convolutional Neural Network Hardware Implementation for Wearable Heart Rate Anomaly Detection. Comput. Biol. Med. 2023, 155, 106623. [Google Scholar] [CrossRef] [PubMed]
- Ma, X.; Li, Y.; Wan, L.; Xu, Z.; Song, J.; Huang, J. Classification of Seed Corn Ears Based on Custom Lightweight Convolutional Neural Network and Improved Training Strategies. Eng. Appl. Artif. Intell. 2023, 120, 105936. [Google Scholar] [CrossRef]
- Zhang, D.; Hao, X.; Wang, D.; Qin, C.; Zhao, B.; Liang, L.; Liu, W. An Efficient Lightweight Convolutional Neural Network for Industrial Surface Defect Detection. Artif. Intell. Rev. 2023, 56, 10651–10677. [Google Scholar] [CrossRef]
- Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and <0.5 MB Model Size. arXiv 2016, arXiv:1602.07360. [Google Scholar] [CrossRef]
- Ma, N.; Zhang, X.; Zheng, H.T.; Sun, J. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. arXiv 2018, arXiv:1807.11164. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. arXiv 2019, arXiv:1801.04381. [Google Scholar] [CrossRef]
- Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2020, arXiv:1905.11946. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv 2015, arXiv:1502.03167. [Google Scholar] [CrossRef]
- Xu, B.; Wang, N.; Chen, T.; Li, M. Empirical Evaluation of Rectified Activations in Convolutional Network. arXiv 2015, arXiv:1505.00853. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arXiv 2015, arXiv:1502.01852. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. arXiv 2016, arXiv:1602.07261. [Google Scholar] [CrossRef]
- Wang, R.J.; Li, X.; Ling, C.X. Pelee: A Real-Time Object Detection System on Mobile Devices. arXiv 2019, arXiv:1804.06882. [Google Scholar]
- Ren, F.; Liu, W.; Wu, G. Feature Reuse Residual Networks for Insect Pest Recognition. IEEE Access 2019, 7, 122758–122768. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images, Tech Report. 2009. Available online: https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 24 December 2023).
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
- Robbins, H.; Monro, S. A Stochastic Approximation Method. Ann. Math. Stat. 1951, 22, 400–407. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar] [CrossRef]
- Choi, H.; Park, J.; Yang, Y.M. A Novel Quick-Response Eigenface Analysis Scheme for Brain–Computer Interfaces. Sensors 2022, 22, 5860. [Google Scholar] [CrossRef] [PubMed]
- DeVries, T.; Taylor, G.W. Improved Regularization of Convolutional Neural Networks with Cutout. arXiv 2017, arXiv:1708.04552. [Google Scholar] [CrossRef]
- Zhang, X.; Zhou, X.; Lin, M.; Sun, J. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6848–6856. [Google Scholar] [CrossRef]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2021, arXiv:2010.11929. [Google Scholar] [CrossRef]
- Nocentini, O.; Kim, J.; Bashir, M.Z.; Cavallo, F. Image Classification Using Multiple Convolutional Neural Networks on the Fashion-MNIST Dataset. Sensors 2022, 22, 9544. [Google Scholar] [CrossRef]
Stages | Layers | Patch Size | Stride | Output Size |
---|---|---|---|---|
Stage 1 | Convolution layer | 3 × 3 | 1 | 32 × 32 × 32 |
Stage 2 | MF Block | - | - | 104 × 32 × 32 |
Convolution layer | 1 × 1 | 1 | 104 × 32 × 32 | |
Average pooling layer | 2 × 2 | 2 | 104 × 16 × 16 | |
Stage 3 | MF Block | - | - | 200 × 16 × 16 |
Convolution layer | 1 × 1 | 1 | 200 × 16 × 16 | |
Average pooling layer | 2 × 2 | 2 | 200 × 8 × 8 | |
Stage 4 | MF Block | - | - | 392 × 8 × 8 |
Convolution layer | 1 × 1 | 1 | 392 × 8 × 8 | |
Average pooling layer | 2 × 2 | 2 | 392 × 4 × 4 | |
Stage 5 | MF Block | - | - | 464 × 4 × 4 |
Convolution layer | 1 × 1 | 1 | 464 × 4 × 4 | |
Stage 6 | Global average pooling | - | - | 464 |
Fully connected layer | - | - | 10 |
Category | Training Set | Test Set | Total |
---|---|---|---|
Airplanes | 5000 | 1000 | 6000 |
Automobiles | 5000 | 1000 | 6000 |
Birds | 5000 | 1000 | 6000 |
Cats | 5000 | 1000 | 6000 |
Deer | 5000 | 1000 | 6000 |
Dogs | 5000 | 1000 | 6000 |
Frogs | 5000 | 1000 | 6000 |
Horses | 5000 | 1000 | 6000 |
Ships | 5000 | 1000 | 6000 |
Trucks | 5000 | 1000 | 6000 |
No. | Model | Accuracy (Top-1) | Parameters |
---|---|---|---|
1 | Ours | 94.60% | 0.52 M |
2 | SqueezeNet [18] | 92.83% | 0.73 M |
3 | ShuffleNet [37] | 92.31% | 0.93 M |
4 | ShuffleNetV2 [19] | 92.86% | 1.26 M |
5 | MobileNetV2 [20] | 94.19% | 2.30 M |
6 | MobileNet [38] | 92.76% | 3.22 M |
7 | GoogLeNet [11] | 95.02% | 6.17 M |
8 | DenseNet121 [13] | 95.04% | 6.96 M |
9 | ResNet50 [12] | 93.62% | 23.52 M |
10 | VGG13 [10] | 94.06% | 28.33 M |
No. | Model | Accuracy (Top-1) | Parameters |
---|---|---|---|
1 | Ours | 75.82% | 0.56 M |
2 | SqueezeNet [18] | 69.41% | 0.78 M |
3 | ShuffleNet [37] | 70.06% | 1.00 M |
4 | ShuffleNetV2 [19] | 69.51% | 1.30 M |
5 | MobileNetV2 [20] | 68.08% | 2.36 M |
6 | MobileNet [38] | 65.98% | 3.30 M |
7 | GoogLeNet [11] | 78.03% | 6.20 M |
8 | DenseNet121 [13] | 77.01% | 7.00 M |
9 | ResNet50 [12] | 77.39% | 23.70 M |
10 | VGG13 [10] | 72.00% | 28.70 M |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wan, G.; Yao, L. LMFRNet: A Lightweight Convolutional Neural Network Model for Image Analysis. Electronics 2024, 13, 129. https://doi.org/10.3390/electronics13010129
Wan G, Yao L. LMFRNet: A Lightweight Convolutional Neural Network Model for Image Analysis. Electronics. 2024; 13(1):129. https://doi.org/10.3390/electronics13010129
Chicago/Turabian StyleWan, Guangquan, and Lan Yao. 2024. "LMFRNet: A Lightweight Convolutional Neural Network Model for Image Analysis" Electronics 13, no. 1: 129. https://doi.org/10.3390/electronics13010129
APA StyleWan, G., & Yao, L. (2024). LMFRNet: A Lightweight Convolutional Neural Network Model for Image Analysis. Electronics, 13(1), 129. https://doi.org/10.3390/electronics13010129