Comparison of Different Methods for Building Ensembles of Convolutional Neural Networks
Abstract
:1. Introduction
- An in-depth comparison and evaluation of three methods for building CNN ensembles, both standalone and in combination, verified across several different datasets;
- The introduction of several new methods for perturbing network weights;
- Free access to all resources, including the MATLAB source code, used in our experiments.
2. Related Work
3. Methods
3.1. Activation Functions
3.2. Ensemble through Stochastic Approach
3.3. Data Augmentation
- APP1: This augmentation generates three new images based on a given image. It randomly reflects the image vertically and horizontally, resulting in two new images. The third transformation involves linearly scaling the original image along both axes with two factors randomly selected from a uniform distribution ranging from 1 to 2.
- APP2: Building upon APP1, this augmentation generates six new images. It includes the transformations of APP1 and adds three additional manipulations. First, image rotation is applied with a random angle extracted from the range of −10 to 10 degrees. Second, translation is performed by shifting the image along both axes with values randomly sampled from the interval of 0 to 5 pixels. Last, shear transformation is applied, with vertical and horizontal angles randomly selected from the range of 0 to 30 degrees.
- APP3: This augmentation replicates APP2 but excludes the shear and the scale transformations, resulting in four new images.
- APP4: This augmentation approach generates three new images by applying a transform based on Principal Component Analysis (PCA). The PCA coefficients extracted from a given image are subjected to three perturbations that generate three new images. For the first image, each element of the feature vector has a 50% probability of being randomly set to zero. For the second, noise is added to each component based on the standard deviation of the projected image. For the third, five images from the same class as the original image are selected, and their PCA vectors are computed. With a 5% probability, components from the original PCA vector are swapped with corresponding components from the other five PCA vectors. The three perturbed PCA vectors are then transformed back using the inverse PCA transform to produce the augmented images.
- APP5: Similar to APP4, this augmentation generates three new images using the perturbation method described above. However, instead of using PCA, the Discrete Cosine Transform (DCT) is applied. It should be noted that the DC coefficient is never changed during this transformation. The basic idea of DCT- and PCA-based approaches is similar: both methods allow us to project the image into a subspace and then return to the original space; by inserting noise into the backprojection, we can create new images. An example of the outcome of the third DCT-based approach is depicted in Figure 5.
- APP6: This augmentation is designed specifically for color images. It creates three new images by color shifting and by altering contrast and sharpness. Contrast alteration is achieved by linearly scaling the original image’s contrast between the lowest value (a) and the highest value (b) allowed for the augmented image. Any pixel in the original image outside this range is mapped to 0 if it is lower than a or 255 if it is greater than b. Sharpness is modified by blurring the original image with a Gaussian filter (variance = 1) and subtracting the blurred image from the original. Color shifting is performed by applying integer shifts to the three RGB filters, and each shift is added to one of the three channels in the original image.
3.4. Parameter Ensembling via Perturbation (PEP)
- Dout: similar to drop-out: 2% of the weights zeroed out. This approach is described in Listing 1;
- DCTa: each set of weights is projected onto a Discrete Cosine Transform (DCT) space, with (3.33%) randomly chosen DCT coefficients set to zero (the DC component is never zeroed out), after which the inverse DCT is applied. This approach is described in Listing 2;
- DCTb: each set of weights is projected onto a Discrete Cosine Transform (DCT) space where a small amount of random noise is injected (the DC component is never perturbed), after which the inverse DCT is applied. This approach is described in Listing 3;
- PEPa: method similar to the original version, but where a small amount of random noise is injected. This approach is described in Listing 4;
- PEPb: the same idea as PEPa, but noise is injected in a different manner. This approach is described in Listing 5.
Listing 1. Dout: it is similar to dropout filter. |
Perturbation = rand(size(Weights)); % Weights are the weights of the given net % Perturbation is a tensor of the same size as the % set of weights of the net, randomly initialized to [0,1] Perturbation = Perturbation < 0.98; % 2% of the values are set to zero Weights = Weights.*Perturbation; % some weights are zeroed out |
Listing 2. DCTa: DCT-based perturbation approach. |
for each layer for each channel IMG = Weights(layer,channel); % weights of a given channel-layer are stored dctProj = dct2(IMG); % DCT projection dctProj_reset = dctProj; % reset some random dct coefficients dctProj_reset("random indexes") = 0; % DC component is never zeroed out dctProj_reset(1,1) = dctProj(1,1); Weights(layer,channel) = idct2(dctProj_reset); % retroprojection end end |
Listing 3. DCTb: DCT-based perturbation approach. |
for each layer for each channel IMG = Weights(layer,channel); % weights of a given channel-layer are stored dctProj = dct2(IMG); % DCT projection % standard deviation of the values of the weights noise = std(dctProj)/4; % random noise dctProjNew = dctProj + (rand-0.5) .* noise; % rand is random between 0 and 1 dctProjNew(1,1) = dctProj(1,1); % DC component is never zeroed out Weights(layer, channel) = idct2(dctProjNew); % retroprojection end end |
Listing 4. PEPa: Method 3 is similar to Dout. |
sigma = 0.002; Weights = Weights + rand(size(Weights)) .* sigma; % Weights are the weights of the given net |
Listing 5. PEPb: Method 3 is similar to Dout. |
sigma = 0.2; Weights = Weights .* (1 + rand(size(Weights)) .* sigma); % Weights are the weights of the given net |
3.5. Wilcoxon Signed-Rank Test
4. Experimental Results
4.1. Datasets
- HE (2D HeLa dataset [69]): This contains fluorescence microscopy images of HeLa cells stained with different fluorescent dyes specific to various organelles. The dataset is well balanced and divided into ten classes representing different organelles, including DNA (Nuclei), ER (Endoplasmic reticulum), Giantin (cis/medial Golgi), GPP130 (cis Golgi), Lamp2 (Lysosomes), Nucleolin (Nucleoli), Actin, TfR (Endosomes), Mitochondria, and Tubulin.
- MA (C. elegans Muscle Age dataset [70]): This dataset focuses on classifying the age of C. elegans nematodes. It has 257 images of C. elegans muscles collected at four different ages, representing distinct classes based on age. A 5-fold cross-validation is applied.
- BG (Breast Grading Carcinoma [71]): This dataset, obtained from Zenodo (record: 834910#.Wp1bQ-jOWUl), has 300 annotated histological images of breast tissues from patients diagnosed with invasive ductal carcinoma. The dataset is categorized into three classes representing different grades (1–3) of carcinoma. A 5-fold cross-validation is applied.
- LAR (Laryngeal dataset [72]): Obtained from Zenodo (record: 1003200#.WdeQcnBx0nQ), this has 1320 images of laryngeal tissues. It includes both healthy and early-stage cancerous tissues, representing a total of four tissue classes. This dataset is split into three folds by the original authors.
- The POR (portrait dataset) dataset [73] focuses specifically on portrait images of humans. It is designed to evaluate segmentation performance in the context of portrait photography, considering factors such as facial features, skin tones, and background elements. This dataset includes 1447 images for training and 289 images for validation. POR can be accessed at https://github.com/HYOJINPARK/ExtPortraitSeg (accessed on 23 October 2023).
- PEST [74] is a dataset of 563 pest images, 10 classes, commonly found on plants. We use the split training-test sets suggested by the original authors.
- InfLAR [75] is a dataset of 720 images, four classes, extracted from laryngoscopic videos. We use the split training-test sets (three different folds) suggested by the original authors.
- TRIZ [76] is a dataset of 574 gastric lesion-type images, four classes; as suggested by the original authors, we apply a 10-fold cross-validation.
4.2. Results
- noDA: no data augmentation has been applied during training;
- DA: APP3 data augmentation has been applied during training;
- RE(x): a combination by sum rule of x standard ResNet50, where each network is simply re-trained on the training set;
- SE(x): a combination by sum rule of x networks coupled with the stochastic approach for replacing the activation function layers;
- StocDA_PEP(18): for each DA method, we train three networks for a total of eighteen. Each network is then coupled with one of the five PEP variants (randomly chosen);
- StocDA(18): for each DA method, we train three SE networks for a total of eighteen.
- A different topology, mobilenetV2 [77], is used to assess the performance of the tested ensembles;
- To apply bagging for building the ensemble, we coupled bagging with RE(18)-DA, naming it Bag_RE(18)-DA.
- RE(18)-DA outperforms RE(1)-DA in all the datasets;
- SE(18)-DA outperforms RE(18)-DA in all the datasets but PEST;
- StocDA(18) outperforms SE(18)-DA in all the datasets but HE;
- StocDA_PEP(18) outperforms StocDA(18) in all the datasets but Pest when the accuracy is considered as performance indicator;
- Bagging does not lead to improvement; performance of RE(18)-DA and Bag_RE(18)-DA is similar.
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wei, W.; Khan, A.; Huerta, E.; Huang, X.; Tian, M. Deep learning ensemble for real-time gravitational wave detection of spinning binary black hole mergers. Phys. Lett. B 2021, 812, 136029. [Google Scholar] [CrossRef]
- Nanni, L.; Brahnam, S.; Lumini, A.; Loreggia, A. Coupling RetinaFace and Depth Information to Filter False Positives. Appl. Sci. 2023, 13, 2987. [Google Scholar] [CrossRef]
- Shehab, M.; Abualigah, L.; Shambour, Q.; Abu-Hashem, M.A.; Shambour, M.K.Y.; Alsalibi, A.I.; Gandomi, A.H. Machine learning in medical applications: A review of state-of-the-art methods. Comput. Biol. Med. 2022, 145, 105458. [Google Scholar] [CrossRef]
- Dutta, P.; Sathi, K.A.; Hossain, M.A.; Dewan, M.A.A. Conv-ViT: A Convolution and Vision Transformer-Based Hybrid Feature Extraction Method for Retinal Disease Detection. J. Imaging 2023, 9, 140. [Google Scholar] [CrossRef] [PubMed]
- Wu, Z.; Tang, Y.; Hong, B.; Liang, B.; Liu, Y. Enhanced Precision in Dam Crack Width Measurement: Leveraging Advanced Lightweight Network Identification for Pixel-Level Accuracy. Int. J. Intell. Syst. 2023, 2023, 9940881. [Google Scholar] [CrossRef]
- Deng, G.; Huang, T.; Lin, B.; Liu, H.; Yang, R.; Jing, W. Automatic meter reading from UAV inspection photos in the substation by combining YOLOv5s and DeepLabv3+. Sensors 2022, 22, 7090. [Google Scholar] [CrossRef] [PubMed]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
- Haggenmüller, S.; Maron, R.C.; Hekler, A.; Utikal, J.S.; Barata, C.; Barnhill, R.L.; Beltraminelli, H.; Berking, C.; Betz-Stablein, B.; Blum, A.; et al. Skin cancer classification via convolutional neural networks: Systematic review of studies involving human experts. Eur. J. Cancer 2021, 156, 202–216. [Google Scholar] [CrossRef]
- Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef]
- Horie, Y.; Yoshio, T.; Aoyama, K.; Yoshimizu, S.; Horiuchi, Y.; Ishiyama, A.; Hirasawa, T.; Tsuchida, T.; Ozawa, T.; Ishihara, S.; et al. Diagnostic outcomes of esophageal cancer by artificial intelligence using convolutional neural networks. Gastrointest. Endosc. 2019, 89, 25–32. [Google Scholar] [CrossRef]
- Qummar, S.; Khan, F.G.; Shah, S.; Khan, A.; Shamshirband, S.; Rehman, Z.U.; Khan, I.A.; Jadoon, W. A deep learning ensemble approach for diabetic retinopathy detection. IEEE Access 2019, 7, 150530–150539. [Google Scholar] [CrossRef]
- Pan, D.; Zeng, A.; Jia, L.; Huang, Y.; Frizzell, T.; Song, X. Early detection of Alzheimer’s disease using magnetic resonance imaging: A novel approach combining convolutional neural networks and ensemble learning. Front. Neurosci. 2020, 14, 259. [Google Scholar] [CrossRef]
- Nanni, L.; Loreggia, A.; Lumini, A.; Dorizza, A. A Standardized Approach for Skin Detection: Analysis of the Literature and Case Studies. J. Imaging 2023, 9, 35. [Google Scholar] [CrossRef]
- Nagaraj, P.; Subhashini, S. A Review on Detection of Lung Cancer Using Ensemble of Classifiers with CNN. In Proceedings of the 2023 2nd International Conference on Edge Computing and Applications (ICECAA), Namakkal, India, 19–21 July 2023; pp. 815–820. [Google Scholar]
- Shah, A.; Shah, M.; Pandya, A.; Sushra, R.; Sushra, R.; Mehta, M.; Patel, K.; Patel, K. A Comprehensive Study on Skin Cancer Detection using Artificial Neural Network (ANN) and Convolutional Neural Network (CNN). Clin. eHealth 2023, 6, 76–84. [Google Scholar] [CrossRef]
- Thanapol, P.; Lavangnananda, K.; Bouvry, P.; Pinel, F.; Leprévost, F. Reducing overfitting and improving generalization in training convolutional neural network (CNN) under limited sample sizes in image recognition. In Proceedings of the 2020-5th International Conference on Information Technology (InCIT), Chonburi, Thailand, 21–22 October 2020; pp. 300–305. [Google Scholar]
- Campagner, A.; Ciucci, D.; Svensson, C.M.; Figge, M.T.; Cabitza, F. Ground truthing from multi-rater labeling with three-way decision and possibility theory. Inf. Sci. 2021, 545, 771–790. [Google Scholar] [CrossRef]
- Panch, T.; Mattie, H.; Celi, L.A. The “inconvenient truth” about AI in healthcare. NPJ Digit. Med. 2019, 2, 77. [Google Scholar] [CrossRef]
- Bravin, R.; Nanni, L.; Loreggia, A.; Brahnam, S.; Paci, M. Varied Image Data Augmentation Methods for Building Ensemble. IEEE Access 2023, 11, 8810–8823. [Google Scholar] [CrossRef]
- Claro, M.L.; de MS Veras, R.; Santana, A.M.; Vogado, L.H.S.; Junior, G.B.; de Medeiros, F.N.; Tavares, J.M.R. Assessing the Impact of Data Augmentation and a Combination of CNNs on Leukemia Classification. Inf. Sci. 2022, 609, 1010–1029. [Google Scholar] [CrossRef]
- Nanni, L.; Fantozzi, C.; Loreggia, A.; Lumini, A. Ensembles of Convolutional Neural Networks and Transformers for Polyp Segmentation. Sensors 2023, 23, 4688. [Google Scholar] [CrossRef]
- Nanni, L.; Lumini, A.; Loreggia, A.; Brahnam, S.; Cuza, D. Deep ensembles and data augmentation for semantic segmentation. In Diagnostic Biomedical Signal and Image Processing Applications with Deep Learning Methods; Elsevier: Amsterdam, The Netherlands, 2023; pp. 215–234. [Google Scholar]
- Cornelio, C.; Donini, M.; Loreggia, A.; Pini, M.S.; Rossi, F. Voting with random classifiers (VORACE): Theoretical and experimental analysis. Auton. Agent 2021, 35, 2. [Google Scholar] [CrossRef]
- Yao, X.; Liu, Y. Making use of population information in evolutionary artificial neural networks. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 1998, 28, 417–425. [Google Scholar]
- Opitz, D.; Shavlik, J. Generating accurate and diverse members of a neural-network ensemble. Adv. Neural Inf. Process. Syst. 1995, 8. [Google Scholar]
- Liu, Y.; Yao, X.; Higuchi, T. Evolutionary ensembles with negative correlation learning. IEEE Trans. Evol. Comput. 2000, 4, 380–387. [Google Scholar]
- Rosen, B.E. Ensemble learning using decorrelated neural networks. Connect. Sci. 1996, 8, 373–384. [Google Scholar] [CrossRef]
- Liu, Y.; Yao, X. Ensemble learning via negative correlation. Neural Netw. 1999, 12, 1399–1404. [Google Scholar] [CrossRef] [PubMed]
- Papanastasopoulos, Z.; Samala, R.K.; Chan, H.P.; Hadjiiski, L.; Paramagul, C.; Helvie, M.A.; Neal, C.H. Explainable AI for medical imaging: Deep-learning CNN ensemble for classification of estrogen receptor status from breast MRI. In Proceedings of the Medical Imaging 2020: Computer-Aided Diagnosis, Houston, TX, USA, 16–19 February 2020; Volume 11314, pp. 228–235. [Google Scholar]
- He, X.; Zhou, Y.; Wang, B.; Cui, S.; Shao, L. Dme-net: Diabetic macular edema grading by auxiliary task learning. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2019; pp. 788–796. [Google Scholar]
- Coupé, P.; Mansencal, B.; Clément, M.; Giraud, R.; de Senneville, B.D.; Ta, V.T.; Lepetit, V.; Manjon, J.V. AssemblyNet: A large ensemble of CNNs for 3D whole brain MRI segmentation. NeuroImage 2020, 219, 117026. [Google Scholar] [CrossRef] [PubMed]
- Savelli, B.; Bria, A.; Molinara, M.; Marrocco, C.; Tortorella, F. A multi-context CNN ensemble for small lesion detection. Artif. Intell. Med. 2020, 103, 101749. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Matloob, F.; Ghazal, T.M.; Taleb, N.; Aftab, S.; Ahmad, M.; Khan, M.A.; Abbas, S.; Soomro, T.R. Software defect prediction using ensemble learning: A systematic literature review. IEEE Access 2021, 9, 98754–98771. [Google Scholar] [CrossRef]
- Roshan, S.E.; Asadi, S. Improvement of Bagging performance for classification of imbalanced datasets using evolutionary multi-objective optimization. Eng. Appl. Artif. Intell. 2020, 87, 103319. [Google Scholar] [CrossRef]
- Kassani, S.H.; Kassani, P.H.; Wesolowski, M.J.; Schneider, K.A.; Deters, R. Classification of histopathological biopsy images using ensemble of deep learning networks. arXiv 2019, arXiv:1909.11870. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 2818–2826. [Google Scholar]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Liu, K.; Zhang, M.; Pan, Z. Facial expression recognition with CNN ensemble. In Proceedings of the 2016 International Conference on Cyberworlds (CW), Chongqing, China, 28–30 September 2016; pp. 163–166. [Google Scholar]
- Goodfellow, I.J.; Erhan, D.; Carrier, P.L.; Courville, A.; Mirza, M.; Hamner, B.; Cukierski, W.; Tang, Y.; Thaler, D.; Lee, D.H.; et al. Challenges in representation learning: A report on three machine learning contests. In Proceedings of the Neural Information Processing: 20th International Conference, ICONIP 2013, Daegu, Republic of Korea, 3–4 November 2013; Proceedings, Part III 20. Springer: Cham, Switzerland, 2013; pp. 117–124. [Google Scholar]
- Kumar, A.; Kim, J.; Lyndon, D.; Fulham, M.; Feng, D. An ensemble of fine-tuned convolutional neural networks for medical image classification. IEEE J. Biomed. Health Inform. 2016, 21, 31–40. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Gilbert, A.; Piras, L.; Wang, J.; Yan, F.; Ramisa, A.; Dellandrea, E.; Gaizauskas, R.J.; Villegas, M.; Mikolajczyk, K. Overview of the ImageCLEF 2016 Scalable Concept Image Annotation Task. In Proceedings of the CLEF (Working Notes), Évora, Portugal, 5–8 September 2016; pp. 254–278. [Google Scholar]
- Pandey, P.; Deepthi, A.; Mandal, B.; Puhan, N.B. FoodNet: Recognizing foods using ensemble of deep networks. IEEE Signal Process. Lett. 2017, 24, 1758–1762. [Google Scholar] [CrossRef]
- Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
- Wolpert, D.H.; Macready, W.G. An efficient method to estimate bagging’s generalization error. Mach. Learn. 1999, 35, 41–55. [Google Scholar] [CrossRef]
- Bauer, E.; Kohavi, R. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach. Learn. 1999, 36, 105–139. [Google Scholar] [CrossRef]
- Kim, P.K.; Lim, K.T. Vehicle type classification using bagging and convolutional neural network on multi view surveillance image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 41–46. [Google Scholar]
- Dong, X.; Qian, L.; Huang, L. A CNN-based bagging learning approach to short-term load forecasting in smart grid. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), San Francisco, CA, USA, 4–8 August 2017; pp. 1–6. [Google Scholar]
- Guo, J.; Gould, S. Deep CNN ensemble with data augmentation for object detection. arXiv 2015, arXiv:1506.07224. [Google Scholar]
- Gan, Y.; Chen, J.; Xu, L. Facial expression recognition boosted by soft label with a diverse ensemble. Pattern Recognit. Lett. 2019, 125, 105–112. [Google Scholar] [CrossRef]
- Antipov, G.; Berrani, S.A.; Dugelay, J.L. Minimalistic CNN-based ensemble model for gender prediction from face images. Pattern Recognit. Lett. 2016, 70, 59–65. [Google Scholar] [CrossRef]
- Zhang, H.; Zhou, T.; Xu, T.; Hu, H. Remote interference discrimination testbed employing AI ensemble algorithms for 6G TDD networks. Sensors 2023, 23, 2264. [Google Scholar] [CrossRef]
- Nanni, L.; Lumini, A.; Ghidoni, S.; Maguolo, G. Stochastic selection of activation layers for convolutional neural networks. Sensors 2020, 20, 1626. [Google Scholar] [CrossRef] [PubMed]
- Ju, C.; Bibaut, A.; van der Laan, M. The relative performance of ensemble methods with deep convolutional neural networks for image classification. J. Appl. Stat. 2018, 45, 2800–2818. [Google Scholar] [CrossRef]
- Harangi, B. Skin lesion classification with ensembles of deep convolutional neural networks. J. Biomed. Inform. 2018, 86, 25–32. [Google Scholar] [CrossRef] [PubMed]
- Lyksborg, M.; Puonti, O.; Agn, M.; Larsen, R. An ensemble of 2D convolutional neural networks for tumor segmentation. In Proceedings of the Image Analysis: 19th Scandinavian Conference, SCIA 2015, Copenhagen, Denmark, 15–17 June 2015; Proceedings 19. Springer: Cham, Switzerland, 2015; pp. 201–211. [Google Scholar]
- Minetto, R.; Segundo, M.P.; Sarkar, S. Hydra: An ensemble of convolutional neural networks for geospatial land classification. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6530–6541. [Google Scholar] [CrossRef]
- Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A survey on ensemble learning. Front. Comput. Sci. 2020, 14, 241–258. [Google Scholar]
- Brown, G.; Wyatt, J.; Harris, R.; Yao, X. Diversity creation methods: A survey and categorisation. Inf. Fusion 2005, 6, 5–20. [Google Scholar] [CrossRef]
- Sagi, O.; Rokach, L. Ensemble learning: A survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2018, 8, e1249. [Google Scholar] [CrossRef]
- Duch, W.; Jankowski, N. Survey of neural transfer functions. Neural Comput. Surv. 1999, 2, 163–212. [Google Scholar]
- Goceri, E. Medical Image Data Augmentation: Techniques, Comparisons and Interpretations. Artif. Intell. Rev. 2023, 7, 1–45. [Google Scholar]
- Mehrtash, A.; Abolmaesumi, P.; Golland, P.; Kapur, T.; Wassermann, D.; Wells, W. Pep: Parameter ensembling by perturbation. Adv. Neural Inf. Process. Syst. 2020, 33, 8895–8906. [Google Scholar] [PubMed]
- Demšar, J. Statistical Comparisons of Classifiers over Multiple Datasets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]
- Boland, M.V.; Murphy, R.F. A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells. Bioinformatics 2001, 17, 1213–1223. [Google Scholar] [PubMed]
- Shamir, L.; Orlov, N.; Mark Eckley, D.; Macura, T.J.; Goldberg, I.G. IICBU 2008: A proposed benchmark suite for biological image analysis. Med. Biol. Eng. Comput. 2008, 46, 943–947. [Google Scholar]
- Dimitropoulos, K.; Barmpoutis, P.; Zioga, C.; Kamas, A.; Patsiaoura, K.; Grammalidis, N. Grading of invasive breast carcinoma through Grassmannian VLAD encoding. PLoS ONE 2017, 12, e0185110. [Google Scholar]
- Moccia, S.; De Momi, E.; Guarnaschelli, M.; Savazzi, M.; Laborai, A.; Guastini, L.; Peretti, G.; Mattos, L.S. Confident texture-based laryngeal tissue classification for early stage diagnosis support. J. Med. Imaging 2017, 4, 034502. [Google Scholar]
- Kim, Y.W.; Byun, Y.C.; Krishna, A.V.N. Portrait Segmentation Using Ensemble of Heterogeneous Deep-Learning Models. Entropy 2021, 23, 197. [Google Scholar] [CrossRef]
- Deng, L.; Wang, Y.; Han, Z.; Yu, R. Research on insect pest image detection and recognition based on bio-inspired methods. Biosyst. Eng. 2018, 169, 139–148. [Google Scholar] [CrossRef]
- Patrini, I.; Ruperti, M.; Moccia, S.; Mattos, L.S.; Frontoni, E.; De Momi, E. Transfer learning for informative-frame selection in laryngoscopic videos through learned features. Med. Biol. Eng. Comput. 2020, 58, 1225–1238. [Google Scholar]
- Zhao, R.; Zhang, R.; Tang, T.; Feng, X.; Li, J.; Liu, Y.; Zhu, R.; Wang, G.; Li, K.; Zhou, W.; et al. TriZ-a rotation-tolerant image feature and its application in endoscope-based disease diagnosis. Comput. Biol. Med. 2018, 99, 182–190. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.G.; Zhu, M.; Zhmoginov, A.; Chen, L. Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation. arXiv 2018, arXiv:1801.04381. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Short Name | # Classes | # Samples | Image Size | Protocol | Ref. |
---|---|---|---|---|---|
HE | 10 | 862 | grayscale | 5CV | [69] |
MA | 4 | 257 | grayscale | 5CV | [70] |
BG | 3 | 300 | RGB | 5CV | [71] |
LAR | 4 | 1320 | RGB | 3CV | [72] |
POR | 3 | 1736 | RGB | 5CV | [73] |
PEST | 10 | 563 | RGB | 5CV | [74] |
InfLAR | 4 | 720 | RGB | 10CV | [75] |
TRIZ | 4 | 574 | RGB | 10CV | [76] |
HE | MA | BG | LAR | POR | Average | |
---|---|---|---|---|---|---|
RE(1)-noDA | 94.65 | 92.50 | 91.67 | 90.98 | 85.74 | 91.11 |
RE(14)-noDA | 96.05 | 95.00 | 90.33 | 94.02 | 87.15 | 92.51 |
RE(30)-noDA | 95.81 | 94.58 | 90.67 | 94.02 | 87.15 | 92.44 |
RE(1)-DA | 95.93 | 95.83 | 92.67 | 94.77 | 86.29 | 93.10 |
RE(14)-DA | 96.63 | 97.50 | 94.33 | 95.76 | 88.24 | 94.49 |
RE(30)-DA | 96.33 | 98.33 | 94.00 | 95.83 | 88.56 | 94.61 |
SE(14)-noDA | 95.47 | 95.42 | 92.67 | 94.62 | 88.02 | 93.24 |
SE(30)-noDA | 95.58 | 96.25 | 92.67 | 95.00 | 88.77 | 93.65 |
SE(14)-DA | 96.63 | 98.33 | 94.67 | 95.98 | 88.67 | 94.86 |
SE(30)-DA | 96.33 | 98.33 | 95.00 | 96.21 | 89.00 | 94.97 |
DataAUG | HE | MA | BG | LAR | POR | Average |
---|---|---|---|---|---|---|
DA1 | 95.12 | 95.00 | 93.00 | 92.95 | 87.05 | 92.62 |
DA2 | 96.63 | 95.83 | 94.00 | 95.08 | 85.97 | 93.50 |
DA3 | 95.93 | 95.83 | 92.67 | 94.77 | 86.29 | 93.10 |
DA4 | 95.23 | 93.33 | 92.33 | 94.62 | 84.90 | 92.08 |
DA5 | 95.35 | 91.25 | 91.33 | 95.45 | 86.41 | 91.95 |
DA6 | 92.44 | 91.25 | 92.33 | 94.39 | 87.37 | 91.55 |
ALL | 96.74 | 97.50 | 94.00 | 96.06 | 89.00 | 94.66 |
RE(6)-DA | 96.40 | 97.08 | 93.67 | 95.98 | 88.45 | 94.31 |
HE | MA | BG | LAR | POR | Average | |
---|---|---|---|---|---|---|
DropOut | 94.53 | 95.00 | 88.33 | 92.65 | 84.57 | 91.10 |
DCTa | 94.88 | 93.33 | 92.00 | 94.09 | 85.11 | 91.88 |
DCTb | 93.95 | 94.17 | 90.00 | 92.35 | 84.79 | 91.05 |
PEPa | 95.58 | 93.33 | 89.67 | 92.20 | 85.22 | 91.20 |
PEPb | 94.77 | 92.08 | 89.33 | 92.58 | 85.11 | 90.77 |
PEPa(5) | 95.93 | 96.25 | 90.33 | 94.02 | 86.94 | 92.69 |
ALL | 96.05 | 97.08 | 90.67 | 94.24 | 86.95 | 93.00 |
RE(5)-noDA | 95.47 | 94.58 | 91.33 | 93.48 | 86.82 | 92.33 |
HE | MA | BG | LAR | POR | PEST | InfLAR | TRIZ | Average | |
---|---|---|---|---|---|---|---|---|---|
RE(1)-DA | 95.93 | 95.83 | 92.67 | 94.77 | 86.29 | 93.70 | 95.56 | 98.78 | 94.19 |
RE(18)-DA | 96.33 | 98.33 | 94.33 | 95.61 | 88.13 | 93.87 | 96.30 | 98.78 | 95.21 |
SE(18)-DA | 96.51 | 98.33 | 95.00 | 96.06 | 88.56 | 94.36 | 96.67 | 98.95 | 95.55 |
StocDA(18) | 96.10 | 96.67 | 94.33 | 96.81 | 89.96 | 94.48 | 96.53 | 98.95 | 95.47 |
StocDA_PEP(18) | 96.40 | 97.50 | 94.00 | 96.82 | 91.68 | 94.14 | 97.08 | 99.13 | 95.84 |
HE | MA | BG | LAR | POR | PEST | InfLAR | TRIZ | Average | |
---|---|---|---|---|---|---|---|---|---|
RE(1)-DA | 0.40 | 0.79 | 2.74 | 0.41 | 2.69 | 0.75 | 0.54 | 0.10 | 1.05 |
RE(18)-DA | 0.22 | 0.16 | 2.32 | 0.18 | 2.05 | 0.71 | 0.49 | 0.13 | 0.78 |
SE(18)-DA | 0.14 | 0.06 | 2.72 | 0.14 | 1.88 | 0.57 | 0.49 | 0.05 | 0.75 |
StocDA(18) | 0.15 | 0.10 | 2.96 | 0.09 | 1.36 | 0.53 | 0.41 | 0.04 | 0.70 |
StocDA_PEP(18) | 0.10 | 0.07 | 1.67 | 0.07 | 1.31 | 0.52 | 0.40 | 0.03 | 0.52 |
GPU | GPU Year | Single ResNet50 | Ensemble 15 ResNet50 |
---|---|---|---|
GTX 1080 | 2016 | 0.36 s | 5.58 s |
Titan Xp | 2017 | 0.31 s | 4.12 s |
Titan RTX | 2018 | 0.22 s | 2.71 s |
Titan V100 | 2018 | 0.20 s | 2.42 s |
HE | LAR | PEST | POR | |
---|---|---|---|---|
RE(1)-DA | 95.93 | 94.77 | 92.54 | 84.79 |
RE(18)-DA | 95.93 | 95.91 | 93.26 | 86.72 |
SE(18)-DA | 96.40 | 95.91 | 93.04 | 89.09 |
StocDA(18) | 95.81 | 96.06 | 93.31 | 89.31 |
StocDA_PEP(18) | 96.51 | 96.29 | 92.65 | 90.17 |
HE | LAR | PEST | POR | |
---|---|---|---|---|
RE(1)-DA | 0.25 | 0.43 | 0.89 | 3.38 |
RE(18)-DA | 0.17 | 0.25 | 0.67 | 2.57 |
SE(18)-DA | 0.17 | 0.17 | 0.73 | 1.76 |
StocDA(18) | 0.21 | 0.16 | 0.65 | 1.46 |
StocDA_PEP(18) | 0.20 | 0.13 | 0.62 | 1.23 |
HE | LAR | PEST | POR | |
---|---|---|---|---|
RE(18)-DA | 96.33 | 95.61 | 93.87 | 88.13 |
Bag_RE(18)-DA | 96.40 | 95.08 | 93.48 | 88.02 |
HE | LAR | PEST | POR | |
---|---|---|---|---|
RE(18)-DA | 0.22 | 0.18 | 0.75 | 2.05 |
Bag_RE(18)-DA | 0.31 | 0.19 | 0.47 | 2.30 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nanni, L.; Loreggia, A.; Brahnam, S. Comparison of Different Methods for Building Ensembles of Convolutional Neural Networks. Electronics 2023, 12, 4428. https://doi.org/10.3390/electronics12214428
Nanni L, Loreggia A, Brahnam S. Comparison of Different Methods for Building Ensembles of Convolutional Neural Networks. Electronics. 2023; 12(21):4428. https://doi.org/10.3390/electronics12214428
Chicago/Turabian StyleNanni, Loris, Andrea Loreggia, and Sheryl Brahnam. 2023. "Comparison of Different Methods for Building Ensembles of Convolutional Neural Networks" Electronics 12, no. 21: 4428. https://doi.org/10.3390/electronics12214428
APA StyleNanni, L., Loreggia, A., & Brahnam, S. (2023). Comparison of Different Methods for Building Ensembles of Convolutional Neural Networks. Electronics, 12(21), 4428. https://doi.org/10.3390/electronics12214428