Hyperspectral Classification of Blood-Like Substances Using Machine Learning Methods Combined with Genetic Algorithms in Transductive and Inductive Scenarios
Abstract
:1. Introduction
2. State of the Art
2.1. Hyperspectral Classification
2.2. Evolutionary Computation and Genetic Algorithms
2.3. Hyperspectral Classification and Band Selection with GAs
3. Materials and Methods
3.1. Dataset
3.2. Data Preprocessing
- Median filter: Images were smoothed with a spatial median filter with a window size of one pixel. This operation was intended to reduce the noise in spectra, using the fact that classes were spatially significantly larger than a single pixel.
- Spectra normalization: As suggested in [14], the spectrum of each pixel was divided by its median. The purpose of this normalisation was to compensate for uneven lighting in the image.
- Removal of noisy bands: Following [14], noisy bands (0–4), (48–50) and (122–128) were removed, leaving 113 bands.
3.3. Feature Extraction
3.4. Classification Algorithms
3.4.1. Support Vector Machines
- Gaussian radial basis function (RBF) , parameterised with ,
- sigmoid kernel parametrised with
- polynomial kernel parametrised with that can be simplified to the linear kernel when parameters .
3.4.2. K-Nearest Neighbour (KNN)
3.4.3. Multilayer Perceptron
3.5. Model and Feature Selection with Genetic Algorithms
3.5.1. Model Selection with Grid Search
3.5.2. Implementation
3.5.3. Model Performance Metric
4. Experiments
- Hyperspectral transductive classification (HTC)—training and test examples were randomly, uniformly selected from a single hyperspectral image.
- Hyperspectral inductive classification (HIC)—training and test examples were selected from different images. Typically, training examples came from “Frame” images and testing examples came from the “Comparison” images.
- Hyperspectral inductive classification with a validation Set (HICVS)—this scenario was similar to the HIC scenario: training examples came from “Frame” images and testing examples came from the “Comparison” images. However, model selection was performed using a separate validation set that was randomly, uniformly sampled from the “Comparison” scene. This scenario was designed to test the capabilities of GA optimisation under different conditions to those in the HIC scenario, which is discussed in detail in Section 6.
4.1. The Scheme of Experiments
- Raw data—The data set consisted of seven hyperspectral images from the data set described in Section 3.1. Every image had 128 hyperspectral bands. The images represented two scenes—the “Frame” scene and the “Comparison” scene. Four of the seven images showed the “Frame” scene, captured in days , where the value represents the afternoon of the first day. The three “Comparison” images were captured on days .
- Data preprocessing—Data were transformed in accordance with the methodology described in Section 3.2: in order to reduce the effect of noise and uneven lighting, spectra were smoothed with the median window, normalised and noisy bands were removed. Background (unannotated pixels) and pixels from the class “beetroot juice” (class 4) that was not present in all images were removed. Finally, the problem was posed as a six-class classification with classes .
- Feature extraction—A derivative transformation was used, as described in Section 3.3.
- Data split—Data were divided into training and test sets. A detailed description of this stage is included in Section 4.2, Section 4.3, Section 4.4.
- Model optimization—Model and feature selection were performed as described in detail in Section 3.5. The reference method used for comparison was a grid search. In both cases, the accuracy was chosen as the evaluation criterion. The settings and details of the cross-validation varied depending on the scenario of the experiment; detailed descriptions are provided in descriptions of the individual scenarios.
- Model evaluation—The final final results were expressed in terms of classification accuracy. After finding the best model in stage 5, this model was trained on the entire training set and tested on the test sets. The test sets were created from both scenes: “Frame” and “Comparison”. The training and testing process was repeated five times and the average accuracy with the standard deviation was calculated.
4.2. Hyperspectral Transductive Classification (HTC)
4.3. Hyperspectral Inductive Classification (HIC)
4.4. Hyperspectral Inductive Classification with a Validation Set (HICVS)
5. Results
5.1. The HTC Scenario
5.2. The HIC Scenario
5.3. HIC Scenario with a Validation Set
5.4. Computation Time
6. Discussion
6.1. The Impact of Preprocessing
6.2. Model Optimisation with GA in Hyperspectral Classification
7. Conclusions and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Ghamisi, P.; Plaza, J.; Chen, Y.; Li, J.; Plaza, A.J. Advanced spectral classifiers for hyperspectral images: A review. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–32. [Google Scholar] [CrossRef] [Green Version]
- Bioucas-Dias, J.M.; Plaza, A.; Camps-Valls, G.; Scheunders, P.; Nasrabadi, N.M.; Chanussot, J. Hyperspectral remote sensing data analysis and future challenges. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–36. [Google Scholar] [CrossRef] [Green Version]
- Bioucas-Dias, J.M.; Plaza, A.; Dobigeon, N.; Parente, M.; Du, Q.; Gader, P.; Chanussot, J. Hyperspectral unmixing overview: Geometrical, statistical, and sparse regression-based approaches. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 354–379. [Google Scholar] [CrossRef] [Green Version]
- Landgrebe, D.A. Signal Theory Methods in Multispectral Remote Sensing; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 29. [Google Scholar]
- Romaszewski, M.; Głomb, P.; Cholewa, M. Semi-supervised hyperspectral classification from a small number of training samples using a co-training approach. ISPRS J. Photogramm. Remote Sens. 2016, 121, 60–76. [Google Scholar] [CrossRef]
- Vapnik, V.; Sterin, A. On structural risk minimization or overall risk in a problem of pattern recognition. Autom. Remote Control 1977, 10, 1495–1503. [Google Scholar]
- Manolakis, D.; Marden, D.; Shaw, G.A. Hyperspectral image processing for automatic target detection applications. Linc. Lab. J. 2003, 14, 79–116. [Google Scholar]
- Holland, J.H. Adaptation in Natural and Artificial Systems; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar]
- Rutkowski, L. Computational Intelligence: Methods and Techniques; Springer: Berlin/Heidelberg, Germany, 1992. [Google Scholar]
- Ma, J.P.; Zheng, Z.B.; Tong, Q.X.; Zheng, L.F. An application of genetic algorithms on band selection for hyperspectral image classification. In Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No. 03EX693), Xi’an, China, 5 November 2003; Volume 5, pp. 2810–2813. [Google Scholar]
- Sukawattanavijit, C.; Chen, J.; Zhang, H. GA-SVM algorithm for improving land-cover classification using SAR and optical remote sensing data. IEEE Geosci. Remote Sens. Lett. 2017, 14, 284–288. [Google Scholar] [CrossRef]
- Kumar, R.K.; Saichandana, B.; Srinivas, K. Dimensionality reduction and classification of hyperspectral images using genetic algorithm. Indones. J. Electr. Eng. Comput. Sci. 2016, 3, 503–511. [Google Scholar] [CrossRef]
- Zhuo, L.; Zheng, J.; Li, X.; Wang, F.; Ai, B.; Qian, J. A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine. In Geoinformatics 2008 and Joint Conference on GIS and Built Environment: Classification of Remote Sensing Images; International Society for Optics and Photonics: Bellingham, WA, USA, 2008; Volume 7147, p. 71471J. [Google Scholar]
- Romaszewski, M.; Głomb, P.; Sochan, A.; Cholewa, M. A dataset for evaluating blood detection in hyperspectral images. Forensic Sci. Int. 2021, 320, 110701. [Google Scholar] [CrossRef]
- Tadeusiewicz, R. Automatic Understanding of Medical Images (Opening Lecture) The 2nd International Conference “Innovative Technologies in Biomedicine”; The Cracovian Association for Heart and Lung Health PULMO-CAR: Kraków, Poland, 2015; pp. 10–11. [Google Scholar]
- Kłeczek, P.; Lech, M.; Jaworek-Korjakowska, J.; Dyduch, G.; Tadeusiewicz, R. Segmentation of black ink and melanin in skin histopathological images. In Medical Imaging 2018: Digital Pathology; International Society for Optics and Photonics: Bellingham, WA, USA, 2018; p. 105811A. [Google Scholar] [CrossRef]
- Kłeczek, P.; Dyduch, G.; Jaworek-Korjakowska, J.; Tadeusiewicz, R. Automated epidermis segmentation in histopathological images of human skin stained with hematoxylin and eosin. In Medical Imaging 2017: Digital Pathology; International Society for Optics and Photonics: Bellingham, WA, USA, 2017; p. 101400M. [Google Scholar] [CrossRef]
- Jaworek-Korjakowska, J.; Kłeczek, P.; Tadeusiewicz, R. Detection and classification of pigment network in dermoscopic color images as one of the 7-point checklist criteria. In Recent Developments and Achievements in Biocybernetics and Biomedical Engineering, Proceedings of the 20th Polish Conference on Biocybernetics and Biomedical Engineering, Kraków, Poland, 20–22 September 2017; Advances in Intelligent Systems and Computing; Springer: Berlin/Heidelberg, Germany, 2017; Volume 647, pp. 174–181. [Google Scholar] [CrossRef]
- Lu, G.; Fei, B. Medical hyperspectral imaging: A review. J. Biomed. Opt. 2014, 19, 010901. [Google Scholar] [CrossRef]
- Edelman, G.; Manti, V.; van Ruth, S.M.; van Leeuwen, T.; Aalders, M. Identification and age estimation of blood stains on colored backgrounds by near infrared spectroscopy. Forensic Sci. Int. 2012, 220, 239–244. [Google Scholar] [CrossRef] [PubMed]
- Melgani, F.; Bruzzone, L. Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1778–1790. [Google Scholar] [CrossRef] [Green Version]
- Pal, M.; Maxwell, A.E.; Warner, T.A. Kernel-based extreme learning machine for remote-sensing image classification. Remote Sens. Lett. 2013, 4, 853–862. [Google Scholar] [CrossRef]
- Khodadadzadeh, M.; Li, J.; Plaza, A.; Bioucas-Dias, J.M. A subspace-based multinomial logistic regression for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2014, 11, 2105–2109. [Google Scholar] [CrossRef]
- Ghamisi, P.; Maggiori, E.; Li, S.; Souza, R.; Tarablaka, Y.; Moser, G.; De Giorgi, A.; Fang, L.; Chen, Y.; Chi, M.; et al. New frontiers in spectral-spatial hyperspectral image classification: The latest advances based on mathematical morphology, Markov random fields, segmentation, sparse representation, and deep learning. IEEE Geosci. Remote Sens. Mag. 2018, 6, 10–43. [Google Scholar] [CrossRef]
- Li, S.; Song, W.; Fang, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. Deep learning for hyperspectral image classification: An overview. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6690–6709. [Google Scholar] [CrossRef] [Green Version]
- Fang, B.; Li, Y.; Zhang, H.; Chan, J.C.W. Semi-supervised deep learning classification for hyperspectral image based on dual-strategy sample selection. Remote Sens. 2018, 10, 574. [Google Scholar] [CrossRef] [Green Version]
- Engelbrecht, A.P. Computational Intelligence: An Introduction, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2007; pp. 1–597. [Google Scholar]
- Tadeusiewicz, R. Neural networks as a tool for modeling of biological systems. Bio-Algorithms Med-Syst. 2015, 11, 135–144. [Google Scholar] [CrossRef]
- Back, T.; Hammel, U.; Schwefel, H.P. Evolutionary computation: Comments on the history and current state. IEEE Trans. Evol. Comput. 1997, 1, 3–17. [Google Scholar] [CrossRef] [Green Version]
- Nguyen, H.T.; Sugeno, M. Fuzzy Systems, Modeling and Control; Springer: Boston, MA, USA, 1998. [Google Scholar]
- Sivanandam, S.; Deepa, S. Introduction to Genetic Algorithms; Cited by 1272; Springer: Berlin/Heidelberg, Germany, 2008; pp. 1–442. [Google Scholar] [CrossRef] [Green Version]
- Park, H.; Son, D.; Koo, B.; Jeong, B. Waiting strategy for the vehicle routing problem with simultaneous pickup and delivery using genetic algorithm. Expert Syst. Appl. 2021, 165, 113959. [Google Scholar] [CrossRef]
- Zhou, Y.; Zhang, W.; Kang, J.; Zhang, X.; Wang, X. A problem-specific non-dominated sorting genetic algorithm for supervised feature selection. Inf. Sci. 2021, 547, 841–859. [Google Scholar] [CrossRef]
- D’Angelo, G.; Palmieri, F. GGA: A modified genetic algorithm with gradient-based local search for solving constrained optimization problems. Inf. Sci. 2021, 547, 136–162. [Google Scholar] [CrossRef]
- Alonso-Arévalo, M.A.; Cruz-Gutiérrez, A.; Ibarra, R.; García-Canseco, E.; Conte-Galván, R. Robust heart sound segmentation based on spectral change detection and genetic algorithms. Biomed. Signal Process. Control 2021, 63, 102208. [Google Scholar] [CrossRef]
- Dong, X.; Zhang, H.; Xu, M.; Shen, F. Hybrid genetic algorithm with variable neighborhood search for multi-scale multiple bottleneck traveling salesmen problem. Future Gener. Comput. Syst. 2021, 114, 229–242. [Google Scholar] [CrossRef]
- Pławiak, P.; Acharya, U.R. Novel Deep Genetic Ensemble of Classifiers for Arrhythmia Detection Using ECG Signals. Neural Comput. Appl. 2020, 32, 11137–11161. [Google Scholar] [CrossRef] [Green Version]
- Pławiak, P. Novel Genetic Ensembles of Classifiers Applied to Myocardium Dysfunction Recognition Based on ECG Signals. Swarm Evol. Comput. 2018, 39C, 192–208. [Google Scholar] [CrossRef]
- Książek, W.; Abdar, M.; Acharya, U.R.; Pławiak, P. A Novel Machine Learning Approach for Early Detection of Hepatocellular Carcinoma Patients. Cogn. Syst. Res. 2019, 54, 116–127. [Google Scholar] [CrossRef]
- Pławiak, P.; Abdar, M.; Pławiak, J.; Makarenkov, V.; Acharya, U.R. DGHNL: A New Deep Genetic Hierarchical Network of Learners for Prediction of Credit Scoring. Inf. Sci. 2020, 516, 401–418. [Google Scholar] [CrossRef]
- Pedergnana, M.; Marpu, P.R.; Dalla Mura, M.; Benediktsson, J.A.; Bruzzone, L. A novel technique for optimal feature selection in attribute profiles based on genetic algorithms. IEEE Trans. Geosci. Remote Sens. 2013, 51, 3514–3528. [Google Scholar] [CrossRef]
- Nagasubramanian, K.; Jones, S.; Sarkar, S.; Singh, A.K.; Singh, A.; Ganapathysubramanian, B. Hyperspectral band selection using genetic algorithm and support vector machines for early identification of charcoal rot disease in soybean stems. Plant Methods 2018, 14, 86. [Google Scholar] [CrossRef]
- Tsai, F.; Philpot, W. Derivative analysis of hyperspectral data. Remote Sens. Environ. 1998, 66, 41–51. [Google Scholar] [CrossRef]
- Majda, A.; Wietecha-Posłuszny, R.; Mendys, A.; Wójtowicz, A.; ydżba-Kopczyńska, B. Hyperspectral imaging and multivariate analysis in the dried blood spots investigations. Appl. Phys. A 2018, 124, 312. [Google Scholar] [CrossRef] [Green Version]
- Scholkopf, B.; Smola, A.J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond; Adaptive Computation and Machine Learning Series; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Głomb, P.; Romaszewski, M.; Cholewa, M.; Domino, K. Application of hyperspectral imaging and machine learning methods for the detection of gunshot residue patterns. Forensic Sci. Int. 2018, 290, 227–237. [Google Scholar] [CrossRef] [PubMed]
- Schölkopf, B.; Smola, A.J.; Williamson, R.C.; Bartlett, P.L. New support vector algorithms. Neural Comput. 2000, 12, 1207–1245. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Z. Introduction to machine learning: K-nearest neighbors. Ann. Transl. Med. 2016, 4, 218. [Google Scholar] [CrossRef] [Green Version]
- Ramchoun, H.; Amine, M.; Janati Idrissi, M.A.; Ghanou, Y.; Ettaouil, M. Multilayer Perceptron: Architecture Optimization and Training. Int. J. Interact. Multimed. Artif. Intel. 2016, 4, 26–30. [Google Scholar] [CrossRef]
- Grefenstette, J. Genetic algorithms for changing environments. In Ppsn; Citeseer: State College, PA, USA, 1992; Volume 2, pp. 137–144. [Google Scholar]
- Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics; Proceedings of Machine Learning Research; Teh, Y.W., Titterington, M., Eds.; PMLR: Chia Laguna Resort, Sardinia, Italy, 2010; Volume 9, pp. 249–256. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Wallach, H., Larochelle, H., Beygelzimer, A., d Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2019; pp. 8024–8035. [Google Scholar]
- Fortin, F.A.; De Rainville, F.M.; Gardner, M.A.; Parizeau, M.; Gagné, C. DEAP: Evolutionary Algorithms Made Easy. J. Mach. Learn. Res. 2012, 13, 2171–2175. [Google Scholar]
- Kandaswamy, C.; Silva, L.M.; Alexandre, L.A.; Santos, J.M.; de Sá, J.M. Improving Deep Neural Network Performance by Reusing Features Trained with Transductive Transference. In Artificial Neural Networks and Machine Learning—ICANN 2014; Springer International Publishing: Cham, Switzerland, 2014; pp. 265–272. [Google Scholar]
Parameter | Range of Values |
---|---|
K a | {RBF, polynomial, sigmoid} |
d b | |
c | |
d | |
band |
Parameter | Value |
---|---|
Size of the population | 200 |
Number of epochs | 100 |
Fitness function | Accuracy |
Selection algorithm | Tournament selection, size 3 |
Crossover method | Uniform crossover |
Mutation method | One-point mutation a |
Probability of crossover | 0.8 |
Probability of mutation | 0.8 |
Elitist strategy | 1 individual |
Classifier | Parameter | Values |
---|---|---|
SVM | Ka | {RBF, polynomial, sigmoid} |
C | ||
db | ||
c | ||
d | ||
LSVM e | loss | {hinge, squared} |
C | ||
-SVM | Ka | {RBF, polynomial, sigmoid} |
db | ||
c | ||
d | ||
KNN | Dist. metric | {Euclidean, Manhattan, Chebyshev} |
Weights | {Uniform, distance} | |
No. neighbors | ||
MLP | No. hidden layers | |
Number of neurons on consecutive layers | ||
Dropout | ||
Learning rate | ||
Batch size | ||
Number of iterations | ||
Weights initialisation | Glorot method [51] with normal distribution |
Model Optimisation | Classifier | Accuracy/Day | |||
---|---|---|---|---|---|
1 | 7 | 21 | All b | ||
GS | SVC | ||||
LSVC a | |||||
-SVM | |||||
KNN | |||||
MLP | |||||
GA | -SVM |
Model Optimisation | Classifier | Accuracy/Day | |||
---|---|---|---|---|---|
1 | 7 | 21 | All b | ||
GS | SVC | ||||
LSVC a | |||||
-SVM | |||||
KNN | |||||
MLP | |||||
GA | -SVM |
Model Optimisation | Classifier | Accuracy/Day | |||
---|---|---|---|---|---|
1 | 7 | 21 | All b | ||
GS | SVC | ||||
LSVC a | |||||
-SVM | |||||
KNN | |||||
MLP | |||||
GA | -SVM |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pałka, F.; Książek, W.; Pławiak, P.; Romaszewski, M.; Książek, K. Hyperspectral Classification of Blood-Like Substances Using Machine Learning Methods Combined with Genetic Algorithms in Transductive and Inductive Scenarios. Sensors 2021, 21, 2293. https://doi.org/10.3390/s21072293
Pałka F, Książek W, Pławiak P, Romaszewski M, Książek K. Hyperspectral Classification of Blood-Like Substances Using Machine Learning Methods Combined with Genetic Algorithms in Transductive and Inductive Scenarios. Sensors. 2021; 21(7):2293. https://doi.org/10.3390/s21072293
Chicago/Turabian StylePałka, Filip, Wojciech Książek, Paweł Pławiak, Michał Romaszewski, and Kamil Książek. 2021. "Hyperspectral Classification of Blood-Like Substances Using Machine Learning Methods Combined with Genetic Algorithms in Transductive and Inductive Scenarios" Sensors 21, no. 7: 2293. https://doi.org/10.3390/s21072293
APA StylePałka, F., Książek, W., Pławiak, P., Romaszewski, M., & Książek, K. (2021). Hyperspectral Classification of Blood-Like Substances Using Machine Learning Methods Combined with Genetic Algorithms in Transductive and Inductive Scenarios. Sensors, 21(7), 2293. https://doi.org/10.3390/s21072293