Machine Learning Identification of Saline-Alkali-Tolerant Japonica Rice Varieties Based on Raman Spectroscopy and Python Visual Analysis
Abstract
:1. Introduction
2. Materials and Methods
2.1. Sample Preparation
2.2. Obtaining Spectral Information
2.3. Reduce Interference and Extract Crest
2.4. Reduced Feature Dimension
2.4.1. Dimension Reduction by Scipy.Signal.Find_Peaks (SSFP)
2.4.2. Dimension Reduction by SelectKBest (SKB)
2.4.3. Dimension Reduction by Recursive Feature Elimination (RFE)
2.4.4. Dataset Partitioning by K-Fold Cross-Validation (K-Fold CV)
2.5. Identification Models Evaluate Feature Selection Methods
2.5.1. Typical Linear LR Identification Model
2.5.2. Typical Nonlinear SVM Identification Model
3. Results and Analysis
3.1. Analysis of Characteristic Information Extraction of Crest
3.2. Analysis of Selection of Features
3.2.1. Features Selected by SKB
3.2.2. Features Selected by RFE
3.3. Performance Analysis of Models
3.3.1. Performance of Typical Linear LR Classification Model
3.3.2. Performance of Typical Nonlinear SVM Classification Model
4. Discussion
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Li, M.; Zhang, Y.H. Effects of different fertilization patterns on the bacterial community dynamic in saline-alkali paddy soil. Agric. Res. Arid Areas 2018, 36, 142–148. [Google Scholar] [CrossRef]
- Han, G.Q.; Zhou, L.R. Improvement and Utilization of Saline Soil in Herlongjiang Province; China Agricultural Press: Beijing, China, 2011. [Google Scholar]
- Yang, F.; Wang, Z.C.; Ma, H.Y.; Yang, F.; An, F. Research and integrated demonstration of ecological amelioration techniques of saline-sodic land in northeast China. Acta Ecol. Sin. 2016, 36, 7054–7058. [Google Scholar] [CrossRef]
- Zhu, J.L.; Yan, Z.Q. Screening test of saline-alkali-tolerant rice varieties in Zhoushan saline-alkali fiele. Zhejiang Agric. Sci. 2021, 62, 1913–1915. [Google Scholar] [CrossRef]
- Ma, Z.H.; Cao, Y.; Wang, Q.L. Effects of Planting Rice on Soil Physical and Chemical Properties of Saline-alkali Land in Northern Shaanxi and Screening of Saline-alkali-tolerant Rice Varieties. China Rice 2022, 28, 80–84. [Google Scholar] [CrossRef]
- Wang, Q.J.; Li, M.X.; Zhao, H.L.; Wang, G.S. Evaluation and Screening of Germplasm Resources with Saline-Alkali Tolerance in Heilongjiang Province. Crops 2012, 4, 116–120. [Google Scholar] [CrossRef]
- Ding, G.H.; Liu, K.; Cao, L.Z.; Bai, L.M.; Wang, T.; Zhou, J.S.; Luo, Y.; Xia, T.S.; Yang, G.; Wang, X.Y.; et al. Breeding of a Saline-alkali Tolerant Rice Variety Longdao 124 with High Quality and Stable Yield in Cold Regions. China Seed Ind. 2021, 6, 78–81. [Google Scholar] [CrossRef]
- Liu, B.A. Screening test report of saline-alkali tolerant rice varieties in western Jilin Province. Jilin Agric. 2016, 23, 86. [Google Scholar] [CrossRef]
- Huang, A.Y. Comparative Analysis of ten Rice Varieties on Salt-Endurance in Qinghua, Vietnam; Sichuan Agricultural University: Ya’an, China, 2016. [Google Scholar]
- Wang, Z.S.; Zhu, Y.Q.; Li, N.; Liu, H.; Zheng, H.J.; Wang, W.P.; Liu, Y. High-throughput sequencing-based analysis of the composition and diversity of endophytic bacterial community in seeds of saline-alkali tolerant rice. Microbiol. Res. 2021, 250, 126794. [Google Scholar] [CrossRef]
- Geetha, S.; Vasuki, A.; Jagadeesh, S.P.; Saraswathi, R.; Krishnamurthy, S.L.; Palanichamy, M.; Dhasarathan, G.; Thamodharan, M.B. Development of sodicity tolerant rice varieties through marker assisted backcross breeding. Electron. J. Plant Breed. 2017, 8, 1013–1031. [Google Scholar] [CrossRef]
- Wang, W.L. Using Indica-Japonica Cross RIL Population to Locate QTLs Related to Salit and Alkali Tolerance in Rice; Shenyang Agricultural University: Shenyang, China, 2020. [Google Scholar] [CrossRef]
- Wang, H. Screening of Saline-Alkaline Tolerant Varieties of Rice (Oryza sativa L.) and Genetic Analysis; Northeast Forestry University: Harbin, China, 2019. [Google Scholar] [CrossRef]
- Sun, J.; Xie, D.W.; Zhang, E.Y.; Zheng, H.; Wang, J.; Liu, H.; Yang, L.; Zhang, S.; Wang, L.; Zou, D. QTL mapping of photosynthetic-related traits in rice under salt and alkali stresses. Euphytica 2019, 215, 147. [Google Scholar] [CrossRef]
- Hibben, J.H.; Teller, E. The Raman effect and its chemical aplications and physical research. Ind. Eng. Chem. News Ed. 1939, 17, 556. [Google Scholar]
- Chen, H. Research on data analysis and visualization platform based on Python. Netw. Secur. Technol. Appl. 2022, 2, 57–58. [Google Scholar]
- Zhang, T.; Fan, S.X.; Xiang, Y.; Zhang, S.J.; Wang, J.H.; Sun, Q. Non-destructive analysis of germination percentage, germination energy and simple vigour index on wheat seeds during storage by Vis/NIR and SWIR hyperspectral imaging. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2020, 239, 118488. [Google Scholar] [CrossRef] [PubMed]
- Liu, J.M.; Jin, S.; Bao, C.G.; Sun, Y.; Li, W.Z. Rapid determination of lignocellulose in corn stover based on near-infrared reflectance spectroscopy and chemometrics methods. Bioresour. Technol. 2021, 321, 124449. [Google Scholar] [CrossRef]
- He, J.; Hui, J.Z.; Wang, S.D.; Hong, X.; Wang, K. Application of Python in Visualization of CINRAD Storm Products. Meteorol. Sci. Technol. 2020, 48, 374–379. Available online: https://scjg.cnki.net/kcms/detail/detail.aspx?filename=QXKJ202003011&dbcode=CJFQ&dbname=CJFD2020&v= (accessed on 10 January 2022).
- Gao, W.; Sun, P.P.; Li, D.Z. Visual Analysis of Film Data Based on Python Crawler. J. Shenyang Univ. Chem. Technol. 2020, 34, 73–78. [Google Scholar]
- Pu, Y.P. Research on Data Visualization Based on Python in the Era of Big Data. China Comput. Commun. 2021, 33, 179–182. [Google Scholar]
- Le, T.D.; Gathignol, F.; Vu, H.T.; Nguyen, K.L.; Tran, L.H.; Vu, H.T.; Dinh, T.X.; Lazennec, F.; Pham, X.H.; Véry, A.; et al. Genome-Wide Association Mapping of Salinity Tolerance at the Seedling Stage in a Panel of Vietnamese Landraces Reveals New Valuable QTLs for Salinity Stress Tolerance Breeding in Rice. Plants 2021, 10, 1088. [Google Scholar] [CrossRef]
- Wu, N.; Tang, Y.F.; Lin, Y.J.; Zeng, Y.; Ma, J.; Wang, N. Expression of Some Genes Related to Resistance to Salt-alkali Stress in’Hitomebore’. Mol. Plant Breed. 2019, 17, 7634–7640. [Google Scholar] [CrossRef]
- Zhu, M.X.; Gao, X.Y.; Shao, X.W.; Jin, F.; Geng, Y.Q.; Wang, S. Effect of Different Concentrations of Saline-Alkali Stress on Growth and Yield of Rice. Jilin Agric. Sci. 2014, 39, 12–16. [Google Scholar] [CrossRef]
- Cao, Y.F.; Yuan, P.S.; Wang, H.Y.; Korohou, T.W.; Fan, J.Q.; Xu, H.L. Monitoring Index of Rice Bacterial Blight Based on Hyperspectral Fractal Dimension. J. Agric. Mach. 2021, 52, 134–140. [Google Scholar]
- Wang, Y.N.; Fan, S.J. MSAP Analysis of Genomic DNA Methylation in Oryza sativa under Low Temperature Stress. Anhui Agric. Sci. 2017, 45, 135–137+186. [Google Scholar] [CrossRef]
- Tian, F.M. Identification of Rice Based on Analysis of Raman Spectrum and Organic Ingredients; Jilin University: Jilin, China, 2018; Available online: https://www.globethesis.com/?t=1361330542982755 (accessed on 10 February 2022).
- Almeida, M.R.; Alves, R.S.; Nascimbem, L.B.; Stephani, R.; Poppi, R.J.; Oliveira, L.F. Determination of amylose cetent in starch using Raman spectroscopy and multivariate calibration analysis. Aanlytical Bioanal. Chem. 2010, 397, 2693–2701. [Google Scholar] [CrossRef]
- Luo, Q. The Development of the Low Background Gamma Ray Spectrum Analysis Software; Chengdu University of Technology: Chengdu, China, 2019. [Google Scholar] [CrossRef]
- Noor, S.A.; Kasim, K.A.; Sameer, A. Kadhim BER Performance Improvement of Alamouti MIMO-STBC Decoder Using Mutual Information Method. In Journal of Physics: Conference Series; IOP Publishing: Bristol, UK, 2020; p. 012016. [Google Scholar]
- Sharma, N.V.; Yadav, N.S. An optimal intrusion detection system using recursive feature elimination and ensemble of classifiers. Microprocess. Microsyst. 2021, 85, 104293. [Google Scholar] [CrossRef]
- Wang, C.S.; Shu, Q.Q.; Wang, X.Y.; Guo, B.; Liu, P.; Li, Q. A random forest classifier based on pixel comparison features for urban LiDAR data. ISPRS J. Photogramm. Remote Sens. 2019, 148, 75–86. [Google Scholar] [CrossRef]
- Narasimhulu, C.V. An automatic feature selection and classification framework for analyzing ultrasound kidney images using dragonfly algorithm and random forest classifier. IET Image Precess. 2021, 15, 2080–2096. [Google Scholar] [CrossRef]
- Mantas, L.; Arnas, U. Efficient Implementations of Echo State Network Cross-Validation. Cogn. Comput. 2021, prepublish. [Google Scholar] [CrossRef]
- Saha, A.; Pal, S.C.; Chowdhuri, I.; Islam, A.R.M. Towfiqul, Roy Paramita, Chakrabortty Rabin. Land degradation risk dynamics assessment in red and lateritic zones of eastern plateau, India: A combine approach of K-fold CV, data mining and field validation. Ecol. Inform. 2022, 69, 101653. [Google Scholar] [CrossRef]
- Data Partitioning—Hold-Out, K-Fold CV, Bootstrap. Available online: https://blog.csdn.net/weixin_37352167/article/details/85028835 (accessed on 21 June 2022).
- Sainani, K.L. Multinomial and Ordinal Logistic Regression. PM&R J. Inj. Funct. Rehabil. 2021, 13, 1050–1055. [Google Scholar] [CrossRef]
- Nattino, G.; Pennell, M.L.; Lemeshow, S. Assessing the goodness of fit of logistic regression models in large samples: A modification of the Hosmer-Lemeshow test. Biometrics 2020, 76, 549–560. [Google Scholar] [CrossRef]
- Vladimir, N.; Michiel, K. On Stochastic Optimization and Statistical Learning in Reproducing Kernel Hilbert Spaces by Support Vector Machines(SVM). Informatica 2009, 20, 273–292. [Google Scholar]
- Understanding the Confusion Matrix. Available online: https://blog.huati365.com/f8111c156fc686cd (accessed on 15 June 2022).
- Wang, L.Q.; Zhang, C.; Hou, Y.C.; Tan, X.H.; Cheng, R.; Gao, X.; Bai, Y.P. Remote sensing image scene classification application based on deep learning feature fusion. J. Nanjing Univ. Inf. Sci. Technol. 2021, 2021, 6659831. [Google Scholar]
- Sha, M.; Tang, Z.L.; Zhang, D.; Zhang, Z.Y.; Liu, J. Study on cyclic voltammetric electrochemical fingerprint method for origin traceability of rice. J. Phys. Conf. Ser. 2021, 1952, 022038. [Google Scholar] [CrossRef]
- Violino, S.; Ortenzi, L.; Antonucci, F.; Pallottino, F.; Benincasa, C.; Figorilli, S.; Costa, C. An Artificial Intelligence Approach for Italian EVOO Origin Traceability through an Open Source IoT Spectrometer. Foods 2020, 9, 834. [Google Scholar] [CrossRef]
- Qian, L.; Zuo, F.; Zhang, C.D.; Zhang, D.J. Geographical Origin Traceability of Rice: A Study on the Effect of Processing Precision on Index Elements. Food Sci. Technol. Res. 2019, 25, 619–624. [Google Scholar] [CrossRef]
- Chen, X.Y.; Jin, F.; Feng, D.H.; Wang, Y.L.; Liang, Y.J. Classification of sunspot magnetic types based on two-model integration. Astron. Res. Technol. 2022, 7, 1–11. [Google Scholar] [CrossRef]
- Kong, X.R. Overview of Machine Learning. Electron. Manuf. 2019, 24, 82–84+38. [Google Scholar] [CrossRef]
- Zhu, L.P. Review of sparse sufficient dimension reduction: Comment. Stat. Theory Relat. Fields 2020, 4, 134. [Google Scholar] [CrossRef]
- Flavio, E.S.; Pilar, B.; Serge, G.; Javier, M.; Elizabeth, T. A spectral envelope approach towards effective SVM-RFE on infrared data. Pattern Recognit. Lett. 2016, 71, 59–65. [Google Scholar] [CrossRef] [Green Version]
Number of Varieties | Sample | Variety of Sample | Number of CS | Number of TS | Total |
---|---|---|---|---|---|
1 | QJ10 | 1 | 36 | 36 | 72 |
2 | BD6 | 1 | 36 | 36 | 72 |
3 | DF132 | 1 | 36 | 36 | 72 |
4 | LJ12 | 0 | 36 | 36 | 72 |
5 | KD42 | 0 | 36 | 36 | 72 |
6 | LD107 | 0 | 36 | 36 | 72 |
Laser Power | Integration Time | Number of Spectrum | Display | Save Spectrum | Resolution |
---|---|---|---|---|---|
High | 4 | 3 | Average | ASCII | Low |
Method | Raman Shift Position for Peak Extraction (Raman Shift/cm−1) | Total | ||||||
---|---|---|---|---|---|---|---|---|
480 | 865 | 941 | 1129 | 1339 | 1461 | 2910 | ||
SSFP | p\w\wh\pd | p\w\wh\pd | p\w\wh\pd | p\w\wh\pd | p\w\wh\pd | p\w\wh\pd | p\w\wh\pd | 28 |
SKB | p\pd | p\pd | pd | pd | pd | wh | p\w | 10 |
RFE | w\pd | pd | p\wh\pd | pd | p\wh\pd | p\w\pd | wh | 14 |
Total | 8 | 7 | 8 | 6 | 8 | 8 | 7 | 52 |
Feature Selection | Test Dataset | LR Classification Model | |||||
---|---|---|---|---|---|---|---|
Test Data Subset | TN (0,0) | FP (0,1) | FN (1,0) | TP (1,1) | Accuracy (%) | Precision (%) | |
SSFP | 1 | 32 | 4 | 2 | 34 | 0.9167 | 0.8947 |
2 | 32 | 4 | 3 | 33 | 0.9028 | 0.8919 | |
3 | 33 | 3 | 4 | 32 | 0.9028 | 0.9143 | |
4 | 34 | 2 | 2 | 34 | 0.9444 | 0.9444 | |
5 | 30 | 6 | 4 | 32 | 0.8611 | 0.8421 | |
6 | 34 | 2 | 1 | 35 | 0.9583 | 0.9459 | |
The average | 0.9144 | 0.9056 | |||||
SKB | 1 | 33 | 3 | 1 | 35 | 0.9444 | 0.9211 |
2 | 31 | 5 | 2 | 34 | 0.9028 | 0.8718 | |
3 | 32 | 4 | 3 | 33 | 0.9028 | 0.8919 | |
4 | 34 | 2 | 4 | 32 | 0.9167 | 0.9412 | |
5 | 31 | 5 | 5 | 31 | 0.8611 | 0.8611 | |
6 | 36 | 0 | 0 | 36 | 1 | 1 | |
The average | 0.9213 | 0.9145 | |||||
RFE | 1 | 33 | 3 | 3 | 33 | 0.9167 | 0.9167 |
2 | 31 | 5 | 1 | 35 | 0.9167 | 0.8750 | |
3 | 30 | 6 | 2 | 34 | 0.8889 | 0.8500 | |
4 | 33 | 3 | 0 | 36 | 0.9583 | 0.9231 | |
5 | 29 | 7 | 3 | 33 | 0.8611 | 0.8250 | |
6 | 35 | 1 | 2 | 34 | 0.9583 | 0.9714 | |
The average | 0.9167 | 0.8935 |
Feature Selection | Test Dataset | SVM Classification Model | |||||
---|---|---|---|---|---|---|---|
Test Data Subset | TN (0,0) | FP (0,1) | FN (1,0) | TP (1,1) | Accuracy (%) | Precision (%) | |
SSFP | 1 | 34 | 2 | 2 | 34 | 0.9444 | 0.9444 |
2 | 34 | 2 | 4 | 32 | 0.9167 | 0.9412 | |
3 | 30 | 6 | 5 | 31 | 0.8472 | 0.8378 | |
4 | 35 | 1 | 0 | 36 | 0.9861 | 0.9730 | |
5 | 35 | 1 | 3 | 32 | 0.9437 | 0.9697 | |
6 | 34 | 2 | 1 | 35 | 0.9583 | 0.9459 | |
The average | 0.9327 | 0.9353 | |||||
SKB | 1 | 33 | 3 | 3 | 33 | 0.9167 | 0.9167 |
2 | 34 | 2 | 4 | 32 | 0.9167 | 0.9412 | |
3 | 32 | 4 | 6 | 30 | 0.8611 | 0.8824 | |
4 | 36 | 0 | 1 | 35 | 0.9861 | 1 | |
5 | 34 | 2 | 1 | 35 | 0.9583 | 0.9459 | |
6 | 34 | 2 | 2 | 34 | 0.9444 | 0.9444 | |
The average | 0.9306 | 0.9384 | |||||
RFE | 1 | 35 | 1 | 5 | 31 | 0.9167 | 0.9688 |
2 | 34 | 2 | 3 | 33 | 0.9306 | 0.9429 | |
3 | 34 | 2 | 5 | 31 | 0.9028 | 0.9394 | |
4 | 35 | 1 | 0 | 36 | 0.9861 | 0.9730 | |
5 | 35 | 1 | 2 | 34 | 0.9583 | 0.9714 | |
6 | 34 | 2 | 2 | 34 | 0.9444 | 0.9444 | |
The average | 0.9398 | 0.9566 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, R.; Tan, F.; Wang, Y.; Ma, B.; Yuan, M.; Wang, L.; Zhao, X. Machine Learning Identification of Saline-Alkali-Tolerant Japonica Rice Varieties Based on Raman Spectroscopy and Python Visual Analysis. Agriculture 2022, 12, 1048. https://doi.org/10.3390/agriculture12071048
Liu R, Tan F, Wang Y, Ma B, Yuan M, Wang L, Zhao X. Machine Learning Identification of Saline-Alkali-Tolerant Japonica Rice Varieties Based on Raman Spectroscopy and Python Visual Analysis. Agriculture. 2022; 12(7):1048. https://doi.org/10.3390/agriculture12071048
Chicago/Turabian StyleLiu, Rui, Feng Tan, Yaxuan Wang, Bo Ma, Ming Yuan, Lianxia Wang, and Xin Zhao. 2022. "Machine Learning Identification of Saline-Alkali-Tolerant Japonica Rice Varieties Based on Raman Spectroscopy and Python Visual Analysis" Agriculture 12, no. 7: 1048. https://doi.org/10.3390/agriculture12071048
APA StyleLiu, R., Tan, F., Wang, Y., Ma, B., Yuan, M., Wang, L., & Zhao, X. (2022). Machine Learning Identification of Saline-Alkali-Tolerant Japonica Rice Varieties Based on Raman Spectroscopy and Python Visual Analysis. Agriculture, 12(7), 1048. https://doi.org/10.3390/agriculture12071048