A Study on Origin Traceability of White Tea (White Peony) Based on Near-Infrared Spectroscopy and Machine Learning Algorithms
Abstract
:1. Introduction
2. Materials and Methods
2.1. Set of White Tea (White Peony) Samples and Preliminary Treatment
2.2. Spectra Acquisition
2.3. Spectral Pretreatment
2.4. Extraction of Characteristics
2.5. Establishment and Evaluation of Models
2.6. Data Analysis
3. Results and Discussion
3.1. Spectral Analysis
3.2. Spectral Pretreatment
3.3. Extraction of Characteristics
3.3.1. PCA
3.3.2. LDA
3.3.3. SPA
3.4. Models Evaluation and Optimization
3.4.1. Models Evaluation of White Tea’s Origins Classified by DPC
3.4.2. Models Evaluation of White Tea Origins Classified by DDFP and AFWT
3.4.3. Models Optimization
3.4.4. Performance Analysis of Optimal Models
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Damiani, E.; Bacchetti, T.; Padella, L.; Tiano, L.; Carloni, P. Antioxidant activity of different white teas: Comparison of hot and cold tea infusions. J. Food Compos. Anal. 2014, 33, 59–66. [Google Scholar] [CrossRef]
- Hilal, Y.; Engelhardt, U. Characterisation of white tea—Comparison to green and black tea. J. Fur Verbrauch. Und Lebensm.—J. Consum. Prot. Food Saf. 2007, 2, 414–421. [Google Scholar] [CrossRef]
- Wei, C.; Yang, H.; Wang, S.; Zhao, J.; Liu, C.; Gao, L.; Xia, E.; Lu, Y.; Tai, Y.; She, G.; et al. Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality. Proc. Natl. Acad. Sci. USA 2018, 115, E4151–E4158. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhuang, Z.; Mi, Z.; Kong, L.; Wang, Q.; Schweiger, A.H.; Wan, Y.; Li, H. Accumulation of potentially toxic elements in Chinese tea (Camellia sinensis): Towards source and health risk assessment. Sci. Total Environ. 2022, 851, 158018. [Google Scholar] [CrossRef]
- Ning, J.; Cao, Q.; Su, H.; Zhu, X.; Wang, K.; Wan, X.; Zhang, Z. Discrimination of six tea categories coming from different origins depending on polyphenols, caffeine, and theanine combined with different discriminant analysis. Int. J. Food Prop. 2017, 20, 1838–1847. [Google Scholar] [CrossRef] [Green Version]
- Dai, W.; Xie, D.; Lu, M.; Li, P.; Lv, H.; Yang, C.; Peng, Q.; Zhu, Y.; Guo, L.; Zhang, Y.; et al. Characterization of white tea metabolome: Comparison against green and black tea by a nontargeted metabolomics approach. Food Res. Int. 2017, 96, 40–45. [Google Scholar] [CrossRef]
- Li, C.; Zong, B.; Guo, H.; Luo, Z.; He, P.; Gong, S.; Fan, F. Discrimination of white teas produced from fresh leaves with different maturity by near-infrared spectroscopy. Spectrochim. Acta Part A-Mol. Biomol. Spectrosc. 2020, 227, 117697. [Google Scholar] [CrossRef]
- Zhang, H.; Li, Y.; Lv, Y.; Jiang, Y.; Pan, J.; Duan, Y.; Zhu, Y.; Zhang, S. Influence of brewing conditions on taste components in Fuding white tea infusions. J. Sci. Food Agric. 2017, 97, 2826–2833. [Google Scholar] [CrossRef]
- Meng, W.; Xu, X.; Cheng, K.-K.; Xu, J.; Shen, G.; Wu, Z.; Dong, J. Geographical Origin Discrimination of Oolong Tea (TieGuanYin, Camellia sinensis (L.) O. Kuntze) Using Proton Nuclear Magnetic Resonance Spectroscopy and Near-Infrared Spectroscopy. Food Anal. Methods 2017, 10, 3508–3522. [Google Scholar] [CrossRef]
- Zhang, D.; Wu, W.; Qiu, X.; Li, X.; Zhao, F.; Ye, N. Rapid and direct identification of the origin of white tea with proton transfer reaction time-of-flight mass spectrometry. Rapid Commun. Mass Spectrom. 2020, 34, e8830. [Google Scholar] [CrossRef]
- Ye, X.; Jin, S.; Wang, D.; Zhao, F.; Yu, Y.; Zheng, D.; Ye, N. Identification of the Origin of White Tea Based on Mineral Element Content. Food Anal. Methods 2017, 10, 191–199. [Google Scholar] [CrossRef]
- Wang, S.; Liu, S.; Yuan, Y.; Zhang, J.; Wang, J.; Kong, D. Simultaneous detection of different properties of diesel fuel by near infrared spectroscopy and chemometrics. Infrared Phys. Technol. 2020, 104, 103111. [Google Scholar] [CrossRef]
- Wahl, P.R.; Fruhmann, G.; Sacher, S.; Straka, G.; Sowinski, S.; Khinast, J.G. PAT for tableting: Inline monitoring of API and excipients via NIR spectroscopy. Eur. J. Pharm. Biopharm. 2014, 87, 271–278. [Google Scholar] [CrossRef] [PubMed]
- Wang, B.; Peng, B. A Feasibility Study on Monitoring Residual Sugar and Alcohol Strength in Kiwi Wine Fermentation Using a Fiber-Optic FT-NIR Spectrometry and PLS Regression. J. Food Sci. 2017, 82, 358–363. [Google Scholar] [CrossRef] [PubMed]
- Jin, G.; Xu, Y.; Cui, C.; Zhu, Y.; Zong, J.; Cai, H.; Ning, J.; Wei, C.; Hou, R. Rapid identification of the geographic origin of Taiping Houkui green tea using near-infrared spectroscopy combined with a variable selection method. J. Sci. Food Agric. 2022, 102, 6123–6130. [Google Scholar] [CrossRef]
- Ren, G.; Wang, S.; Ning, J.; Xu, R.; Wang, Y.; Xing, Z.; Wan, X.; Zhang, Z. Quantitative analysis and geographical traceability of black tea using Fourier transform near-infrared spectroscopy (FT-NIRS). Food Res. Int. 2013, 53, 822–826. [Google Scholar] [CrossRef]
- Yan, S.-M.; Liu, J.-P.; Xu, L.; Fu, X.-S.; Cui, H.-F.; Yun, Z.-Y.; Yu, X.-P.; Ye, Z.-H. Rapid Discrimination of the Geographical Origins of an Oolong Tea (Anxi-Tieguanyin) by Near-Infrared Spectroscopy and Partial Least Squares Discriminant Analysis. J. Anal. Methods Chem. 2014, 2014, 704971. [Google Scholar] [CrossRef] [Green Version]
- Kabir, M.H.; Guindo, M.L.; Chen, R.; Liu, F. Geographic Origin Discrimination of Millet Using Vis-NIR Spectroscopy Combined with Machine Learning Techniques. Foods 2021, 10, 2767. [Google Scholar] [CrossRef]
- Zhang, X.; Sun, J.; Li, P.; Zeng, F.; Wang, H. Hyperspectral detection of salted sea cucumber adulteration using different spectral preprocessing techniques and SVM method. LWT-Food Sci. Technol. 2021, 152, 112295. [Google Scholar] [CrossRef]
- Liu, Y.; Huang, J.; Li, M.; Chen, Y.; Cui, Q.; Lu, C.; Wang, Y.; Li, L.; Xu, Z.; Zhong, Y.; et al. Rapid identification of the green tea geographical origin and processing month based on near-infrared hyperspectral imaging combined with chemometrics. Spectrochim. Acta Part A-Mol. Biomol. Spectrosc. 2022, 267, 120537. [Google Scholar] [CrossRef]
- He, Y.; Li, X.; Deng, X. Discrimination of varieties of tea using near infrared spectroscopy by principal component analysis and BP model. J. Food Eng. 2007, 79, 1238–1242. [Google Scholar] [CrossRef]
- Dumalisile, P.; Manley, M.; Hoffman, L.; Williams, P.J. Near-Infrared (NIR) Spectroscopy to Differentiate Longissimus thoracis et lumborum (LTL) Muscles of Game Species. Food Anal. Methods 2020, 13, 1220–1233. [Google Scholar] [CrossRef]
- Chu, X.L.; Yuan, H.F.; Lu, W.Z. Progress and application of spectral data pretreatment and wavelength selection methods in NIR analytical technique. Prog. Chem. 2004, 16, 528–542. [Google Scholar]
- Cui, X.; Liu, X.; Yu, X.; Cai, W.; Shao, X. Water can be a probe for sensing glucose in aqueous solutions by temperature dependent near infrared spectra. Anal. Chim. Acta 2017, 957, 47–54. [Google Scholar] [CrossRef]
- Su, T.; Sun, Y.; Han, L.; Cai, W.; Shao, X. Revealing the interactions of water with cryoprotectant and protein by near-infrared spectroscopy. Spectrochim. Acta Part A-Mol. Biomol. Spectrosc. 2022, 266, 120417. [Google Scholar] [CrossRef] [PubMed]
- Ringner, M. What is principal component analysis? Nat. Biotechnol. 2008, 26, 303–304. [Google Scholar] [CrossRef] [PubMed]
- Zhu, H.-J.; Tang, H.; Sun, J.-X.; Du, Z.-X. Classification Method of Liquor Quality Based on Time and Frequency Spectrum Characteristics. Spectrosc. Spectr. Anal. 2021, 41, 2962–2968. [Google Scholar]
- Yao, Z.; Lei, Y.; He, D. Early Visual Detection of Wheat Stripe Rust Using Visible/Near-Infrared Hyperspectral Imaging. Sensors 2019, 19, 952. [Google Scholar] [CrossRef] [Green Version]
- Galvao, R.K.H.; Ugulino Araujo, M.C.; Fragoso, W.D.; Silva, E.C.; Jose, G.E.; Carreiro Soares, S.F.; Paiva, H.M. A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm. Chemom. Intell. Lab. Syst. 2008, 92, 83–91. [Google Scholar] [CrossRef]
- Chen, Y.; Hu, X.; Fan, W.; Shen, L.; Zhang, Z.; Liu, X.; Du, J.; Li, H.; Chen, Y.; Li, H. Fast density peak clustering for large scale data based on kNN. Knowl. -Based Syst. 2020, 187, 104824. [Google Scholar] [CrossRef]
- Ai, F.-F.; Bin, J.; Zhang, Z.-M.; Huang, J.-H.; Wang, J.-B.; Liang, Y.-Z.; Yu, L.; Yang, Z.-Y. Application of random forests to select premium quality vegetable oils by their fatty acid composition. Food Chem. 2014, 143, 472–478. [Google Scholar] [CrossRef] [PubMed]
- Fernandez-Delgado, M.; Cernadas, E.; Barro, S.; Amorim, D. Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? J. Mach. Learn. Res. 2014, 15, 3133–3181. [Google Scholar]
- Hong, X.; Wang, J. Use of Electronic Nose and Tongue to Track Freshness of Cherry Tomatoes Squeezed for Juice Consumption: Comparison of Different Sensor Fusion Approaches. Food Bioprocess Technol. 2015, 8, 158–170. [Google Scholar] [CrossRef]
- Zhao, Y.-R.; Yu, K.-Q.; Li, X.; He, Y. Detection of Fungus Infection on Petals of Rapeseed (Brassica napus L.) Using NIR Hyperspectral Imaging. Sci. Rep. 2016, 6, 38878. [Google Scholar] [CrossRef] [PubMed]
- Gao, S.; Wang, Y.; Fang, C.; Xu, L. A Smart Terrain Identification Technique Based on Electromyography, Ground Reaction Force, and Machine Learning for Lower Limb Rehabilitation. Appl. Sci. 2020, 10, 2638. [Google Scholar] [CrossRef] [Green Version]
- Zhao, L.; Zhang, H.; Huang, F.; Liu, H.; Wang, T.; Zhang, C. Authenticating Tibetan pork in China by tracing the species and geographical features based on stable isotopic and multi-elemental fingerprints. Food Control 2023, 145, 109411. [Google Scholar] [CrossRef]
- Pan, S.; Zhang, X.; Xu, W.; Yin, J.; Gu, H.; Yu, X. Rapid On-site identification of geographical origin and storage age of tangerine peel by Near-infrared spectroscopy. Spectrochim. Acta Part A-Mol. Biomol. Spectrosc. 2022, 271, 120936. [Google Scholar] [CrossRef]
- Patgiri, C.; Ganguly, A. Adaptive thresholding technique based classification of red blood cell and sickle cell using Naive Bayes Classifier and K-nearest neighbor classifier. Biomed. Signal Process. Control 2021, 68, 102745. [Google Scholar] [CrossRef]
NIRS Data Matrix | Number of Principal Components | OS | CWT | Minmax | SNV | MSC | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
Eigenvalue | Cumulative Contribution | Eigenvalue | Cumulative Contribution | Eigenvalue | Cumulative Contribution | Eigenvalue | Cumulative Contribution | Eigenvalue | Cumulative Contribution | ||
White tea origins classified by DPC | 1 | 806.91 | 97.10% | 470.66 | 56.64% | 573.28 | 69.07% | 614.11 | 73.90% | 614.35 | 73.93% |
2 | 15.94 | 99.02% | 161.16 | 76.03% | 187.72 | 91.69% | 146.26 | 91.50% | 145.78 | 91.47% | |
3 | 5.48 | 99.68% | 117.55 | 90.18% | 49.09 | 97.60% | 44.73 | 96.88% | 44.35 | 96.81% | |
4 | 1.72 | 99.89% | 38.38 | 94.80% | 10.90 | 98.91% | 10.87 | 98.19% | 10.85 | 98.11% | |
5 | 0.47 | 99.94% | 21.34 | 97.36% | 3.77 | 99.37% | 8.64 | 99.23% | 9.23 | 99.23% | |
6 | 0.21 | 99.97% | 6.00 | 98.09% | 1.51 | 99.55% | 2.20 | 99.50% | 2.38 | 99.51% | |
7 | 0.16 | 99.99% | 4.39 | 98.61% | 1.13 | 99.69% | 1.70 | 99.70% | 1.63 | 99.71% | |
8 | 0.04 | 99.99% | 3.53 | 99.04% | 0.87 | 99.79% | 0.66 | 99.78% | 0.67 | 99.79% | |
9 | 0.03 | 99.99% | 2.80 | 99.38% | 0.55 | 99.86% | 0.53 | 99.84% | 0.61 | 99.86% | |
10 | 0.02 | 100.00% | 1.48 | 99.55% | 0.36 | 99.90% | 0.39 | 99.89% | 0.35 | 99.90% | |
11 | 0.01 | 100.00% | 1.27 | 99.71% | 0.23 | 99.93% | 0.21 | 99.92% | 0.17 | 99.92% | |
12 | 0.01 | 100.00% | 0.68 | 99.79% | 0.14 | 99.94% | 0.15 | 99.93% | 0.15 | 99.94% | |
13 | 0.00 | 100.00% | 0.48 | 99.85% | 0.10 | 99.96% | 0.13 | 99.95% | 0.12 | 99.96% | |
14 | 0.00 | 100.00% | 0.35 | 99.89% | 0.08 | 99.97% | 0.10 | 99.96% | 0.08 | 99.97% | |
15 | 0.00 | 100.00% | 0.26 | 99.92% | 0.05 | 99.97% | 0.07 | 99.97% | 0.07 | 99.97% | |
White tea origins classified by DDFP or AFWT | 1 | 808.20 | 97.26% | 436.20 | 52.49% | 563.59 | 67.90% | 625.87 | 75.32% | 626.54 | 75.40% |
2 | 13.37 | 98.87% | 209.90 | 77.75% | 204.63 | 92.56% | 141.17 | 92.30% | 140.95 | 92.36% | |
3 | 7.53 | 99.77% | 92.95 | 88.94% | 41.72 | 97.58% | 34.41 | 96.44% | 34.28 | 96.48% | |
4 | 1.04 | 99.90% | 58.57 | 95.98% | 10.56 | 98.86% | 13.20 | 98.03% | 12.71 | 98.01% | |
5 | 0.53 | 99.96% | 11.19 | 97.33% | 4.11 | 99.35% | 10.58 | 99.31% | 10.93 | 99.33% | |
6 | 0.16 | 99.98% | 6.75 | 98.14% | 1.98 | 99.59% | 1.88 | 99.53% | 1.75 | 99.54% | |
7 | 0.07 | 99.99% | 5.16 | 98.76% | 0.92 | 99.70% | 1.38 | 99.70% | 1.39 | 99.70% | |
8 | 0.05 | 99.99% | 3.39 | 99.17% | 0.72 | 99.78% | 0.74 | 99.79% | 0.84 | 99.81% | |
9 | 0.02 | 100.00% | 2.05 | 99.42% | 0.58 | 99.85% | 0.55 | 99.85% | 0.56 | 99.87% | |
10 | 0.01 | 100.00% | 1.36 | 99.58% | 0.34 | 99.89% | 0.30 | 99.89% | 0.26 | 99.90% | |
11 | 0.01 | 100.00% | 0.99 | 99.70% | 0.21 | 99.92% | 0.20 | 99.91% | 0.20 | 99.93% | |
12 | 0.00 | 100.00% | 0.66 | 99.78% | 0.16 | 99.94% | 0.19 | 99.94% | 0.14 | 99.95% | |
13 | 0.00 | 100.00% | 0.57 | 99.85% | 0.12 | 99.95% | 0.13 | 99.95% | 0.13 | 99.96% | |
14 | 0.00 | 100.00% | 0.31 | 99.89% | 0.08 | 99.96% | 0.10 | 99.97% | 0.07 | 99.97% | |
15 | 0.00 | 100.00% | 0.26 | 99.92% | 0.06 | 99.97% | 0.07 | 99.97% | 0.05 | 99.98% |
Classification Model | Number of Samples Trained | Number of Samples Predicted | Number of Characteristics | RA (%) | AUC | Average RA (%) | Average AUC |
---|---|---|---|---|---|---|---|
DPC-OS-KNN | 434 | 145 | 831 | 73.10 | 0.62 | 80.10 | 0.72 |
DPC-CWT-KNN | 831 | 76.55 | 0.67 | ||||
DPC-Minmax-KNN | 831 | 81.38 | 0.70 | ||||
DPC-MSC-KNN | 831 | 81.38 | 0.71 | ||||
DPC-SNV-KNN | 831 | 81.38 | 0.71 | ||||
DPC-OS-PCA-KNN | 4 | 73.10 | 0.62 | ||||
DPC-CWT-PCA-KNN | 11 | 76.55 | 0.67 | ||||
DPC-Minmax-PCA-KNN | 7 | 82.07 | 0.71 | ||||
DPC-MSC-PCA-KNN | 7 | 81.38 | 0.71 | ||||
DPC-SNV-PCA-KNN | 7 | 81.38 | 0.71 | ||||
DPC-OS-LDA-KNN | 6 | 84.14 | 0.82 | ||||
DPC-CWT-LDA-KNN | 6 | 86.90 | 0.83 | ||||
DPC-Minmax-LDA-KNN | 6 | 84.14 | 0.83 | ||||
DPC-MSC-LDA-KNN | 6 | 85.52 | 0.81 | ||||
DPC-SNV-LDA-KNN | 6 | 85.52 | 0.83 | ||||
DPC-OS-SPA-KNN | 15 | 73.10 | 0.62 | ||||
DPC-CWT-SPA-KNN | 13 | 75.86 | 0.64 | ||||
DPC-Minmax-SPA-KNN | 13 | 77.24 | 0.68 | ||||
DPC-MSC-SPA-KNN | 15 | 80.69 | 0.74 | ||||
DPC-SNV-SPA-KNN | 11 | 80.69 | 0.71 | ||||
DPC-OS-RF | 434 | 145 | 831 | 75.17 | 0.65 | 80.00 | 0.73 |
DPC-CWT-RF | 831 | 83.45 | 0.71 | ||||
DPC-Minmax-RF | 831 | 80.00 | 0.68 | ||||
DPC-MSC-RF | 831 | 83.45 | 0.78 | ||||
DPC-SNV-RF | 831 | 82.76 | 0.77 | ||||
DPC-OS-PCA-RF | 4 | 79.31 | 0.70 | ||||
DPC-CWT-PCA-RF | 11 | 84.83 | 0.76 | ||||
DPC-Minmax-PCA-RF | 7 | 86.90 | 0.82 | ||||
DPC-MSC-PCA-RF | 7 | 83.45 | 0.80 | ||||
DPC-SNV-PCA-RF | 7 | 84.83 | 0.81 | ||||
DPC-OS-LDA-RF | 6 | 73.10 | 0.67 | ||||
DPC-CWT-LDA-RF | 6 | 81.38 | 0.74 | ||||
DPC-Minmax-LDA-RF | 6 | 68.97 | 0.70 | ||||
DPC-MSC-LDA-RF | 6 | 77.24 | 0.70 | ||||
DPC-SNV-LDA-RF | 6 | 71.72 | 0.68 | ||||
DPC-OS-SPA-RF | 15 | 74.48 | 0.63 | ||||
DPC-CWT-SPA-RF | 13 | 82.07 | 0.77 | ||||
DPC-Minmax-SPA-RF | 13 | 80.00 | 0.70 | ||||
DPC-MSC-SPA-RF | 15 | 82.07 | 0.74 | ||||
DPC-SNV-SPA-RF | 11 | 84.83 | 0.76 | ||||
DPC-OS-SVM | 434 | 145 | 831 | 75.86 | 0.63 | 74.38 | 0.61 |
DPC-CWT-SVM | 831 | 77.24 | 0.64 | ||||
DPC-Minmax-SVM | 831 | 75.17 | 0.64 | ||||
DPC-MSC-SVM | 831 | 73.10 | 0.59 | ||||
DPC-SNV-SVM | 831 | 82.07 | 0.74 | ||||
DPC-OS-PCA-SVM | 4 | 75.86 | 0.63 | ||||
DPC-CWT-PCA-SVM | 11 | 77.24 | 0.64 | ||||
DPC-Minmax-PCA-SVM | 7 | 75.17 | 0.64 | ||||
DPC-MSC-PCA-SVM | 7 | 73.10 | 0.59 | ||||
DPC-SNV-PCA-SVM | 7 | 82.07 | 0.73 | ||||
DPC-OS-LDA-SVM | 6 | 71.72 | 0.57 | ||||
DPC-CWT-LDA-SVM | 6 | 77.93 | 0.67 | ||||
DPC-Minmax-LDA-SVM | 6 | 73.10 | 0.58 | ||||
DPC-MSC-LDA-SVM | 6 | 75.17 | 0.63 | ||||
DPC-SNV-LDA-SVM | 6 | 72.41 | 0.61 | ||||
DPC-OS-SPA-SVM | 15 | 70.34 | 0.56 | ||||
DPC-CWT-SPA-SVM | 13 | 68.97 | 0.54 | ||||
DPC-Minmax-SPA-SVM | 13 | 71.72 | 0.58 | ||||
DPC-MSC-SPA-SVM | 15 | 66.90 | 0.50 | ||||
DPC-SNV-SPA-SVM | 11 | 72.41 | 0.58 |
Classification Model | Number of Samples Trained | Number of Samples Predicted | Number of Characteristics | RA (%) | AUC | Average RA (%) | Average AUC |
---|---|---|---|---|---|---|---|
DDFP-OS-KNN | 291 | 98 | 831 | 54.08 | 0.62 | 71.48 | 0.74 |
DDFP-CWT-KNN | 831 | 65.31 | 0.68 | ||||
DDFP-Minmax-KNN | 831 | 70.41 | 0.73 | ||||
DDFP-MSC-KNN | 831 | 66.33 | 0.69 | ||||
DDFP-SNV-KNN | 831 | 66.33 | 0.69 | ||||
DDFP-OS-PCA-KNN | 4 | 53.06 | 0.61 | ||||
DDFP-CWT-PCA-KNN | 10 | 65.31 | 0.68 | ||||
DDFP-Minmax-PCA-KNN | 6 | 69.39 | 0.73 | ||||
DDFP-MSC-PCA-KNN | 7 | 65.31 | 0.68 | ||||
DDFP-SNV-PCA-KNN | 7 | 65.31 | 0.68 | ||||
DDFP-OS-LDA-KNN | 5 | 92.86 | 0.92 | ||||
DDFP-CWT-LDA-KNN | 5 | 82.65 | 0.85 | ||||
DDFP-Minmax-LDA-KNN | 5 | 88.78 | 0.89 | ||||
DDFP-MSC-LDA-KNN | 5 | 90.82 | 0.91 | ||||
DDFP-SNV-LDA-KNN | 5 | 90.82 | 0.91 | ||||
DDFP-OS-SPA-KNN | 13 | 54.08 | 0.62 | ||||
DDFP-CWT-SPA-KNN | 19 | 75.51 | 0.77 | ||||
DDFP-Minmax-SPA-KNN | 12 | 67.35 | 0.70 | ||||
DDFP-MSC-SPA-KNN | 14 | 74.49 | 0.75 | ||||
DDFP-SNV-SPA-KNN | 13 | 71.43 | 0.74 | ||||
DDFP-OS-RF | 291 | 98 | 831 | 59.18 | 0.64 | 75.56 | 0.78 |
DDFP-CWT-RF | 831 | 72.45 | 0.75 | ||||
DDFP-Minmax-RF | 831 | 75.51 | 0.80 | ||||
DDFP-MSC-RF | 831 | 76.53 | 0.80 | ||||
DDFP-SNV-RF | 831 | 76.53 | 0.80 | ||||
DDFP-OS-PCA-RF | 4 | 66.33 | 0.68 | ||||
DDFP-CWT-PCA-RF | 10 | 75.51 | 0.78 | ||||
DDFP-Minmax-PCA-RF | 6 | 77.55 | 0.79 | ||||
DDFP-MSC-PCA-RF | 7 | 76.53 | 0.79 | ||||
DDFP-SNV-PCA-RF | 7 | 76.53 | 0.80 | ||||
DDFP-OS-LDA-RF | 5 | 85.71 | 0.89 | ||||
DDFP-CWT-LDA-RF | 5 | 80.61 | 0.81 | ||||
DDFP-Minmax-LDA-RF | 5 | 85.71 | 0.86 | ||||
DDFP-MSC-LDA-RF | 5 | 86.73 | 0.88 | ||||
DDFP-SNV-LDA-RF | 5 | 81.63 | 0.86 | ||||
DDFP-OS-SPA-RF | 13 | 54.08 | 0.60 | ||||
DDFP-CWT-SPA-RF | 19 | 76.53 | 0.80 | ||||
DDFP-Minmax-SPA-RF | 12 | 74.49 | 0.77 | ||||
DDFP-MSC-SPA-RF | 14 | 77.55 | 0.79 | ||||
DDFP-SNV-SPA-RF | 13 | 75.51 | 0.77 | ||||
DDFP-OS-SVM | 291 | 98 | 831 | 61.22 | 0.61 | 70.34 | 0.68 |
DDFP-CWT-SVM | 831 | 69.39 | 0.69 | ||||
DDFP-Minmax-SVM | 831 | 73.47 | 0.77 | ||||
DDFP-MSC-SVM | 831 | 58.16 | 0.58 | ||||
DDFP-SNV-SVM | 831 | 76.53 | 0.79 | ||||
DDFP-OS-PCA-SVM | 4 | 75.86 | 0.63 | ||||
DDFP-CWT-PCA-SVM | 10 | 77.24 | 0.64 | ||||
DDFP-Minmax-PCA-SVM | 6 | 75.17 | 0.64 | ||||
DDFP-MSC-PCA-SVM | 7 | 73.10 | 0.59 | ||||
DDFP-SNV-PCA-SVM | 7 | 82.07 | 0.73 | ||||
DDFP-OS-LDA-SVM | 5 | 89.80 | 0.90 | ||||
DDFP-CWT-LDA-SVM | 5 | 80.61 | 0.82 | ||||
DDFP-Minmax-LDA-SVM | 5 | 81.63 | 0.83 | ||||
DDFP-MSC-LDA-SVM | 5 | 86.73 | 0.86 | ||||
DDFP-SNV-LDA-SVM | 5 | 86.73 | 0.86 | ||||
DDFP-OS-SPA-SVM | 13 | 50.00 | 0.50 | ||||
DDFP-CWT-SPA-SVM | 19 | 50.00 | 0.50 | ||||
DDFP-Minmax-SPA-SVM | 12 | 54.08 | 0.53 | ||||
DDFP-MSC-SPA-SVM | 14 | 50.00 | 0.50 | ||||
DDFP-SNV-SPA-SVM | 13 | 55.10 | 0.55 |
Classification Model | Number of Samples Trained | Number of Samples Predicted | Number of Characteristics | RA (%) | AUC | Average RA (%) | Average AUC |
---|---|---|---|---|---|---|---|
AFWT-OS-KNN | 291 | 98 | 831 | 75.51 | 0.76 | 85.10 | 0.83 |
AFWT-CWT-KNN | 831 | 77.55 | 0.78 | ||||
AFWT-Minmax-KNN | 831 | 88.78 | 0.89 | ||||
AFWT-MSC-KNN | 831 | 89.80 | 0.90 | ||||
AFWT-SNV-KNN | 831 | 89.80 | 0.90 | ||||
AFWT-OS-PCA-KNN | 4 | 73.47 | 0.74 | ||||
AFWT-CWT-PCA-KNN | 10 | 77.55 | 0.78 | ||||
AFWT-Minmax-PCA-KNN | 6 | 88.78 | 0.89 | ||||
AFWT-MSC-PCA-KNN | 7 | 88.78 | 0.89 | ||||
AFWT-SNV-PCA-KNN | 7 | 88.78 | 0.89 | ||||
AFWT-OS-LDA-KNN | 1 | 97.96 | 0.98 | ||||
AFWT-CWT-LDA-KNN | 1 | 94.90 | 0.95 | ||||
AFWT-Minmax-LDA-KNN | 1 | 92.86 | 0.93 | ||||
AFWT-MSC-LDA-KNN | 1 | 94.90 | 0.95 | ||||
AFWT-SNV-LDA-KNN | 1 | 94.90 | 0.95 | ||||
AFWT-OS-SPA-KNN | 11 | 73.10 | 0.62 | ||||
AFWT-CWT-SPA-KNN | 13 | 75.86 | 0.64 | ||||
AFWT-Minmax-SPA-KNN | 10 | 77.24 | 0.68 | ||||
AFWT-MSC-SPA-KNN | 12 | 80.69 | 0.74 | ||||
AFWT-SNV-SPA-KNN | 13 | 80.69 | 0.71 | ||||
AFWT-OS-RF | 291 | 98 | 831 | 76.53 | 0.77 | 89.21 | 0.87 |
AFWT-CWT-RF | 831 | 90.82 | 0.91 | ||||
AFWT-Minmax-RF | 831 | 91.84 | 0.92 | ||||
AFWT-MSC-RF | 831 | 93.88 | 0.94 | ||||
AFWT-SNV-RF | 831 | 95.92 | 0.96 | ||||
AFWT-OS-PCA-RF | 4 | 82.65 | 0.83 | ||||
AFWT-CWT-PCA-RF | 10 | 92.86 | 0.93 | ||||
AFWT-Minmax-PCA-RF | 6 | 91.84 | 0.92 | ||||
AFWT-MSC-PCA-RF | 7 | 94.90 | 0.95 | ||||
AFWT-SNV-PCA-RF | 7 | 93.88 | 0.94 | ||||
AFWT-OS-LDA-RF | 1 | 97.96 | 0.98 | ||||
AFWT-CWT-LDA-RF | 1 | 94.90 | 0.95 | ||||
AFWT-Minmax-LDA-RF | 1 | 92.86 | 0.93 | ||||
AFWT-MSC-LDA-RF | 1 | 94.90 | 0.95 | ||||
AFWT-SNV-LDA-RF | 1 | 94.90 | 0.95 | ||||
AFWT-OS-SPA-RF | 11 | 74.49 | 0.63 | ||||
AFWT-CWT-SPA-RF | 13 | 82.07 | 0.77 | ||||
AFWT-Minmax-SPA-RF | 10 | 80.00 | 0.70 | ||||
AFWT-MSC-SPA-RF | 12 | 82.07 | 0.74 | ||||
AFWT-SNV-SPA-RF | 13 | 84.83 | 0.76 | ||||
AFWT-OS-SVM | 291 | 98 | 831 | 84.69 | 0.85 | 82.85 | 0.80 |
AFWT-CWT-SVM | 831 | 88.78 | 0.89 | ||||
AFWT-Minmax-SVM | 831 | 90.82 | 0.91 | ||||
AFWT-MSC-SVM | 831 | 89.80 | 0.90 | ||||
AFWT-SNV-SVM | 831 | 94.90 | 0.95 | ||||
AFWT-OS-PCA-SVM | 4 | 75.86 | 0.63 | ||||
AFWT-CWT-PCA-SVM | 10 | 77.24 | 0.64 | ||||
AFWT-Minmax-PCA-SVM | 6 | 75.17 | 0.64 | ||||
AFWT-MSC-PCA-SVM | 7 | 73.10 | 0.59 | ||||
AFWT-SNV-PCA-SVM | 7 | 82.07 | 0.73 | ||||
AFWT-OS-LDA-SVM | 1 | 97.96 | 0.98 | ||||
AFWT-CWT-LDA-SVM | 1 | 94.90 | 0.95 | ||||
AFWT-Minmax-LDA-SVM | 1 | 92.86 | 0.93 | ||||
AFWT-MSC-LDA-SVM | 1 | 94.90 | 0.95 | ||||
AFWT-SNV-LDA-SVM | 1 | 94.90 | 0.95 | ||||
AFWT-OS-SPA-SVM | 11 | 66.33 | 0.66 | ||||
AFWT-CWT-SPA-SVM | 13 | 65.31 | 0.66 | ||||
AFWT-Minmax-SPA-SVM | 10 | 81.63 | 0.82 | ||||
AFWT-MSC-SPA-SVM | 12 | 51.02 | 0.50 | ||||
AFWT-SNV-SPA-SVM | 13 | 84.70 | 0.85 |
Classification Model | RA (%) | Four Times Cross-Validation RA (%) | |
---|---|---|---|
Training Set | Validation Set | ||
AFWT-OS-LDA-KNN | 100.00 | 97.96 | 97.96 |
AFWT-OS-LDA-RF | 100.00 | 97.96 | 95.87 |
AFWT-OS-LDA-SVM | 100.00 | 97.96 | 96.96 |
Optimal Models | Parameters | Number of Characteristics | RA (%) | AUC |
---|---|---|---|---|
DPC-CWT-LDA-KNN | k = 8 | 6 | 88.97 | 0.85 |
DDFP-OS-LDA-KNN | k = 1 | 5 | 93.88 | 0.93 |
AFWT-OS-LDA-KNN | k = 3 | 1 | 97.96 | 0.98 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, L.; Dai, H.; Zhang, J.; Zheng, Z.; Song, B.; Chen, J.; Lin, G.; Chen, L.; Sun, W.; Huang, Y. A Study on Origin Traceability of White Tea (White Peony) Based on Near-Infrared Spectroscopy and Machine Learning Algorithms. Foods 2023, 12, 499. https://doi.org/10.3390/foods12030499
Zhang L, Dai H, Zhang J, Zheng Z, Song B, Chen J, Lin G, Chen L, Sun W, Huang Y. A Study on Origin Traceability of White Tea (White Peony) Based on Near-Infrared Spectroscopy and Machine Learning Algorithms. Foods. 2023; 12(3):499. https://doi.org/10.3390/foods12030499
Chicago/Turabian StyleZhang, Lingzhi, Haomin Dai, Jialin Zhang, Zhiqiang Zheng, Bo Song, Jiaya Chen, Gang Lin, Linhai Chen, Weijiang Sun, and Yan Huang. 2023. "A Study on Origin Traceability of White Tea (White Peony) Based on Near-Infrared Spectroscopy and Machine Learning Algorithms" Foods 12, no. 3: 499. https://doi.org/10.3390/foods12030499
APA StyleZhang, L., Dai, H., Zhang, J., Zheng, Z., Song, B., Chen, J., Lin, G., Chen, L., Sun, W., & Huang, Y. (2023). A Study on Origin Traceability of White Tea (White Peony) Based on Near-Infrared Spectroscopy and Machine Learning Algorithms. Foods, 12(3), 499. https://doi.org/10.3390/foods12030499