Consensual Regression of Lasso-Sparse PLS models for Near-Infrared Spectra of Food
Abstract
:1. Introduction
2. Theory and Algorithm
2.1. Theory of Lasso
2.2. Deviation Weight Fusion (DW-F)
2.3. Estimation of Model’s Performance
2.4. Framework of Proposed Fusion Strategy
- (1)
- Samples with spectral and attributes data are randomly divided into calibration and prediction subsets with a ratio of 2:1.
- (2)
- In the calibration dataset, a series of PLS models are continuously developed between spectra and attributes with an increasing number of latent variables (LVs), respectively. The max number of LVs is N, and set to the max of 20 in this work.
- (3)
- The Lasso screen method is used to select the informative member models, and N PLS models are firstly taken as the potential member models. Finally, K PLS models are screened out as subsequent member models.
- (4)
- Three fusing strategies are employed, respectively, to fuse these K PLS member models, and corresponding fusion models are obtained.
- (5)
- Parameters in these fusion models are evaluated at the stage of cross-validation and prediction in order to compare their performances.
2.5. Software
3. Experimental Data
4. Results and Discussions
4.1. Performance of General PLS
4.2. Sparse Member Models by Lasso
4.3. Fusion of Member Models and Comparison
4.4. Comparison and Discussions
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Nobari Moghaddam, H.; Tamiji, Z.; Akbari Lakeh, M.; Khoshayand, M.R.; Haji Mahmoodi, M. Multivariate analysis of food fraud: A review of NIR based instruments in tandem with chemometrics. J. Food Compos. Anal. 2022, 107, 104343. [Google Scholar] [CrossRef]
- Nagy, M.M.; Wang, S.; Farag, M.A. Quality analysis and authentication of nutraceuticals using near IR (NIR) spectroscopy: A comprehensive review of novel trends and applications. Trends Food Sci. Technol. 2022, 123, 290–309. [Google Scholar] [CrossRef]
- Zareef, M.; Chen, Q.; Hassan, M.M.; Arslan, M.; Hashim, M.M.; Ahmad, W.; Kutsanedzie, F.Y.H.; Agyekum, A.A. An Overview on the Applications of Typical Non-linear Algorithms Coupled With NIR Spectroscopy in Food Analysis. Food Eng. Rev. 2020, 12, 173–190. [Google Scholar] [CrossRef]
- Walsh, K.B.; McGlone, V.A.; Han, D.H. The uses of near infra-red spectroscopy in postharvest decision support: A review. Postharvest Biol. Technol. 2020, 163, 111139. [Google Scholar] [CrossRef]
- Ding, L.; Yuan, L.-m.; Sun, Y.; Zhang, X.; Li, J.; Yan, Z. Rapid Assessment of Exercise State through Athlete’s Urine Using Temperature-Dependent NIRS Technology. J. Anal. Methods Chem. 2020, 2020, 8828213. [Google Scholar] [CrossRef] [PubMed]
- Baiano, A. Applications of hyperspectral imaging for quality assessment of liquid based and semi-liquid food products: A review. J. Food Eng. 2017, 214, 10–15. [Google Scholar] [CrossRef]
- Lohumi, S.; Lee, S.; Lee, H.; Cho, B.-K. A review of vibrational spectroscopic techniques for the detection of food authenticity and adulteration. Trends Food Sci. Technol. 2015, 46, 85–98. [Google Scholar] [CrossRef]
- Nicolai, B.M.; Lotze, E.; Peirs, A.; Scheerlinck, N.; Theron, K.I. Non-destructive measurement of bitter pit in apple fruit using NIR hyperspectral imaging. Postharvest Biol. Technol. 2006, 40, 1–6. [Google Scholar] [CrossRef]
- Monnier, G.F. A review of infrared spectroscopy in microarchaeology: Methods, applications, and recent trends. J. Archaeol. Sci. Rep. 2018, 18, 806–823. [Google Scholar] [CrossRef]
- Xiaobo, Z.; Jiewen, Z.; Povey, M.J.W.; Holmes, M.; Hanpin, M. Variables selection methods in near-infrared spectroscopy. Anal. Chim. Acta 2010, 667, 14–32. [Google Scholar] [CrossRef]
- Pasquini, C. Near infrared spectroscopy: A mature analytical technique with new perspectives—A review. Anal. Chim. Acta 2018, 1026, 8–36. [Google Scholar] [CrossRef] [PubMed]
- Yun, Y.-H.; Li, H.-D.; Deng, B.-C.; Cao, D.-S. An overview of variable selection methods in multivariate analysis of near-infrared spectra. TrAC Trends Anal. Chem. 2019, 113, 102–115. [Google Scholar] [CrossRef]
- Wang, H.-P.; Chen, P.; Dai, J.-W.; Liu, D.; Li, J.-Y.; Xu, Y.-P.; Chu, X.-L. Recent advances of chemometric calibration methods in modern spectroscopy: Algorithms, strategy, and related issues. TrAC Trends Anal. Chem. 2022, 153, 116648. [Google Scholar] [CrossRef]
- Nicolai, B.M.; Beullens, K.; Bobelyn, E.; Peirs, A.; Saeys, W.; Theron, K.I.; Lammertyn, J. Nondestructive measurement of fruit and vegetable quality by means of NIR spectroscopy: A review. Postharvest Biol. Technol. 2007, 46, 99–118. [Google Scholar] [CrossRef]
- Yuan, L.-m.; Mao, F.; Huang, G.; Chen, X.; Wu, D.; Li, S.; Zhou, X.; Jiang, Q.; Lin, D.; He, R. Models fused with successive CARS-PLS for measurement of the soluble solids content of Chinese bayberry by vis-NIRS technology. Postharvest Biol. Technol. 2020, 169, 111308. [Google Scholar] [CrossRef]
- Beć, K.B.; Grabska, J.; Huck, C.W. In silico NIR spectroscopy—A review. Molecular fingerprint, interpretation of calibration models, understanding of matrix effects and instrumental difference. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2022, 279, 121438. [Google Scholar] [CrossRef]
- Liu, K.; Chen, X.; Li, L.; Chen, H.; Ruan, X.; Liu, W. A consensus successive projections algorithm—Multiple linear regression method for analyzing near infrared spectra. Anal. Chim. Acta 2015, 858, 16–23. [Google Scholar] [CrossRef]
- Singh, M.; Singh, R.; Ross, A. A comprehensive overview of biometric fusion. Inf. Fusion 2019, 52, 187–205. [Google Scholar] [CrossRef]
- Modak, S.K.S.; Jha, V.K. Multibiometric fusion strategy and its applications: A review. Inf. Fusion 2019, 49, 174–204. [Google Scholar] [CrossRef]
- Li, Y.; Xiong, Y.; Min, S. Data fusion strategy in quantitative analysis of spectroscopy relevant to olive oil adulteration. Vib. Spectrosc. 2019, 101, 20–27. [Google Scholar] [CrossRef]
- Barbosa, C.D.; Baqueta, M.R.; Rodrigues Santos, W.C.; Gomes, D.; Alvarenga, V.O.; Teixeira, P.; Albano, H.; Rosa, C.A.; Valderrama, P.; Lacerda, I.C.A. Data fusion of UPLC data, NIR spectra and physicochemical parameters with chemometrics as an alternative to evaluating kombucha fermentation. LWT 2020, 133, 109875. [Google Scholar] [CrossRef]
- Wang, X.; Feng, H.; Chen, T.; Zhao, S.; Zhang, J.; Zhang, X. Gas sensor technologies and mathematical modelling for quality sensing in fruit and vegetable cold chains: A review. Trends Food Sci. Technol. 2021, 110, 483–492. [Google Scholar] [CrossRef]
- Ye, P.; Ji, G.; Yuan, L.-M.; Li, L.; Chen, X.; Karimidehcheshmeh, F.; Chen, X.; Huang, G. A Sparse Classification Based on a Linear Regression Method for Spectral Recognition. Appl. Sci. 2019, 9, 2053. [Google Scholar] [CrossRef] [Green Version]
- Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef] [Green Version]
- Tibshirani, R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Norgaard, L.; Saudland, A.; Wagner, J.; Nielsen, J.P.; Munck, L.; Engelsen, S.B. Interval partial least-squares regression (iPLS): A comparative chemometric study with an example from near-infrared spectroscopy. Appl. Spectrosc. 2000, 54, 413–419. [Google Scholar] [CrossRef]
- Yuan, L.-m.; Cai, J.-r.; Sun, L.; Han, E.; Ernest, T. Nondestructive Measurement of Soluble Solids Content in Apples by a Portable Fruit Analyzer. Food Anal. Methods 2016, 9, 785–794. [Google Scholar] [CrossRef]
- Christensen, J.; Nørgaard, L.; Heimdal, H.; Pedersen, J.G.; Engelsen, S.B. Rapid Spectroscopic Analysis of Marzipan—Comparative Instrumentation. J. Near Infrared Spectrosc. 2004, 12, 63–75. [Google Scholar] [CrossRef]
- Poerio, D.V.; Brown, S.D. Stacked interval sparse partial least squares regression analysis. Chemom. Intell. Lab. Syst. 2017, 166, 49–60. [Google Scholar] [CrossRef]
- Yuan, L.-M.; Mao, F.; Chen, X.; Li, L.; Huang, G. Non-invasive measurements of ‘Yunhe’ pears by vis-NIRS technology coupled with deviation fusion modeling approach. Postharvest Biol. Technol. 2020, 160, 111067. [Google Scholar] [CrossRef]
Spectral Data | Attribute | Number of Calibration/Prediction | Spectral Range/nm | Bands | Range | Mean ± SD | CV |
---|---|---|---|---|---|---|---|
Corn | moisture | 53/27 | 1100~2498 | 700 | 9.377~10.993 | 10.234 ± 0.38 | 0.0372 |
oil | 3.088~3.832 | 3.498 ± 0.177 | 0.0506 | ||||
protein | 7.654~9.711 | 8.668 ± 0.497 | 0.0575 | ||||
starch | 62.826~66.472 | 64.695 ± 0.821 | 0.0127 | ||||
Apple [27] | SSC | 90/44 | 550~985 | 2040 | 9.75~15.45 | 12.24 ± 1.141 | 0.093 |
Marzipan [28] | moisture | 22/10 | 450~2448 | 1000 | 6.8~18.6 | 13.57 ± 3.664 | 0.270 |
Modeling Method | Attribute | Corn | Apple | Marzipan | |||
---|---|---|---|---|---|---|---|
Moisture | Oil | Protein | Starch | SSC | Moisture | ||
Optimal PLS | Best LVs | 6 | 8 | 14 | 15 | 15 | 8 |
RMSECV | 0.0371 | 0.0675 | 0.1078 | 0.2198 | 0.5153 | 0.6613 | |
Rcv | 0.968 | 0.916 | 0.968 | 0.954 | 0.892 | 0.972 | |
RMSEP | 0.0375 | 0.0760 | 0.0977 | 0.1716 | 0.5651 | 1.1169 | |
Rp | 0.978 | 0.899 | 0.983 | 0.974 | 0.873 | 0.928 | |
Number after Lasso sparse | 15 | 2 | 12 | 7 | 9 | 6 | |
PLS-F | RMSECV | 0.0085 | 0.0334 | 0.1065 | 0.1718 | 0.4908 | 0.1278 |
Rcv | 0.991 | 0.965 | 0.978 | 0.973 | 0.903 | 0.996 | |
RMSEP | 0.0076 | 0.0404 | 0.0846 | 0.1443 | 0.4863 | 1.134 | |
Rp | 0.989 | 0.952 | 0.992 | 0.986 | 0.916 | 0.916 | |
RR-F | RMSECV | 0.0131 | 0.0357 | 0.0991 | 0.1697 | 0.4865 | 0.122 |
Rcv | 0.978 | 0.959 | 0.984 | 0.977 | 0.918 | 0.996 | |
RMSEP | 0.0141 | 0.0404 | 0.0789 | 0.1482 | 0.481 | 0.9712 | |
Rp | 0.971 | 0.951 | 0.989 | 0.984 | 0.922 | 0.938 | |
DW-F | RMSECV | 0.008 | 0.0419 | 0.1015 | 0.1711 | 0.4842 | 0.2667 |
Rcv | 0.992 | 0.946 | 0.981 | 0.974 | 0.918 | 0.989 | |
RMSEP | 0.0065 | 0.0369 | 0.0867 | 0.146 | 0.4663 | 0.7948 | |
Rp | 0.993 | 0.962 | 0.988 | 0.986 | 0.937 | 0.957 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yuan, L.-M.; Yang, X.; Fu, X.; Yang, J.; Chen, X.; Huang, G.; Chen, X.; Li, L.; Shi, W. Consensual Regression of Lasso-Sparse PLS models for Near-Infrared Spectra of Food. Agriculture 2022, 12, 1804. https://doi.org/10.3390/agriculture12111804
Yuan L-M, Yang X, Fu X, Yang J, Chen X, Huang G, Chen X, Li L, Shi W. Consensual Regression of Lasso-Sparse PLS models for Near-Infrared Spectra of Food. Agriculture. 2022; 12(11):1804. https://doi.org/10.3390/agriculture12111804
Chicago/Turabian StyleYuan, Lei-Ming, Xiaofeng Yang, Xueping Fu, Jiao Yang, Xi Chen, Guangzao Huang, Xiaojing Chen, Limin Li, and Wen Shi. 2022. "Consensual Regression of Lasso-Sparse PLS models for Near-Infrared Spectra of Food" Agriculture 12, no. 11: 1804. https://doi.org/10.3390/agriculture12111804
APA StyleYuan, L.-M., Yang, X., Fu, X., Yang, J., Chen, X., Huang, G., Chen, X., Li, L., & Shi, W. (2022). Consensual Regression of Lasso-Sparse PLS models for Near-Infrared Spectra of Food. Agriculture, 12(11), 1804. https://doi.org/10.3390/agriculture12111804