Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil
Abstract
:1. Introduction
2. Results
2.1. Determination of Titanium
2.2. Determination of Iron
2.3. Determination of Aluminum, Beryllium, Gadolinium and Yttrium
3. Materials and Methods
3.1. Study Area
3.2. Soil Analysis and Parameters of Interest
3.3. Spectral Analysis and Database
3.4. Chemometric Methods
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Sample Availability
References
- Weil, R.R.; Brady, N.C. The Nature and Properties of Soils; Pearson Education Limited: Saddle River, NJ, USA, 2017. [Google Scholar]
- Obiora, S.C.; Chukwu, A.; Chibuike, G.; Nwegbu, A.N. Potentially harmful elements and their health implications in cultivable soils and food crops around lead-zinc mines in Ishiagu, Southeastern Nigeria. J. Geochem. Explor. 2019, 204, 289–296. [Google Scholar] [CrossRef]
- Bolan, S.; Wijesekara, H.; Tanveer, M.; Boschi, V.; Padhye, L.P.; Wijesooriya, M.; Wang, L.; Jasemizad, R.; Wang, C.; Zhang, T.; et al. Beryllium contamination and its risk management in terrestrial and aquatic environmental settings. Environ. Pollut. 2023, 320, 121077. [Google Scholar] [CrossRef]
- Han, X.; Wang, L.; Wang, Y.; Yang, J.; Wan, X.; Liang, T.; Song, H.; Elbana, T.A.; Rinklebe, J. Mechanisms and influencing factors of yttrium sorption on paddy soil: Experiments and modeling. Chemosphere 2017, 307, 135688. [Google Scholar] [CrossRef]
- Unruh, C.; Bavel, N.V.; Anikovskiy, M.; Prenner, E.J. Benefits and detriments of gadolinium from medical advances to health and ecological risks. Molecules 2022, 25, 5762. [Google Scholar] [CrossRef]
- Dinh, T.; Dobo, Z.; Kovacs, H. Phytomining of rare earth elements—A review. Chemosphere 2022, 297, 134259. [Google Scholar] [CrossRef]
- Ou, X.; Chen, Z.; Chen, X.; Li, X.; Wang, J.; Ren, T.; Chen, H.; Feng, L.; Wang, Y.; Chen, Z.; et al. Redistribution and chemical speciation of rare earth elements in an ion–adsorption rare earth tailing, southern china. Sci. Total Environ. 2022, 821, 153369. [Google Scholar] [CrossRef]
- Tibau, A.V.; Grube, B.D.; Velez, B.J.; Vega, V.M.; Mutter, J. Titanium exposure and human health. Oral Sci. Int. 2019, 16, 15–24. [Google Scholar] [CrossRef]
- Qureshi, Y. Impact of heavy metals consumption on human health: A literature review. J. Pharm. Res. Int. 2021, 33, 412–421. [Google Scholar] [CrossRef]
- Hu, B.; Chen, S.; Ju, J.; Xia, F.; Xu, J.; Li, Y.; Shi, Z. Application of portable xrf and vnir sensors for rapid assessment of soil heavy metal pollution. PLoS ONE 2017, 12, e0172438. [Google Scholar] [CrossRef]
- Štofejová, L.; Fazekaš, J.; Fazekašová, D. Analysis of heavy metal content in soil and plants in the dumping ground of magnesite mining factory Jelšava-Lubeník (Slovakia). Sustainability 2021, 13, 4508. [Google Scholar] [CrossRef]
- Hartley, W.; Edwards, R.; Lepp, N.W. Arsenic and heavy metal mobility in iron oxide-amended contaminated soils as evaluated by short- and long-term leaching tests. Environ. Pollut. 2004, 131, 495–504. [Google Scholar] [CrossRef]
- Saldanha, R.B.; Scheuermann Filho, H.C.; Mallmann, J.E.C.; Consoli, N.C.; Reddy, K.R. Physical–mineralogical–chemical characterization of carbide lime: An environment-friendly chemical additive for soil stabilization. J. Mater. Civ. Eng. 2016, 30, 06018004. [Google Scholar] [CrossRef]
- Krzebietke, S.; Daszykowski, M.; Czarnik-Matusewicz, H.; Stanimirova, I.; Pieszczek, L.; Sienkiewicz, S.; Wierzbowska, J. Monitoring the concentrations of Cd, Cu, Pb, Ni, Cr, Zn, Mn and Fe in cultivated haplic luvisol soils using near-infrared reflectance spectroscopy and chemometrics. Talanta 2023, 251, 123749. [Google Scholar] [CrossRef]
- Fonseca, A.A.; Pasquini, C.; Costa, D.C.; Soares, E.M.B. Effect of the sample measurement representativeness on soil carbon determination using near-infrared compact spectrophotometers. Geoderma 2022, 409, 115636. [Google Scholar] [CrossRef]
- Haghi, R.K.; Pérez-Fernández, E.; Robertson, A.H.J. Prediction of various soil properties for a national spatial dataset of scottish soils based on four different chemometric approaches: A comparison of near infrared and mid-infrared spectroscopy. Geoderma 2021, 396, 115071. [Google Scholar] [CrossRef]
- Jia, S.; Li, H.; Wang, Y.; Tong, R.; Li, Q. Recursive variable selection to update near-infrared spectroscopy model for the determination of soil nitrogen and organic carbon. Geoderma 2016, 268, 92–99. [Google Scholar] [CrossRef]
- Oliveira, D.L.B.; Pereira, S.H.S.; Schneider, M.P.; Silva, Y.J.A.B.; Nascimento, C.W.A.; Straaten, P.V.; Silva, Y.J.A.B.; Gomes, A.A.; Veras, G. Bio-inspired algorithm for variable selection in i-plsr to determine physical properties, thorium and rare earth elements in soils from Brazilian semiarid region. Microchem. J. 2021, 160, 105640. [Google Scholar] [CrossRef]
- Maia, A.J.; Nascimento, R.C.; Silva, Y.J.A.B.; Nascimento, C.W.A.; Mendes, W.S.; Veras Neto, J.G.; Araujo Filho, J.C.; Tiecher, T.; Silva, Y.J.A.B. Near-infrared spectroscopy for prediction of potentially toxic elements in soil and sediments from a semiarid and coastal humid tropical transitional river basin. Microchem. J. 2022, 179, 107544. [Google Scholar] [CrossRef]
- Garcia, M.B.E.O.; Dias, B.C.; Gomes, A.A. Exploring estimated hydrocarbon composition via gas chromatography and multivariate calibration to predict the pyrolysis gasoline distillation curve. Fuel 2021, 303, 121298. [Google Scholar] [CrossRef]
- Khaliliyan, H.; Schuster, C.; Sumerskii, I.; Guggenberger, M.; Oberlerchner, J.T.; Rosenau, T.; Potthast, A.; Böhmdorfer, S. Direct quantification of lignin in liquors by high performance thin layer cromatography-densitometry and multivariate calibration. ACS Sustain. Chem. Eng. 2020, 8, 16766–16774. [Google Scholar] [CrossRef]
- Sæbøa, S.; Almøy, T.; Aarøe, J.; Aastveit, A.H. ST-PLS: A multi-directional nearest shrunken centroid type classifier via pls. J. Chemom. 2008, 20, 54–62. [Google Scholar] [CrossRef]
- Attia, K.A.M.; Nassar, M.W.I.; El-Zeiny, M.B.; Serag, A. Firefly algorithm versus genetic algorithm as powerful variable selection tools and their effect on different multivariate calibration models in spectroscopy: A comparative study. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2017, 170, 117–123. [Google Scholar] [CrossRef]
- Heinze, G.; Wallisch, C.; Dunkler, D. Variable selection—A review and recommendations for the practicing statistician. Biom. J. 2018, 60, 431–449. [Google Scholar] [CrossRef]
- Mehmood, T.; Saebo, S.; Liland, K.H. Comparison of variable selection methods in partial least squares regression. J. Chemom. 2020, 34, e3226. [Google Scholar] [CrossRef]
- Andersen, C.M.; Bro, R. Variable selection in regression—A tutorial. J. Chemom. 2010, 24, 728–737. [Google Scholar] [CrossRef]
- Quilty, J.; Adamowski, J.; Boucher, M.A. A stochastic data-driven ensemble forecasting framework for water resources: A case study using ensemble members derived from a database of deterministic wavelet-based models. Water Resour. Res. 2019, 55, 175–202. [Google Scholar] [CrossRef]
- Gomes, A.A.; Azcarate, S.M.; Diniz, P.H.G.D.; Fernandes, D.D.S.; Veras, G. Variable selection in the chemometric treatment of food data: A tutorial review. Food Chem. 2022, 370, 131072. [Google Scholar] [CrossRef]
- Bozorg-Haddad, O. Advanced Optimization by Nature-Inspired Algorithms; Springer Nature: Singapore, 2017. [Google Scholar]
- Yang, X.-S.; Deb, S.; Zhao, Y.-X.; Fong, S.; He, X. Swarm intelligence: Past, present and future. Soft Comput. 2018, 22, 5923–5933. [Google Scholar] [CrossRef]
- Rudnick, R.L.; Gao, S. Treatise on Geochemistry; Elsevier: Cambridge, MA, USA, 2006. [Google Scholar]
- Ryan, J.G. Trace-element systematics of beryllium in terrestrial materials. Rev. Mineral. Geochem. 2002, 50, 121–145. [Google Scholar] [CrossRef]
- Balaram, V. Rare earth elements: A review of applications, occurrence, exploration, analysis, recycling, and environmental impact. Trends Anal. Geosci. Front. 2019, 10, 1285–1303. [Google Scholar] [CrossRef]
- Rossel, R.A.V.; Behrens, T. Using data mining to model and interpret soil diffuse reflectance spectra. Geoderma 2010, 158, 46–54. [Google Scholar] [CrossRef]
- Wu, Y.Z.; Ji, J.; Gong, P.; Liao, Q.; Tian, Q.; Ma, H. A mechanism study of reflectance spectroscopy for investigating heavy metals in soils. Soil Sci. Soc. Am. J. 2007, 71, 918–926. [Google Scholar] [CrossRef]
- Tepanosyan, G.; Muradyan, V.; Tepanosyan, G.; Avetisyan, R.; Asmaryan, S.; Sahakyan, L.; Denk, M.; Glaber, C. Exploring relationship of soil PTE geochemical and “VIS-NIR spectroscopy” patterns near Cu–Mo mine (Armenia). Environ. Pollut. 2023, 323, 121180. [Google Scholar] [CrossRef]
- Naibo, G.; Ramon, R.; Pesini, G.; Bueno, J.M.M.; Barros, C.A.P.; Caner, L.; Silva, Y.J.A.B.; Minella, J.P.G.; Santos, D.R.; Tiecher, T. Near-infrared spectroscopy to estimate the chemical element concentration in soils and sediments in a rural catchment. Catena 2022, 213, 106145. [Google Scholar] [CrossRef]
- Dematte, J.A. Characterization and discrimination of soils by their reflected electromagnetic energy. Pesq. Agropec. Bras. 2002, 37, 1445–1458. [Google Scholar] [CrossRef]
- Dalmolin, R.S.D.; Goncalves, C.N.; Klamt, E.; Dick, D.P. Relationship between the soil constituents and its spectral behavior. Cienc. Rural 2005, 35, 481–489. [Google Scholar] [CrossRef]
- Mammadov, E.; Denk, M.; Riedel, F.; Kazmierowski, C.; Lewinska, K.; Łukowiak, R.; Grzebisz, W.; Mamedov, A.I.; Glaesser, C. Determination of mehlich 3 extractable elements with visible and near infrared spectroscopy in a mountainous agricultural land, the caucasus mountains. Land 2022, 11, 363. [Google Scholar] [CrossRef]
- Gholizadeh, A.; Saberioon, M.; Ben-Dor, E.; Rossel, R.A.V.; Boruvka, L. Modelling potentially toxic elements in forest soils with vis-nir spectra and learning algorithms. Environ. Pollut. 2020, 267, 115574. [Google Scholar] [CrossRef]
- Alvarez, J.R.E.; Monteiro, A.A.; Jiménez, N.H.; Muñiz, U.O.; Padilha, A.R.; Molina, R.J.; Vera, S.Q. Nuclear and related analytical methods applied to the determination of cr, ni, cu, zn, cd and pb in a red ferralitic soil and sorghum samples. J. Radioanal. Nucl. Chem. 2001, 247, 479–486. [Google Scholar] [CrossRef]
- Jiao, Y.; Li, Z.; Chen, X.; Fei, S. Preprocessing methods for near-infrared spectrum calibration. J. Chemom. 2020, 34, e3306. [Google Scholar] [CrossRef]
- Galvão, R.K.H.; Araújo, M.C.U.; José, G.E.; Pontes, M.J.C.; Silva, E.C.; Saldanha, T.C.B. A method for calibration and validation subset partitioning. Talanta 2005, 67, 736–740. [Google Scholar] [CrossRef] [PubMed]
- Nørgaard, L.; Saudland, A.; Wagner, J.; Nielsen, J.P.; Munck, L.; Engelsen, S.B. Interval partial least-squares regression (iPLS): A comparative chemometric study with an example from near-infrared spectroscopy. Appl. Spectrosc. 2000, 54, 413–419. [Google Scholar] [CrossRef]
- Gomes, A.A.; Galvão, R.K.H.; Araújo, M.C.U.; Veras, G.; Silva, E.C. The successive projections algorithm for interval selection in pls. Microchem. J. 2013, 110, 202–208. [Google Scholar] [CrossRef]
- Bellon-Maurel, V.; Fernandez-Ahumada, E.; Palagos, B.; Roger, J.M.; McBratney, A. Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy. Trends Anal. Chem. 2010, 29, 1073–1081. [Google Scholar] [CrossRef]
Elements | Minimum Value | Maximum Value | Mean Value | SD | RSD |
---|---|---|---|---|---|
Ti | 1.60 × 103 | 10.4 × 103 | 4.66 × 103 | 2.01 × 103 | 43.08 |
Fe | 9.3 × 103 | 69.0 × 103 | 30.6 × 103 | 13.0 × 103 | 42.41 |
Al | 47.1 × 103 | 157.8 × 103 | 91.2 × 103 | 27.7 × 103 | 30.42 |
Be | 0.35 | 3.55 | 2.02 | 0.62 | 30.69 |
Gd | 2.44 | 15.24 | 5.60 | 1.62 | 28.97 |
Y | 6.82 | 35.80 | 14.77 | 4.03 | 27.29 |
Preprocessing | MSC | SNV | BaseLine (Linear) | BaseLine (OffSet) | BaseLine (Off + Linear) | |
---|---|---|---|---|---|---|
Model | iSPA-PLS | FFiPLS | FFiPLS | FFiPLS | PLS | FFiPLS |
LV | 21 | 10 | 11 | 6 | 17 | 12 |
NV | 2550 | 750 | 1500 | 1350 | 3001 | 1350 |
RMSEC (mg kg−1) | 0.36 × 103 | 0.92 × 103 | 0.83 × 103 | 1.02 × 103 | 0.61 × 103 | 0.80 × 103 |
R2cal | 0.9792 | 0.8381 | 0.8743 | 0.7876 | 0.9353 | 0.8802 |
RMSEP (mg kg−1) | 0.73 × 103 | 0.62 × 103 | 0.87 × 103 | 0.77 × 103 | 0.79 × 103 | 0.80 × 103 |
R2pred | 0.7097 | 0.7862 | 0.5655 | 0.6725 | 0.6881 | 0.7055 |
Biaspred (mg kg−1) | 0.28 × 103 | 0.27 × 103 | 0.34 × 103 | 0.19 × 103 | 0.05 × 103 | 0.16 × 103 |
REP (%) | 18.19 | 15.6 | 21.25 | 19.76 | 19.99 | 19.96 |
RPDpred | 1.85 | 2.16 | 1.52 | 1.75 | 1.79 | 1.84 |
SDV | 0.89 × 103 | 0.79 × 103 | 1.07 × 103 | 0.85 × 103 | 0.81 × 103 | 0.86 × 103 |
Preprocessing | SNV | Moving (Average) | Baseline (Linear) | Baseline (Offset) | Baseline (Offset + Linear) | ||
---|---|---|---|---|---|---|---|
Model | iPLS | FFiPLS | PLS | iSPAPLS | iPLS | FFiPLS | FFiPLS |
LV | 13 | 12 | 16 | 13 | 11 | 11 | 14 |
NV | 150 | 1350 | 3001 | 150 | 150 | 750 | 150 |
RMSEC (mg kg−1) | 2.16 × 103 | 6.23 × 103 | 4.68 × 103 | 1.63 × 103 | 3.41 × 103 | 6.38 × 103 | 1.00 × 103 |
R2cal | 0.9791 | 0.8237 | 0.9099 | 0.9882 | 0.9472 | 0.8152 | 0.9958 |
RMSEP (mg kg−1) | 5.81 × 103 | 4.58 × 103 | 8.09 × 103 | 6.21 × 103 | 5.88 × 103 | 6.22 × 103 | 6.16 × 103 |
R2pred | 0.6701 | 0.7947 | 0.0135 | 0.4910 | 0.5439 | 0.4888 | 0.4464 |
Biaspred (mg kg−1) | 0.46 × 103 | 0.46 × 103 | 1.70 × 103 | 0.37 × 103 | 0.60 × 103 | 0.52 × 103 | 1.53 × 103 |
REP (%) | 21.61 | 17.04 | 31.78 | 24.42 | 23.12 | 24.47 | 23.26 |
RPDpred | 1.74 | 2.21 | 1.01 | 1.4 | 1.48 | 1.4 | 1.34 |
SDV | 5.91 × 103 | 4.65 × 103 | 8.79 × 103 | 6.37 × 103 | 6.09 × 103 | 6.33 × 103 | 6.09 × 103 |
Analyte | Al | Be | Gd | Y | ||||
---|---|---|---|---|---|---|---|---|
Preprocessing | Raw Data | SNV | MSC | Baseline (Offset + Linear) | MSC | SNV | SG Smoothing | SG Smoothing |
LV | 9 | 7 | 7 | 6 | 4 | 5 | 5 | 5 |
NV | 1200 | 1800 | 1650 | 1350 | 1800 | 750 | 1650 | 1050 |
RMSEC (mg kg−1) | 12.77 × 103 | 13.09 × 103 | 12.82 × 103 | 13.31 × 103 | 0.55 | 0.55 | 1.40 | 3.37 |
R2cal | 0.8203 | 0.8048 | 0.8165 | 0.8008 | 0.3812 | 0.4059 | 0.4276 | 0.4489 |
RMSEP (mg kg−1) | 12.16 × 103 | 11.61 × 103 | 8.80 × 103 | 9.50 × 103 | 0.29 | 0.34 | 0.85 | 1.98 |
R2pred | 0.7729 | 0.8023 | 0.872 | 0.8533 | 0.3354 | 0.0488 | 0.2029 | 0.4437 |
Biaspred (mg kg−1) | 3.89 × 103 | 0.25 × 103 | 0.79 × 103 | 0.72 × 103 | 0.02 | 0.02 | 0.25 | 0.65 |
REP (%) | 14.06 | 13.01 | 10.2 | 10.65 | 14.81 | 17.19 | 14.55 | 13.09 |
RPDpred | 2.1 | 2.25 | 2.79 | 2.61 | 1.23 | 1.02 | 1.12 | 1.34 |
SDV | 14.19 × 103 | 11.85 × 103 | 9.09 × 103 | 9.67 × 103 | 0.3 | 0.35 | 0.98 | 2.32 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Abrantes, G.; Almeida, V.; Maia, A.J.; Nascimento, R.; Nascimento, C.; Silva, Y.; Silva, Y.; Veras, G. Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil. Molecules 2023, 28, 6959. https://doi.org/10.3390/molecules28196959
Abrantes G, Almeida V, Maia AJ, Nascimento R, Nascimento C, Silva Y, Silva Y, Veras G. Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil. Molecules. 2023; 28(19):6959. https://doi.org/10.3390/molecules28196959
Chicago/Turabian StyleAbrantes, Giovanna, Valber Almeida, Angelo Jamil Maia, Rennan Nascimento, Clistenes Nascimento, Ygor Silva, Yuri Silva, and Germano Veras. 2023. "Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil" Molecules 28, no. 19: 6959. https://doi.org/10.3390/molecules28196959
APA StyleAbrantes, G., Almeida, V., Maia, A. J., Nascimento, R., Nascimento, C., Silva, Y., Silva, Y., & Veras, G. (2023). Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil. Molecules, 28(19), 6959. https://doi.org/10.3390/molecules28196959