Combining Partial Least Squares and the Gradient-Boosting Method for Soil Property Retrieval Using Visible Near-Infrared Shortwave Infrared Spectra
Abstract
:1. Introduction
2. Materials and Methods
2.1. The LUCAS Soil Spectral Library
2.2. Partial Least Squares Algorithm
2.3. Gradient-Boosted Decision Trees (GBDT)
2.4. Calculation of Relative Variable Importance
2.5. Assessment
3. Results
3.1. Overview of the Spectral Measurement of Soil Samples
3.2. Results of PLS Regression for the Estimation of Soil Properties
3.3. Results of PLS-GBDT for the Estimation of Soil Properties
3.4. Relative Important Variables Derived from PLS Regression and the Gradient-Boosting Method
4. Discussion
4.1. Dimension Reduction for High-Dimensional Soil Spectra
4.2. GBDT for Quantitative Soil Spectroscopic Modelling
5. Conclusions
Acknowledgments
Author Contributions
Conflicts of Interest
References
- Nocita, M.; Stevens, A.; van Wesemael, B.; Aitkenhead, M.; Bachmann, M.; Barthès, B.; Ben-Dor, E.; Brown, D.J.; Clairotte, M.; Csorba, A.; et al. Soil spectroscopy: An alternative to wet chemistry for soil monitoring. Adv. Agron. 2015, 132, 139–159. [Google Scholar]
- Soriano-Disla, J.M.; Janik, L.J.; Viscarra Rossel, R.A.; MacDonald, L.M.; McLaughlin, M.J. The performance of visible, near-, and mid-infrared reflectance spectroscopy for prediction of soil physical, chemical, and biological properties. Appl. Spectrosc. Rev. 2014, 49, 139–186. [Google Scholar] [CrossRef]
- Viscarra Rossel, R.A.; Behrens, T.; Ben-Dor, E.; Brown, D.J.; Demattê, J.A.M.; Shepherd, K.D.; Shi, Z.; Stenberg, B.; Stevens, A.; Adamchuk, V.; et al. A global spectral library to characterize the world’s soil. Earth-Sci. Rev. 2016, 155, 198–230. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; Huang, T.; Liu, J.; Lin, Z.; Li, S.; Wang, R.; Ge, Y. Soil pH value, organic matter and macronutrients contents prediction using optical diffuse reflectance spectroscopy. Comput. Electron. Agric. 2015, 111, 69–77. [Google Scholar] [CrossRef]
- Shi, Z.; Wang, Q.L.; Peng, J.; Ji, W.; Liu, H.J.; Li, X.; Viscarra Rossel, R.A. Development of a national VNIR soil-spectral library for soil classification and prediction of organic matter concentrations. Sci. China Earth Sci. 2014, 57, 1671–1680. [Google Scholar] [CrossRef]
- Ben-Dor, E.; Chabrillat, S.; Demattê, J.A.M.; Taylor, G.R.; Hill, J.; Whiting, M.L.; Sommer, S. Using imaging spectroscopy to study soil properties. Remote Sens. Environ. 2009, 113, S38–S55. [Google Scholar] [CrossRef]
- Nocita, M.; Stevens, A.; van Wesemael, B.; Brown, D.J.; Shepherd, K.D.; Towett, E.; Vargas, R.; Montanarella, L. Soil spectroscopy: An opportunity to be seized. Glob. Chang. Biol. 2015, 21, 10–11. [Google Scholar] [CrossRef] [PubMed]
- Ben-Dor, E.; Banin, A. Near-Infrared analysis as a rapid method to simultaneously evaluate several Soil properties. Soil Sci. Soc. Am. J. 1995, 59, 364–372. [Google Scholar] [CrossRef]
- Wang, J.; Cui, L.; Gao, W.; Shi, T.; Chen, Y.; Gao, Y. Prediction of low heavy metal concentrations in agricultural soils using visible and near-infrared reflectance spectroscopy. Geoderma 2014, 216, 1–9. [Google Scholar] [CrossRef]
- Ben-Dor, E.; Patkin, K.; Richter, R.; Mueller, A.; Kaufmann, H. Mapping of several soil properties using DAIS-7915. In A Decade of Trans-European Remote Sensing Cooperation; Buchroithner, M., Ed.; CRC Press: Dresden, Germany, 2001; pp. 385–390. [Google Scholar]
- Kopačková, V.; Ben-Dor, E.; Carmon, N.; Notesco, G. Modelling diverse soil attributes with visible to longwave infrared spectroscopy using PLSR employed by an automatic modelling engine. Remote Sens. 2017, 9, 134. [Google Scholar] [CrossRef]
- Leone, A.; Viscarra-Rossel, R.A.; Amenta, P.; Buondonno, A. Prediction of soil properties with PLSR and vis-NIR spectroscopy: Application to mediterranean soils from Southern Italy. Curr. Anal. Chem. 2012, 8, 283–299. [Google Scholar] [CrossRef]
- Gholizadeh, A.; Carmon, N.; Klement, A.; Ben-Dor, E.; Borůvka, L. Agricultural Soil Spectral Response and Properties Assessment: Effects of Measurement Protocol and Data Mining Technique. Remote Sens. 2017, 9, 1078. [Google Scholar] [CrossRef]
- Steinberg, A.; Chabrillat, S.; Stevens, A.; Segl, K.; Foerster, S. Prediction of common surface soil properties based on Vis-NIR airborne and simulated EnMAP imaging spectroscopy data: Prediction accuracy and influence of spatial resolution. Remote Sens. 2016, 8, 613. [Google Scholar] [CrossRef]
- Tran, T.N.; Afanador, N.L.; Buydens, L.M.C.; Blanchet, L. Interpretation of variable importance in partial least squares with significance multivariate correlation (sMC). Chemom. Intell. Lab. Syst. 2014, 138, 153–160. [Google Scholar] [CrossRef]
- Li, X.; Zhang, Y.; Bao, Y.; Luo, J.; Jin, X.; Xu, X.; Song, X.; Yang, G. Exploring the best hyperspectral features for LAI estimation using partial least squares regression. Remote Sens. 2014, 6, 6221–6241. [Google Scholar] [CrossRef]
- Mehmood, T.; Liland, K.H.; Snipen, L.; Sæbø, S. A review of variable selection methods in partial least squares regression. Chemom. Intell. Lab. Syst. 2012, 118, 62–69. [Google Scholar] [CrossRef]
- Norgaard, L.; Wagner, J.; Nielsen, J.P.; Munc, L.; Engelsen, S.B. Interval partial least-squares regression (iPLS): A comparative chemometric study with an example from near-infrared spectroscopy. Appl. Spectrosc. 2000, 54, 413–419. [Google Scholar] [CrossRef]
- Vohland, M.; Besold, J.; Hill, J.; Fründ, H.-C. Comparing different multivariate calibration methods for the determination of soil organic carbon pools with visible to near infrared spectroscopy. Geoderma 2011, 166, 198–205. [Google Scholar] [CrossRef]
- Christy, C.D.; Dyer, S.A. Estimation of soil properties using a combination of spectral and scalar sensor data. In 2006 IEEE Instrumentation and Measurement Technology Conference Proceedings; IEEE: New York, NY, USA, 2006; pp. 729–734. [Google Scholar]
- Gogé, F.; Joffre, R.; Jolivet, C.; Ross, I.; Ranjard, L. Optimization criteria in sample selection step of local regression for quantitative analysis of large soil NIRS database. Chemom. Intell. Lab. Syst. 2012, 110, 168–176. [Google Scholar] [CrossRef]
- Ramirez-Lopez, L.; Behrens, T.; Schmidt, K.; Stevens, A.; Demattê, J.A.M.; Scholten, T. The spectrum-based learner: A new local approach for modeling soil vis-NIR spectra of complex datasets. Geoderma 2013, 195, 268–279. [Google Scholar] [CrossRef]
- Gholizadeh, A.; Borůvka, L.; Saberioon, M.; Vašát, R. A memory-based learning approach as compared to other data mining algorithms for the prediction of soil texture using diffuse reflectance spectra. Remote Sens. 2016, 8, 341. [Google Scholar] [CrossRef]
- Bu, H.L.; Li, G.Z.; Zeng, X.Q.; Yang, J.Y.; Yang, M.Q. Feature selection and partial least squares based dimension reduction for tumor classification. In Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, Boston, MA, USA, 14–17 October 2007; pp. 967–973. [Google Scholar]
- Boulesteix, A.-L. PLS dimension reduction for classification with microarray data. Stat. Appl. Genet. Mol. Biol. 2004, 3, 1–30. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, Y.; Rayens, W. PLS and dimension reduction for classification. Comput. Stat. 2007, 22, 189–208. [Google Scholar] [CrossRef]
- Tang, L.; Peng, S.; Bi, Y.; Shan, P.; Hu, X. A new method combining LDA and PLS for dimension reduction. PLoS ONE 2014, 9, e96944. [Google Scholar] [CrossRef] [PubMed]
- Rosipal, R.; Krämer, N. Overview and recent advances in partial least squares. In Subspace, Latent Structure and Feature Selection; Springer: Berlin/Heidelberg, Germany, 2006; pp. 34–51. [Google Scholar]
- Höskuldsson, A. PLS regression methods. J. Chemom. 1988, 2, 211–228. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: Reliable large-scale tree boosting system. In Proceedings of the 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
- Agrawal, R.J.; Shanahan, J.G. Location disambiguation in local searches using gradient boosted decision trees. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA, 3–5 November 2010; pp. 129–136. [Google Scholar]
- Mitchell, R.; Frank, E. Accelerating the XGBoost algorithm using GPU computing. PeerJ Prepr. 2017, 5, e2911v1. [Google Scholar] [CrossRef]
- Tóth, G.; Jones, A.; Montanarella, L. LUCAS Topsoil Survey: Methodology, Data, and Results; Publications Office: Luxembourg, 2013. [Google Scholar]
- Tóth, G.; Jones, A.; Montanarella, L. The LUCAS topsoil database and derived information on the regional variability of cropland topsoil properties in the European Union. Environ. Monit. Assess. 2013, 185, 7409–7425. [Google Scholar] [CrossRef] [PubMed]
- Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Friedman, J.H. Stochastic Gradient Boosting. Comput. Stat. Data Anal. 2002, 38, 367–378. [Google Scholar] [CrossRef]
- Chopra, T.; Vajpai, J. Fault diagnosis in benchmark process control system using stochastic gradient boosted decision trees. Int. J. Soft Comput. Eng. 2011, 1, 98–101. [Google Scholar]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems; Morgan Kaufmann Publishers: San Mateo, CA, USA, 2017; pp. 3148–3156. [Google Scholar]
- Pedregosa, F.; Weiss, R.; Brucher, M. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- LightGBM. Available online: https://github.com/Microsoft/LightGBM/ (accessed on 10 December 2017).
- Zhu, J.; Shan, Y.; Mao, J.; Yu, D.; Rahmanian, H.; Zhang, Y. Deep embedding forest: Forest-based serving with deep embedding features. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, USA, 13–17 August 2017. [Google Scholar]
- Viscarra Rossel, R.A.; McGlynn, R.N.; McBratney, A.B. Determining the composition of mineral-organic mixes using UV-vis-NIR diffuse reflectance spectroscopy. Geoderma 2006, 137, 70–82. [Google Scholar] [CrossRef]
- Ben-Dor, E.; Taylor, R.G.; Hill, J.; Demattê, J.A.M.; Whiting, M.L.; Chabrillat, S.; Sommer, S. Imaging spectrometry for soil applications. Adv. Agron. 2008, 97, 321–392. [Google Scholar]
- Viscarra Rossel, R.A.; Walvoort, D.J.J.; McBratney, A.B.; Janik, L.J.; Skjemstad, J.O. Visible, near infrared, mid infrared or combined diffuse reflectance spectroscopy for simultaneous assessment of various soil properties. Geoderma 2006, 131, 59–75. [Google Scholar] [CrossRef]
- Peng, X.; Shi, T.; Song, A.; Chen, Y.; Gao, W. Estimating soil organic carbon using VIS/NIR spectroscopy with SVMR and SPA methods. Remote Sens. 2014, 6, 2699–2717. [Google Scholar] [CrossRef]
- Stenberg, B.; Viscarra Rossel, R.A.; Mouazen, A.M.; Wetterlind, J. Visible and near infrared spectroscopy in soil science. Adv. Agron. 2010, 107, 163–215. [Google Scholar]
- Mukherjee, K.; Ghosh, J.K.; Mittal, R.C. Dimensionality reduction of hyperspectral data using spectral fractal feature. Geocarto Int. 2012, 27, 515–531. [Google Scholar] [CrossRef]
- Huang, H.; Luo, F.; Liu, J.; Yang, Y. Dimensionality reduction of hyperspectral images based on sparse discriminant manifold embedding. ISPRS J. Photogramm. Remote Sens. 2015, 106, 42–54. [Google Scholar] [CrossRef]
- Liu, L.; Ji, M.; Dong, Y.; Zhang, R.; Buchroithner, M. Quantitative retrieval of organic soil properties from visible near-infrared Shortwave infrared (Vis-NIR-SWIR) spectroscopy feature extraction. Remote Sens. 2016, 8, 1035. [Google Scholar] [CrossRef]
- Vohland, M.; Ludwig, M.; Thiele-bruhn, S.; Ludwig, B. Determination of soil properties with visible to near- and mid-infrared spectroscopy: Effects of spectral variable selection. Geoderma 2014, 223–225, 88–96. [Google Scholar] [CrossRef]
- Viscarra Rossel, R.A.; Chappell, A.; De Caritat, P.; Mckenzie, N.J. On the soil information content of visible-near infrared reflectance spectra. Eur. J. Soil Sci. 2011, 62, 442–453. [Google Scholar] [CrossRef]
- Roweis, S. Nonlinear dimensionality reduction by locally linear embedding. Science 2000, 290, 2323–2326. [Google Scholar] [CrossRef] [PubMed]
- Ramirez-lopez, L.; Behrens, T.; Schmidt, K.; Viscarra Rossel, R.A.; Demattê, J.A.M.; Scholten, T. Distance and similarity-search metrics for use with soil vis-NIR spectra. Geoderma 2013, 199, 43–53. [Google Scholar] [CrossRef]
- Zhang, L.; Zhang, L.; Kumar, V. Deep learning for Remote Sensing Data:A technical tutorial on the state of the art. IEEE Geosci. Remote Sens. Mag. 2016, 18, 22–40. [Google Scholar] [CrossRef]
- Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y. Pierre-AntoineManzagol Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 2010, 11, 3371–3408. [Google Scholar]
- Xing, C.; Ma, L.; Yang, X. Stacked denoise autoencoder based feature extraction and classification for hyperspectral images. J. Sens. 2015, 2016, 3632943. [Google Scholar] [CrossRef]
- Caruana, R.; Niculescu-Mizil, A. An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 161–168. [Google Scholar]
- Caruana, R.; Karampatziakis, N.; Yessenalina, A. An empirical evaluation of supervised learning in high dimensions. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 96–103. [Google Scholar]
- Stevens, A.; Nocita, M.; Tóth, G.; Montanarella, L.; van Wesemael, B. Prediction of soil organic carbon at the European scale by visible and near infraRed reflectance spectroscopy. PLoS ONE 2013, 8, e66409. [Google Scholar] [CrossRef] [PubMed]
- Nocita, M.; Stevens, A.; Toth, G.; Panagos, P.; van Wesemael, B.; Montanarella, L. Prediction of soil organic carbon content by diffuse reflectance spectroscopy using a local partial least square regression approach. Soil Biol. Biochem. 2014, 68, 337–347. [Google Scholar] [CrossRef]
Category | Property | N | Mean | SD | Min | Q25 | Q50 | Q75 | Max |
---|---|---|---|---|---|---|---|---|---|
Woodland | OC (g/kg) | 4182 | 37.3 | 24.1 | 0.0 | 18.8 | 31.4 | 50.8 | 125.8 |
N (g/kg) | 4182 | 2.0 | 1.3 | 0.0 | 1.0 | 1.7 | 2.6 | 9.1 | |
Clay (%) | 4182 | 11.3 | 10.4 | 0.0 | 4.0 | 7.0 | 16.0 | 65.0 | |
Cropland | OC (g/kg) | 8341 | 17.1 | 10.9 | 0.0 | 10.4 | 14.4 | 20.5 | 160.3 |
N (g/kg) | 8341 | 1.6 | 0.79 | 0.0 | 1.1 | 1.5 | 1.9 | 9.5 | |
Clay (%) | 8341 | 22.1 | 12.7 | 1.0 | 13.0 | 21.0 | 30.0 | 79.0 | |
Grassland | OC (g/kg) | 3957 | 30.2 | 19.0 | 0.0 | 15.7 | 25.9 | 39.2 | 165.7 |
N (g/kg) | 3957 | 2.7 | 1.5 | 0.0 | 1.5 | 2.3 | 3.4 | 13.6 | |
Clay (%) | 3957 | 19.9 | 12.4 | 0.0 | 11.0 | 18.0 | 27.0 | 79.0 |
Category | Property | PLS Components | Number of Trees | Maximum Depth |
---|---|---|---|---|
Woodland | OC (g/kg) | 42 | 300 | 3 |
N (g/kg) | 78 | 1100 | 4 | |
Clay (%) | 50 | 500 | 4 | |
Cropland | OC (g/kg) | 64 | 1950 | 4 |
N (g/kg) | 86 | 2000 | 4 | |
Clay (%) | 82 | 2000 | 3 | |
Grassland | OC (g/kg) | 60 | 700 | 3 |
N (g/kg) | 72 | 900 | 3 | |
Clay (%) | 60 | 1450 | 3 |
© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, L.; Ji, M.; Buchroithner, M. Combining Partial Least Squares and the Gradient-Boosting Method for Soil Property Retrieval Using Visible Near-Infrared Shortwave Infrared Spectra. Remote Sens. 2017, 9, 1299. https://doi.org/10.3390/rs9121299
Liu L, Ji M, Buchroithner M. Combining Partial Least Squares and the Gradient-Boosting Method for Soil Property Retrieval Using Visible Near-Infrared Shortwave Infrared Spectra. Remote Sensing. 2017; 9(12):1299. https://doi.org/10.3390/rs9121299
Chicago/Turabian StyleLiu, Lanfa, Min Ji, and Manfred Buchroithner. 2017. "Combining Partial Least Squares and the Gradient-Boosting Method for Soil Property Retrieval Using Visible Near-Infrared Shortwave Infrared Spectra" Remote Sensing 9, no. 12: 1299. https://doi.org/10.3390/rs9121299
APA StyleLiu, L., Ji, M., & Buchroithner, M. (2017). Combining Partial Least Squares and the Gradient-Boosting Method for Soil Property Retrieval Using Visible Near-Infrared Shortwave Infrared Spectra. Remote Sensing, 9(12), 1299. https://doi.org/10.3390/rs9121299