A Study on Developing a Model for Predicting the Compression Index of the South Coast Clay of Korea Using Statistical Analysis and Machine Learning Techniques
Abstract
:1. Introduction
2. Target Area and Data
2.1. Target Area
2.2. Data
3. Selecting the Influencing Factors
4. Statistical Analysis
4.1. VIF
4.2. Simple Regression Analysis
4.3. Multiple Regression Analysis
5. Machine Learning
5.1. Machine Learning Algorithms
5.2. Model Evaluation Metrics
5.3. Results
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kim, S.K.; Lim, H.D.; Moon, S.K. Clay Minerals and Their Distribution in the Soft Ground Deposited along the Coastline. J. Korean Geotech. Soc. 1998, 14, 73–80. [Google Scholar]
- Gregory, A.S.; Whalley, W.R.; Watts, C.W.; Birtd, N.R.A.; Hallett, P.D.; Whitmore, A.P. Calculation of the compression index and precompression stress from soil compression test data. Soil Tillage Res. 2006, 89, 45–57. [Google Scholar] [CrossRef]
- Bryan, A.M.; Brian, B.S.; Michael, M.L.; Fintan, J.B.; Eric, R.F. Empirical correlations for the compression index of Irish soft soils. Proc. Inst. Civ. Eng.-Geotech. Eng. 2014, 167, 507–599. [Google Scholar]
- Kalantary, F.; Kordnaeij, A. Prediction of compression index using artificial neural network. Sci. Res. Essays 2012, 7, 2835–2848. [Google Scholar] [CrossRef]
- Balasubramaniam, A.S.; Brenner, R.P. Consolidation and Settlement of Soft Clay. Dev. Geotech. Eng. 1981, 20, 479–566. [Google Scholar]
- Park, C.S.; Kim, S.S. A Study on the Estimation of Compression Index in the East-Southern Coast Clay of Korea. J. Korean Geotech. Soc. 2019, 35, 43–56. [Google Scholar]
- Skempton, A.W.; Jones, O.T. Notes on the compressibility of clays. J. Geol. Soc. 1944, 100, 119–135. [Google Scholar] [CrossRef]
- Heo, Y.; Hwang, I.S.; Kang, C.W.; Bae, W.S. Correlations Between the Physical Properties and Consolidation Parameter of West Shore Clay. J. Korean Geo-Environ. Soc. 2015, 16, 33–40. [Google Scholar] [CrossRef]
- Bae, W.S.; Kim, J.W. Correlations Between the Physical Properties and Compression Index of KwangYang Clay. J. Korean Geo-Environ. Soc. 2009, 10, 7–14. [Google Scholar]
- Chung, S.G.; Kwag, J.M.; Jang, W.Y.; Kim, D.G. Compressibility Characteristics of Estuarine Clays in the Nakdong River Plain. J. Korean Geotech. Soc. 2002, 18, 295–307. [Google Scholar]
- Bae, W.S.; Kwon, Y.C. Prediction of consolidation parameter using multiple regression analysis. Mar. Georesources Geotechnol. 2017, 35, 643–652. [Google Scholar] [CrossRef]
- Nguyen, M.D.; Pham, B.T.; Ho, L.S.; Ly, H.B.; Le, T.T.; Qi, C.; Le, V.M.; Le, L.M.; Prakash, I.; Son, L.H.; et al. Soft-computing techniques for prediction of soils consolidation coefficient. Catena 2020, 195, 104802. [Google Scholar] [CrossRef]
- Singh, M.J.; Kaushik, A.; Patnaik, G.; Xu, D.S.; Feng, W.Q.; Rajput, A.; Prakash; Borana, L. Machine learning-based approach for predicting the consolidation characteristics of soft soil. Mar. Georesources Geotechnol. 2023. [Google Scholar] [CrossRef]
- Heo, Y.; Yun, S.; Jung, K.; Oh, S. Analysis on the Relationship of Soil Parameters of Marine Clay. J. Korean Geo-Environ. Soc. 2008, 9, 37–45. [Google Scholar]
- George, V. Chilingar and Larry Knight, Relationship Between Pressure and Moisture Content of Kaolinite, Illite, and Montmorillonite Clays1. Bull. Am. Assoc. Pet. Geol. 1960, 44, 101–106. [Google Scholar]
- Partha, N.M.; Alexander, S.; Bhuyan, M.H. A Unified Approach for Establishing Soil Water Retention and Volume Change Behavior of Soft Soils. Geotech. Test. J. 2021, 44, 1197–1216. [Google Scholar] [CrossRef]
- Stamatopoulos, C.; Petridis, P.; Parcharidis, I.; Foumelis, M. A method predicting pumping-induced ground settlement using back-analysis and its application in the Karla region of Greece. Nat. Hazards 2018, 92, 1733–1762. [Google Scholar] [CrossRef]
- Tripathy, S.; Schanz, T. Compressibility behavior of clays at large pressures. Can. Geotech. J. 2007, 44, 355–362. [Google Scholar] [CrossRef]
- Marcial, D.; Delage, P.; Cui, Y. On the high stress compression of bentonites. Can. Geotech. J. 2002, 39, 816. [Google Scholar] [CrossRef]
- Pearson, K. Note on regression and inheritance in the case of two parents. Proc. R. Soc. Lond. 1895. 58, 240–242.
- Box, G.E.; Cox, D.R. An analysis of transformations. J. R. Stat. Soc. Ser. B Stat. Methodol. 1964, 26, 211–252. [Google Scholar] [CrossRef]
- Pearson, K. On lines and planes of closest fit to systems of points in space. Philos. Mag. 1901, 2, 559–572. [Google Scholar] [CrossRef]
- Fisher, R.A. On the “probable error” of a coefficient of correlation deduced from a small sample. Metron 1921, 1, 3–32. [Google Scholar]
- Chan, J.Y.L.; Leow, S.M.H.; Bea, K.T.; Cheng, W.K.; Phoong, S.W.; Hong, Z.W.; Chen, Y.L. Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review. Mathematics 2022, 10, 1283. [Google Scholar] [CrossRef]
- Kimon, N.; Alex, K.; Andreas, A. Interdependency Pattern Recognition in Econometrics: A Penalized Regularization Antidote. Econometrics 2021, 9, 44. [Google Scholar] [CrossRef]
- Manoranjan, P.; Bharati, P. Introduction to Correlation and Linear Regression Analysis. Appl. Regres. Tech. 2019, 1–18. [Google Scholar] [CrossRef]
- Lee, S.Y.; Kim, J.Y.; Kang, J.M.; Baek, W.J.; Yoon, H.J. Comparison of Performance of Machine Learning Models for Predicting Compression Index Based on Clay Properties. J. Korean Soc. Hazard Mitig. 2022, 22, 127–134. [Google Scholar] [CrossRef]
- Hong, S.J.; Kim, D.H.; Choi, Y.M.; Lee, W.J. Prediction of Compression Index of Busan and Inchon Clays Considering Sedimentation State. J. Korean Geotech. Soc. 2011, 27, 37–46. [Google Scholar] [CrossRef]
- Hayes, A.F. Using heteroskedasticity-consistent standard error estimators in OLS regression: An introduction and software implementation. Behav. Res. Methods 2007, 39, 709–722. [Google Scholar] [CrossRef]
- Hayes, A.F.; Mattels, J. Computational procedures for probing interactions in OLS and logistic regression: SPSS and SAS implementations. Behav. Res. Methods 2009, 41, 924–936. [Google Scholar] [CrossRef]
- Pal, M. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 2005, 26, 217–222. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In KDD’16 Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Adv. Neural Inf. Process. Syst. 2017, 30, 3149–3157. [Google Scholar]
- Breiman, L.; Friedman, J.; Stone, C.; Olshen, R. Classification and Regression Trees; Taylor & Francis: Abingdon, UK, 1984. [Google Scholar]
- Louppe, G. Understanding Random Forests; University of Liege: Leige, Belgium, 2014; p. 211. [Google Scholar]
- Kim, H.I.; Lee, Y.S.; Kim, B. Real-time flood prediction applying random forest regression model in urban areas. J. Korea Water Resour. Assoc. 2021, 54, 1119–1130. [Google Scholar]
- Zhang, Y.; Haghani, A. A gradient boosting method to improve travel time prediction. Transp. Res. Part C Emerg. Technol. 2015, 58, 308–324. [Google Scholar] [CrossRef]
- Willmott, C.J. Some Comments on the Evaluation of Model Performance. Bull. Am. Meteorol. Soc. 1982, 63, 1309–1313. [Google Scholar] [CrossRef]
- Díaz, E.; Spagnoli, G. A super-learner machine learning model for a global prediction of compression index in clays. Appl. Clay Sci. 2024, 249, 107239. [Google Scholar] [CrossRef]
Water System | Number of Data Points |
---|---|
Yeongsan River | 1759 |
Seomjin River | 1778 |
Nakdong River | 1331 |
Total | 4868 |
Geotechnical Properties | Range | Mean |
---|---|---|
Average Depth (m) | 0.050~66.000 | 10.652 |
Natural Water Content (,%) | 7.400~147.100 | 58.996 |
Specific Gravity (Gs) | 2.530~2.900 | 2.692 |
Liquid Limit (LL, %) | 22.200~142.500 | 58.554 |
Plasticity Index (PI, %) | 1.400~118.90 | 24.707 |
Plasticity Limit (PL, %) | 1.40~91.80 | 25.866 |
Initial Void Ratio () | 0.607~3.825 | 1.675 |
Saturated Unit Weight (, tf/) | 1.063~4.155 | 1.648 |
Uniaxial Compressive Strength (, kgf/) | 0.012~2.323 | 0.346 |
Compression Index () | 0.119~2.614 | 0.711 |
Pre-consolidation Pressure (, kgf/) | 0.000~4.900 | 0.732 |
Category | Compression Index |
---|---|
Average Depth | −0.049 |
Natural Water Content | 0.838 *** |
Specific Gravity | −0.077 |
Liquid Limit | 0.632 *** |
Plasticity Index | 0.617 *** |
Plastic Limit | 0.473 *** |
Initial Void Ratio | 0.869 *** |
Category | VIF |
---|---|
Natural Water Content | 15.025 |
Liquid Limit | 93.882 |
Plasticity Index | 60.218 |
Plastic Limit | 9.899 |
Initial Void Ratio | 14.976 |
Category | VIF |
---|---|
Natural Water Content | 2.149 |
Liquid Limit | 3.262 |
Plastic Limit | 2.101 |
Category | Regression Equation | |
---|---|---|
Natural Water Content | = 0.0167 − 0.2913 | 0.687 |
Liquid Limit | = 0.0125LL − 0.0146 | 0.425 |
Plastic Limit | = 0.0284PL − 0.0019 | 0.223 |
Category | Linear Regression Model (OLS) | |
---|---|---|
Constant | −0.3165 *** | |
Geotechnical Properties | Natural Water Content | 0.0152 *** |
Liquid Limit | 0.0018 *** | |
Plastic Limit | 0.0003 | |
Explanatory Power | 0.691 |
Model | RMSE | |||
---|---|---|---|---|
Train | Test | Train | Test | |
RF | 0.1341 | 0.1669 | 0.82 | 0.72 |
XGB | 0.1403 | 0.1715 | 0.80 | 0.70 |
LGBM | 0.1401 | 0.1702 | 0.80 | 0.71 |
Model | Hyperparameter |
---|---|
RF | N_estimators = 500, Max_depth = 6 |
XGB | N_estimators = 200, Max_depth = 3, Learning_rate = 0.05 |
LGBM | N_estimators = 300, Max_depth = 5, Learning_rate = 0.02 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, S.; Kang, J.; Kim, J.; Baek, W.; Yoon, H. A Study on Developing a Model for Predicting the Compression Index of the South Coast Clay of Korea Using Statistical Analysis and Machine Learning Techniques. Appl. Sci. 2024, 14, 952. https://doi.org/10.3390/app14030952
Lee S, Kang J, Kim J, Baek W, Yoon H. A Study on Developing a Model for Predicting the Compression Index of the South Coast Clay of Korea Using Statistical Analysis and Machine Learning Techniques. Applied Sciences. 2024; 14(3):952. https://doi.org/10.3390/app14030952
Chicago/Turabian StyleLee, Sungyeol, Jaemo Kang, Jinyoung Kim, Wonjin Baek, and Hyeonjun Yoon. 2024. "A Study on Developing a Model for Predicting the Compression Index of the South Coast Clay of Korea Using Statistical Analysis and Machine Learning Techniques" Applied Sciences 14, no. 3: 952. https://doi.org/10.3390/app14030952
APA StyleLee, S., Kang, J., Kim, J., Baek, W., & Yoon, H. (2024). A Study on Developing a Model for Predicting the Compression Index of the South Coast Clay of Korea Using Statistical Analysis and Machine Learning Techniques. Applied Sciences, 14(3), 952. https://doi.org/10.3390/app14030952