Advanced Machine Learning and Water Quality Index (WQI) Assessment: Evaluating Groundwater Quality at the Yopurga Landfill
Abstract
:1. Introduction
2. Study Area
3. Materials and Methods
3.1. Data Collection
3.2. Machine Learning Algorithms
3.2.1. Input Data
3.2.2. Data Preprocessing
3.2.3. Model Training
3.2.4. Hyperparameter Optimization
3.2.5. Model Performance Comparison
3.3. Water Quality Index Model
3.3.1. Indicator Selection
3.3.2. Sub-Index Functions
3.3.3. Calculation of Weights
3.3.4. Aggregation Functions
3.4. Evaluation of Water Quality Index Model Scores
4. Results and Discussion
4.1. Model Validation
4.2. Machine Learning Algorithms
4.2.1. The Relationship between Actual Values and Predicted Values
4.2.2. Advantages and Limitations of Machine Learning Algorithms
4.2.3. Scientific and Industrial Background of Machine Learning Algorithms
4.2.4. The Practical Application of Machine Learning-Based Approach in Water Resource Management
4.2.5. Scalability and Transferability of the Machine Learning Framework
4.2.6. Future Applications and Studies
4.3. Water Quality Index Model Component Analysis
4.3.1. Results of Indicator Selection
4.3.2. Sub-Index Functions
4.3.3. Weight Function
4.3.4. Aggregation Functions
4.4. Comparison of Different Aggregation Functions
4.5. Eclipsing Error Analysis in Water Quality Models
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Kumar, P. Simulation of Gomti River (Lucknow City, India) Future Water Quality under Different Mitigation Strategies. Heliyon 2018, 4, e01074. [Google Scholar] [CrossRef] [PubMed]
- Han, Z.; Ma, H.; Shi, G.; He, L.; Wei, L.; Shi, Q. A Review of Groundwater Contamination near Municipal Solid Waste Landfill Sites in China. Sci. Total Environ. 2016, 569–570, 1255–1264. [Google Scholar] [CrossRef]
- He, N.; Liu, L.; Wei, R.; Sun, K. Heavy Metal Pollution and Potential Ecological Risk Assessment in a Typical Mariculture Area in Western Guangdong. Int. J. Environ. Res. Public Health 2021, 18, 11245. [Google Scholar] [CrossRef] [PubMed]
- Yao, X.; Cao, Y.; Zheng, G.; Devlin, A.T.; Yu, B.; Hou, X.; Tang, S.; Xu, L.; Lu, Y. Use of Life Cycle Assessment and Water Quality Analysis to Evaluate the Environmental Impacts of the Bioremediation of Polluted Water. Sci. Total Environ. 2021, 761, 143260. [Google Scholar] [CrossRef] [PubMed]
- Obiri, S.; Addico, G.; Mohammed, S.; Anku, W.W.; Darko, H.; Collins, O. Water Quality Assessment of the Tano Basin in Ghana: A Multivariate Statistical Approach. Appl. Water Sci. 2021, 11, 49. [Google Scholar] [CrossRef]
- Sim, S.F.; Ling, T.Y.; Lau, S.; Jaafar, M.Z. A Novel Computer-Aided Multivariate Water Quality Index. Environ. Monit. Assess. 2015, 187, 181. [Google Scholar] [CrossRef] [PubMed]
- Ding, F.; Chen, L.; Sun, C.; Zhang, W.; Yue, H.; Na, S. An Upgraded Groundwater Quality Evaluation Based on Hasse Diagram Technique & Game Theory. Ecol. Indic. 2022, 140, 109024. [Google Scholar] [CrossRef]
- Ding, F.; Zhang, W.; Chen, L.; Sun, Z.; Li, W.; Li, C.; Jiang, M. Water Quality Assessment Using Optimized CWQII in Taihu Lake. Environ. Res. 2022, 214, 113713. [Google Scholar] [CrossRef] [PubMed]
- Lumb, A.; Sharma, T.C.; Bibeault, J.-F. A Review of Genesis and Evolution of Water Quality Index (WQI) and Some Future Directions. Water Qual. Expo. Health 2011, 3, 11–24. [Google Scholar] [CrossRef]
- Uddin, M.G.; Nash, S.; Olbert, A.I. A Review of Water Quality Index Models and Their Use for Assessing Surface Water Quality. Ecol. Indic. 2021, 122, 107218. [Google Scholar] [CrossRef]
- Uddin, M.G.; Nash, S.; Rahman, A.; Olbert, A.I. A Sophisticated Model for Rating Water Quality. Sci. Total Environ. 2023, 868, 161614. [Google Scholar] [CrossRef] [PubMed]
- Shah, M.I.; Javed, M.F.; Alqahtani, A.; Aldrees, A. Environmental Assessment Based Surface Water Quality Prediction Using Hyper-Parameter Optimized Machine Learning Models Based on Consistent Big Data. Process Saf. Environ. Prot. 2021, 151, 324–340. [Google Scholar] [CrossRef]
- Taromideh, F.; Fazloula, R.; Choubin, B.; Emadi, A.; Berndtsson, R. Urban Flood-Risk Assessment: Integration of Decision-Making and Machine Learning. Sustainability 2022, 14, 4483. [Google Scholar] [CrossRef]
- Uddin, M.G.; Nash, S.; Rahman, A.; Olbert, A.I. Assessing Optimization Techniques for Improving Water Quality Model. J. Clean. Prod. 2023, 385, 135671. [Google Scholar] [CrossRef]
- Sutadian, A.D.; Muttil, N.; Yilmaz, A.G.; Perera, B.J.C. Development of a Water Quality Index for Rivers in West Java Province, Indonesia. Ecol. Indic. 2018, 85, 966–982. [Google Scholar] [CrossRef]
- Uddin, M.G.; Nash, S.; Rahman, A.; Olbert, A.I. A Comprehensive Method for Improvement of Water Quality Index (WQI) Models for Coastal Water Quality Assessment. Water Res. 2022, 219, 118532. [Google Scholar] [CrossRef] [PubMed]
- Uddin, M.G.; Nash, S.; Rahman, A.; Olbert, A.I. A Novel Approach for Estimating and Predicting Uncertainty in Water Quality Index Model Using Machine Learning Approaches. Water Res. 2023, 229, 119422. [Google Scholar] [CrossRef] [PubMed]
- Gazzaz, N.M.; Yusoff, M.K.; Aris, A.Z.; Juahir, H.; Ramli, M.F. Artificial Neural Network Modeling of the Water Quality Index for Kinta River (Malaysia) Using Water Quality Variables as Predictors. Mar. Pollut. Bull. 2012, 64, 2409–2420. [Google Scholar] [CrossRef]
- Wang, X.; Tian, Y.; Liu, C. Assessment of Groundwater Quality in a Highly Urbanized Coastal City Using Water Quality Index Model and Bayesian Model Averaging. Front. Environ. Sci. 2023, 11, 1086300. Available online: https://www.frontiersin.org/articles/10.3389/fenvs.2023.1086300/full (accessed on 1 June 2024).
- Forecasting Groundwater Quality Using Automatic Exponential Smoothing Model (AESM) in Xianyang City, China-Web of Science Core Collection. Available online: http://webofscience-clarivate-cn-s.libziyuan.bjut.edu.cn:8118/wos/woscc/full-record/WOS:000812667000001 (accessed on 27 May 2024).
- Aghamohammadghasem, M.; Azucena, J.; Hashemian, F.; Liao, H.; Zhang, S.; Nachtmann, H. System Simulation And Machine Learning-Based Maintenance Optimization for an Inland Waterway Transportation System. In Proceedings of the 2023 Winter Simulation Conference (WSC), San Antonio, TX, USA, 10–13 December 2023; Available online: https://ieeexplore.ieee.org/document/10408112 (accessed on 27 May 2024).
- Li, Q.; Lu, L.; Zhao, Q.; Hu, S. Impact of Inorganic Solutes’ Release in Groundwater during Oil Shale In Situ Exploitation. Water 2022, 15, 172. Available online: https://www.mdpi.com/2073-4441/15/1/172 (accessed on 1 June 2024).
- Zhang, J.; Zhou, J.; Zhou, Y.; Zeng, Y.; Ji, Y.; Sun, Y.; Lei, M. Hydrogeochemical Characteristics and Groundwater Quality Assessment in the Plain Area of Yarkant River Basin in Xinjiang, P.R. China. Environ. Sci. Pollut. Res. 2021, 28, 31704–31716. [Google Scholar] [CrossRef]
- Radhakrishnan, N.; Pillai, A.S. Comparison of Water Quality Classification Models Using Machine Learning. In Proceedings of the 2020 5th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 10–12 June 2020; pp. 1183–1188. [Google Scholar]
- El Bilali, A.; Taleb, A.; Brouziyne, Y. Groundwater Quality Forecasting Using Machine Learning Algorithms for Irrigation Purposes. Agric. Water Manag. 2021, 245, 106625. [Google Scholar] [CrossRef]
- Naloufi, M.; Lucas, F.S.; Souihi, S.; Servais, P.; Janne, A.; Wanderley Matos De Abreu, T. Evaluating the Performance of Machine Learning Approaches to Predict the Microbial Quality of Surface Waters and to Optimize the Sampling Effort. Water 2021, 13, 2457. [Google Scholar] [CrossRef]
- Bedi, S.; Samal, A.; Ray, C.; Snow, D. Comparative Evaluation of Machine Learning Models for Groundwater Quality Assessment. Environ. Monit. Assess. 2020, 192, 776. [Google Scholar] [CrossRef]
- Nasif, A.; Othman, Z.A.; Sani, N.S. The Deep Learning Solutions on Lossless Compression Methods for Alleviating Data Load on IoT Nodes in Smart Cities. Sensors 2021, 21, 4223. [Google Scholar] [CrossRef]
- Elgeldawi, E.; Sayed, A.; Galal, A.R.; Zaki, A.M. Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis. Informatics 2021, 8, 79. [Google Scholar] [CrossRef]
- Shekar, B.H.; Dagnew, G. Grid Search-Based Hyperparameter Tuning and Classification of Microarray Cancer Data. In Proceedings of the 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP), Gangtok, India, 25–28 February 2019; pp. 1–8. [Google Scholar]
- Powers, D.M. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation. arXiv 2020. Available online: https://arxiv.longhoe.net/abs/2010.16061 (accessed on 31 May 2024).
- The Relationship between Precision-Recall and ROC Curves | Proceedings of the 23rd International Conference on Machine Learning. Available online: https://dl.acm.org/doi/abs/10.1145/1143844.1143874 (accessed on 31 May 2024).
- The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets-Web of Science Core Collection. Available online: http://webofscience-clarivate-cn-s.libziyuan.bjut.edu.cn:8118/wos/woscc/full-record/WOS:000350685900033 (accessed on 31 May 2024).
- A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation | SpringerLink. Available online: https://springer.longhoe.net/chapter/10.1007/978-3-540-31865-1_25 (accessed on 31 May 2024).
- Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)?—Arguments against Avoiding RMSE in the Literature-Web of Science Core Collection. Available online: http://webofscience-clarivate-cn-s.libziyuan.bjut.edu.cn:8118/wos/woscc/full-record/WOS:000341600100030 (accessed on 31 May 2024).
- The Use of the Area under the Roc Curve in the Evaluation of Machine Learning Algorithms-Web of Science Core Collection. Available online: http://webofscience-clarivate-cn-s.libziyuan.bjut.edu.cn:8118/wos/woscc/full-record/WOS:A1997XE56500009 (accessed on 31 May 2024).
- Li, Q.; Liu, A.; Yu, K.; Yu, K.F. A Weighted Rank-Sum Procedure for Comparing Samples with Multiple Endpoints. Stat. Interface 2009, 2, 197–201. [Google Scholar] [CrossRef] [PubMed]
- Odu, G.O. Weighting Methods for Multi-Criteria Decision Making Technique. J. Appl. Sci. Environ. Manag. 2019, 23, 1449–1457. [Google Scholar] [CrossRef]
- Hamlat, A.; Guidoum, A.; Koulala, I. Status and Trends of Water Quality in the Tafna Catchment: A Comparative Study Using Water Quality Indices. J. Water Reuse Desalination 2017, 7, 228–245. [Google Scholar] [CrossRef]
- Bordalo, A.A.; Nilsumranchit, W.; Chalermwat, K. Water Quality and Uses of the Bangpakong River (Eastern Thailand). Water Res. 2001, 35, 3635–3642. [Google Scholar] [CrossRef] [PubMed]
- Sutadian, A.D.; Muttil, N.; Yilmaz, A.G.; Perera, B.J.C. Using the Analytic Hierarchy Process to Identify Parameter Weights for Developing a Water Quality Index. Ecol. Indic. 2017, 75, 220–233. [Google Scholar] [CrossRef]
- Ding, F.; Zhang, W.; Cao, S.; Hao, S.; Chen, L.; Xie, X.; Li, W.; Jiang, M. Optimization of Water Quality Index Models Using Machine Learning Approaches. Water Res. 2023, 243, 120337. [Google Scholar] [CrossRef]
- Khoi, D.N.; Quan, N.T.; Linh, D.Q.; Nhi, P.T.T.; Thuy, N.T.D. Using Machine Learning Models for Predicting the Water Quality Index in the La Buong River, Vietnam. Water 2022, 14, 1552. [Google Scholar] [CrossRef]
- Abbas, F.; Cai, Z.; Shoaib, M.; Iqbal, J.; Ismail, M.; Alrefaei, A.F.; Albeshr, M.F. Machine Learning Models for Water Quality Prediction: A Comprehensive Analysis and Uncertainty Assessment in Mirpurkhas, Sindh, Pakistan. Water 2024, 16, 941. Available online: https://www.mdpi.com/2073-4441/16/7/941?utm_campaign=releaseissue_waterutm_medium=emailutm_source=releaseissueutm_term=doilink133 (accessed on 1 June 2024).
- Xiong, Y.; Zhang, T.; Sun, X.; Yuan, W.; Gao, M.; Wu, J.; Han, Z. Groundwater Quality Assessment Based on the Random Forest Water Quality Index—Taking Karamay City as an Example. Sustainability 2023, 15, 14477. [Google Scholar] [CrossRef]
- Abbasi, T.; Abbasi, S.A. Water Quality Indices; Elsevier: Amsterdam, The Netherlands, 2012; ISBN 978-0-444-54304-2. [Google Scholar]
- Juwana, I.; Muttil, N.; Perera, B.J.C. Uncertainty and Sensitivity Analysis of West Java Water Sustainability Index—A Case Study on Citarum Catchment in Indonesia. Ecol. Indic. 2016, 61, 170–178. [Google Scholar] [CrossRef]
- Akhtar, N.; Ishak, M.I.S.; Ahmad, M.I.; Umar, K.; Md Yusuff, M.S.; Anees, M.T.; Qadir, A.; Ali Almanasir, Y.K. Modification of the Water Quality Index (WQI) Process for Simple Calculation Using the Multi-Criteria Decision-Making (MCDM) Method: A Review. Water 2021, 13, 905. [Google Scholar] [CrossRef]
Parameter | Unit | Standard for Groundwater Quality (GB/T 14848-93) | Standard for Groundwater Quality (GB/T 14848-2017) |
---|---|---|---|
IV | IV | ||
pH | - | 5.5~6.5/8.5~9 | 5.5 ≤ pH < 6.5/8.5 < pH ≤ 9.0 |
Ammonium Nitrogen | mg/L | 0.5 | 1.5 |
Manganese | mg/L | 1 | 1.5 |
Nickel | mg/L | 0.1 | 0.1 |
Boron | mg/L | - | 2 |
Lead | mg/L | 0.1 | 0.1 |
Zinc | mg/L | 5 | 5 |
Fluoride | mg/L | 2 | 2 |
Chemical Oxygen Demand | mg/L | 10 | - |
Iron | mg/L | 1.5 | 2 |
Indicators | Conditions | Sub-Index Functions |
---|---|---|
Ammonium Nitrogen | - | Equation (8) |
Manganese | ||
Nickel | ||
Boron | ||
Lead | ||
Zinc | ||
Fluoride | ||
Chemical Oxygen Demand | ||
Iron | ||
pH | (i) If pH ≥ 5.5 and pH < 6.5 | Equation (9) |
(ii) If pH > 8.5 and pH ≤ 9.0 | Equation (10) | |
(iii) If pH ≥ 6.5 and pH ≤ 8.5 | 100 |
Aggregate Function | Calculation Formula |
---|---|
NSF index (Weighted Arithmetic Mean (WAM)) | |
SRDD index (Modified Additive Function) | |
West Java WQI | |
Weighted Quadratic Mean (WQM) | |
Log-weighted Quadratic Mean (LQM) | |
Sinusoidal Weighted Mean (SWM) |
Aggregate Function | Classification Scheme |
---|---|
NSF index | Five categories |
① Excellent (90–100) | |
② Good (70–89) | |
③ Medium (50–69) | |
④ Bad (25–49) | |
⑤ Very bad (0–24) | |
SRDD index | Seven categories |
① Clean (90–100) | |
② Good (80–89) | |
③ Good with treatment (70–79) | |
④ Tolerable (40–69) | |
⑤ Polluted (30–39) | |
⑥ Several polluted (20–29) | |
⑦ Piggery waste (0–19) | |
West Java WQI | Five categories |
① Excellent (90–100) | |
② good (90–75) | |
③ Fair (75–50) | |
④ Marginal (50–25) | |
⑤ Poor (25–5) | |
Weighted Quadratic Mean | Four categories |
① Good (80–100) | |
② Fair 50–79) | |
③ Marginal (30–49) | |
④ Poor (0–29) | |
Log-weighted Quadratic Mean | Five categories |
① Clean (90–100) | |
② Slightly polluted (75–90) | |
③ Moderately polluted (50–75) | |
④ Heavily polluted (25–50) | |
⑤ Seriously polluted (0–25) | |
Sinusoidal Weighted Mean | Five categories |
① Clean (90–100) | |
② Slightly polluted (75–90) | |
③ Moderately polluted (50–75) | |
④ Heavily polluted (25–50) | |
⑤ Seriously polluted (0–25) |
Study | Region | Evaluation Method | WQI Range | Key Indicators | Remarks |
---|---|---|---|---|---|
The Study | the Yopurga Landfill | Machine Learning Optimized WQI | Moderate Pollution to Slight Pollution | pH, Mn, Ni | Xgboost algorithm used to determine indicator weights |
Use of Principal Component Analysis for parameter selection for development of a novel Water Quality Index: A case study of river Ganga India | Ganges Basin, India | PCA | --------- | Dissolved Oxygen (DO), pH, Conductivity, Biochemical Oxygen Demand (BOD), Total Coliform (TC), Chlorides, Magnesium, Sulfates, and Total Dissolved Solids (TDS) | PCA analysis reduced the number of parameters from 28 to 9 |
Assessment of groundwater quality in a highly urbanized coastal city using water quality index model and Bayesian model averaging | Shenzhen | Machine Learning Optimized WQI | In the marginal to good level | NH3-N, Mn, pH | Xgboost algorithm and ROC weight method used to determine indicator weights |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zheng, H.; Hou, S.; Liu, J.; Xiong, Y.; Wang, Y. Advanced Machine Learning and Water Quality Index (WQI) Assessment: Evaluating Groundwater Quality at the Yopurga Landfill. Water 2024, 16, 1666. https://doi.org/10.3390/w16121666
Zheng H, Hou S, Liu J, Xiong Y, Wang Y. Advanced Machine Learning and Water Quality Index (WQI) Assessment: Evaluating Groundwater Quality at the Yopurga Landfill. Water. 2024; 16(12):1666. https://doi.org/10.3390/w16121666
Chicago/Turabian StyleZheng, Hongmei, Shiwei Hou, Jing Liu, Yanna Xiong, and Yuxin Wang. 2024. "Advanced Machine Learning and Water Quality Index (WQI) Assessment: Evaluating Groundwater Quality at the Yopurga Landfill" Water 16, no. 12: 1666. https://doi.org/10.3390/w16121666
APA StyleZheng, H., Hou, S., Liu, J., Xiong, Y., & Wang, Y. (2024). Advanced Machine Learning and Water Quality Index (WQI) Assessment: Evaluating Groundwater Quality at the Yopurga Landfill. Water, 16(12), 1666. https://doi.org/10.3390/w16121666