Next Article in Journal
Beat the Heat: Stay or Stray? Exploring the Connection of Extreme Temperature Events, Green Space, and Impervious Surfaces in European Cities
Previous Article in Journal
Evaluation of Correction Algorithms for Sentinel-2 Images Implemented in Google Earth Engine for Use in Land Cover Classification in Northern Spain
Previous Article in Special Issue
Ecological Risk Assessment, Distribution and Source of Polycyclic Aromatic Hydrocarbons in the Soil of Urban and Suburban Forest Areas of Southern Poland
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using Random Forest Regression to Model the Spatial Distribution of Concentrations of Selected Metals in Groundwater in Forested Areas of the Wielkopolska National Park, Poland

Department of Soil Science, Land Reclamation and Geodesy, Poznań University of Life Sciences, Piątkowska 94E, 60-649 Poznań, Poland
Forests 2024, 15(12), 2191; https://doi.org/10.3390/f15122191
Submission received: 6 November 2024 / Revised: 3 December 2024 / Accepted: 9 December 2024 / Published: 12 December 2024
(This article belongs to the Special Issue Soil Pollution and Remediation of Forests Soil)

Abstract

:
Monitoring groundwater pollution is an important issue in terms of analyzing threats to protected, environmentally valuable areas. The topographical and environmental characteristics of a given area are often mentioned among the factors affecting the dynamics and chemistry of groundwater. In this study, the random forest regression (RFR) model was used to determine the spatial distribution of selected metals, such as aluminum, calcium, iron, potassium, magnesium, manganese, sodium, and zinc. In the role of indicators describing terrain variability, derivatives of the digital elevation model (DEM) were employed, with a spatial resolution of 5 m, describing the topography of the terrain on a local scale, such as, among others, slopes, the aspect and curvatures of slopes, the topographic position index, and the SAGA wetness index, as well as generalized values determined for each sampling point of the areas contributing their runoff. In addition, environmental parameters were taken into consideration: forest habitat types, the structure of soil cover, and the seasons when samples were collected. This study used samples collected from 15 wells located in forested areas of the Wielkopolska National Park on seven dates. The results obtained show that random forest can be used with very good results to model the spatial variability of the concentrations of aluminum, potassium, magnesium, manganese, and sodium in groundwater. However, in the case of calcium and zinc, no correlations were found between the adopted indicators describing the spatial variability of the area and their concentrations in groundwater. In addition, the degree of importance of each predictor was determined in order to rank their importance in modeling the concentration of each of the metals in groundwater. The summary ranking of predictors indicates that the strongest influence on the predicted concentration of metals in groundwater is exhibited by profile curvatures, planar curvatures, multiscale TPI, and then the habitat type of the forest. On the other hand, curvature classifications, soil composition, and seasonality exhibit the smallest generalized impact on the results of modeling.

1. Introduction

Groundwater is one of the most important resources of the natural environment, constituting the largest source of fresh water [1]. The quality of groundwater is influenced by a number of factors, both environmental and anthropogenic [2,3]. They can significantly limit the availability of this water, and consequently affect the surrounding environment. Nowadays, an ever-growing number of soluble chemicals resulting from urban and industrial activities and modern agricultural practices pose a threat [4]. At the same time, groundwater, being a component of the critical zone, affects the development of plants [5].
The chemical composition of groundwater in a given place depends on the soil and rock substrates, the time of contact with the soil or rock substrates, biological processes, and the mixing of water coming from the surface and from under the surface [6,7,8]. This affects the spatial variability of groundwater chemistry. Depending on the location in the relief, a greater role may be played by biogeochemical processes [9] or processes related to the accumulation of organic matter [10]. Research conducted in the United States has shown that geochemical processes occurring in the unsaturated zone can significantly affect the composition of groundwater [11]. Knowledge of the spatial variability of the occurrence of metals in groundwater, combined with their location in the landscape, is necessary to identify possible sources of pollution in order to develop appropriate management strategies aimed at remediating or regulating these pollutants [12].
Monitoring groundwater quality provides important information that can be used for activities aimed at maintaining ecological systems, including those in environmentally valuable areas. Assessing temporal and spatial changes in water quality parameters is of fundamental importance to controlling and preventing water pollution [13,14,15]. Traditional geostatistical and numerical models are typically very complex, expensive, time-consuming, and require extensive data [16]; they are typically unable to reflect the complex biological, chemical, and physical characteristics used to describe water quality [17,18]. The use of remote sensing data describing the terrain can help to more easily and quickly assess the spatial variability of groundwater contamination by metals. This is especially true for areas where access for systematic sampling is difficult or even impossible [19].
Machine learning (ML) algorithms are increasingly used to analyze the variability of water quality parameters. ML models use the inductive hypothesis to analyze and “learn the rules” based on data without relying on a specific set of equations. Haggerty et al. [20] identified four main groups of ML models used in modeling groundwater quality. The first two groups are partially supervised models and ensemble models. The group of models with unsupervised learning included self-organizing map models, grouping, and multiple frameworks. However, the most numerous group of models consists of supervised models, which include decision tree models and random forests, artificial neural networks, support vector machine models, adaptive neuro-fuzzy inference systems, deep learning, regression models, comparative studies, and optimization techniques.
The random forest (RF) model used in this study is a classification algorithm consisting of many decision trees. RF randomly selects observations and traits to build each individual tree and, consequently, create an uncorrelated forest of trees [21,22]. Due to its non-parametric nature, in the regression of random trees in random forest regression (RFR), the data do not have to meet the conditions for distribution; the data may also be collinear. The RFR method also proves its worth if a large number of predictors are used [23]. A random forest can be treated as a form of a “grey box” [24,25]. Although individual trees cannot be checked separately, we obtained some measures for the interpretation of results. One of these is the importance of the variable describing how much worse the prediction would be if the data for this predictor were randomly shifted [25]. Random forests have the additional advantage of dealing with “small n, large p” problems, a common scenario in studies on groundwater quality where observational data are scarce compared to the number of potentially influential variables [26].
Numerous studies indicate that RF exhibits better quality when modeling pollutants in groundwater compared to other methods. In the case of modeling nitrates, they allowed better results to be achieved compared to, e.g., MRL, kriging, SVM, naïve Bayes, a generalized additive model (GAM), and a boosted regression tree (BRT) on the scale of both a single well [27] and a continent [28]. RF was also used to determine the spatial distribution of nitrate concentrations on a large scale using only spatial predictors describing the environment [29].
The data used to teach the models have a specific spatial resolution, which may affect the quality of the results obtained from the model [30]. Depending on the size of the area subjected to analysis, the type of data used to describe the terrain and climate, and its availability, the spatial resolution of the data used for modeling may range from several dozen meters to a resolution expressed in dozens of kilometers [28,31,32,33].
The aim of this study was to determine the possibility of describing the spatial variability of selected metal concentrations in the groundwater within the forested areas of the Wielkopolska National Park using parameters describing the terrain topography determined on the basis of remote sensing data. The analysis took into account both parameters determined locally and the impact of areas neighboring the sampling sites. The final aim of this study was also to determine the possibility of using spatial data with a fairly high spatial resolution of 5 m in modeling the spatial variability of metal concentrations in groundwater using RFR. This is particularly important in valuable, protected areas where access for collecting water samples is difficult.

2. Materials and Methods

2.1. Study Site

This study used data describing concentrations of metals in groundwater samples collected in the Wielkopolska National Park, located in the western part of Poland (52.27° N, 16.79° E) (Figure 1). The area of the Park is 76 km2. The Wielkopolska National Park is located in the immediate vicinity of the city of Poznań, which has over 500 thousand inhabitants. In total, the agglomeration of the city of Poznań has over 1 million inhabitants and it is one of the few areas in Poland with a continuous increase in the number of people living there. According to the Köppen–Geiger classification, the climate in this area exhibits characteristics of an oceanic climate (Cfb) characterized by warm summers and mild winters [34]. The average annual air temperature is 8.5 °C. The coldest month is January (−1.0 °C) and the warmest is July (18.1 °C). The average annual precipitation is 507 mm, which is one of the lowest in Poland. The highest precipitation is observed in July (76.0 mm) and the lowest in February (22.9 mm) [35]. The average annual groundwater table depths measured in observation wells for the period from 2015 to 2022 ranged from 0.88 m to 17.93 m [36].

2.2. Flowchart

Figure 2 shows a workflow diagram for an example of one metal. Part A, aimed at determining the topographic and environmental parameters of the catchment and then determining the area for which modeling of metal concentrations dissolved in groundwater is possible, is common to all metals and was performed only once. The remaining part, covering the construction of the RFR model and the production of maps of the spatial distribution of metal concentrations in groundwater, was performed for each of the metals.

2.3. Groundwater Samples

The results of the analyses of Al, Ca, Fe, K, Mg, Mn, Na, and Zn concentrations were made available by the Wielkopolska National Park. The data used in the analysis are the only data regarding metal concentrations in groundwater in the Wielkopolska National Park. Water samples were taken from shallow groundwater, the depth of which ranged from 0.5 to 4.6 m. Groundwater samples were collected at 15 sites located in the forested areas of this national park on 7 dates: March, July, and November 2016, and February, May, August, and November 2017. This gives a total of 105 samples for each of the metals analyzed. These sites are marked with numbers 1, 2, 4, 5, and 6–17 (Figure 1). Water samples were collected after pumping water out of the well at least three times. Concentrations of elements dissolved in water were determined using atomic absorption spectrometry in certified laboratories in accordance with the W-METAXFL1 methodology.

2.4. Terrain Topography

In order to analyze the topography of the terrain, the author’s original digital elevation model (DEM) was employed, with a resolution of 5 m, covering the entire National Park area (Figure 1). The results of the LiDAR flight pass, made available by the Wielkopolska National Park in the form of “las” files, were used as source data. When constructing the DEM and determining the parameters of the terrain, the SAGA GIS 9.3.1 program was used [37] (SAGA User Group Association, Hamburg Germany).
The topography of the terrain influences groundwater recharge from water infiltrating from the surface but also determines groundwater flow patterns [38,39] and the movement of substances dissolved in it [40,41]. In the case of using topographic parameters obtained from remote sensing sources to describe a phenomenon, many parameters can be used [42,43]. It should be remembered that the determination of topographic parameters using DEM is subject to uncertainties and errors contained in the data and in the algorithms used to determine these parameters [44,45,46]. Therefore, the use of multiparametric models can reduce the impact of a single parameter error on the final result.
The topographic parameters used in this work allow for the assessment of water movement conditions near the points where water samples are collected for analysis. The description of the terrain topography for each of the raster cells included terrain slopes, aspect, general curvature, profile and planar curvature, curvature classification, the topographic position index (TPI), TPI-based landform classification, multiscale TPI, and the SAGA wetness index.
In order to describe the water sample collection sites, locally defined indicators were used, such as profile curvature, planar curvature, general curvature, curvature classification, the topographic position index (TPI), TPI-based landform classification, multiscale TPI, and the SAGA wetness index. The slopes and the aspect of the slope are among the basic factors describing the terrain [47]. Consideration of slopes is important for assessing the rate of infiltration. Areas with low slopes are characterized by high infiltration and low surface runoff. Areas with high slopes show the opposite relationship [48]. In turn, the slope’s exposure can be used to assess its insolation, which can affect, among other things, evapotranspiration. On southern slopes, stronger insolation leads to stronger water uptake by vegetation, as well as from the more heated soil in capillary rise.
Profile curvature measures the topographic curvature along a flow line, i.e., the steepest descent path. Planar curvature measures the curvature of contour lines on topographic maps. It is directly related to the convergence and divergence of flow lines [49]. Terrain slopes, the general profile, and planar curvatures were determined using the 9-parameter model presented by Zevenberg and Thorn [50]. These indicators allowed for determining the local humidity conditions.
The TPI index and the derived multiscale TPI and TPI landform classification indicators allowed for determining the location of cells in the relief of the terrain at a different level of generalization [51,52]. TPI compares the elevation of each cell with the average elevation of cells located in a specific neighborhood. Positive TPI values indicate cells that are positioned higher than the average for the neighborhood, which may indicate hill ridges. Negative TPI values correspond to locations located below the neighborhood (valleys). TPI values close to zero indicate flat areas or areas characterized by a constant slope of the terrain. TPI landform classification, on the other hand, additionally uses TPI standard deviation values to delineate discrete location classes on the slope [51]. A description of terrain forms was also carried out using a fuzzy classification based on terrain slopes and a number of different descriptions of terrain curvature [53].
The SAGA wetness index (SWI), a modification of the Topographic wetness index, was used to describe water flow conditions. This index employs a multi-directional water flow algorithm and iteratively modifies the contributing area based on the local slope of the terrain and the accumulation of flow in cells adjacent to the cell being analyzed, in accordance with the formula [54,55,56]:
S W I = l n S C A M tan β
S C A M = S C A m a x 1 t β e x p t β   f o r   S C A < S C A m a x 1 t β e x p t β
where: SCAM is a modified specific catchment area of each DEM grid cell (m), SCAmax is the maximum stable value for SCAM obtained through the iterations, β is the slope angle (arcs), and t is described as a suction factor. A high SWI value indicates that the region has a greater potential for soil water saturation [44].
The chemical pattern of groundwater is related to groundwater flow, and hydrochemical changes are controlled primarily by geochemical processes. The chemistry of shallow local groundwater may change systematically along the flow path [57]. In order to determine the possible impact of the areas surrounding sample collection sites on the concentrations of metals in the water, areas contributing their surface runoff were delineated for points 1, 3, and 8–17. These areas were determined using the Upslope Area tool with the multiple flow direction method. For points 4, 5, and 7, located in alluvial areas, a buffer zone with a radius of 50 m surrounding each point was adopted as the area that may influence the concentrations of metals in groundwater. The descriptions of the areas defined in this way included determining each contributing area’s maximum and average slopes, the standard deviation of slopes for the contributing area, and the average aspect. The standard deviation of the slope and mean gradient can be used to describe the conditions for runoff formation and infiltration in the contributing area.

2.5. Environmental Parameters

The distributions of forest habitat types and topsoil cover were made available by the Wielkopolska National Park in the form of shp files [58]. The data were rasterized with a resolution of 5 m, in a grid compatible with the DEM raster grid. Then, the dominant type of FHT and soil cover was determined for the previously described contributing areas of each of the points. In Poland, FHT is the basic unit in the forest habitat classification system. It includes forest areas with similar site conditions resulting from soil moisture and fertility, terrain shape, and geological structure [59,60]. In the case of shallow groundwater levels, forests can affect the water–salt balance of soils by increasing salt concentrations [61]. Different tree species affect the salinity of groundwater in different ways [62].
In addition, the analysis took into account the dates of measurements of metal concentrations in groundwater, classifying the months of sampling as spring: March, April, and May; summer: June, July, and August; autumn: September, October, and November, and winter: December, January, and February. This may allow the assessment of the seasonal variability of metal concentrations in groundwater.

2.6. Random Forest Regression

All statistical analyses were performed using the program R, version 4.4 [63] (R Foundation for Statistical Computing, Vienna, Austria).
The random forest regression (RFR) method was used to determine the relationships between the concentrations of metals in groundwater and the geomorphometric and environmental parameters described above. RFR is a machine learning method that allows for classification and regression, consisting of the construction of many decision trees [22]. The philosophy behind the learning techniques used in RFR assumes that its accuracy is higher than that of other machine learning algorithms because the combination of predictions works more accurately than any single component model [28]. RFR can be used to model the dependencies of non-linear variables. It allows for determining complex relationships between variables and is not affected by the collinearity of variables [64]. RFR requires two defined parameters—the number of factors to be used in each tree-building process (mtry) and the number of trees created in the forest (ntree).
The analysis employed the “randomForest” package in version 4.7-1.1 [22], which is part of the R program.
The measurement dataset for each metal was 105 measurements. For each of the metals, a subset of 75% of the data (i.e., 79 items of measurement data) were randomly separated from each dataset and was then used to teach the models. The remaining 25% of the data (i.e., 26 items of measurement data) constituted a test dataset that was used to assess the quality of the models. The accuracy of the model was then assessed by comparing the predicted value with the actual values of the test data using the Pearson correlation coefficient.
RF modeling allows for testing the significance of all predictive variables in relation to the response variable. The increase in the purity of nodes from the division of decision trees based on the analyzed variable (IncNodePurity) is used to measure the importance of variables [65]. This metric of the importance of variables was used to determine the overall validity of environmental variables in the model.

3. Results and Discussion

3.1. Topography of the Area

Figure 3, Figure 4 and Figure 5 show the topographical and environmental parameters of the WNP area. These values were used to determine the conditions describing the sites of water sampling, as presented in Table 1 and Table 2. The smallest topographic catchment area, covering 2875 m2, belongs to point 13. However, the largest area, amounting to 57,275 m2, is characteristic of point 8. The average slope for the catchment area ranges from 1.6% for catchment area 8 to 25.9% for catchment area 16. The highest variability of slopes is shown by the catchment area in point 17, where the standard deviation is 19.8%, and the smallest is in point 8 with 0.7%.
The top layers of the soil cover of the analyzed areas are dominated by formations characterized by excellent water permeability. Most of the soil cover of the catchment area in the measurement points consisted of slightly loamy sands (six catchments) and loose sands (five catchments). The rest predominantly consisted of loamy sands, silty sands, dust, and sandy loams (Table 2).
Due to the habitat types of the forest, the area of four catchments is dominated by fresh broadleaf forests, fresh mixed forests, and fresh coniferous forests. In the case of three catchments, riparian forests dominate, and in one catchment, moist broadleaf forests are the most prevalent.
Regarding the points where the sites of water sampling were located, they were situated in various landforms. The wells were located both on concave slopes (negative curvature values) and convex slopes (positive curvature values), as well as on flat land (≈0 values).

3.2. Description of Metal Concentrations in Groundwater

Table 3 presents descriptive statistics of the concentrations of eight metals in groundwater. The average concentrations of Al, Ca, Fe, K, Mg, Mn, Na, and Zn in groundwater were 0.034, 133, 0.543, 3.80, 17.5, 0.31, 29.0, and 0.0063 mg dm−3, respectively. These values arrange the concentrations of metals in the following series: Ca > Na > Mg > K > Fe > Mn > Al > Zn. Analyzing the values of the CV coefficient of variation, one can notice that Fe and Ca concentrations exhibit the greatest variability, for which the CV is, respectively, 210% and 190%. Ca and Mg concentrations, however, are characterized by the lowest variability, for which the CV values are 36% and 37%, respectively.

3.3. RFR Modeling Results

The models that were created based on the learning data were assessed by comparing the concentrations of metals in groundwater, which were calculated for the test data with the measured concentrations (Figure 6). The best compliance of the results calculated by using the model with the measured results was obtained for manganese (Mn), for which the Pearson correlation coefficient was r = 0.96. Slightly lower levels of correlation occurred for potassium K (r = 0.91), aluminum Al (r = 0.86), magnesium Mg (r = 0.84), and sodium Na (r = 0.82). For these compounds, the level of significance is p < 0.001. Lower compliance of the concentrations obtained from the model with the measured concentrations was obtained for Fe/iron. The correlation coefficient of r = 0.44, with a significance level of p = 0.03, indicates that the results obtained are fairly low-quality. On the other hand, the quality of RFR models generated for calcium and zinc does not allow their use in modeling spatial variability. It should be emphasized that in the analysis, all topographic and environmental parameters were determined for a resolution of 5 m. The research conducted by Wu et al. [31] indicates that the optimal resolution of determining topographic parameters for determining the concentrations of heavy metals in the soil may vary not only for individual parameters but also across different metals. For example, the optimal spatial resolution determined by them in the case of a terrain aspect ranged from 90 m for Zn to 3000 m in the case of Cu and Pb. On the other hand, in the case of zinc, the optimal resolution varied from 90 m for aspect to 250 and 1500 m for the depth of the valley, 1500 m in the case of TWI, and 3000 m for the slopes of the terrain.
In addition, the quality of the results obtained in the modeling process may be influenced by both the number of variables and the nature of the terrain features they describe. The study focuses on 15 parameters describing the topography of the terrain, and the remaining features include the top layers of the soil cover, the date of sampling, and the habitat types of the forest. These features provide a very good description of the spatial distributions of K, Al, Mg, and Na. The improvement of results for Fe, or the possibility of modeling Ca and Zn, may be influenced by supplementing the range of predictors with variables describing geology, soil properties, or groundwater table depths [66]. However, increasing the range of predictors may be challenging due to the lack of detailed information on their spatial variability. The effect of groundwater depth on dissolved metal concentrations was not considered in this study because modeling the spatial distribution of concentrations, which is the final output of this work, would require knowledge of the groundwater depth for each raster cell.

3.4. Importance of Predictors

The random forests algorithm allows for determining the contribution of individual parameters describing the terrain to the results obtained from the model. As shown in Figure 7, for Ca and Zn, it was not possible to identify topographic parameters that would allow the spatial variability of these metals to be modeled. On the other hand, for other metals, the variability of concentrations is dependent to varying degrees on various parameters. In the case of Al, the greatest impact on the concentration is caused by the profile curvature and the maximum slope in the watershed. The profile curvature influences the erosion and sedimentation processes taking place on the slope [67] and also determines the moisture conditions of the topsoil. Studies conducted in Brazil have shown lower aluminum contents on concave slopes than on convex ones [68].
In the case of Fe, the influencing factors are seasonality and the average slope of the contributing area. Strong seasonal variability of Fe concentrations in groundwater, unlike other metals, was also observed in earlier studies [69]. This variability can be explained by the influence of the high variability of redox dynamics in soils related to oxygen availability and changes in groundwater depth, as well as by seasonal changes in anaerobic microbial activity [70,71].
The second metal whose concentration in groundwater is directly related to redox changes is manganese. Depending on the oxidation state, its solubility changes, and consequently its concentration in groundwater is also altered [72]. However, analyses of data available for the WPN area indicate that for Mn, the most important parameters are the multiscale TPI and TPI.
For potassium, these factors are the general curvature, the SAGA WI, and the TPI landform. In the case of aluminum, the studies conducted by dos Santos [68] showed a relationship between K content and terrain curvature. However, in this case, they observed the opposite relationship, i.e., a higher potassium content occurred on concave slopes.
In the case of Mg, the strongest impact is caused by the planar curvature and the multiscale TPI. In the case of sodium, the average slope of the catchment and the fuzzy landform are key (Figure 7). These parameters indicate the influence of parameters describing water infiltration not only in the immediate vicinity of the sampling site but also include the influence of the contribution area. It is indicated that sodium concentrations are strongly influenced by the processes of percolation and dissolution of salt contained in the soil [73].
As presented in Figure 8, the ranking of predictors indicates that the strongest influence on the predicted concentrations of metals in groundwater is exhibited by profile curvatures, planar curvature, the multiscale TPI, and then the forest habitat type (FHT). It should be noted that the FHT is most commonly associated with local humidity conditions. Di Stefano et al. [67] indicate that the relationship between different forms of terrain curvature and soil chemical properties, and consequently groundwater chemistry, can be linked to the dynamics of erosion and sedimentation processes and the distribution of vegetation in a given area. Conversely, curvature classifications, soil composition, and the season of sampling have the least generalized impact on the modeling results. The low impact of soils may result from the low diversity of the surface layers of the soil cover of the analyzed areas; in 11 out of the 15 places where groundwater sampling occurred, highly permeable materials (loose sands and loamy sands) predominated. In addition, studies conducted by Kendall et al. [74] and Walker et al. [75] confirm a much greater spatial variability of groundwater chemistry compared to time variability.

3.5. Result Maps

The models obtained during the analysis, which depict the relationships between metal concentrations and indicators describing the terrain, were used to create maps of the spatial variability of Al, Fe, K, Mg, Mn, and Na concentrations for the forested areas of the Wielkopolska National Park (Figure A1 and Figure A2). The maps were created at a resolution of 5 m, for areas for which the variability range of the predictors was consistent with the ranges used during random forest model development, as shown on Figure 2. This is due to the limitations of RF when data outside the range used for training are used for modeling. The results obtained in the extrapolation process may be completely wrong in such cases [76]. The maps allow for the assessment of the spatial distribution of metal concentrations in groundwater in areas where very detailed field studies are very difficult. This applies, for example, to very valuable natural protected areas in a national park, access to which may be difficult or impossible due to legal restrictions.

4. Conclusions

This study assesses the possibility of using random forest regression (RFR) techniques to model the spatial variability of concentrations of aluminum, calcium, iron, potassium, magnesium, manganese, sodium, and zinc in groundwater. The results of the analysis of data from samples collected in forested areas of the Wielkopolska National Park indicate that for the adopted data resolution of 5 m, RFR allows for modeling the variability of Al, Fe, K, Mg, Mn, and Na concentrations in the groundwater. However, the model did not detect a relationship between the adopted predictors and Ca and Zn concentrations. Among the 18 indicators describing the variability of the terrain morphology and environmental variables, the most important factors in the creation of models are local parameters: profile and planar curvature (expressed in absolute values) and the multiscale TPI, as well as the FHT. In contrast, curvature classifications, soil composition, and sampling season have the least impact.
It should be noted that all sampling points are located in forested, protected areas of the National Park, which minimizes the risk of the occurrence of point sources of metal contamination. The results obtained indicate that modeling the spatial variability of metal concentrations in groundwater provides information about the state of pollution of these waters in the studied areas. The maps of the spatial distribution of metal concentrations in groundwater prepared as the final result of this work can be a valuable source of information, allowing protective measures to be planned in the Wielkopolska National Park.

Funding

This publication was financed by the Polish Minister of Science and Higher Education as part of the Strategy of the Poznan University of Life Sciences for 2024–2026 in the field of improving scientific research and development work in priority research areas.

Data Availability Statement

This article used data provided by the Wielkopolska National Park: Woda w środowisku przyrodniczym. Portal informacyjno-edukacyjny Wielkopolskiego Parku Narodowego (Water in the natural environment. Information and educational portal of the Wielkopolska National Park), https://ebd.web.amu.edu.pl (accessed on 10 December 2024).

Conflicts of Interest

The author declares no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A

Figure A1. Predicted concentrations of Al, Fe, and K in the groundwater.
Figure A1. Predicted concentrations of Al, Fe, and K in the groundwater.
Forests 15 02191 g0a1
Figure A2. Predicted concentrations of Mn, Na, and Mg in the groundwater.
Figure A2. Predicted concentrations of Mn, Na, and Mg in the groundwater.
Forests 15 02191 g0a2

References

  1. Gleeson, T.; Befus, K.M.; Jasechko, S.; Luijendijk, E.; Cardenas, M.B. The Global Volume and Distribution of Modern Groundwater. Nat. Geosci. 2016, 9, 161–167. [Google Scholar] [CrossRef]
  2. Alagha, J.S.; Said, M.A.M.; Mogheir, Y. Modeling of Nitrate Concentration in Groundwater Using Artificial Intelligence Approach—A Case Study of Gaza Coastal Aquifer. Environ. Monit. Assess 2014, 186, 35–45. [Google Scholar] [CrossRef] [PubMed]
  3. Luczaj, J. Groundwater Quantity and Quality. Resources 2016, 5, 10. [Google Scholar] [CrossRef]
  4. Babiker, I.S.; Mohamed, M.A.A.; Hiyama, T. Assessing Groundwater Quality Using GIS. Water Resour. Manag. 2007, 21, 699–715. [Google Scholar] [CrossRef]
  5. Singha, K.; Navarre-Sitchler, A. The Importance of Groundwater in Critical Zone Science. Groundwater 2022, 60, 27–34. [Google Scholar] [CrossRef]
  6. Kiewiet, L.; Von Freyberg, J.; Van Meerveld, H.J. Spatiotemporal Variability in Hydrochemistry of Shallow Groundwater in a Small Pre-alpine Catchment: The Importance of Landscape Elements. Hydrol. Process. 2019, 33, 2502–2522. [Google Scholar] [CrossRef]
  7. Harter, T. Groundwater Quality and Groundwater Pollution; University of California, Agriculture and Natural Resources: St. Davis, CA, USA, 2003; ISBN 978-1-60107-259-7. [Google Scholar]
  8. Schilling, K.E.; Jacobson, P. Spatial Relations of Topography, Lithology and Water Quality in a Large River Floodplain. River Res. Apps. 2012, 28, 1417–1427. [Google Scholar] [CrossRef]
  9. Cirmo, C.P.; McDonnell, J.J. Linking the Hydrologic and Biogeochemical Controls of Nitrogen Transport in Near-Stream Zones of Temperate-Forested Catchments: A Review. J. Hydrol. 1997, 199, 88–120. [Google Scholar] [CrossRef]
  10. Lidman, F.; Boily, Å.; Laudon, H.; Köhler, S.J. From Soil Water to Surface Water—How the Riparian Zone Controls Element Transport from a Boreal Forest to a Stream. Biogeosciences 2017, 14, 3001–3014. [Google Scholar] [CrossRef]
  11. Fisher, R.S.; Mullican, W.F., III. Hydrochemical Evolution of Sodium-Sulfate and Sodium-Chloride Groundwater Beneath the Northern Chihuahuan Desert, Trans-Pecos, Texas, USA. Hydrogeol. J. 1997, 5, 4–16. [Google Scholar] [CrossRef]
  12. Moradpour, S.; Entezari, M.; Ayoubi, S.; Karimi, A.; Naimi, S. Digital Exploration of Selected Heavy Metals Using Random Forest and a Set of Environmental Covariates at the Watershed Scale. J. Hazard. Mater. 2023, 455, 131609. [Google Scholar] [CrossRef] [PubMed]
  13. Zavareh, M.; Maggioni, V.; Zhang, X. Assessing the Efficiency of a Random Forest Regression Model for Estimating Water Quality Indicators. Meteorol. Hydrol. Water Manag. 2024, 11, 52–69. [Google Scholar] [CrossRef]
  14. Motlagh, A.M.; Yang, Z.; Saba, H. Groundwater Quality. Water Environ. Res. 2020, 92, 1649–1658. [Google Scholar] [CrossRef] [PubMed]
  15. Tesoriero, A.J.; Wherry, S.A.; Dupuy, D.I.; Johnson, T.D. Predicting Redox Conditions in Groundwater at a National Scale Using Random Forest Classification. Environ. Sci. Technol. 2024, 58, 5079–5092. [Google Scholar] [CrossRef]
  16. Dankoub, Z.; Ayoubi, S.; Khademi, H.; Lu, S.-G. Spatial Distribution of Magnetic Properties and Selected Heavy Metals in Calcareous Soils as Affected by Land Use in the Isfahan Region, Central Iran. Pedosphere 2012, 22, 33–47. [Google Scholar] [CrossRef]
  17. Chen, S.; Fang, G.; Huang, X.; Zhang, Y. Water Quality Prediction Model of a Water Diversion Project Based on the Improved Artificial Bee Colony–Backpropagation Neural Network. Water 2018, 10, 806. [Google Scholar] [CrossRef]
  18. Jadhav, M.S.; Khare, K.C.; Warke, A.S. Water Quality Prediction of Gangapur Reservoir (India) Using LS-SVM and Genetic Programming. Lakes Reserv. 2015, 20, 275–284. [Google Scholar] [CrossRef]
  19. Srivastava, P.K.; Gupta, M.; Mukherjee, S. Mapping Spatial Distribution of Pollutants in Groundwater of a Tropical Area of India Using Remote Sensing and GIS. Appl. Geomat. 2012, 4, 21–32. [Google Scholar] [CrossRef]
  20. Haggerty, R.; Sun, J.; Yu, H.; Li, Y. Application of Machine Learning in Groundwater Quality Modeling—A Comprehensive Review. Water Res. 2023, 233, 119745. [Google Scholar] [CrossRef]
  21. Biau, G.; Scornet, E. A Random Forest Guided Tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
  22. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  23. Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J. Random Forests for Classification in Ecology. Ecology 2007, 88, 2783–2792. [Google Scholar] [CrossRef] [PubMed]
  24. Wang, B.; Hipsey, M.R.; Ahmed, S.; Oldham, C. The Impact of Landscape Characteristics on Groundwater Dissolved Organic Nitrogen: Insights from Machine Learning Methods and Sensitivity Analysis. Water Resour. Res. 2018, 54, 4785–4804. [Google Scholar] [CrossRef]
  25. Prasad, A.M.; Iverson, L.R.; Liaw, A. Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction. Ecosystems 2006, 9, 181–199. [Google Scholar] [CrossRef]
  26. Strobl, C.; Boulesteix, A.-L.; Zeileis, A.; Hothorn, T. Bias in Random Forest Variable Importance Measures: Illustrations, Sources and a Solution. BMC Bioinform. 2007, 8, 25. [Google Scholar] [CrossRef]
  27. Wheeler, D.C.; Nolan, B.T.; Flory, A.R.; DellaValle, C.T.; Ward, M.H. Modeling Groundwater Nitrate Concentrations in Private Wells in Iowa. Sci. Total Environ. 2015, 536, 481–488. [Google Scholar] [CrossRef]
  28. Ouedraogo, I.; Defourny, P.; Vanclooster, M. Application of Random Forest Regression and Comparison of Its Performance to Multiple Linear Regression in Modeling Groundwater Nitrate Concentration at the African Continent Scale. Hydrogeol. J. 2019, 27, 1081–1098. [Google Scholar] [CrossRef]
  29. Knoll, L.; Breuer, L.; Bach, M. Large Scale Prediction of Groundwater Nitrate Concentrations from Spatial Data Using Machine Learning. Sci. Total Environ. 2019, 668, 1317–1327. [Google Scholar] [CrossRef]
  30. Reinecke, R.; Wachholz, A.; Mehl, S.; Foglia, L.; Niemann, C.; Döll, P. Importance of Spatial Resolution in Global Groundwater Modeling. Groundwater 2020, 58, 363–376. [Google Scholar] [CrossRef]
  31. Wu, Y.; Zhou, L.; Meng, Y.; Lin, Q.; Fei, Y. Influential Topographic Factor Identification of Soil Heavy Metals Using GeoDetector: The Effects of DEM Resolution and Pollution Sources. Remote Sens. 2023, 15, 4067. [Google Scholar] [CrossRef]
  32. Wilson, S.R.; Close, M.E.; Abraham, P.; Sarris, T.S.; Banasiak, L.; Stenger, R.; Hadfield, J. Achieving Unbiased Predictions of National-Scale Groundwater Redox Conditions via Data Oversampling and Statistical Learning. Sci. Total Environ. 2020, 705, 135877. [Google Scholar] [CrossRef] [PubMed]
  33. Khan, Q.; Liaqat, M.U.; Mohamed, M.M. A Comparative Assessment of Modeling Groundwater Vulnerability Using DRASTIC Method from GIS and a Novel Classification Method Using Machine Learning Classifiers. Geocarto Int. 2022, 37, 5832–5850. [Google Scholar] [CrossRef]
  34. Kottek, M.; Grieser, J.; Beck, C.; Rudolf, B.; Rubel, F. World Map of the Köppen-Geiger Climate Classification Updated. Meteorol. Z. 2006, 15, 259–263. [Google Scholar] [CrossRef] [PubMed]
  35. Metorological Yearbook 2020. Instytut Meteorologii i Gospodarki Wodnej—Państwowy Instytut Badawczy. Available online: https://danepubliczne.imgw.pl/data/dane_pomiarowo_obserwacyjne/Roczniki/Rocznik%20meteorologiczny/Rocznik%20Meteorologiczny%202020.pdf (accessed on 20 May 2024).
  36. Fiedler, M.; Zydroń, A. Changes in Groundwater Levels in the Wielkopolski National Park. Sylwan 2024, 168, 184–197. [Google Scholar] [CrossRef]
  37. Conrad, O.; Bechtel, B.; Bock, M.; Dietrich, H.; Fischer, E.; Gerlitz, L.; Wehberg, J.; Wichmann, V.; Böhner, J. System for Automated Geoscientific Analyses (SAGA) v. 2.1.4. Geosci. Model Dev. 2015, 8, 1991–2007. [Google Scholar] [CrossRef]
  38. Zhang, X.; Jiao, J.J.; Guo, W. How Does Topography Control Topography-Driven Groundwater Flow? Geophys. Res. Lett. 2022, 49, e2022GL101005. [Google Scholar] [CrossRef]
  39. Goderniaux, P.; Davy, P.; Bresciani, E.; De Dreuzy, J.; Le Borgne, T. Partitioning a Regional Groundwater Flow System into Shallow Local and Deep Regional Flow Compartments. Water Resour. Res. 2013, 49, 2274–2286. [Google Scholar] [CrossRef]
  40. De Graaf, I.E.M.; Gleeson, T.; Van Beek, L.P.H.; Sutanudjaja, E.H.; Bierkens, M.F.P. Environmental Flow Limits to Global Groundwater Pumping. Nature 2019, 574, 90–94. [Google Scholar] [CrossRef]
  41. Cardenas, M.B.; Jiang, X. Groundwater Flow, Transport, and Residence Times through Topography-driven Basins with Exponentially Decreasing Permeability and Porosity. Water Resour. Res. 2010, 46, 2010WR009370. [Google Scholar] [CrossRef]
  42. Benjmel, K.; Amraoui, F.; Boutaleb, S.; Ouchchen, M.; Tahiri, A.; Touab, A. Mapping of Groundwater Potential Zones in Crystalline Terrain Using Remote Sensing, GIS Techniques, and Multicriteria Data Analysis (Case of the Ighrem Region, Western Anti-Atlas, Morocco). Water 2020, 12, 471. [Google Scholar] [CrossRef]
  43. Apogba, J.N.; Anornu, G.K.; Koon, A.B.; Dekongmen, B.W.; Sunkari, E.D.; Fynn, O.F.; Kpiebaya, P. Application of Machine Learning Techniques to Predict Groundwater Quality in the Nabogo Basin, Northern Ghana. Heliyon 2024, 10, e28527. [Google Scholar] [CrossRef] [PubMed]
  44. Raaflaub, L.D.; Collins, M.J. The Effect of Error in Gridded Digital Elevation Models on the Estimation of Topographic Parameters. Environ. Model. Softw. 2006, 21, 710–732. [Google Scholar] [CrossRef]
  45. Walker, J.P.; Willgoose, G.R. On the Effect of Digital Elevation Model Accuracy on Hydrology and Geomorphology. Water Resour. Res. 1999, 35, 2259–2268. [Google Scholar] [CrossRef]
  46. Zhou, Q.; Liu, X. Analysis of Errors of Derived Slope and Aspect Related to DEM Data Properties. Comput. Geosci. 2004, 30, 369–378. [Google Scholar] [CrossRef]
  47. Tang, Y.; Zhang, D.; Xu, H.; Dai, L.; Xu, Q.; Zhang, Z.; Jing, X. The Role of Topography Feedbacks in Enrichment of Heavy Metal Elements in Terrace Type Region. Front. Environ. Sci. 2024, 12, 1291917. [Google Scholar] [CrossRef]
  48. Fox, D.M.; Bryan, R.B.; Price, A.G. The Influence of Slope Angle on Final Infiltration Rate for Interrill Conditions. Geoderma 1997, 80, 181–194. [Google Scholar] [CrossRef]
  49. Bogaart, P.W.; Troch, P.A. Curvature Distribution within Hillslopes and Catchments and Its Effect on the Hydrological Response. Hydrol. Earth Syst. Sci. 2006, 10, 925–936. [Google Scholar] [CrossRef]
  50. Zevenbergen, L.W.; Thorne, C.R. Quantitative Analysis of Land Surface Topography. Earth Surf. Process. Landf. 1987, 12, 47–56. [Google Scholar] [CrossRef]
  51. Weiss, A.D. Topographic Position and Landforms Analysis. In Proceedings of the Poster Presentation, ESRI Users Conference, San Diego, CA, USA, 9–13 July 2001. [Google Scholar]
  52. Guisan, A.; Weiss, S.B.; Weiss, A.D. GLM versus CCA Spatial Modeling of Plant Species Distribution. Plant Ecol. 1999, 143, 107–122. [Google Scholar] [CrossRef]
  53. Schmidt, J.; Hewitt, A. Fuzzy Land Element Classification from DTMs Based on Geometry and Terrain Position. Geoderma 2004, 121, 243–256. [Google Scholar] [CrossRef]
  54. Böhner, J.; Selige, T. Spatial Prediction of Soil Attributes Using Terrain Analysis and Climate Regionalization. Gott. Geograpihsche Abh. 2002, 115, 13–28. [Google Scholar]
  55. Winzeler, H.E.; Owens, P.R.; Read, Q.D.; Libohova, Z.; Ashworth, A.; Sauer, T. Topographic Wetness Index as a Proxy for Soil Moisture in a Hillslope Catena: Flow Algorithms and Map Generalization. Land 2022, 11, 2018. [Google Scholar] [CrossRef]
  56. Riihimäki, H.; Kemppinen, J.; Kopecký, M.; Luoto, M. Topographic Wetness Index as a Proxy for Soil Moisture: The Importance of Flow-Routing Algorithm and Grid Resolution. Water Resour. Res. 2021, 57, e2021WR029871. [Google Scholar] [CrossRef]
  57. Lyu, M.; Pang, Z.; Yin, L.; Zhang, J.; Huang, T.; Yang, S.; Li, Z.; Wang, X.; Gulbostan, T. The Control of Groundwater Flow Systems and Geochemical Processes on Groundwater Chemistry: A Case Study in Wushenzhao Basin, NW China. Water 2019, 11, 790. [Google Scholar] [CrossRef]
  58. Wielkopolska National Park Water in the Natural Environment of the Wielkopolska National Park. Geoportal. Available online: http://77.65.27.118:8080/project (accessed on 20 January 2024).
  59. Święcicki, Z. (Ed.) Instrukcja Urządzania Lasu. Cz. 2: Instrukcja Wyróżniania i Kartowania w Lasach Państwowych Typów Siedliskowych Lasu Oraz Zbiorowisk Roślinnych; Centrum Informacyjne Lasów Państwowych, Na zlec. Dyrekcji Generalnej Lasów Państwowych: Warszawa, Poland, 2012; ISBN 978-83-61633-66-2. (In Polish) [Google Scholar]
  60. Wieruszewski, M.; Mydlarz, K. The Influence of Habitat Conditions on the Properties of Pinewood. Forests 2021, 12, 1311. [Google Scholar] [CrossRef]
  61. Nosetto, M.D.; Jobbágy, E.G.; Tóth, T.; Jackson, R.B. Regional Patterns and Controls of Ecosystem Salinization with Grassland Afforestation along a Rainfall Gradient. Glob. Biogeochem. Cycles 2008, 22, 2007GB003000. [Google Scholar] [CrossRef]
  62. Gribovszki, Z.; Kalicz, P.; Balog, K.; Szabó, A.; Tóth, T.; Csáfordi, P.; Metwaly, M.; Szalai, S. Groundwater Uptake of Different Surface Cover and Its Consequences in Great Hungarian Plain. Ecol. Process. 2017, 6, 39. [Google Scholar] [CrossRef]
  63. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2024. [Google Scholar]
  64. Breiman, L. Statistical Modeling: The Two Cultures (with Comments and a Rejoinder by the Author). Statist. Sci. 2001, 16, 199–231. [Google Scholar] [CrossRef]
  65. Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer Series in Statistics; Springer: New York, NY, USA, 2009; ISBN 978-0-387-84857-0. [Google Scholar]
  66. Tesoriero, A.J.; Gronberg, J.A.; Juckem, P.F.; Miller, M.P.; Austin, B.P. Predicting Redox-sensitive Contaminant Concentrations in Groundwater Using Random Forest Classification. Water Resour. Res. 2017, 53, 7316–7331. [Google Scholar] [CrossRef]
  67. Stefano, C.D.; Ferro, V.; Porto, P.; Tusa, G. Slope Curvature Influence on Soil Erosion and Deposition Processes. Water Resour. Res. 2000, 36, 607–617. [Google Scholar] [CrossRef]
  68. Santos, G.L.D.; Pereira, M.G.; Lima, S.S.D.; Ceddia, M.B.; Mendonça, V.M.M.; Delgado, R.C. Landform Curvature and Its Effect on the Spatial Variability of Soil Attributes, Pinheiral—RJ/BR. Cerne 2016, 22, 431–438. [Google Scholar] [CrossRef]
  69. Szczucińska, A.M.; Siepak, M.; Zioła-Frankowska, A.; Marciniak, M. Seasonal and Spatial Changes of Metal Concentrations in Groundwater Outflows from Porous Sediments in the Gryżyna-Grabin Tunnel Valley in Western Poland. Environ. Earth Sci. 2010, 61, 921–930. [Google Scholar] [CrossRef]
  70. Conrad, S.; Löfgren, S.; Bauer, S.; Ingri, J. Seasonal Variations of Redox State in Hemiboreal Soils Indicated by Changes of δ56 Fe, Sulfate, and Nitrate in Headwater Streams. ACS Earth Space Chem. 2019, 3, 2816–2823. [Google Scholar] [CrossRef]
  71. Ekström, S.M.; Regnell, O.; Reader, H.E.; Nilsson, P.A.; Löfgren, S.; Kritzberg, E.S. Increasing Concentrations of Iron in Surface Waters as a Consequence of Reducing Conditions in the Catchment Area. JGR Biogeosciences 2016, 121, 479–493. [Google Scholar] [CrossRef]
  72. Hamer, K.; Gudenschwager, I.; Pichler, T. Manganese (Mn) Concentrations and the Mn-Fe Relationship in Shallow Groundwater: Implications for Groundwater Monitoring. Soil Syst. 2020, 4, 49. [Google Scholar] [CrossRef]
  73. El Ghandour, M.F.M.; Khalil, J.B.; Atta, S.A. Distribution of Sodium and Potassium in the Groundwater of the Nile Delta Region (Egypt). CATENA 1983, 10, 175–187. [Google Scholar] [CrossRef]
  74. Kendall, C.; McDonnell, J.J.; Gu, W. A Look inside ‘Black Box’ Hydrograph Separation Models: A Study at the Hydrohill Catchment. Hydrol. Process. 2001, 15, 1877–1902. [Google Scholar] [CrossRef]
  75. Walker, J.F.; Hunt, R.J.; Bullen, T.D.; Krabbenhoft, D.P.; Kendall, C. Variability of Isotope and Major Ion Chemistry in the Allequash Basin, Wisconsin. Groundwater 2003, 41, 883–894. [Google Scholar] [CrossRef]
  76. Hengl, T.; Nussbaum, M.; Wright, M.N.; Heuvelink, G.B.M.; Gräler, B. Random Forest as a Generic Framework for Predictive Modeling of Spatial and Spatio-Temporal Variables. PeerJ 2018, 6, e5518. [Google Scholar] [CrossRef]
Figure 1. Location and DEM of the Wielkopolski Park Narodowy.
Figure 1. Location and DEM of the Wielkopolski Park Narodowy.
Forests 15 02191 g001
Figure 2. Workflow for analysis of a single metal, using aluminum as an example. Part A is common to all metals.
Figure 2. Workflow for analysis of a single metal, using aluminum as an example. Part A is common to all metals.
Forests 15 02191 g002
Figure 3. Topography of WPN.
Figure 3. Topography of WPN.
Forests 15 02191 g003
Figure 4. Topography of WPN. TPI Landforms: 1—streams, 2—midslope drainages, 3—upland drainages, 4—valleys, 5—plains, 6—open slopes, 7—upper slopes, 8—local ridges, 9—midslope ridges, 10—high ridges. Curvature class: 0—concave, 1—flat, 2—convex.
Figure 4. Topography of WPN. TPI Landforms: 1—streams, 2—midslope drainages, 3—upland drainages, 4—valleys, 5—plains, 6—open slopes, 7—upper slopes, 8—local ridges, 9—midslope ridges, 10—high ridges. Curvature class: 0—concave, 1—flat, 2—convex.
Forests 15 02191 g004
Figure 5. Forest habitat types (FHT) and soil types. FHT: BMsw—fresh mixed coniferous forest, Bsw—fresh coniferous forest, LMsw—fresh mixed broadleaved forest, LMw—moist mixed broadleaved forest, LMb—swamp mixed broadleaved forest, Lsw—fresh broadleaved forest, Lw—moist broadleaved forest, Lł—riparian forest, Ol—alder and alder–ash forest; soil: gs—medium clay, gp—sandy loam, pg—clay sand, pl—loose sand, ps—light clay sand, płp—sandy silt, m—alluvia, tn—turf.
Figure 5. Forest habitat types (FHT) and soil types. FHT: BMsw—fresh mixed coniferous forest, Bsw—fresh coniferous forest, LMsw—fresh mixed broadleaved forest, LMw—moist mixed broadleaved forest, LMb—swamp mixed broadleaved forest, Lsw—fresh broadleaved forest, Lw—moist broadleaved forest, Lł—riparian forest, Ol—alder and alder–ash forest; soil: gs—medium clay, gp—sandy loam, pg—clay sand, pl—loose sand, ps—light clay sand, płp—sandy silt, m—alluvia, tn—turf.
Forests 15 02191 g005
Figure 6. Correlation between measured and predicted values of metal concentrations in the groundwater.
Figure 6. Correlation between measured and predicted values of metal concentrations in the groundwater.
Forests 15 02191 g006
Figure 7. Plot showing the importance of variables.
Figure 7. Plot showing the importance of variables.
Forests 15 02191 g007
Figure 8. Summarized importance of variables.
Figure 8. Summarized importance of variables.
Forests 15 02191 g008
Table 1. Topography of sampling points.
Table 1. Topography of sampling points.
Sampling PointGeneral CurvatureProfile CurvaturePlanar CurvatureTPIMultiscale TPITPI Landform ClassProfile Curv. ClassTangential Curv. Class
10.01390.0093−0.0155−3.92−2.24212
2−0.0036−0.00210.0042−1.09−0.74401
40.00310.00090.0363−0.330.29522
6−0.0030.0016−0.0406−0.15−0.15401
70.00140.0008−0.0155−0.320.1312
80.00430.00040.05670.410.49522
9−0.0076−0.00580.0324−1.37−0.95101
100.00740.00240.0197−2.760.77122
11−0.0014−0.0006−0.00021.641.79812
120.00760.00070.0284−3.810.6022
13−0.00890.0038−0.073−0.53−2.38320
14−0.0035−0.00280.01260.22−0.18402
150.0015−0.00060.0422−1.24−0.05122
16−0.0039−0.0015−0.0209−1.83−0.84102
17−0.0096−0.0048−0.0062−2.07−3.19101
Table 2. Topography of point catchments.
Table 2. Topography of point catchments.
Sampling PointArea
(m2)
Max Slope (%)Mean Slope (%)Slope SD
(%)
Average Aspect (°)Mean SAGA WIForest Habitat TypeSoil Type
1535037.516.47.71215.2521
214,12539.515.37.81965.8523
4885018.74.93.71788.511
6885011.34.12.31728.4355
7885019.63.74.22239.3952
857,27551.60.77311.1932
9655018.56.84.61887.3743
10685041.513.110.81437.2422
11540018.66.44.61797.6511
1247,4005110.210.31308.1732
13287519.98.54.62086.3936
1412,65014.44.22.82658.2911
1547,02523.87.25.22438.1322
16837548.525.911.61365.3211
1730,9509020.719.81495.9932
SD—standard deviation; FHT: 1—fresh mixed coniferous forest, 2—fresh mixed broadleaved forest, 3—fresh broadleaved forest, 4—moist broadleaved forest, 5—riparian forest; soil: 1—loose sand, 2—light clay sand, 3—loamy sand, 5—sandy silt, 6—sandy loam.
Table 3. The statistical description of metal concentrations in the groundwater (mg dm−3).
Table 3. The statistical description of metal concentrations in the groundwater (mg dm−3).
MeanMedianMinMaxSDSkewnessCV
Al0.0340.0210.0050.2080.0432.77126
Ca13312952.626047.70.6936
Fe0.5430.0230.0017.21.143.34210
K3.82.520.66423.23.091.8481
Mg17.517.37.9538.16.410.6937
Mn0.310.190.0021.610.361.85116
Na2923.26.997.419.71.368
Zn0.00630.00330.0010.0930.0125.4190
SD—standard deviation; CV—coefficient of variation.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fiedler, M. Using Random Forest Regression to Model the Spatial Distribution of Concentrations of Selected Metals in Groundwater in Forested Areas of the Wielkopolska National Park, Poland. Forests 2024, 15, 2191. https://doi.org/10.3390/f15122191

AMA Style

Fiedler M. Using Random Forest Regression to Model the Spatial Distribution of Concentrations of Selected Metals in Groundwater in Forested Areas of the Wielkopolska National Park, Poland. Forests. 2024; 15(12):2191. https://doi.org/10.3390/f15122191

Chicago/Turabian Style

Fiedler, Michał. 2024. "Using Random Forest Regression to Model the Spatial Distribution of Concentrations of Selected Metals in Groundwater in Forested Areas of the Wielkopolska National Park, Poland" Forests 15, no. 12: 2191. https://doi.org/10.3390/f15122191

APA Style

Fiedler, M. (2024). Using Random Forest Regression to Model the Spatial Distribution of Concentrations of Selected Metals in Groundwater in Forested Areas of the Wielkopolska National Park, Poland. Forests, 15(12), 2191. https://doi.org/10.3390/f15122191

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop