Next Article in Journal
Determination of Polycyclic Aromatic Hydrocarbons from Atmospheric Deposition in Malva sylvestris Leaves Using Gas Chromatography with Mass Spectrometry (GC-MS)
Previous Article in Journal
Impact of Large-Scale Circulations on Ground-Level Ozone Variability over Eastern China
Previous Article in Special Issue
Characteristics of Volatile Organic Compounds Emissions and Odor Impact in the Pharmaceutical Industry
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Innovative Approaches to Industrial Odour Monitoring: From Chemical Analysis to Predictive Models

1
RSE-Ricerca sul Sistema Energetico, Via Rubattino 54, 20134 Milan, Italy
2
Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 20126 Milan, Italy
3
ACEA Infrastructure S.p.A., Gruppo ACEA S.p.A., Via Vitorchiano 165, 00189 Rome, Italy
*
Author to whom correspondence should be addressed.
Atmosphere 2024, 15(12), 1401; https://doi.org/10.3390/atmos15121401
Submission received: 11 September 2024 / Revised: 15 November 2024 / Accepted: 15 November 2024 / Published: 21 November 2024
(This article belongs to the Special Issue Environmental Odour (2nd Edition))

Abstract

:
This study evaluated the reliability of an electronic nose in monitoring odour concentration near a wastewater treatment plant and examined the correlation between four sensor readings and odour intensity. The electronic nose chemical sensors are related to the concentration of the following chemical species: two values for the concentration of VOCs recorded via the PID sensor (VPID) and the EC sensor (VEC), and concentrations of sulfuric acid (VH2S) and benzene (VC6H6). Using Random Forest and least squares regression analysis, the study identifies VH2S and VC6H6 as key contributors to odour concentration (CcOD). Three Random Forest models (RF0, RF1, RF2), with different characteristics for splitting between the test set and the training set, were tested, with RF1 showing superior predictive performance due to its training approach. All models highlighted VH2S and VC6H6 as significant predictors, while VPID and VEC had less influence. A significant correlation between odour concentration and specific chemical sensor readings was found, particularly for VH2S and VC6H6. However, predicting odour concentrations below 1000 ouE/m3 proved challenging. Linear regression further confirmed the importance of VH2S and VC6H6, with a moderate R-squared value of 0.70, explaining 70% of the variability in odour concentration. The study demonstrated the effectiveness of combining Random Forest and least squares regression for robust and interpretable results. Future research should focus on expanding the dataset and incorporating additional variables to enhance model accuracy. The findings underscore the necessity of specific sensor training and standardised procedures for accurate odour monitoring and characterisation.

1. Introduction

Industrial emission gases, a crucial output component of manufacturing processes [1,2], are often rich in odourous substances [3,4], leading to concerns about air quality due to their atmospheric mobility and potential impact on the environment and receptor population [1,5]. Recent studies have highlighted the connection between atmospheric emissions and the chemical structure of odourous molecules, emphasising the need for effective odour management strategies [6,7]. With a growing public concern about air pollution, odours act as an early indicator of environmental pollution, prompting increased attention to monitoring and control strategies [8,9,10].
The population’s capacity to perceive a specific odour is highly subjective, as the same stimulus can evoke significantly different sensations among different individuals or even within the same individual across various circumstances [11,12], or in the presence of different environmental factors, such as the presence of high humidity or high CO2 concentration [13,14,15]. It is crucial to bear in mind that olfactory discomfort emerges from multiple factors: odour does not directly correspond to the odour-causing molecule, as it is not an inherent feature of the molecule itself. Instead, it corresponds to a sensation triggered by the substance once interpreted by the olfactory system [6,16]. Even when not toxic, offensive odour compounds can induce symptoms such as irritation of the respiratory tract, chest tightness, palpitations, drowsiness, and mood swings [17,18]. Furthermore, certain malodourous chemicals, like ethylbenzene, toluene, and benzene, can have severe health effects [19].
Due to the complexity of industrial odourous emissions-related challenges, adopting a comprehensive approach to olfactory nuisance management is crucial [1,7,20]. Regulation for controlling olfactory nuisances spans various hierarchical levels, from European legislation to local regulations, reflecting the need to address odour emissions at both supranational and local levels [6,21]. Nowadays the European Union mandates control over the emission of odour-causing substances during plant operations to safeguard people’s life quality [22]: European directives require technical standards published by CEN, with UNI EN 13725:2022 being the reference standard for odour determination by dynamic olfactometry [11].
Dynamic olfactometry, as outlined in EN 13725:2022, employs human noses as sensors to assess odours by their impact on trained individuals, who undergo a screening process to determine the concentration of odourants in a gaseous sample [7,11,23]. Olfactometry is widely acknowledged as the most sensitive method for assessing odour quality [24]. However, it has limitations such as time and effort requirements, the need for specialised laboratories separate from sampling sites, the inability for on-site or real-time measurements, high uncertainty levels, potential exposure of panelists to hazardous substances, and a lack of precise implementation guidance, resulting in inconsistent approaches [23]. Furthermore, sensorial analysis is inherently unreproducible [25].
Over the past 15 years, Instrumental Odour Monitoring Systems (IOMS) have gained increasing popularity as tools for assessing odour impact [26,27]. Sensor-based instrument methodologies employ artificial olfactory systems that replicate the capabilities of human smell, allowing for continuous operation directly in ambient air [28,29,30]. Among these systems, e-Noses—electronic olfactory system—have emerged as the most used ones [26,27,31]: they are devices with chemical sensors and a pattern recognition system used to detect and identify various odours [30,32]. It is crucial to clarify that e-Noses do not conduct a chemical analysis of the analysed mixture but instead provide an olfactory fingerprint [33,34]. Despite promising results, practical applications of e-Noses in real-life scenarios remain limited due to:
(1)
Technical challenges, such as sensor insensitivity or lack of selectivity [35,36] and susceptibility to interference from temperature and humidity [27,37], make it difficult to establish a direct correlation between the chemical fingerprint and human perception of odour.
(2)
Standardisation, which represents a significant barrier to the widespread adoption of e-Noses on an industrial scale [31,38].
(3)
The total uncertainty of the measured data are not defined in terms of reproducibility and reliability, which are necessary parameters for making instrument comparisons [39].
Anyway, the use of IOMS for continuous olfactory nuisance monitoring proved to be particularly beneficial in industrial process control, enabling prompt intervention in plant management by activating operational measures upon surpassing critical odour thresholds [40].
The theoretical conversion from chemical concentration to odour concentration could be feasible, but the mechanism of human odour recognition is not yet fully understood [41]. Hence, experimental investigations are required. The simplest method involves directly using the concentration of a single substance, which can work well for individual substances or constant composition groups [42,43]. In this study, the experimental method for converting chemical concentration to odour concentration will be addressed.
After the e-Nose training, a mathematical relationship with an odour level can be derived; it is based on the recorded sensor concentration values and facilitates the estimation of odour concentration. However, it is important to clarify that this mathematical relationship is not absolute but applies only to the specific type of odour being detected [6,27]. Internationally, several standard methods were developed to implement this knowledge, including Draft P2520.1 [44] and TC264 WG41 on electronic sensors for odorant monitoring [45] The first method focuses on creating chemically standardised mixtures, while the second method relies on comparing electronic noses with reference values obtained through dynamic olfactometry.
This study aimed to evaluate the reliability of data collected by e-Nose near a wastewater treatment plant and explore the correlation between chemical concentration sensor readings in the electronic nose and odour concentration values. A wastewater treatment plant is classified as a passive diffuse odourous source, where certain surfaces or openings release unpleasant odours that disperse at an unpredictable airflow rate [46]. Area sources release emissions from large surfaces and are categorised as active or passive. Active sources have outward airflow, like biofilters or aerated heaps, while passive sources, such as landfill surfaces and wastewater tanks, rely on processes like equilibrium or convection for mass transfer to the air [47]. The sedimentation phase of the treatment process is the most odourous, with an odour emission factor (OEF) of 190,000 ouE/m3 of treated effluent [47,48]. This phase generates odourous substances due to anaerobic and anoxic conditions. The primary odourous compounds produced during wastewater treatment include hydrogen sulphide, ammonia, organic sulfur compounds, reduced organic sulfur compounds, and amines [49,50]. Sulfur compounds, mainly hydrogen sulphide, are released during the breakdown of aerobic organic matter, while ammonia is produced from protein decomposition [6]. Indeed, it is possible to identify alterations in air quality attributable to a plant by analysing measurements taken in ambient air, even at a distance from the facility.
Random Forest models can assess how different chemical concentrations can predict odour data, which is crucial for odour management in industries. These models quantify the dependence of odour data on chemical concentrations and identify the importance of each variable. Selecting the most relevant chemical species for analysis is essential to accurately reflect real-world conditions, as some species significantly influence odour concentration. Additionally, the linear correlation model utilises theoretical formulas and linear regression to determine if the key chemical variables align with the electronic nose’s data. Collectively, this dual approach strengthens the electronic nose’s reliability for continuous odour monitoring, helping address odour-related issues and improve community relations. This approach serves to validate odour concentration measurements conducted during environmental monitoring using an e-Nose, aiding in the development of an accurate characterisation strategy. Such activities are preparatory steps for future field validation with the assistance of dynamic olfactometry.

2. Materials and Methods

2.1. Site and Equipment

The study analysed data from the ETL3000 e-Nose located at a sewage treatment plant presented in Figure 1 dedicated to wastewater purification. The e-Nose is placed on the red spot near sedimentation basins, highlighted with green points. The plant area is located near a residential one, hence the need for precise monitoring to identify and potentially prevent spikes in odour concentration that could affect the residential area in the future.
The electronic nose is the ETL3000 (ORION, Veggiano (PD), Italy), which represents an air quality monitoring instrument for indicative measurement (as outlined in European Directive 2008/50/EC [51]). The electronic nose has been trained following the guidelines proposed by the Lombardy Region [52]: reference odours, which are representative of the emissions from the specific activities being studied, were chosen. The sensors of the ETL3000 were calibrated using controlled concentrations of the selected reference odours. Calibration entailed generating response curves for each sensor by exposing them to known concentrations of odours.
During the training phase, the ETL3000 was exposed multiple times to the reference odours under controlled laboratory conditions. Sensor responses were recorded, establishing a comprehensive database of odour patterns. The accuracy of the ETL3000 was validated by testing it with the reference odours and comparing its identifications with those obtained from standard olfactometric methods. Cross-validation techniques were employed to ensure reliability and precision in odour detection and quantification. Subsequently, the ETL3000 was deployed in real emission scenarios. Field testing involved comparing its readings with traditional olfactometric measurements to fine-tune its sensitivity and accuracy. Adjustments were made as necessary to enhance its performance.
The data collection period spanned from 18 October 2023 to 13 November 2023, therefore for a period of 27 days. In addition to measuring odour concentration (ouE/m3), the electronic nose recorded various descriptors of the analysed site. These included data from a weather station capturing physical parameters such as wind speed (m/s), wind direction (deg), temperature (°C), atmospheric pressure (Pa), precipitation (mm), relative humidity (%), and UV index, along with readings from four chemical sensors. These chemical sensors monitored levels of volatile organic compounds (VOCs, in ppm) with a SENS-IT PID sensor and a SENS-IT Electrochemical sensor, hydrogen sulphide (H2S, in ppb) with a SENS-IT Electrochemical sensor, and benzene (C6H6, in ppb) with a SENS-IT Thick Film Metal Oxide Semiconductor sensor, contributing to the overall assessment of odour.
The system features a variety of sensors integrated into modules. These include metal-oxide semiconductors (MOS) that can detect both organic and inorganic molecules, specialised electrochemical sensors (EC), and selective photoionisation detectors (PID). Sensors designed for C6H6 detection (from 0 ppb to 30 ppb) employ Thick Film Metal Oxide Semiconductor (TF-MOS) technology. The sensor’s active surface consists of a specific nanostructured semiconductor metal oxide. Initially, atmospheric oxygen is adsorbed onto the sensor’s surface, leading to charge transfer from the semiconductor to oxygen molecules. Subsequently, a specific gas reacts with the adsorbed oxygen (via Red-Ox reactions), releasing electrons into the semiconductor’s conduction band. By utilising current signals from the sensors during these reactions, the concentration of the specific gas can be directly measured. Additionally, monitoring of H2S (from 0 ppb to 3000 ppb) and VOCs from (0 ppm to 25 ppm) is carried out by employing traditional Electrochemical Technology (EC). In the EC, the chemical targe interacts with the electrode surface, and it causes a change in electrical properties, such as voltage or current, which can be measured and correlated with the concentration of the target chemical. A photoionisation detector (PID) uses ultraviolet (UV) light to ionise VOC molecules in an air sample. When VOCs enter the ionisation chamber, the UV light knocks electrons off the VOC molecules, creating positive ions and free electrons. These ionised particles generate an electrical current when attracted to electrodes within the chamber. The strength of this current is proportional to the concentration of VOCs, providing real-time measurement. PIDs are commonly used for monitoring air quality in industrial and environmental settings.

2.2. Data Analysis

Odour data are processed every ten minutes using an algorithm that correlates the descriptors of the electronic nose. A preliminary investigation was conducted through the development of a correlation matrix. The data for this matrix were processed following a smoothing operation, which helped to reduce variability and improve the quality of the analysis. This matrix was calculated by constructing a linear model for each variable, treating each one as the dependent (or outcome) variable in relation to all other variables in the dataset, which served as independent (or predictor) variables. For these models, the coefficient of determination (R2) was calculated to understand how effectively the models could explain the variance of each individual variable. A deeper investigation was conducted by the software R. The analysis is performed using the Random Forest Algorithm [53,54] and then with least squares regression [55].
As G. Biau and E. Scornet report in 2016 in ‘A Random Forest guided tour’ [54], Random Forest (RF) is a supervised learning algorithm that represents an evolution of decision trees. RF was used for a more accurate development of the correlation between the data, as it is a machine learning algorithm used to solve classification problems and predict continuous variables based on a set of predictors. Decision trees are a tree structure where attributes are evaluated at nodes, leading to the final classification in the leaves. The evaluation criterion for attributes is the Information Gain (IG), which measures the variation in information entropy from a previous state to a subsequent one. It is commonly used when deciding which feature (or attribute) to use to split the data into decision tree nodes during the training process. The higher the entropy, the lower the predictability of the event.
RF addresses the overfitting issue by being an ensemble method that generates k trees through bootstrap sampling of the dataset—statistical technique used to estimate a sample distribution without making parametric assumptions about the underlying population—selecting attributes randomly at each split (node) and evaluating Information Gain, described in Equation (1), to obtain indications of model robustness as follows:
IG(T,a) = H(T) − H(T|a)
where ‘T’ represents the training dataset, ‘a’ is the value of an attribute, and ‘H’ represents information entropy defined in Equation (2):
Entropy ( H ) = k 1 N ( P x l o g 2 P x )
where ‘Px’ is the probability that event ‘x’ occurs, and ‘k’ represents the number of attributes.
RF combines predictions from all the trees, thereby reducing the risk of overfitting. Its hyperparameters—parameters external to machine learning models that affect the training process but are not directly learned from the data—include the number of trees (ntree), the number of randomly sampled variables (mtry), the number of samples for training (sampsize), the minimum size of terminal nodes (nodesize), and the maximum number of final nodes (maxnodes).
For the data analysis, three distinct random forest models were utilised: RF0, RF1, and RF2. Each model exhibits different characteristics regarding the division between the training (sampsize) and test sets, which are discussed in detail in the Results and Discussion section. For the development of all three models, the following parameters were considered: The number of trees (ntree) was set to 2000; the number of variables sampled (mtry) was set to 2; and the minimum node size (nodsize) was set to 1. Additionally, the variable Ncores was set to 8, indicating the number of processors used to expedite the calculations, and the operation set.seed(12345) has been set to ensure that random processes yield consistent results, allowing for reproducibility in statistical analyses and machine learning.
In the RF method, variables play a significant role in determining the accuracy and effectiveness of the model. RF assigns an importance score to each chemical variable, reflecting its contribution to the model’s ability to make accurate predictions. The impurity-based importance metric provided by the model’s importance attribute offers a comprehensive measure of each variable’s contribution to reducing uncertainty in the predictions. By summing the reduction in node impurity each variable achieves across all trees, the model identifies which variables are the most influential in shaping predictions. This measure is crucial for understanding the underlying structure of the data and prioritising key features. The use of the Random Forest model in odour analysis is particularly effective due to its accuracy and robustness in handling complex data. This approach allows for predicting odour concentrations by simultaneously analysing various chemical substances and identifying those most influential on odour perception. Additionally, Random Forest captures nonlinear interactions between variables and provides insights into the importance of each factor, enhancing the understanding of the elements contributing to odours. Its ability to manage missing data makes it a reliable method for environmental monitoring. The predictive variables used for this study are related to the chemical measurements’ concentration of H2S, C6H6, and the two measurements of VOCs. Further information on RF can be found in the literature [56,57,58].
Considering the objective of investigating the relationship that best reflects the available data, once the most important chemical variables are identified through the RF model, they can be incorporated into the logarithmic model described in literature and subsequently outlined.
Typically, a relationship that links chemical concentrations (Ci) is used to evaluate the odour concentration (CcOD), as described in Equation (3). It involves multiplying by a specific coefficient (kc) for the quotient obtained from dividing the measured chemical concentration (Ci) by the odour threshold value (OTVi), which is the minimum concentration of an odourous substance in the air required to be detected for the specific compound [31,59,60]. An additional parameter complementing odour concentration is odour intensity (OI). Intensity measures the perceived strength of an odour, which is influenced by both the odourant and the individual perceiving it [61]. In contrast, concentration quantifies the actual amount of odour present in the air. Although they are often considered interchangeable, concentration represents the objective quantity of the odour, while intensity reflects the subjective perception of its strength [62]. The OI mixture formula is indicated in Equation (4) and calculated by the Weber–Fechner law [63].
Together, these parameters offer distinct but complementary insights into the olfactory experience. In literature, concentrations corresponding to the olfactory thresholds of many compounds have been experimentally determined. These values are applicable only when referring to pure substances. However, in the presence of mixtures, effects such as independence, additivity, synergy, and antagonism can occur [64]. Determining the proportionality constant (kc) for this calculation typically involves olfactometric measurements and a linear regression analysis, as demonstrated in studies like [43,63].
CCOD = kC (∑Ci/OTVi)
OI = log CCOD + 0.5
An additional investigation into the mathematical model was carried out using the least squares method. The confidence intervals (Equation (5)) for the parameter estimates in the linear models were calculated using the following formula:
CI = β ± tα/2,df ∙ SE(β)
β represents the estimated coefficient; tα/2, df is the critical value from the Students t-distribution for the chosen confidence level (95%) and the degrees of freedom (df = n-k), where ‘n’ is the number of observations and ‘k’ is the number of estimated parameters, including the intercept; SE(β) is the standard error associated with the estimated coefficient. This formula allows for the computation of the range within which we expect the true parameter values to fall, with a given level of confidence. A detailed discussion of the linear regression model is included in the Supplementary Materials. This approach leverages the linear relationships between the selected variables and the target to gain a more detailed understanding of the factors influencing the model’s output. This allows for the integration of the nonlinear information captured by RF with a more interpretable structure provided by the linear model.

3. Results and Discussion

Within the analysis algorithm and in the generated graphs, the various descriptive variables representing the site are denoted as follows: ‘wint’ for wind intensity, ‘wdir’ for wind direction, ‘temp’ for temperature, ‘pres’ for atmospheric pressure, ‘prec’ for precipitation, ‘rhum’ for relative humidity, ‘uvidx’ for UV index, ‘VEC’ for VOCs concentration determined with the EC sensor, ‘VPID’ VOCs concentration determined with the PID sensor, ‘VH2S’ for H2S concentration, and ‘VC6H6’ for C6H6 concentration. Additionally, the concentration of odourous substances is indicated by ‘CcOD’.
Due to the experimental nature of the data, as they consist of signals of environmental origin recorded by the electronic nose, it is necessary to use a smoothing model to filter out unwanted noise. The smoothing process makes it easy to analyse or visualise underlying trends or patterns. This technique is particularly useful when raw data are affected by random fluctuations or irregularities that can hinder interpretation. One of the most common techniques is the so-called “moving average”, which involves calculating the average of values in the data series over a specified time interval. In this case, the preceding five odour concentrations were used as a moving average around the current odour concentration value.
The correlation matrix (Table 1) provides a comprehensive summary of the relationships between various environmental variables and odourous concentration ‘CcOD’ within the dataset. The matrix presents the correlation coefficients, which quantify the linear relationships between pairs of variables. Coefficients range from −1 to 1: the value 1 represents a perfect positive relationship, the value 0 indicates the absence of any relationship, and the value −1 signifies a perfect negative relationship.
The odourous concentration ‘CcOD’ ranges from 2.2 ouE/m3 to 18,972.2 ouE/m3, with a mean value of 519.76 ouE/m3. The data at the 98th percentile reaches 4868 ouE/m3.
An analysis of the intensity and direction of the wind in relation to the concentration of odour was subsequently carried out.
Observing Table 1, it is evident that the relationship for VEC and VPID does not appear to be particularly significant, as indicated by its lack of correlation with CcOD. The data for VC6H6 exhibit a significant correlation with CcOD, and VH2S shows a stronger correlation with CcOD. By observing the detailed numerical correlations from Table 1, it becomes easier to interpret and understand the linear relationships among the variables, particularly in relation to the odourous concentration CcOD. A brief analysis of the weather variables ‘temp’, ‘wint’, ‘wdir’, ‘pres’, ‘prec’, ‘rhum’, and ‘uvidx’ data show no significant correlation with the odour concentration, especially regarding temperature, which could be due to a temporal limit in the dataset (of a duration of 27 days), as during October–November, there is generally no significant temperature variation: the mean temperature was 15.7 °C with a deviation standard of 4.5 °C.

3.1. Analysis of Wind Direction and Intensity and Implications for Odour

The wind rose chart in Figure 2 illustrates the distribution of wind frequencies in terms of direction and intensity. The directions are represented by cardinal and intercardinal points: N (North), NE (North-East), E (East), SE (South-East), S (South), SW (South-West), W (West), NW (North-West). The bars extending from the centre of the chart indicate the direction from which the wind is blowing. The length of the bars represents the frequency with which the wind blows from a certain direction. The concentric circles represent frequency percentages, with increments of 5% (5%, 10%, 15%, 20%). The colours of the bars indicate the wind speed (in m/s), according to the legend at the bottom: ‘blue’ is for 0 m/s −2 m/s, ‘green’ for 2m/s–4 m/s, ‘yellow’ for 4 m/s–6 m/s, red for 6 m/s–20.78 m/s. The average wind speed is reported in the chart as “mean = 0.89512 m/s”. The percentage of time the wind is calm, with very low wind speed, is indicated as “calm = 41.1%”. Most of the wind comes from directions varying between N and NE and between S and SE, with a predominance of moderate winds (blue and green) and some instances of stronger winds (yellow and red). Compared to the annual wind rose, this analysis shows a higher frequency of winds coming from the north. This is due to the seasonal presence of the Tramontana wind, which is common during the late autumn period. The N direction seems to have the highest frequency of calm wind. The percentage of time the wind is calm is quite high,41.1%, suggesting that in this dataset, there are many periods of very weak or absent wind.
An in-depth analysis is conducted by observing the distribution of odour concentration in relation to wind direction and frequency (Figure 3). Each segment represents the concentration levels of odour as influenced by the wind coming from different directions. The directions are represented by cardinal points: N, NE, E, SE, S, SW, W, and NW. The segments extending from the centre indicate the direction from which the wind is blowing.
The segment colours represent the odour concentration, according to the legend on the right. The length of the segments represents the frequency of wind from each direction. The concentric circles represent frequency percentages, with increments of 5% and 10%.
In Figure 3, the mean odour concentration is reported as “mean = 521.08 ouE/m3”. The percentage of calm conditions is indicated as “calm = 31.9%”. Higher odour concentrations (red and orange segments) are notably present in the N, SE, and S directions, indicating that these directions have higher levels of odour when the wind blows from them. The chart also shows moderate odour concentrations (blue and green segments) spread across various directions. The calm condition percentage (31.9%) suggests that nearly a third of the time, wind conditions are calm, leading to lower odour dispersion. The different calm period percentages in Figure 2 and Figure 3 arise because Figure 2 reflects low or absent wind speeds, while Figure 3, focusing on odour transport, shows fewer calm periods as low wind is less effective for dispersing odour. This wind–odour association highlights wind’s role in odour movement. High calm readings may be influenced by elevated areas nearby, and calm periods were excluded from the analysis dataset.
Considering the relative positions of the primary sedimentation tanks and the electronic nose shown in Figure 1, it was essential to narrow the analysis to wind directions specifically originating from the 225° (SW) to 325° (NW) range. Since these angles correspond to the directions from which the wind is approaching, analysing this specific range enhances the likelihood of detecting odours originating directly from the tanks.
Figure 4 illustrates the box plot of odour concentrations concerning the wind directions NW, W, and SW. Table 2 reports values of odour concentration for various percentiles. The box plot shows the distribution of odour concentrations for the wind directions Northeast (NE), West (W), and Southwest (SW), with the y-axis on a logarithmic scale to better visualise central data and reduce the visual impact of outliers, making it easier to compare both typical and extreme values in a balanced way. This choice is helpful as the concentrations vary over a wide range, from 10 to over 10,000 ouE/m3. The plot reveals that the median odour concentrations across the three directions are similar, around 100 ouE/m3. The central part of the distribution (the 25th to 75th percentiles) shows values concentrated within a limited range, while the numerous outliers indicate occasional high peaks. Overall, the plot highlights that, although moderate values are most frequent, there are also episodes of high concentration in all three directions.
Odour concentrations tend to increase significantly from lower percentiles (25th and 50th) to higher percentiles (75th and 98th). This indicates that while most odour concentrations remain relatively low, there are some very high peaks. At the maximum concentrations (98th percentile), the highest concentrations occur with winds from the SW, peaking at 4547.4 ouE/m3; odour concentrations for NW reach 3766.11 ouE/m3 at the 98th percentile; for winds from W, the highest recorded concentration is 2904.0 ouE/m3.
The SW direction shows the highest odour concentrations overall, suggesting a significant odour source located northwest of the measurement point. SW and W show slightly lower but still significant concentrations compared to SW.
For odourous emissions, there is no national law in Italy that sets clear and uniform limits for odour concentration, leaving it to regional authorities. For instance, the unique limit for Lombardy Region is 300 ouE/m3 for compost production plants, used as a reference to prevent negative impacts on air quality [65,66].
The concentration limit of 1 ouE/m3 pertains to the odour concentration perceived in areas with sensitive receptors, such as residential areas or public placesfor olfactory impact studies by simulation of dispersion [67]. This more restrictive measure ensures that the odour does not reach levels that are annoying or harmful to people near the facility. In general, olfactory nuisance is defined when the odour exceeds the limit for more than 2% of the monitoring time (i.e., 172 h in a period monitoring of a year) [52].
As shown in Figure 1, the e-Nose is positioned near the plant boundary line; however, a study was conducted regarding the number of times the electronic nose records concentrations greater than 300 ouE/m3, occurring in 26.36% of cases as reported in Figure 5. The scale on the x-axis is adjusted to allow observation of the values contained within the 98th percentile. The graph related to the entire dataset is included in the Supplementary Materials (Figure S1).
A more specific analysis has been carried out regarding the weather variables of interest. It was possible to analyse data from both chemical detectors and odour concentrations under different meteorological conditions.
The data of CcOD, VEC, VPID, VH2S, and VC6H6, which underwent smoothing, was analysed considering: wind direction angles between 225° and 325° indicate the range from which the wind originates, covering the area where odour from the sedimentation tank can potentially be intercepted by the electronic nose. This directional range helps to effectively filter odour data specifically related to the water treatment plant, leveraging the positioning of the electronic nose relative to the sedimentation tank; a temperature greater than 15 °C and relative humidity greater than 70% since the presence of water vapour in the ambient air can significantly increase the perception of odour. A comparison has been made between the selected data as described previously. There is a lack of correlation between the value of CcOD and the variables VPID and VEC, both by considering the entire dataset and the data related to the previously mentioned meteorological conditions; a more significant correlation is observed with CcOD value and the variables VH2S and VC6H6. For variable VH2S, the R2 value increases from 0.60 to 0.68, while for variable VC6H6, the R2 value rises from 0.44 to 0.60 when selecting data based on the above-mentioned meteorological characteristics. The relevant figures are provided in the Supplementary Materials (Figures S2–S5).

3.2. Data Pre-Processing and Machine Learning with Random Forest Models

Data are pre-processed by the logarithmic function before using a Random Forest algorithm. This is advantageous for several reasons. It helps stabilise variance and reduce the impact of outliers, making the model more robust. Additionally, it transforms highly skewed data into a more normal distribution, which improves the algorithm’s performance. The transformation also compresses the range of values, bringing them onto a similar scale, and simplifies relationships in the data, allowing the Random Forest to detect and model patterns more effectively [68].
Given the presence of a limited dataset, as it consists of less than a monthly series of data with readings taken every ten minutes, three different approaches have been followed:
  • In the first series of graphs, the first 30% of the data were extracted for training the algorithm and calculating the predictive model on the subsequent remaining 70% of the dataset. This model will be referred to in the text as RF0 (Figure 6).
  • A random sample was taken from the dataset (30% of the data) as training for the algorithm, and a predictive model was built on the remaining 70% of the data within the dataset. This model will be referred to in the text as RF1 (Figure 7).
  • A random sample was extracted from the dataset (30% of the data) as training for the algorithm, and then a predictive model was built: the predictive model was applied to the dataset, i.e., the complete matrix, and the same 30% used previously for training was recalculated. This model will be referred to in the text as RF2 (Figure 8).
A preliminary check is conducted using all the data available in the dataset to observe the model’s performance. RF1 is preferable because it is not influenced in the prediction processing by data used for training the model; therefore, the entire dataset is not used, unlike RF2, where part of the data are also used for training the model. Additionally, RF1 is not directly tied to a specific period, unlike RF0, which uses the first 30% of the data for prediction.
Each set of graphs corresponding to a random forest model comprises three distinct plots. The plots in Panel A serve to illustrate the correlation between the observed variable and its predicted counterpart generated by the model. The ideal scenario is represented by the purple line, which follows the y = x trajectory, indicating perfect alignment between observed and predicted CcOD values.
The plots in panel A present a heatmap scatter plot, where the colour intensity indicates regions of higher concentration of values (the red region), offering insights into the data distribution. In the plots in panel B, an error plot is depicted, providing a visual representation of the variance between observed and predicted values across the dataset. Finally, the plots in Panel C feature a bar chart that showcases the distribution of errors, offering a comprehensive view of the model’s predictive performance across different segments of the dataset.
In the histograms of errors, a Gaussian distribution centred around error 0 is observed in Figure 7 and Figure 8. In Figure 6, the error histogram (panel C) does not exhibit a Gaussian shape, indicating that the error is more prevalent. In Figure 7, noticeable and occasionally significant error spikes are evident. In Figure 8, error spikes persist but appear less pronounced compared to those observed in model RF0, while the Gaussian error distribution remains narrow. Similarly, in the series of graphs corresponding to model RF2, despite the persistence of error spikes, their magnitude is diminished compared to RF0.
The following table (Table 3) summarises the coefficients related to the trend line and the R-squared values of the three RF models extended to the entire dataset.
For RF0, the positive intercept and slope of less than 1 indicate an underestimation of predictions compared to the measured values. The R-squared value of 0.31 suggests a relatively weak correlation between predicted and measured data, and a relatively high RMSE (Root Mean Squared Error) of 1.125 indicates overall low accuracy for this model. In RF1, the intercept is close to zero, and the slope is nearly 1, indicating better alignment between predicted and measured data. The R-squared value of 0.7 shows a good correlation, and the reduced RMSE is 0.87. For RF2, the intercept is close to zero, and the slope slightly above 1 indicates a slight overestimation in the predictions. It has the highest R-squared value (0.75) and the lowest RMSE (0.75). The discrepancy in R-squared values among the models can be attributed to the utilisation of training values within the predictive model. Model RF2, which incorporates training values, achieves a higher R-squared value, whereas RF0, lacking randomness in training set utilisation, yields a lower R-squared value. However, it is important to note that model RF2, while achieving a higher R-squared value, is not the most preferable choice. This is because the data used for training cannot also be used in the prediction set, making model RF1 a more favourable option. For the three models, the importance of variables VEC, VPID, VH2S, and VC6H6 has been calculated. Below there is the summary of Table 4.
In the evaluation of variable importance within the Random Forest models, VH2S was identified as the most significant variable, particularly in RF1 and RF2, where it accounted for 49.5% of importance. In RF0, it held an importance of 40.8%. VC6H6 also demonstrated substantial influence, contributing 36% in RF0 and 30% in RF1 and RF2. The VPID variable had moderate importance, ranging from 19.5% to 22.4%, while VEC was the least impactful, with contributions of only 0.8% in RF0 and 1% in the other two models. All models agree in defining variable VH2S and variable VC6H6 as the most important for model structuring. Variable VPID has a minor influence, while variable VEC does not influence model construction. The reason why the variables VPID and VEC do not show significant importance in RF models is likely that variables are measured using sensors such as PID or electrochemical sensors, both of which detect thousands of compounds, potentially dominated by those with lesser olfactory impact. Therefore, in the future, to effectively utilise these types of sensors in odour-related applications, it may be necessary to perform compound speciation. PID sensors are highly sensitive and can detect VOCs at very low concentrations, but they lack selectivity and can be influenced by numerous interfering compounds present in the environment. This complexity makes it challenging to use these sensors in applications where distinguishing between different VOCs is necessary [69].
On the other hand, electrochemical sensors can still be susceptible to interference from other gases in the environment. Additionally, their sensitivity can be affected by environmental conditions such as temperature and humidity, which can compromise measurement accuracy [70].

3.3. Analysis on Data < 1000 ouE/m3

An additional analysis was conducted with the random forest model RF0, RF1, RF2, which considers odour concentration data below 1000 ouE/m3. The investigation focuses on concentrations of CcOD lower than 1000 ouE/m3 due to the higher density of data below this threshold.
The wider spread of Figure 9, Figure 10 and Figure 11 of the Gaussian curve in the error histogram indicates a general scenario of a higher volume of data affected by errors. In all Figure 9 and Figure 10 panels C, concerning the error between observed and predicted data, there is a noticeable shift in the peak frequency of errors that is not centred in 0. Specifically, the highest frequency is no longer attributed to an error of 0, but rather to errors lower than 0. In RF1, the frequency of errors surpassing approximately 100 ouE/m3 is observed; in RF1, around 60 ouE/m3, while in RF2 (Figure 11), it is approximately 30 ouE/m3. These observations align with the earlier discussions regarding the significance of the data. An increased quantity of error spikes for all three models is observed.
The following Table 5 summarises the coefficients related to the trend line and the R-squared values of the three RF models related to odourous concentrations less than 1000 ouE/m3.
In RF0, the high intercept and very low slope coefficient suggest that RF0 has almost no relationship between predicted and measured values. The very low R-squared (0.01) confirms a weak correlation, and the RMSE of 1.17 indicates poor overall accuracy, meaning this model is not effective in capturing the underlying pattern of the data. In the RF1 model, there is an almost perfect alignment between the slope line and the reference line. With an intercept closer to zero and a slope above 1, RF1 slightly overestimates predictions. The R-squared value (0.27) indicates a weak correlation. The RMSE of 0.83 shows moderate accuracy. This model has the lowest intercept and the highest slope among the three, suggesting it slightly overestimates the predictions. In RF2, an R-squared value of 0.47 indicates a moderate correlation, and the RMSE of 0.71 reflects improved accuracy. Despite showing improved metrics, the RF2 model cannot be considered due to the use of the training set within the test set. It cannot be determined if there is a relationship model between the observed and predicted odour concentration values below 1000 ouE/m3. The Random Forest models developed on the entire dataset and for concentrations below 1000 ouE/m3 were validated using 5-fold cross-validation. The related considerations are presented in the Supplementary Materials in Tables S1 and S2.
The e-Nose system, despite certain limitations, proves valuable for managing industrial odour nuisance emergencies. This system is particularly suited for situations requiring continuous air quality monitoring during odour-related events. Notably, the e-Nose has been shown to deliver more reliable measurements when odour concentrations exceed 1000 ouE/m3, making this data instrumental for the timely management of olfactory nuisance emergencies. By leveraging this information, environmental authorities can communicate effectively with the public and swiftly intervene when odour levels surpass critical thresholds. However, one major challenge lies in the lack of uncertainty associated with e-Nose measurements. Currently, no specific regulation ensures that these measurements have the repeatability, reproducibility, and overall uncertainty characteristics needed for full reliability. Following the use of Random Forest to generate accurate predictions and assess variable importance, further exploration of the relationships between predictor variables and the response variable can provide clearer insights. This analysis helps illustrate how each chemical predictor might directly influence odour responses, enabling industries to act on the most odouriferous chemical compounds. The aim is to deploy the e-Nose in a manner like the EN 14181 standard [71], enabling it to differentiate between various odour types in public areas and to use this feedback to modify the chemical configuration of facilities with odour-related challenges.

3.4. Relationship Between Chemical Concentration and Odour Concentration

The first investigative method to establish a relationship between chemical concentration data and odour-related data pertains to the formula previously described in Equation (3). Due to the lack of importance of the variable VEC and its lesser impact on the variable VPID in the RF models (both related to VOCs measurements), these variables have not been utilised. The odour threshold value used for VH2S (OTVH2S), which corresponds to H2S, is 0.41 ppb, and for VC6H6 (OTVC6H6), the benzene values are 2700 ppb. Below is Equation (6) with the variables of interest, VH2S and VC6H6; ‘k0’ and ‘k1‘represent the experimental coefficients related to the study data.
CCOD = k0 + k1 (VH2S/OTVH2S + VC6H6/OTVC6H6)
In Table 6 the values of logarithmic model are reported.
The coefficient of determination (R-squared) is 0.6, meaning that 60% of the variation in the dependent variable can be explained by the independent variables in the model. The standard error for k0 is 2.34 × 102, while for k1, it is 2.5 × 10−2. Both standard errors are relatively low compared to the coefficient estimates, indicating a good level of precision in the estimation. The t-value for k0 is quite high (−5.05 × 10−5), indicating that k0 is statistically significant and makes a substantial contribution to the model, while k1 t-value is high (8.77 × 10), suggesting strong statistical significance. The p-value associated with the F-test is less than 2 × 10−16, indicating that the model is statistically significant. Additionally, both coefficients are highly significant (p-value < 0.001), indicating that both variables have a significant effect on the dependent variable. The confidence intervals presented indicate a range of possible values for the model parameters with 95% confidence. The confidence interval for the intercept ranges from −1.168 × 103 to −1.078 × 103, while the interval for the coefficient of the composite variable lies between 3.26 and 3.41. This indicates that the effect of the composite variable on CcOD is positive. The narrow range around these values suggests that the coefficient estimate is quite precise. The fact that both confidence intervals are relatively narrow implies that the parameter estimates are reliable and have low uncertainty. Furthermore, the interval for the composite variable does not include zero, confirming that this variable has a significant effect on CcOD. It represents a good fit of the model, although there is still a significant portion of variability (40%) that is not explained by the model. It is possible to visually observe a good fit between the average odour concentrations, while the model is much less accurate at very high and very low odour concentrations. The results are visible in Figure 12.
Having observed a moderate correlation within the literature model, a linear model is investigated. Through the least squares method, the potential correlation between the value of CcOD and the sum of the variables VH2S and VC6H6, corrected for the respective experimental coefficients k0, k3, and k4, is identified. The following linear model is investigated with Equation (7):
CCOD = k0 + k3×VH2S + k4×VC6H6
In Table 7, coefficient values of the least squares model are reported. The model indicates that both VH2S and VC6H6 variables are statistically significant predictors of the dependent variable CcOD.
Value k0 has a standard error of 2.19 × 10, k3 has a standard error of 4.31 × 10−2, and k4 has a higher standard error of 5.83 × 103. Despite k4’s relatively higher standard error, it remains small relative to the estimate, indicating good precision across coefficients. All coefficients have very high absolute t-values (k0 = −63.3, k3 = 61.2, k4 = 34.1), suggesting that each is highly significant. The t-values indicate strong contributions to the model, with k3 and k0 being particularly influential due to their large t-values. The p-values associated with the coefficients are very small (<2 × 10−16), indicating strong evidence against the null hypothesis that the coefficients are equal to zero. The confidence intervals for the model parameters provide insight into the precision of the coefficient estimates at a 95% confidence level. For the intercept, the interval ranges from −1.37 × 103 to −1.29 × 103, indicating a high level of precision in its estimation due to the relatively narrow range. The coefficient for the variable k3 is similarly precise, with a confidence interval between 2.46 and 2.63. In contrast, the coefficient for k4 shows a wider confidence interval, spanning from 1.8 × 105 to 2.1 × 105, suggesting greater variability in this estimate. However, despite this broader interval, the estimate remains significantly positive. These confidence intervals underline the robustness of the estimates, especially for k3, while highlighting some degree of uncertainty for k4.
The R-squared value is 0.7, indicating that approximately 70% of the variability in the dependent variable is explained by the independent variables in the model, and this is evident in Figure 13.
In summary, the model suggests that both the VH2S and VC6H6 variables have a significant impact on the dependent variable, and the model is statistically more significant in predicting the outcome variable (Table 8). The Literature Model has an R2 of 0.6, indicating it explains 60% of the variance in the data, which suggests a moderate fit. Its Mean Squared Error (MSE) is 750.8, and the Root Mean Squared Error (RMSE) is 866.5, showing significant deviations between predicted and actual values. The Mean Absolute Error (MAE) is 590.9, reflecting limited precision. In contrast, the Least Squares Model performs better, with an R2 of 0.7, explaining 70% of the variance. It has a lower MSE of 612.8 and RMSE of 782.8, indicating improved accuracy. The MAE of 515.0 also suggests greater reliability. However, there is still potential for further improvement, such as incorporating additional variables and refining the calibration of the electronic nose.

4. Conclusions

The analysis aimed to assess the reliability of data collected by an electronic nose near a wastewater treatment plant and to investigate the correlation between chemical concentration sensor readings and odour concentration values. This approach validates odour concentration measurements conducted during environmental monitoring using an electronic nose, aiding in the development of an accurate characterisation strategy. A good strategy for selecting relevant odour data are to focus exclusively on measurements taken under suitable wind conditions for the experimental setup of the sedimentation tank and the electronic nose.
The analysis established that the variables VH2S and VC6H6 significantly influence odour concentration, underscoring their crucial role in odour perception in this environment. Despite the challenges associated with overfitting in machine learning, the Random Forest approach with three different models (RF0, RF1, RF2) provided predictive performance, particularly with the RF1 model, which minimised data overlap and maximised accuracy: the value of R2 for R1 is 0.7, indicating a correlation to be considered acceptable in the case of environmental data. It has been observed that the concentration of odours is significantly influenced by specific chemical substances detected by dedicated sensors: All RF models consistently identified VH2S and VC6H6 as the most important variables for predicting odour concentration, highlighting the significance of H2S and benzene levels in determining odour intensity. VPID had a lesser impact, while VEC showed an unrelated importance in model construction.
The analysis revealed a significant correlation between CcOD and the readings of certain chemical sensors, particularly VH2S and VC6H6. These variables exhibited a more pronounced influence on odour concentration compared to VEC and VPID, indicating the significance of specific compounds contributing to the overall odour perception.
Further supporting the significant influence of VH2S and VC6H6 on odour concentration, linear regression analysis identified both variables as statistically significant predictors of odour intensity, with strong evidence against the null hypothesis. It was observed that the best correlation model was not achieved using the literature model but rather through the sum of concentrations multiplied by a coefficient. The multiple linear regression model revealed a moderate coefficient of determination of 0.70, indicating that approximately 70% of the variability in odour concentration can be explained by the independent variables. The statistically significant F-statistic and a relatively small standard error for the model coefficients underscore the overall significance of the regression model. The hybrid approach, combining the analysis of Random Forest with the minimum least squares model, can offer a valuable balance between model complexity and interpretability. This approach leverages the predictive capabilities of Random Forest while also providing a clearer understanding of the factors influencing the outcome through the linear model.
The study has several limitations, including relatively high prediction errors and difficulties in correlating with very high and very low odour concentration values. This highlights the inherent difficulty in training the electronic nose and the need for a closer correlation between odour concentration and odour intensity, which is closely related to the sensory experience generated by the odour stimulus in the olfactory system and follows a logarithmic pattern because human odour perception is extremely complex and not yet fully understood. Furthermore, a challenging correlation was noted for odour concentrations below 1000 ouE/m3, highlighting difficulties in prediction and reduced reliability of the data provided by the electronic nose and the complexities of correlating chemical concentrations with perceived odour intensity. Despite these challenges, the electronic nose system plays an important role in the effective management of emergencies associated with industrial odour nuisance situations, being able to recognise which of the compounds identified by the electronic nose are primarily responsible for the odour nuisance event. Future research should aim to extend the dataset duration and include additional variables to improve the model’s accuracy and reliability. It will be necessary to investigate the possibility of employing specific chemical sensors tailored to the type of odour nuisance and to conduct a comparison with dynamic olfactometry for thorough field validation. Overall, this study demonstrates the effectiveness of electronic noses in environmental monitoring and the necessity for specific standardised procedures for odour-causing compounds for the system training.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/atmos15121401/s1. Figure S1: Frequency of odour concentration of plant odour emissions; Figure S2: Comparison between the correlation present between CcOD and the variable VEC after smoothing with a selection mask with 45° < wint < 135°, temp > 15°C, rhum > 70% (Panel A) and without a selection mask (Panel B); Figure S3: Comparison between the correlation present between CcOD and the variable VPID after smoothing with a selection mask with 45° < wint < 135°, temp > 15°C, rhum > 70% (Panel A) and without a selection mask (Panel B); Figure S4: Comparison between the correlation present between CcOD and the variable VH2S after smoothing with a selection mask with 45° < wint < 135°, temp > 15°C, rhum > 70% (Panel A) and without a selection mask (Panel B); Figure S5: Comparison between the correlation present between CcOD and the variable VH2S after smoothing with a selection mask with 45° < wint < 135°, temp > 15°C, rhum >70% (Panel A) and without a selection mask (Panel B); Table S1: Metrics of RF models for the cross-validation on the entire dataset. Table S2: Metrics of RF models for the cross-validation on the dataset below 1000 ouE/m3.

Author Contributions

Conceptualisation, D.C. and C.F.; methodology, D.C., C.F., A.M.C. and M.G.; software, C.F. and D.R.; validation, D.C., C.F., D.R., A.M.C. and M.G.; investigation, D.C., C.F., D.R., A.M.C. and M.G.; data curation, D.C., C.F. and D.R.; writing—original draft preparation, D.C., C.F., A.M.C. and M.G.; writing—review and editing, L.F., E.B., A.F., C.C. and G.D.P.; supervision, D.C, E.B. and L.F.; project administration, D.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been financed by the Research Fund for the Italian Electrical System under the Three-Year Research Plan 2022–2024 (DM MITE n.337, 15.09.2022), in compliance with the Decree of 16 April 2018; Project Number: 3922-210-023—SFE LA.2.10-023.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Schlegelmilch, M.; Streese, J.; Stegmann, R. Odour Management and Treatment Technologies: An Overview. Waste Manag. 2005, 25, 928–939. [Google Scholar] [CrossRef] [PubMed]
  2. Wing, S.; Horton, R.A.; Marshall, S.W.; Thu, K.; Tajik, M.; Schinasi, L.; Schiffman, S.S. Air Pollution and Odor in Communities near Industrial Swine Operations. Environ. Health Perspect. 2008, 116, 1362–1368. [Google Scholar] [CrossRef] [PubMed]
  3. Shen, Y.; Chen, T.B.; Gao, D.; Zheng, G.; Liu, H.; Yang, Q. Online Monitoring of Volatile Organic Compound Production and Emission during Sewage Sludge Composting. Bioresour. Technol. 2012, 123, 463–470. [Google Scholar] [CrossRef]
  4. D’Imporzano, G.; Crivelli, F.; Adani, F. Biological Compost Stability Influences Odor Molecules Production Measured by Electronic Nose during Food-Waste High-Rate Composting. Sci. Total Environ. 2008, 402, 278–284. [Google Scholar] [CrossRef]
  5. Seinfeld, J.H.; Pandis, S.N.; Πανδής, Σ.Ν. Atmospheric Chemistry and Physics: From Air Pollution to Climate Change; J. Wiley: Hoboken, NJ, USA, 2006; ISBN 9780471720171. [Google Scholar]
  6. Bonasoni, P.; Gilardoni, S.; Barbieri, P.; Moraca, S.; De Gennaro, G. Molestie Olfattive-Studi Metodi e Strumenti per Il Controllo; Edizioni ETS: Pisa, Italy, 2022. [Google Scholar]
  7. Cipriano, D.; Cefalì, A.M.; Allegrini, M. Experimenting with Odour Proficiency Tests Implementation Using Synthetic Bench Loops. Atmosphere 2021, 12, 761. [Google Scholar] [CrossRef]
  8. Bax, C.; Voti, M.L.; Sironi, S.; Capelli, L. Application and Performance Verification of Electronic Noses for Landfill Odour Monitoring; Politecnico di Milano: Cagliari, Italy, 2019. [Google Scholar]
  9. Angelini, P.; Soracase, M. Documento Guida di Comunicazione del Rischio Ambientale per la Salute; I Quaderni di Arpae: Bologna, Italy, 2018. [Google Scholar]
  10. Mauro, F.; Borghesi, R. Using Citizen Science to Manage Odour Emissions in National IED Plants: A Systematic Review of the Scientific Literature. Atmosphere 2024, 15, 302. [Google Scholar] [CrossRef]
  11. CEN TC264 EN 13725; Stationary Source Emissions—Determination of Odour Concentration by Dynamic Olfactometry and Odour Emission Rate. CEN: Brussels, Belgium, 2022.
  12. Delplanque, S.; Chrea, C.; Grandjean, D.; Ferdenzi, C.; Cayeux, I.; Porcherot, C.; Le Calvé, B.; Sander, D.; Scherer, K.R. How to Map the Affective Semantic Space of Scents. Cogn. Emot. 2012, 26, 885–898. [Google Scholar] [CrossRef] [PubMed]
  13. Blanes-Vidal, V. Air Pollution from Biodegradable Wastes and Non-Specific Health Symptoms among Residents: Direct or Annoyance-Mediated Associations? Chemosphere 2015, 120, 371–377. [Google Scholar] [CrossRef]
  14. Cameron, E.L. Pregnancy and Olfaction: A Review. Front. Psychol. 2014, 5, 67. [Google Scholar] [CrossRef]
  15. Simsek, G.; Bayar Muluk, N.; Arikan, O.K.; Ozcan Dag, Z.; Simsek, Y.; Dag, E. Marked Changes in Olfactory Perception during Early Pregnancy: A Prospective Case–Control Study. Eur. Arch. Oto-Rhino-Laryngol. 2015, 272, 627–630. [Google Scholar] [CrossRef]
  16. Guadalupe-Fernandez, V.; De Sario, M.; Vecchi, S.; Bauleo, L.; Michelozzi, P.; Davoli, M.; Ancona, C. Industrial Odour Pollution and Human Health: A Systematic Review and Meta-Analysis. Environ. Health A Glob. Access Sci. Source 2021, 20, 108. [Google Scholar] [CrossRef] [PubMed]
  17. Schiffman, S.S.; Williams, C.M. Science of Odor as a Potential Health Issue. J. Environ. Qual. 2005, 34, 129–138. [Google Scholar] [CrossRef] [PubMed]
  18. Nimmermark, S. Odour Influence on Well-Being and Health with Specific Focus on Animal Production Emissions. Ann. Agric. Environ. Med. 2004, 11, 163–173. [Google Scholar] [PubMed]
  19. Durmusoglu, E.; Taspinar, F.; Karademir, A. Health Risk Assessment of BTEX Emissions in the Landfill Environment. J. Hazard. Mater. 2010, 176, 870–877. [Google Scholar] [CrossRef]
  20. Conti, C.; Guarino, M.; Bacenetti, J. Measurements Techniques and Models to Assess Odor Annoyance: A Review. Environ. Int. 2020, 134, 105261. [Google Scholar] [CrossRef]
  21. Brancher, M.; Griffiths, K.D.; Franco, D.; de Melo Lisboa, H. A Review of Odour Impact Criteria in Selected Countries around the World. Chemosphere 2017, 168, 1531–1570. [Google Scholar] [CrossRef] [PubMed]
  22. Radon, K.; Peters, A.; Praml, G.; Ehrenstein, V.; Schulze, A.; Hehl, O.; Nowak, D. Livestock Odours and Quality of Life of Neighbouring Redidents. Ann. Agric Environ. Med. 2004, 11, 59–62. [Google Scholar]
  23. Greatorex, J.M. A Review of Methods for Measuringmethane, Nitrous Oxide and Odour Emissionsfrom Animal Production Activities; JTI—Institutet för jordbruks-och miljöteknik: Uppsala, Sweden, 2000. [Google Scholar]
  24. Brattoli, M.; de Gennaro, G.; de Pinto, V.; Loiotile, A.D.; Lovascio, S.; Penza, M. Odour Detection Methods: Olfactometry and Chemical Sensors. Sensors 2011, 11, 5290–5322. [Google Scholar] [CrossRef]
  25. Hove, N.C.Y.; Demeyer, P.; Van der Heyden, C.; Van Weyenberg, S.; Van Langenhove, H. Improving the Repeatability of Dynamic Olfactometry According to EN 13725: A Case Study for Pig Odour. Biosyst. Eng. 2017, 161, 70–79. [Google Scholar] [CrossRef]
  26. Cipriano, D.; Capelli, L. Evolution of Electronic Noses from Research Objects to Engineered Environmental Odour Monitoring Systems: A Review of Standardization Approaches. Biosensors 2019, 9, 75. [Google Scholar] [CrossRef]
  27. Capelli, L.; Sironi, S.; Del Rosso, R. Electronic Noses for Environmental Monitoring Applications. Sensors 2014, 14, 19979–20007. [Google Scholar] [CrossRef] [PubMed]
  28. Gardner, J.W.; Barlett, P.N. A Brief History of Electronic Noses. Sens. Actuators B Chem. 1994, 18, 211–220. [Google Scholar] [CrossRef]
  29. Pearce, T.C. Computational Parallels between the Biological Olfactory Pathway and Its Analogue “The Electronic Nose”: Part II. Sensor-Based Machine Olfaction. BioSystems 1997, 41, 69–90. [Google Scholar] [CrossRef]
  30. Nagle, H.T.; Schiffman, S.S.; Gutierrez-Osuna, R. The how and why of electronic noses. Spectrum 1998, 35, 22–34. [Google Scholar] [CrossRef]
  31. Capelli, L.; Sironi, S.; Del Rosso, R.; Guillot, J.M. Measuring Odours in the Environment vs. Dispersion Modelling: A Review. Atmos. Environ. 2013, 79, 731–743. [Google Scholar] [CrossRef]
  32. Scott, S.M.; James, D.; Ali, Z. Data Analysis for Electronic Nose Systems. Microchim. Acta 2006, 156, 183–207. [Google Scholar] [CrossRef]
  33. Pearce, T.C.; Schiffman, S.S.; Nagle, H.T.; Gardner, J.W. Handbook of Machine Olfaction Electronic Nose Technology; Wiley-VCH: Weinheim, Germany, 2001; ISBN 3527295577. [Google Scholar]
  34. Boeker, P. On “Electronic Nose” Methodology. Sens. Actuators B Chem. 2014, 204, 2–17. [Google Scholar] [CrossRef]
  35. Nicolas, J.; Romain, A.C. Establishing the Limit of Detection and the Resolution Limits of Odorous Sources in the Environment for an Array of Metal Oxide Gas Sensors. Sens. Actuators B Chem. 2004, 99, 384–392. [Google Scholar] [CrossRef]
  36. Nakamoto, T.; Sumitimo, E. Study of Robust Odor Sensing System with Auto-Sensitivity Control. Sens. Actuators B Chem. 2003, 89, 285–291. [Google Scholar] [CrossRef]
  37. Dentoni, L.; Capelli, L.; Sironi, S.; Del Rosso, R.; Zanetti, S.; Torre, M. Della Development of an Electronic Nose for Environmental Odour Monitoring. Sensors 2012, 12, 14363–14381. [Google Scholar] [CrossRef]
  38. Eusebio, L.; Capelli, L.; Sironi, S. Electronic Nose Testing Procedure for the Definition of Minimum Performance Requirements for Environmental Odor Monitoring. Sensors 2016, 16, 1548. [Google Scholar] [CrossRef] [PubMed]
  39. JCGM Member Organizations. Evaluation of Measurement Data-Guide to the Expression of Uncertainty in Measurement; JCGM Member Organizations: Paris, France, 2008. [Google Scholar]
  40. Oliva, G.; Zarra, T.; Pittoni, G.; Senatore, V.; Galang, M.G.; Castellani, M.; Belgiorno, V.; Naddeo, V. Next-Generation of Instrumental Odour Monitoring System (IOMS) for the Gaseous Emissions Control in Complex Industrial Plants. Chemosphere 2021, 271, 129768. [Google Scholar] [CrossRef] [PubMed]
  41. Sharma, A.; Kumar, R.; Aier, I.; Semwal, R.; Tyagi, P.; Varadwaj, P. Sense of Smell: Structural, Functional, Mechanistic Advancements and Challenges in Human Olfactory Research. Curr. Neuropharmacol. 2018, 17, 891–911. [Google Scholar] [CrossRef] [PubMed]
  42. Gostelow, M.P.; Parsons, S.A.; Stuetz, R.M. Odour Measurements for Sewage Treatment Works. Wat. Res 2001, 35, 579–597. [Google Scholar] [CrossRef] [PubMed]
  43. Dincer, F.; Muezzinoglu, A. Odor Determination at Wastewater Collection Systems: Olfactometry versus H2S Analyses. Clean Soil Air Water 2007, 35, 565–570. [Google Scholar] [CrossRef]
  44. P2520.1 Working Group of the IEEE Sensors Council. P2520.1TM/D19.0 Std for Baseline Performance of Machine Olfaction Devices and Systems; P2520.1 Working Group of the IEEE Sensors Council: New York, NY, USA, 2024. [Google Scholar]
  45. UNI 11761:2019; Emissioni e Qualità Dell’aria—Determinazione Degli Odori Tramite IOMS (Instrumental Odour Monitoring Systems). UNI: Milano, Italy, 2019.
  46. Comitato Tecnico Provinciale Valutazione Impatto Ambientale—Comitato V.I.A. Provincia di Vicenza. Orientamento Operativo per la Valutazione Dell’impatto Odorigeno nelle Istruttorie di Valutazione Impatto Ambientale e Assoggettabilità; ARPAV: Vicenza, Italy, 2020. [Google Scholar]
  47. Capelli, L.; Sironi, S.; Del Rosso, R.; Céntola, P. Predicting Odour Emissions from Wastewater Treatment Plants by Means of Odour Emission Factors. Water Res. 2009, 43, 1977–1985. [Google Scholar] [CrossRef]
  48. Stellacci, P.; Liberti, L.; Notarnicola, M.; Haas, C.N. Hygienic Sustainability of Site Location of Wastewater Treatment Plants. A Case Study. I. Estimating Odour Emission Impact. Desalination 2010, 253, 51–56. [Google Scholar] [CrossRef]
  49. Li, R.; Han, Z.; Shen, H.; Qi, F.; Ding, M.; Song, C.; Sun, D. Emission Characteristics of Odorous Volatile Sulfur Compound from a Full-Scale Sequencing Batch Reactor Wastewater Treatment Plant. Sci. Total Environ. 2021, 776, 145991. [Google Scholar] [CrossRef]
  50. Zarra, T.; Reiser, M.; Naddeo, V.; Belgiorno, V.; Kranert, M. Odour Emissions Characterization from Wastewater Treatment Plants by Different Measurement Methods. Chem. Eng. Trans. 2014, 40, 37–42. [Google Scholar] [CrossRef]
  51. Parlamento Europeo e Consiglio dell’Unione Europea. Direttiva 2008/50/CE Del Parlamento Europeo Del Consiglio, Del 21 Maggio 2008, Relativa Alla Qualità Dell’aria Ambiente e per Un’aria Più Pulita in Europa; Parlamento Europeo e Consiglio dell’Unione Europea: Brussels, Belgium, 2008. [Google Scholar]
  52. Regione Lombardia. D.g.r. 15 Febbraio 2012—n. IX/3018 Generali in Merito Alla Caratterizzazione Delle Emissioni Gassose in Atmosfera Derivanti Da Attività a Forte Impatto Odorigeno; Regione Lombardia: Milan, Italy, 2012. [Google Scholar]
  53. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  54. Biau, G.; Scornet, E. A Random Forest Guided Tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
  55. Ferrero, A. Esposizione del Metodo dei Minimi Quadrati; Pranava Books: Firenze, Italy, 1876. [Google Scholar]
  56. Boulesteix, A.L.; Janitza, S.; Kruppa, J.; König, I.R. Overview of Random Forest Methodology and Practical Guidance with Emphasis on Computational Biology and Bioinformatics. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2012, 2, 493–507. [Google Scholar] [CrossRef]
  57. Breiman, L. Consistency for a Sample Model of Random Forests; University of California at Berkeley: Berkeley, CA, USA, 2004. [Google Scholar]
  58. Breiman, L.; Friedman, J.; Stone, C.J.; Olshen, R.A. Classification and Regression Trees, 1st ed.; Chapman e Hall/CRC: New York, NY, USA, 1984. [Google Scholar]
  59. Capelli, L.; Sironi, S.; Del Rosso, R.; Céntola, P.; Il Grande, M. A Comparative and Critical Evaluation of Odour Assessment Methods on a Landfill Site. Atmos. Environ. 2008, 42, 7050–7058. [Google Scholar] [CrossRef]
  60. Kubíckovaánd, J.; Grosch, K.W. Quantification of Potent Odorants in Camembert Cheese and Calculation of Their Odour Activity Values. Int. Dairy J. 1998, 8, 17–23. [Google Scholar] [CrossRef]
  61. McGinley, C.M.; McGinley, M.A. Odor Testing Biosolids for Decision Making. Proc. Water Environ. Fed. 2002, 1055–1072. [Google Scholar] [CrossRef]
  62. Chen, Y.; Bundy, D.S.; Hoff, S.J. Using Olfactometry to Measure Intensity and Threshold Dilution Ratio for Evaluating Swine Odor. J. Air Waste Manag. Assoc. 1999, 49, 847–853. [Google Scholar] [CrossRef] [PubMed]
  63. Wu, C.; Liu, J.; Zhao, P.; Piringer, M.; Schauberger, G. Conversion of the Chemical Concentration of Odorous Mixtures into Odour Concentration and Odour Intensity: A Comparison of Methods. Atmos. Environ. 2016, 127, 283–292. [Google Scholar] [CrossRef]
  64. Centola, P.; Sironi, S.; Capelli, L.; Del Rosso, R. Valutazione di Impatto Odorigeno dì Una Realtà Industriale; AIDIC Servizi S.r.l.: Milano, Italy, 2004. [Google Scholar]
  65. Bokowa, A.; Diaz, C.; Koziel, J.A.; McGinley, M.; Barclay, J.; Schauberger, G.; Guillot, J.M.; Sneath, R.; Capelli, L.; Zorich, V.; et al. Summary and Overview of the Odour Regulationsworldwide. Atmosphere 2021, 12, 206. [Google Scholar] [CrossRef]
  66. Regione Lombardia. Deliberazione Giunta Regionale 16 Aprile 2003—n. 7/12764 Linee Guida Relative Alla Costruzione e All’esercizio Degli Impianti di Produzione di Compost; Regione Lombardia: Milano, Itlay, 2003. [Google Scholar]
  67. Direzione Generale Valutazioni Ambientali. Decreto Direttorale Di Approvazione Degli Indirizzi per l’applicazione Dell’articolo 272-Bis Del Dlgs/2006 in Materia Di Emissioni Odorigene Di Impianti e Attività Elaborato Dal “Coordinamento”; Ministero dell’Ambiente e della Sicurezza Energetica: Roma, Italy, 2023.
  68. El Morr, C.; Jammal, M.; Ali-Hassan, H.; El-Hallak, W. Data Preprocessing. In Machine Learning for Practical Decision Making. International Series in Operations Research & Management Sciencerandom Forest Log; Springer: Cham, Switzerland, 2022; Volume 334. [Google Scholar]
  69. Epping, R.; Koch, M. On-Site Detection of Volatile Organic Compounds (VOCs). Molecules 2023, 28, 1598. [Google Scholar] [CrossRef]
  70. Schütze, A.; Baur, T.; Leidinger, M.; Reimringer, W.; Jung, R.; Conrad, T.; Sauerwald, T. Highly Sensitive and Selective VOC Sensor Systems Based on Semiconductor Gas Sensors: How To? Environments 2017, 4, 20. [Google Scholar] [CrossRef]
  71. EN 14181; Stationary Source Emissions-Quality Assurance of Automated Measuring Systems. CEN: Brussels, Belgium, 2014.
Figure 1. Aerial view of the sawage treatment plant. The red spot indicates the location of the e-Nose and the green spots indicate the position of sedimentation tanks.
Figure 1. Aerial view of the sawage treatment plant. The red spot indicates the location of the e-Nose and the green spots indicate the position of sedimentation tanks.
Atmosphere 15 01401 g001
Figure 2. Wind rose of the study site during the 27-day measurement campaign.
Figure 2. Wind rose of the study site during the 27-day measurement campaign.
Atmosphere 15 01401 g002
Figure 3. Illustration of the distribution of odour concentration during the 27-day campaign in relation to wind direction and frequency.
Figure 3. Illustration of the distribution of odour concentration during the 27-day campaign in relation to wind direction and frequency.
Atmosphere 15 01401 g003
Figure 4. Box plot for percentiles of odour concentration of W, SW, and NW wind direction.
Figure 4. Box plot for percentiles of odour concentration of W, SW, and NW wind direction.
Atmosphere 15 01401 g004
Figure 5. Frequency of odour concentration of plant odour emissions.
Figure 5. Frequency of odour concentration of plant odour emissions.
Atmosphere 15 01401 g005
Figure 6. Random Forest 0 (RF0): series of graphs that utilise the first 30% of the data for training the algorithm and calculate the predictive model on the subsequent remaining 70% of the dataset. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Figure 6. Random Forest 0 (RF0): series of graphs that utilise the first 30% of the data for training the algorithm and calculate the predictive model on the subsequent remaining 70% of the dataset. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Atmosphere 15 01401 g006
Figure 7. Random Forest 1 (RF1): series of graphs that utilise a random 30% of the data for training the algorithm and calculate the predictive model on the subsequent remaining 70% of the dataset. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Figure 7. Random Forest 1 (RF1): series of graphs that utilise a random 30% of the data for training the algorithm and calculate the predictive model on the subsequent remaining 70% of the dataset. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Atmosphere 15 01401 g007
Figure 8. Random Forest 2 (RF2): series of graphs that utilise a random 30% of the data for training the algorithm and calculating the predictive model. The predictive model was applied to the complete dataset. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Figure 8. Random Forest 2 (RF2): series of graphs that utilise a random 30% of the data for training the algorithm and calculating the predictive model. The predictive model was applied to the complete dataset. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Atmosphere 15 01401 g008
Figure 9. Random Forest 0 (RF0): series of graphs that utilise the first 30% of the data for training the algorithm and calculate the predictive model on the subsequent remaining 70% of the dataset. The dataset includes odour data below 1000 ouE/m3. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Figure 9. Random Forest 0 (RF0): series of graphs that utilise the first 30% of the data for training the algorithm and calculate the predictive model on the subsequent remaining 70% of the dataset. The dataset includes odour data below 1000 ouE/m3. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Atmosphere 15 01401 g009
Figure 10. Random Forest 1 (RF1): series of graphs that utilise a random 30% of the data for training the algorithm and calculate the predictive model on the subsequent remaining 70% of the dataset. The dataset includes odour data below 1000 ouE/m3. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Figure 10. Random Forest 1 (RF1): series of graphs that utilise a random 30% of the data for training the algorithm and calculate the predictive model on the subsequent remaining 70% of the dataset. The dataset includes odour data below 1000 ouE/m3. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Atmosphere 15 01401 g010
Figure 11. Random Forest 2 (RF2): series of graphs that utilise a random 30% of the data for training the algorithm and calculating the predictive model. The predictive model was applied to the complete dataset. The dataset includes odour data below 1000 ouE/m3. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Figure 11. Random Forest 2 (RF2): series of graphs that utilise a random 30% of the data for training the algorithm and calculating the predictive model. The predictive model was applied to the complete dataset. The dataset includes odour data below 1000 ouE/m3. (Panel A) illustrates a heat scatter plot, where varying color intensities represent areas of high concentration (shown in red) and low concentration (represented in blue) of values; (Panel B) displays the error plot, offering a graphical representation of the differences between observed and predicted values; (Panel C) presents a bar chart that highlights the distribution of these errors.
Atmosphere 15 01401 g011
Figure 12. Odour concentration obtained from the measurements with the e-Nose and odour concentration calculated using the literature model (Equation (6)).
Figure 12. Odour concentration obtained from the measurements with the e-Nose and odour concentration calculated using the literature model (Equation (6)).
Atmosphere 15 01401 g012
Figure 13. Odour concentration obtained from the measurements with the e-Nose and odour concentration calculated using the Least Squares Model (Equation (7)).
Figure 13. Odour concentration obtained from the measurements with the e-Nose and odour concentration calculated using the Least Squares Model (Equation (7)).
Atmosphere 15 01401 g013
Table 1. Correlation matrix about: odour concentration, VEC, VPID, VH2S, VC6H6, wind intensity, wind direction, temperature, pressure, precipitation, relative humidity, UV index.
Table 1. Correlation matrix about: odour concentration, VEC, VPID, VH2S, VC6H6, wind intensity, wind direction, temperature, pressure, precipitation, relative humidity, UV index.
CcODVECVPIDVH2SVC6H6wintwdirtemppresprecrhumuvidx
CcOD/−0.0050.3370.7800.663−0.143−0.1100.0470.004−0.0090.058−0.096
VEC−0.005/−0.038−0.0430.0530.021−0.0500.0830.023−0.015−0.0760.060
VPID0.337−0.038/0.3290.558−0.042−0.2700.228−0.0580.0030.174−0.086
VH2S0.780−0.0430.329/0.563−0.313−0.129−0.147−0.0030.0360.326−0.315
VC6H60.6630.0530.5580.563/−0.242−0.2870.232−0.021−0.0550.095−0.037
wint−0.1430.021−0.042−0.313−0.242/0.1690.263−0.0370.252−0.4080.149
wdir−0.110−0.050−0.270−0.129−0.2870.169/−0.2370.0010.0920.032−0.015
temp0.0470.0830.228−0.1470.2320.263−0.237/−0.051−0.051−0.5740.231
pres0.0040.023−0.058−0.003−0.021−0.0370.001−0.051/−0.0270.002−0.163
prec−0.009−0.0150.0030.036−0.0550.2520.092−0.051−0.027/0.085−0.050
rhum0.058−0.0760.1740.3260.095−0.4080.032−0.5740.0020.085/−0.360
uvidx−0.0960.060−0.086−0.315−0.0370.149−0.0150.231−0.163−0.050−0.360/
Table 2. Values of odour concentration related to the 25th, 50th, 75th, and 98th percentiles about the W, SW, and NW wind directions.
Table 2. Values of odour concentration related to the 25th, 50th, 75th, and 98th percentiles about the W, SW, and NW wind directions.
WNWSW
25th percentile44.9 ouE/m352.55 ouE/m380.2 ouE/m3
50th percentile114.4 ouE/m3119.4 ouE/m3163.6 ouE/m3
75th percentile269.0 ouE/m3257.0 ouE/m3281.6 ouE/m3
98th percentile2904.0 ouE/m33766.11 ouE/m34547.4 ouE/m3
Table 3. Intercepts, Slope Coefficients, and R-Squared values of the three RF models.
Table 3. Intercepts, Slope Coefficients, and R-Squared values of the three RF models.
ModelsInterceptSlope CoefficientR-SquaredRMSE
RF01.0370.7920.311.125
RF10.2970.9750.70.87
RF2−0.08371.03930.750.75
Table 4. Relative importance of the variables VEC, VPID, VH2S, and VC6H6 in the three Random Forest models.
Table 4. Relative importance of the variables VEC, VPID, VH2S, and VC6H6 in the three Random Forest models.
ModelsVEC
Importance
VPID
Importance
VH2S
Importance
VC6H6
Importance
RF00.8%22.4%40.8%36%
RF11%19.5%49.5%30%
RF21%19.5%49.5%30%
Table 5. Data from RF models related to odourous concentrations below 1000 ouE/m3.
Table 5. Data from RF models related to odourous concentrations below 1000 ouE/m3.
ModelsInterceptSlope CoefficientR-SquaredRMSE
RF04.620.0280.011.17
RF1−0.7581.1320.270.83
RF2−1.111.210.470.71
Table 6. Coefficients of the literature model of Equation (6).
Table 6. Coefficients of the literature model of Equation (6).
CoefficientsEstimateStd. Errort-ValuePr(>|t|)Confidence Ranges
k0−1.18 × 1032.34 × 102−5.05 × 10<2 × 10−161.168 × 103; 1.078 × 103
k13.472.5 × 10−28.77 × 10<2 × 10−163.26; 3.41
Table 7. Coefficient of the least squares model of Equation (7).
Table 7. Coefficient of the least squares model of Equation (7).
CoefficientsEstimateStd. Errort ValuePr(>|t|)Confidence Range
k0−1.39 × 1032.19 × 10−63.3<2 × 10−16−1.37 × 103; −1.29 × 103
k32.644.31 × 10−261.2<2 × 10−162.46; 2.63
k41.99 × 1055.83 × 10334.1<2 × 10−161.8 × 105; 2.1 × 105
Table 8. Comparison of metrics between the two models.
Table 8. Comparison of metrics between the two models.
MetricsLiterature ModelLeast Squares Model
R20.60.7
MSE750.8612.8
RMSE866.5782.8
MAE590.9515.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Franchina, C.; Cefalì, A.M.; Gianotti, M.; Frugis, A.; Corradi, C.; De Prosperis, G.; Ronzio, D.; Ferrero, L.; Bolzacchini, E.; Cipriano, D. Innovative Approaches to Industrial Odour Monitoring: From Chemical Analysis to Predictive Models. Atmosphere 2024, 15, 1401. https://doi.org/10.3390/atmos15121401

AMA Style

Franchina C, Cefalì AM, Gianotti M, Frugis A, Corradi C, De Prosperis G, Ronzio D, Ferrero L, Bolzacchini E, Cipriano D. Innovative Approaches to Industrial Odour Monitoring: From Chemical Analysis to Predictive Models. Atmosphere. 2024; 15(12):1401. https://doi.org/10.3390/atmos15121401

Chicago/Turabian Style

Franchina, Claudia, Amedeo Manuel Cefalì, Martina Gianotti, Alessandro Frugis, Corrado Corradi, Giulio De Prosperis, Dario Ronzio, Luca Ferrero, Ezio Bolzacchini, and Domenico Cipriano. 2024. "Innovative Approaches to Industrial Odour Monitoring: From Chemical Analysis to Predictive Models" Atmosphere 15, no. 12: 1401. https://doi.org/10.3390/atmos15121401

APA Style

Franchina, C., Cefalì, A. M., Gianotti, M., Frugis, A., Corradi, C., De Prosperis, G., Ronzio, D., Ferrero, L., Bolzacchini, E., & Cipriano, D. (2024). Innovative Approaches to Industrial Odour Monitoring: From Chemical Analysis to Predictive Models. Atmosphere, 15(12), 1401. https://doi.org/10.3390/atmos15121401

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop