Use of Association Algorithms in Air Quality Monitoring
Round 1
Reviewer 1 Report (New Reviewer)
- Sources of CO2 and PM are essentially the same (vehicle flow)?- Development of PM also depends on the concentration of SO2, NO2, along with CO2. As PM concentration in a city comprises both primary as well as secondary PM, it will be an exaggeration to say that.
- As mentioned in the manuscript also, effect of meteorology should be considered.
- The manuscript is only focused on the vehicular pollution of the city, but there is lack of information about the road structures, like paved/unpaved, number of vehicles (if available), was not taken into consideration.
- Dust resuspension contributes a large proportion to PM pollution. There is lack of information in the manuscript on this aspect.
- Since, the manuscript only targets to the interpretation of PM with concentration of CO2 or it mentions the correlation analysis between them, it cannot be generalized for any random city due to specific sources of pollution.
General comments:
Line 35: population instead of pollution
Line 43: Number of references is unsystematic. Need to be arranged accordingly as they appear in the text first.
Line 84: It is not possible to generalize the statement and at least 3-4 number of pollutants are required to know about the status of air quality. Although CO2 is a good indicator, it only recognizes the vehicular and industrial pollution and bypassing the other important sources contributing to air pollution in a particular area. So, it should be indicated, that in what type of geographical area, CO2 act as an excellent indicator.
Figure 1: No specific difference is visible in Sensor 1,2, and 3.
Author Response
Dear reviewer, we appreciate the suggestions for improvement made in our work. We try to meet all the notes, and we are available for further clarifications and revisions. We rely on your experience to help us adapt our work to the journal's standards.
We leave the changes made to the text with the color red
- Are CO2 and PM sources essentially the same (vehicle flow)?
The city used to test the use of the monitoring platform, despite having a population of 500 thousand in habitants, has few industries. However, it is one of the cities with the highest number of cars per population in Brazil.
Due to the large number of vehicles, many parts of the city are highly congested at certain times. We selected some of these critical points to make measurements, and test whether the monitoring system is capable of detecting these pollution peaks.
We have modified the text in the introduction, aiming to make it clearer about the main polluting sources.
- The development of PM also depends on the concentration of SO2, NO2 along with CO2.
We completely agree with the reviewer. PM concentration is dependent on several other pollutants.
However, an important point to be reinforced is that the association algorithm does not determine cause, that is, it is not our objective to conclude that the increase in the concentration of particulate matter is caused by the increase in the concentration of carbon dioxide. The system does not determine causes or effects, it only finds patterns between variables, so that it is possible to predict the standard behavior of a variable, based on the value of another.
Thus, during the monitoring period, our system demonstrated that, by default, when the CO2 concentration is higher (above the ideal limit), there is a tendency for the particulate concentration to also be at higher values.
But we fully agree with the reviewer that the more pollutants the platform measures (generates more parameters), the greater the degree of reliability of the results. However, to measure each of the pollutants, a specific type of sensor is required. When a large number of sensors work simultaneously, it generates a serious problem of energy consumption, reducing the durability of the batteries, in addition to increasing the cost of building the system.
However, we concluded after the tests that it would be very important for the platform to also have a humidity sensor, as this variable has a strong influence on particulate matter. In the next version of the platform, we will certainly add more sensors in order to optimize the results.
To try to make this idea clearer in the text, we modified in section 3, a paragraph that describes the objective of the association algorithm, which is basically to find statistical patterns, without determining causes and effects.
- As mentioned in the manuscript too, the effect of meteorology must be considered.
We completely agree with the reviewer. We observed that on rainy days the error percentage of the algorithm was higher than on sunny days. If the platform had a humidity sensor, the accuracy of the algorithm would certainly be improved.
However, the insertion of new sensors needs to be carefully planned, as it increases battery and data transmission demands.
But we are already developing a new version of the platform, which will be made up of more sensors, including a humidity sensor.
In order to make this meteorological issue more detailed, we modified paragraph x, showing the importance of considering humidity in PM measurement.
- The manuscript is focused only on vehicular pollution in the city, but it lacks information about road structures, such as paved/unpaved, number of vehicles (if available), it was not taken into account. Dust resuspension is a major contributor to PM pollution. There is a lack of information in the manuscript on this aspect.
We fully agree with the reviewer on the importance of considering the road structure of the monitored site. However, our main objective in this work is to present the architecture and technical details of the developed monitoring system.
The 30-day period in which we carried out the measurements served only to test the behavior and efficiency of the project. But we failed to make this information clearer in the text.
To solve this problem, we added two paragraphs in the introduction (6th and 7th) to clarify our main objective, which is to describe the improvements made in the third version of the monitoring system.
- As the manuscript only aims at the interpretation of PM with CO2 concentration or mentions the correlation analysis between them, it cannot be generalized to any random city due to specific sources of pollution.
We completely agree with the reviewer. The association algorithm does not establish the cause and effect relationship. It only determines a pattern of behavior.
In this way, each city needs to be monitored to have its air quality standards determined.
To make this information clearer, we inserted in the text that it is not possible to generalize to other cities.
Once again, we thank all the observations and suggestions made by the reviewer, and we are available to make further improvements to the work.
Author Response File: Author Response.docx
Reviewer 2 Report (New Reviewer)
The authors have choosen right topic to present the association of algorithms to couple the air pollutant data.
In this work, the authors have worked for CO and PM concentrations at different sites
But the results are not well established and extensively not conducted.
I suggest and recommend the authors to work for extensive analysis to know strong association between CO and PM concnetrations through AI and ML algorithms.
I also encourage the authors to work on other pollutant levels such O3 and PM, O3, NO2 and PM
Elaborate discussion based on teh results should be presented
the novelty is clear but still the importance and significance of work in the perspective of marketing the sensor is not clear at this moment
the authors should address and define problem statement and definition of problem and how the problem will be resolved by developing association algorithms between air pollutants
at several instances English grammar should be corrected adn very lines need to be reconstructred.
Author Response
# Reviewer 2
Dear reviewer, we appreciate the suggestions for improvement made in our work. We try to meet all the notes, and we are available for further clarifications and revisions. We rely on your experience to help us adapt our work to the journal's standards.
We leave the changes made to the text with the color red
- The authors chose the right topic to present the association of algorithms to couple air pollutant data. In this work, the authors worked for CO and PM concentrations in different locations. I suggest and recommend authors to work on extensive analyzes to know the strong association between CO and PM concentrations through AI and ML algorithms.
We very much appreciate this observation made by the reviewer. Before specifically addressing the reviewer's observations, I'd like to put the project in context a little better.
Our research group developed and published, in 2018, the first version of the air quality monitoring platform. It consisted of a network of wireless sensors that measured the concentration of pollutants. Unlike most similar works at the time, we added a data analysis and mining system, with the aim of generating useful knowledge so that engineers and scholars could make the best decisions.
Despite the good results, during the tests, we noticed that the project had a series of limitations, generated mainly by problems in the communication of the sensors, which were directly affected by factors such as distance, physical barriers and interference. In this way, many environments could not be monitored, such as forests, agricultural regions and places with very high buildings.
In early 2022 we developed a second version of the project, with a series of improvements, and published it in volume 13 of the Atmosphere journal. In this second version of the air quality monitoring platform, more modern techniques were used, capable of resolving the existing limitations of the previous project. This time, instead of a wireless sensor network, the concept of autonomous sensors was used, which right after taking the measurements, record the data directly on the central server, and in case of unavailability of internet signal, it stores the data. in your local database, to sync later.
And now, in 2023, we are presenting a third version of the project for evaluation. The objective was to increase the possibilities of air quality analysis. In addition to the carbon dioxide sensors, equipment was added to measure the concentration of particulate matter.
In addition, an artificial intelligence algorithm known as Apriori has also been added. As we commented in the work, it is widely used in the business environment to make associations between variables, that is, to estimate the value of a variable separately from others. We have adapted this algorithm so that, based on the carbon dioxide concentration, it can estimate the level of particulate matter concentration in the monitored location.
With this context in mind, we would like to say that we fully agree with the reviewer that for a more effective monitoring of air quality, it would be interesting for measurements to take place over a period longer than 30 days. Some references even suggest years of monitoring for a good result.
However, our main objective in this work is to present the architecture and technical details of the monitoring system. The 30-day period in which we carried out the measurements served only to test the behavior and efficiency of the project when put into operation in an uncontrolled environment. But we failed to make this information clearer in the text.
To solve this problem, we added two paragraphs in the introduction (6th and 7th) to clarify our main objective, which is to describe the improvements made in the third version of the monitoring system.
- I also encourage authors to work on other levels of pollutants such as O3 and PM, O3, NO2 and PM
We appreciate the reviewer's suggestion.
We completely agree on the importance of measuring other gases such as ozone (O3), sulfur dioxide (SO2), among many others. However, each of these pollutants requires a specific type of sensor. When a large number of sensors work simultaneously, it generates a serious energy consumption problem, greatly reducing battery life.
But we are developing a new version of the monitoring platform that will have resources to monitor at least 3 more types of pollutants and also the percentage of humidity (which has a strong influence on PM).
- Authors must address and define the problem statement and problem definition and how the problem will be solved by developing association algorithms between air pollutants
Thanks for the author's suggestion.
Each type of pollutant needs a specific type of sensor. This fact makes the quality monitoring process a costly activity.
Our system aims to facilitate the air quality monitoring process. For this, the system measures few pollutants (in principle CO2 and PM), and through machine learning algorithms, it is able to estimate the concentration of other toxic gases.
To make this information more evident, we modified the 4th and 5th paragraphs of the introduction to clarify the difficulties in carrying out air quality monitoring, and how the use of AI can help in this process.
- In many cases the English grammar must be corrected and many lines need to be reconstructed.
We agree with the reviewer. We carried out an extensive revision of the text (by hand and using the Gramarly software), with the aim of adapting the problems to the language. If the work is accepted, we will also hire the English language revision service offered by the journal, to adapt the text to the standards.
Once again, we thank all the observations and suggestions made by the reviewer, and we are available to make further improvements to the work.
Author Response File: Author Response.docx
Reviewer 3 Report (New Reviewer)
1) There is not enough explanation for simulation results. You need to clearly explain the simulation results
2) The problem statement has been discussed accordingly. However, there is a lacking in term of literature reviews supporting the proposed method. The limitations of the previous methods are not discussed rigorously on how the proposed method is then chosen and enhanced in this study.
3) In the results and discussion, the performance of the proposed method should be compared to several baseline methods in order to prove the effectiveness of the proposed method.
4) For the performance evaluation, it is better to add some objective evaluations performance together with the subjective performance evaluation so as to confirm the robustness of the new system.
5) The author should add analysis of result. How good of your method
Author Response
# Reviewer 3
Dear reviewer, we appreciate the suggestions for improvement made in our work. We try to meet all the notes, and we are available for further clarifications and revisions. We rely on your experience to help us adapt our work to the journal's standards.
We leave the changes made to the text with the color red
- There is not enough explanation for the simulation results. You need to clearly explain the simulation results.
We very much appreciate this observation made by the reviewer. Before specifically addressing the reviewer's observations, I'd like to put the project in context a little better.
Our research group developed and published, in 2018, the first version of the air quality monitoring platform. It consisted of a network of wireless sensors that measured the concentration of pollutants. Unlike most similar works at the time, we added a data analysis and mining system, with the aim of generating useful knowledge so that engineers and scholars could make the best decisions.
Despite the good results, during the tests, we noticed that the project had a series of limitations, generated mainly by problems in the communication of the sensors, which were directly affected by factors such as distance, physical barriers and interference. In this way, many environments could not be monitored, such as forests, agricultural regions and places with very high buildings.
In early 2022 we developed a second version of the project, with a series of improvements, and published it in volume 13 of the Atmosphere journal. In this second version of the air quality monitoring platform, more modern techniques were used, capable of resolving the existing limitations of the previous project. This time, instead of a wireless sensor network, the concept of autonomous sensors was used, which right after taking the measurements, record the data directly on the central server, and in case of unavailability of internet signal, it stores the data. in your local database, to sync later.
And now, in 2023, we are presenting a third version of the project for evaluation. The objective was to increase the possibilities of air quality analysis. In addition to the carbon dioxide sensors, equipment was added to measure the concentration of particulate matter.
In addition, an artificial intelligence algorithm known as Apriori has also been added. As we commented in the work, it is widely used in the business environment to make associations between variables, that is, to estimate the value of a variable separately from others. We have adapted this algorithm so that, based on the carbon dioxide concentration, it can estimate the level of particulate matter concentration in the monitored location.
With this contextualization in mind, we would like to reinforce that our objective in this work is not to generate a report with the air quality profile in the city. This would require more time and monitoring.
Our main objective in this work is to present the architecture and technical details of the developed monitoring system. The 30-day period in which we carried out the measurements served only to test the behavior and efficiency of the project when put into operation in an uncontrolled environment. But we failed to make this information clearer in the text.
To solve this problem, we added two paragraphs in the introduction (6th and 7th) to clarify our main objective, which is to describe the improvements made in the third version of the monitoring system.
- The problem statement was discussed accordingly. However, there is a lack of literature reviews that support the proposed method. Limitations of previous methods are not discussed rigorously on how the proposed method is then chosen and improved in this study.
We appreciate the reviewer's suggestion. The machine learning techniques present in the project are used very frequently in a commercial environment and even well-known sites, such as YouTube, which, by associating videos watched by users, prepares a list of suggestions for new videos.
However, there are some works where these algorithms are used for environmental research. References 22 and 23 present in the work are excellent examples.
In response to the reviewer's suggestion, we modified the second and third paragraphs of the introduction, describing these works that apply machine learning algorithms to detect pollution patterns.
- In the results and discussion, the performance of the proposed method should be compared to various baseline methods to prove the effectiveness of the proposed method.
We appreciate the observation made by the reviewer. We modified the 4th and 5th paragraphs of the discussion to clarify the performance and effectiveness of the method. Using a validation-cross technique, it is possible to determine the percentage of success of the algorithm.
In the 4th and 5th paragraphs of the discussion we show that the machine learning algorithm got the association between CO2 and PM10 right at 77.5% and PM2.5 at 71.9%.
- The author must add result analysis.
We appreciate the reviewer's comment. We modified the discussion section, clarifying the results obtained during the research.
We also reinforce in the text that our main objective in this work is to present the architecture of the monitoring system developed by our research group, as well as the types of data analysis it is capable of performing. The 30 days of monitoring carried out served to verify the behavior of the project in a real environment.
Once again, we thank all the observations and suggestions made by the reviewer, and we are available to make further improvements to the work.
Author Response File: Author Response.docx
Round 2
Reviewer 2 Report (New Reviewer)
The authors have revised the work considering all the comments and suggestions givne by the referees. The current version of paper has improved alot and signficant and extensive revision has been taken. Hence the current verison of paper is readily accepted for publication and i recommend and dont have hesitation for its acceptance. thank you for the opportunity given to read and reivew the work.
This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.
Round 1
Reviewer 1 Report
Explanation can be improved related to results and conclusions.
Reviewer 2 Report
Introduction:
1. Introduction on the PM 2.5 is pretty clear. The author referred their previous work that CO2 measurement was done for air quality measurement. Are there any other literatures associating single or multiple parameters to estimate PM? A few more past work literature should be added.
Methods
1. The mathematics of the algortihm added should be given. How the algortihm improves the desired performance should be mentioned.
2. Any hyperparameter tunings were done to optimize output, should be discussed.
Results and discussion:
1. To claim that CO2 is associated to PM by just comparing 2 parameters data collection seems inadequate. Perhaps a statistical/ correlation analysis that validate the claim can be attempted.
2. State the association and classification margin error values. What is the acceptable range?
3. Figure 5: The air quality, does it refer to only CO2 and PM data or overall air quality?
Reviewer 3 Report
I read the article entitled "USE OF ASSOCIATION ALGORITHMS IN AIR QUALITY MONITORING". Unfortunately, the article did not follow the basic principles of a scientific article and is more like a report. There are only 14 references in the article and in some cases the YouTube site is mentioned! So I reject it. My comments are as follows:
* The language should be improved.
* Referencing should be appropriate. In line 40 and 62, two references should be integrated together.
* The introduction section is very weak. In the introduction, it includes the introduction and review of the literature. However, in this section of the introduction of two paragraphs, only the effects of pollutants and PM are defined. It should include several sections in the introduction and discuss one aspect of the study in each section. There is no literature review in the introduction. It is not clear what the purpose of the authors is and what is the background of the method considered in their study! Only 8 references are used in the introduction section, which indicates the weakness of the article.
* The state of pollution in Brazil and the background of its studies are not mentioned in the introduction or in the methodology. The position of the present study among previous studies is not clear.
* The methodology is very poorly expressed. In the beginning, the history is stated, which is not related to the topic. In the second part, non-related material is also presented! It is better to explain the method used in the current study and to specify what kind of artificial intelligence algorithm is used than to refer to YouTube website!
* The results are very superficial and are just a report. The basis of the method under consideration is very small. In the study, only 3 stations were examined for a period of one month! Regardless of the fact that such studies are usually based on more stations over a long period of time (for example, 20 years), how can the evaluated results be extended to different months of the year? Only CO2 and PM parameters have been evaluated, while other parameters such as ozone (O3), sulfur dioxide (SO2), carbon monoxide (CO), nitrogen dioxide (NO2) are also very important. What is the difference between your article and the following study published in Brazil?
10.5327/Z21769478782