Use of IoT with Deep Learning for Classification of Environment Sounds and Detection of Gases

Mishra, Priya; Mishra, Naveen; Choudhary, Dilip Kumar; Pareek, Prakash; Reis, Manuel J. C. S.

doi:10.3390/computers14020033

Open AccessArticle

Use of IoT with Deep Learning for Classification of Environment Sounds and Detection of Gases

by

Priya Mishra

¹,

Naveen Mishra

¹

,

Dilip Kumar Choudhary

¹

,

Prakash Pareek

²

and

Manuel J. C. S. Reis

^3,*

¹

Department of Communication Engineering, School of Electronics Engineering, Vellore Institute of Technology, Vellore 632014, TN, India

²

Electronics and Communication Engineering, Vishnu Institute of Technology, Bhimavaram 534202, AP, India

³

Engineering Department, Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Trás-os-Montes e Alto Douro, 5000-801 Vila Real, Portugal

^*

Author to whom correspondence should be addressed.

Computers 2025, 14(2), 33; https://doi.org/10.3390/computers14020033

Submission received: 29 October 2024 / Revised: 18 January 2025 / Accepted: 19 January 2025 / Published: 22 January 2025

(This article belongs to the Special Issue Wireless Sensor Network, IoT and Cloud Computing Technologies for Smart Cities)

Download

Browse Figures

Versions Notes

Abstract

:

The need for safe and healthy air quality has become critical as urbanization and industrialization increase, leading to health risks and environmental concerns. Gas leaks, particularly of gases like carbon monoxide, methane, and liquefied petroleum gas (LPG), pose significant dangers due to their flammability and toxicity. LPG, widely used in residential and industrial settings, is especially hazardous because it is colorless, odorless, and highly flammable, making undetected leaks an explosion risk. To mitigate these dangers, modern gas detection systems employ sensors, microcontrollers, and real-time monitoring to quickly identify dangerous gas levels. This study introduces an IoT-based system designed for comprehensive environmental monitoring, with a focus on detecting LPG and butane leaks. Using sensors like the MQ6 for gas detection, MQ135 for air quality, and DHT11 for temperature and humidity, the system, managed by an Arduino Mega, collects data and sends these to the ThingSpeak platform for analysis and visualization. In cases of elevated gas levels, it triggers an alarm and notifies the user through IFTTT. Additionally, the system includes a microphone and a CNN model for analyzing audio data, enabling a thorough environmental assessment by identifying specific sounds related to ongoing activities, reaching an accuracy of 96%.

Keywords:

IoT; deep learning; CNN model; environmental sounds; gas detection; MQ6 sensor; MQ135 sensor; DHT11 sensor; IFTTT

1. Introduction

Air pollution, a pervasive and escalating global challenge, poses severe threats to public health, particularly in densely populated urban areas. The intricate web of causative factors encompasses vehicular emissions, industrial activities, and the multifaceted consequences of climate change [1,2]. Governments worldwide are grappling with the complexity of this crisis, evident in initiatives like the commitment of European countries to transitioning to electric vehicles by 2030 [3,4,5]. India has set an ambitious target of achieving a similar transition by 2025, reflecting the urgency to curb the environmental and health ramifications of air pollution [6,7]. In recent months, Delhi, India’s capital, has witnessed a marginal improvement in air quality levels. However, persistent reports highlight concerning concentrations of PM 2.5—fine particles capable of deeply penetrating the lungs [8]—in various pockets of the city and the broader National Capital Region (NCR). Delhi, ranking among the most polluted cities globally, becomes a focal point for intense public debates, particularly as winter sets in, drawing widespread international attention to its ongoing battle with air pollution.

A particularly alarming facet in the Indian context is the frequency of liquefied petroleum gas (LPG) accidents, which are a significant safety concern in the country, with frequent incidents involving both domestic and industrial cylinders [9,10,11,12,13]. These accidents result in a corresponding number of fatalities, disproportionately affecting the youth. The gravity of these incidents extends beyond individual households, underscoring the urgent need for technological interventions to prevent such tragedies. In this landscape, the Internet of Things (IoT) emerges as a dynamic and transformative technological frontier, particularly within the automotive sector, serving as a foundational element for the unfolding era of Industry 4.0. In the pursuit of addressing the escalating global challenge of air pollution, this research draws inspiration from various foundational work in the fields of environmental monitoring, machine learning, and the IoT. Notably, studies have significantly influenced the exploration of machine listening systems within computational auditory scene analysis (CASA) [14]. Moreover, the integration of IoT technology in environmental monitoring has the potential to comprehend and mitigate the impacts of air pollution [15]. Additionally, the research has highlighted the pivotal role of IoT applications in ensuring environmental safety [16,17]. The intersection of IoT technology and smart home management has put emphasis on the potential of the IoT to create intelligent and interconnected living environments [18,19].

As this research explores the potential of IoT technology in tandem with deep learning for environmental monitoring, it acknowledges the comprehensive body of work in the papers cited above, and in particular those presented in Section 2. This convergence of insights from diverse disciplines forms the basis for an innovative approach to address the multifaceted challenges posed by air pollution, with the intention of fostering a healthier and more sustainable future. Additionally, findings from research on recurrence quantification analysis features for auditory scene classification [20] and principles, algorithms, and applications of computational auditory scene analysis [21] enhance the comprehension of auditory scene classification principles and algorithms, thereby strengthening the groundwork for the proposed research. In the realm of machine learning, particularly deep learning, this research leverages insights from a study focused on understanding the effective receptive field in deep convolutional neural networks [22]. The work offers valuable perspectives on the architectural considerations for CNNs, a crucial component in the proposed acoustic scene classification (ASC) deep learning model. The significance of these foundational studies is further underscored by the inclusion of datasets, such as one curated by K. J. Piczak [23], serving as a benchmark for environmental sound classification studies.

Having these ideas as main guides, in this research we propose and present an IoT-enabled system for comprehensive environmental assessment encompassing air quality, temperature, and humidity. This system integrates MQ6 sensors with the aim of detecting ambient gases, with a primary focus on monitoring LPG and butane gas concentrations in the atmosphere. This distinctive feature not only facilitates the identification of hazardous gas leakages, particularly common in household LPG usage, but also empowers pre-emptive measures. The environmental parameters of air quality, temperature, and humidity are also gauged through the utilization of MQ135 and DHT11 sensors. With the main aim of controlling the costs of the hardware component of the system, an Arduino Mega is used to control the collection of sensor data, transmitting it seamlessly to both the ThingSpeak cloud and a convolutional neural network (CNN) model. Through its connection to ThingSpeak, this device guarantees the methodical preservation and presentation of environmental data, simplifying the monitoring and analysis procedures. If hazardous gases are detected, the system immediately activates an alarm and simultaneously sends a real-time alert to the device owner through the integrated “if this, then that” (IFTTT) application programming interface (API). To enhance environmental awareness, a built-in microphone records audio from the environment, which is then analyzed by a CNN-based deep learning model. This model utilizes methods like spectrogram creation, Convolution2D, and MaxPooling2D in a sequential model design to achieve high levels of precision, offering valuable insights into ongoing environmental activities, such as the detection of sirens and maneuvers in the area of the forces fighting the types of disasters indicated above and associated with the problem under analysis here, and the consequent updating of the user’s alert level. When detecting hazardous gases, both the official authorities and the user of the system proposed here can be simultaneously alerted, and by monitoring the environmental sounds, we can update their alter level.

The remainder of this paper is organized as follows. In the next section, Section 2, we present some of the most important related work, followed by a description of the design of the proposed approach in Section 3. In Section 4, we present the main results achieved with the testing of the proposed solution, along with a discussion and explanation of these results. The paper ends with the presentation of the main conclusions.

2. Related Work

With rising concerns over pollutants in air, water, and soil, artificial intelligence (AI)-driven sensors and IoT systems are increasingly essential for tracking and responding to environmental risks. The number of relevant studies related to the problem of using IoT and deep learning (DL) for environmental sound classification and gas detection has been increasing in recent years. These studies cover IoT frameworks, machine learning (ML) techniques, environmental monitoring applications, and specific advances in DL for sound and gas recognition. The studies presented below reflect the trend of deploying IoT and DL for monitoring and alerting systems that respond to hazardous conditions. Such frameworks are becoming increasingly valuable for assisting disaster management teams in detecting and mitigating disaster-related risks through real-time multi-modal analysis.

A few relevant studies reviewing the integration of ML and AI with IoT-based gas detection follow. The authors of [24] explore how ML enhances the IoT by uncovering insights from sensor data, enabling the IoT to meet future needs in business, government, and individual use. By automating decision-making to mimic human responses, ML empowers the IoT to better perceive and respond to environmental changes. The authors review around 300 sources to classify ML-enabled IoT from data, application, and industry perspectives, highlighting advanced methods and applications. Emerging IoT trends like the Internet of Behavior (IoB), pandemic management, autonomous vehicles, edge/fog computing, and lightweight deep learning are also discussed. Additionally, the paper categorizes IoT challenges into technological, individual, business, and societal domains, offering insights to leverage IoT for more prosperous and sustainable societies. The review in [25] examines the role of AI and the IoT in detecting and monitoring environmental hazards, emphasizing the importance of these technologies in safeguarding human health and ecosystems. It highlights recent advancements in AI and the IoT for pollution monitoring, addressing the complexities of predicting environmental changes. It also explores the integration of ML, which, while transformative for environmental science, presents challenges in model selection, interpretability, data sharing, and balancing performance with practical requirements. The review sheds light on the latest trends and considerations for using AI and IoT in environmental monitoring. In [26], the authors provide an overview of deep neural networks (DNNs) and their various architectures, highlighting the potential benefits of integrating DL with the IoT. They review several DL-driven IoT use cases, followed by the presentation of a DL-based model specifically designed for human activity recognition (HAR). The study also includes a performance comparison of this model with other machine learning techniques, demonstrating the advantages of the DL approach.

Concerning the use of these same technologies and methods to aid authorities in disaster management, the survey in [27] addresses the increasing frequency of man-made and natural disasters over the past 50 years that have resulted in significant loss of life, infrastructure damage, and economic disruption. The paper emphasizes the need for a comprehensive solution that includes early detection, prevention, recovery, and management strategies to minimize losses. It critically analyzes existing methods and technologies relevant to disaster scenarios, such as wireless sensor networks (WSNs), remote sensing techniques, artificial intelligence, the IoT, unmanned aerial vehicles (UAVs), and satellite imagery for disaster monitoring and management. The paper highlights the importance of alternative communication networks, particularly during emergencies when traditional networks may be disrupted. It provides a comprehensive study of various disasters, including landslides, forest fires, and earthquakes, and examines the latest technologies used for their monitoring and management. The paper also discusses essential parameters for disaster detection, offers solutions, and explores big data analytics in disaster management. Additionally, it evaluates various techniques, noting their advantages and disadvantages, while identifying open challenges and suggesting future research directions. The authors of [28] review the integration of IoT solutions into early warning systems for natural disasters, such as floods, earthquakes, tsunamis, and landslides. The paper emphasizes the need for systems that predict disasters and disseminate timely warnings to minimize economic and human losses. The review analyzes various IoT architectures used in early warning systems, outlining their constraints and requirements. It identifies the most commonly employed solutions for each type of disaster and highlights significant gaps in the existing literature. Additionally, the paper suggests enhancements to meet the specific needs of each use case, particularly emphasizing the benefits of incorporating a fog–edge computing layer into IoT architectures to improve data processing and response times. The work in [29] surveys how the IoT can address various challenges related to disaster management, leveraging the IoT’s unique features like interoperability, flexibility, and lightweight architecture. It reviews current approaches for handling disasters, focusing on early warning systems, notifications, data analytics, knowledge aggregation, remote monitoring, real-time analytics, and victim localization. The paper provides a detailed discussion of state-of-the-art strategies in disaster response and prevention, along with IoT-supported protocols and commercially available products ready for deployment.

The number of studies devoted to environmental and industrial monitoring using these technologies, and particularly dealing with gas monitoring, is huge and increasing every day. The work in [30] highlights the development of a low-cost IoT-based framework for monitoring environmental parameters such as air quality, gas levels, temperature, and humidity, addressing the growing demand for enhanced environmental safety and industrial risk mitigation. Utilizing sensors and microcontrollers, the system monitors changes both indoors and outdoors, with data stored on cloud-based platforms like ThingSpeak for visualization and worldwide access. Real-time alert mechanisms, such as mobile notifications, ensure prompt responses to hazardous conditions. The system’s low cost, accuracy, user-friendly design, and scalability make it suitable for urban and industrial applications. Testing demonstrates its high reliability and effectiveness across diverse conditions, showcasing its potential to mitigate risks posed by harmful gas contamination and promote environmental awareness. The authors in [31] address the dangers of gas leakage and its impact on human safety, property, and the environment, particularly in developing countries like Bangladesh. The authors develop a low-cost IoT-based prototype that detects gas leaks and fires caused by gas leakage. When gas is detected, the system automatically shuts off the gas line using a solenoid valve, activates an exhaust fan, and sends a notification to the user via GSM. If a fire is detected, the system releases fire extinguisher balls. Additionally, a buzzer sounds, and an LCD displays the system’s status in real time. This integrated system aims to minimize the risks associated with gas leaks and fires through intelligent, automated response mechanisms. The study in [32] introduces a hybrid convolutional–recurrent neural network (H-CRNN) for early detection of harmful gases using an electronic nose (E-nose) system, specifically addressing the need for improved carbon monoxide (CO) detection due to its hidden and harmful nature. Traditional E-nose methods using basic machine learning models, such as support vector machines (SVMs) or recurrent neural networks (RNNs), often suffer from insufficient accuracy and require extensive time and manpower for sensory analysis. The H-CRNN overcomes these limitations by combining convolutional and recurrent neural network architectures to capture long-term dependencies in data. Key features of this model include shortcut connections, a gated attention mechanism, and linear processing units, which enhance detection sensitivity, scalability, and adaptability to varying conditions. This innovative approach demonstrates significant potential for real-time gas detection in both industrial and environmental monitoring applications. In [33], the authors address industrial accidents caused by harmful gas leaks by monitoring and controlling toxic gases, such as NO₂, CO, O₃, SO₂, and LPG, and hydrocarbons, along with environmental factors like temperature and humidity. They used an Arduino UNO R3 board with various sensors (AQ3, Minipid 2 HS PID, IR5500, DHT11, MQ3) and a Wi-Fi module to transmit real-time data to a cloud-connected safety board. Machine learning and AI are applied for intelligent, real-time predictions, with data accessible worldwide. To ensure high data quality, a hybrid error detection model combining hidden Markov models (HMMs) and artificial neural networks (ANNs) was used. The authors in [34] discuss a real-time, IoT-based environmental monitoring system designed for individuals working in hazardous environments, such as farmers, sailors, travelers, and miners. To help avoid risks and manage environmental changes, the system continuously monitors critical meteorological parameters like air quality, rainfall, water level, pH, wind direction and speed, temperature, atmospheric pressure, humidity, soil moisture, light intensity, and turbidity. Using sensors, an Arduino UNO microcontroller, GSM, Wi-Fi, and HTTP protocols, this smart monitoring system transmits data to remote locations via a web-based platform. The authors describe the system as efficient, accurate, cost-effective, and reliable, enabling users to track environmental changes from anywhere, contributing to sustainable living and enhanced environmental standards. The work in [33] discusses an innovative system designed to monitor and control hazardous gas leakage in industrial settings, aiming to prevent accidents that often result in fatalities. The system utilizes an Arduino UNO R3 microcontroller connected to various sensors (AQ3, Minipid 2 HS PID, IR5500, DHT11, and MQ3) and an ESP8266 Wi-Fi module to collect real-time data on toxic gases, temperature, and humidity. These data are stored in the cloud and can trigger alerts to safety control boards. The system employs ML and AI for intelligent predictions and real-time data analysis, emphasizing the importance of sensor data quality in IoT applications. To enhance error detection in sensor datasets, hybrid hidden Markov models and artificial neural networks are implemented. In [35], the authors discuss the ongoing reliance on manual labor for maintaining sewage systems, highlighting the safety risks faced by workers due to inadequate precautions. To address these hazards, gas sensors are employed to continuously monitor harmful compounds like carbon monoxide and methane, ensuring compliance with safety regulations. However, the lack of safety gear exposes workers to danger, necessitating a prompt detection and alerting system. The authors developed a device capable of detecting hazardous chemicals, such as ammonia, carbon monoxide, methane, and hydrogen sulfide, while also measuring temperature within manholes. Upon detecting any irregularities, the system triggers a buzzer and sends an alert message via a GSM module to facilitate the quick rescue of workers.

In this context, some of the studies also deal with the identification of environmental sounds. The research in [36] explores the integration of multimodal environmental sensing systems using AI and IoT technologies to enhance disaster response and environmental monitoring capabilities. By combining audio data with other environmental parameters (e.g., gas levels, humidity), the system leverages advanced deep learning models for audio signal analysis, facilitating real-time decision-making in resource-constrained settings. To address the challenges of data transfer and storage in such scenarios, the authors introduce an innovative audio compression method optimized for environmental monitoring in multimodal data processing systems. This approach combines a deep learning model tailored for edge devices with a standard audio coding scheme, significantly reducing bit rates while preserving the accuracy critical for air pollution analysis. Once compressed, the audio data are transmitted to the cloud, where powerful computational resources decode them for precise reconstruction and classification, ensuring efficiency and reliability in environmental assessments. The work in [37] examines the effectiveness of transfer learning using pre-trained CNNs for sound classification. Originally designed for image recognition, CNNs have been adapted for audio tasks, where transfer learning involves retraining networks on new datasets. Five CNN models were tested—GoogLeNet, SqueezeNet, and ShuffleNet (trained on images), and VGGish and YAMNet (trained on sounds). The study analyzed the impact of retraining parameters, such as optimizer choice, mini-batch size, learning rate, and number of epochs, on classification accuracy and processing time, including sound pre-processing into scalograms and spectrograms. Using UrbanSound8K, ESC-10, and Air Compressor datasets, the study identified optimal parameter combinations based on accuracy and processing efficiency. The work in [38] explores the use of CNNs for classifying short audio clips of environmental sounds. The proposed model includes two convolutional layers with max-pooling and two fully connected layers trained on segmented spectrograms with delta features. The model was evaluated on three public datasets of environmental and urban audio recordings, surpassed baseline methods using Mel-frequency cepstral coefficients, and achieved accuracy comparable to other state-of-the-art techniques. The survey in [39] reviews state-of-the-art deep learning-based audio classification techniques and their applications within IoT frameworks, focusing on architectures like convolutional neural networks (CNNs), recurrent neural networks (RNNs), autoencoders, transformers, and hybrid models. These models are evaluated for tasks such as speech, music, and environmental sound classification, leveraging complex audio patterns from large datasets. Pre-processing methods like spectrograms, Mel-frequency Cepstral coefficients, linear predictive coding, and wavelet decomposition are discussed in terms of their role in enhancing accuracy. The review highlights CNNs for categorical classification, RNNs for capturing temporal patterns, and transformers for extracting temporal and frequency features, while hybrid models integrate strengths from multiple architectures. The paper also examines the adaptability of these methods to IoT applications, offering valuable insights into integrating audio classification for tasks like speech recognition, speaker identification, and environmental monitoring, ultimately facilitating effective deployment in resource-constrained IoT systems.

The reviewed literature highlights significant advancements in IoT and deep learning technologies for environmental monitoring, particularly for gas detection and acoustic scene classification. While several studies focus on individual aspects such as ML-enhanced gas detection [24,26,30], DL-based sound classification [37,39], and IoT systems for real-time alerts [27,33,34], they often lack an integrated approach that combines these elements within a cost-effective framework.

Based on these insights, our proposed system integrates widely available and reliable components like MQ135 and MQ6 sensors for air quality and gas detection, along with the DHT11 sensor for monitoring temperature and humidity. The use of an Arduino Mega ensures cost–effectiveness and compatibility with the sensors, while the leveraging of ThingSpeak for cloud integration addresses the need for real-time data visualization. Furthermore, the inclusion of a CNN-based acoustic scene classifier reflects advances in DL frameworks for sound recognition, inspired by studies leveraging spectrogram-based approaches [39,40].

The design choices in this study were informed by gaps identified in existing systems, such as limited interoperability [27,33] and challenges in integrating gas detection with sound classification models [38]. Our approach uniquely combines IoT with DL to enhance environmental monitoring and real-time responsiveness, addressing both functional and economic constraints identified in the literature.

In conclusion, the existing literature demonstrates that IoT architectures have significantly evolved to address challenges in environmental monitoring and disaster management. Key elements of successful IoT frameworks include seamless integration of sensor networks, efficient data transmission protocols, and the application of machine learning models for real-time analytics. However, gaps remain, such as limited interoperability, insufficient real-time responsiveness, and challenges in combining multimodal data streams (e.g., acoustic and gas sensors). This study addresses these limitations by proposing a unified IoT architecture that integrates deep learning models for environmental sound and gas detection, leveraging cost-effective and widely available components to ensure scalability and real-time alerts. By synthesizing insights from the reviewed studies, this research lays the groundwork for an innovative and holistic approach to IoT-enabled environmental monitoring.

3. Device Design and Implementation

3.1. Proteus Schematic for IoT Device

The schematic circuit for the entire device was crafted using Proteus 8 software [41], offering a comprehensive visualization of the device’s architecture and functionality. Data acquisition from the sensors is orchestrated by the Arduino UNO microcontroller, serving as the central processing unit of the device. Through integration, sensor data are transmitted to the ThingSpeak cloud [42,43], enabling remote monitoring and analysis of different environmental parameters. In its physical representation, the virtual device retains full functionality. It swiftly transfers sensor data to the cloud, ensuring real-time visualization within a remarkably short timeframe of 2–3 min. This expedited data transfer process is complemented by the system’s capability to trigger remote notifications to the user via Wi-Fi, enhancing user awareness and responsiveness to environmental dynamics. Central to its functionality are the MQ135 and MQ6 gas sensors, adept at detecting a range of gases, including NH₃, NO_x, alcohol, benzene, CO₂, and LPG. While the primary focus lies in air-quality monitoring [44,45,46], the inclusion of the MQ6 LPG detection sensor underscores the virtual device’s versatility in identifying potentially hazardous gases, thereby safeguarding households from the perils of LPG cylinder blasts.

Furthermore, the virtual device’s customization capabilities empower users to tailor its functionality to their specific needs and industry requirements. The integration of the DHT11 sensor provides users with real-time information on humidity and temperature, enriching their understanding of environmental conditions. The user interface is facilitated through an LCD screen, offering clear visualizations of air quality measured in parts per million (PPM), alongside humidity and temperature readings. Additionally, a buzzer linked to the Arduino serves as an alert mechanism, promptly notifying users in the event of LPG detection. To facilitate seamless data transfer from the Arduino to the ThingSpeak cloud, a virtual serial port emulator (VSPE) and Python script are employed as efficient alternatives to Arduino Mega. This ensures reliable and uninterrupted communication between the device and the cloud infrastructure, enhancing the overall efficacy and responsiveness of the system. Figure 1 presents a block diagram elucidating the key components and interactions of the developed device, providing a comprehensive overview of its architecture and functionality.

The audio recording process in this system begins with a microphone, either onboard or external, that captures audio data directly from the environment. This microphone is connected to the IoT device or integrated into the setup, enabling continuous recording of environmental sounds and allowing the system to detect specific audio events or cues, such as sirens or other predefined sounds. Once recorded, the audio signals undergo pre-processing, which includes steps like noise reduction, normalization, and format conversion, ensuring the data are clear and standardized for improved classification accuracy. Next, the processed audio segments are transformed into spectrograms using the Librosa library in Python. A spectrogram offers a visual representation of the sound’s frequency spectrum over time, allowing the convolutional neural network (CNN) to interpret audio data in a format akin to image data, enhancing classification. The resulting spectrograms are then transmitted to the audio scene classification (ASC) model, where the CNN model identifies and classifies the audio patterns. Leveraging training data, the model can recognize specific sounds, categorize them into predefined classes, and even trigger alerts if necessary, thus making the system responsive to critical sound events in the monitored environment.

The environmental data collection process in this system relies on the integration of multiple sensors within an IoT device to monitor key environmental parameters. Gas sensors such as the MQ135 (referred to as GAS Sensor 1 in Figure 1) and MQ6 (referred to as GAS Sensor 2 in Figure 1) play a crucial role in this setup. The MQ135 sensor detects various air pollutants, including NH₃, NO_x, alcohol, benzene, and CO₂, while the MQ6 sensor is specifically tuned to detect LPG, helping to identify any potential hazardous gas leaks. Additionally, the DHT11 sensor measures temperature and humidity levels, providing essential data for a comprehensive understanding of environmental conditions.

An Arduino microcontroller serves as the main processing hub, collecting readings from each sensor, aggregating the data, and preparing them for transmission. Using a Python script, these aggregated data are then uploaded to the ThingSpeak cloud in real time, typically every 20 s, with the virtual serial port emulator (VSPE) facilitating data transfer from the Arduino to ThingSpeak [47]. This cloud platform not only stores environmental readings but also acts as a monitoring system. If gas levels, temperature, or humidity exceed set thresholds, the ThingSpeak platform triggers an alert mechanism via an IFTTT applet. This setup sends notifications to users by email or directly to mobile devices, enabling prompt responses to potentially dangerous environmental changes.

3.2. Integration of ThingSpeak Cloud with Arduino and Alerting User with the Help of IFTTT

Within ThingSpeak, there are two applications available for customizing triggers: ThingHTTP [48] and React [49]. React comes into play when the Arduino detects LPG, initiating the trigger ticket and interfacing with ThingHTTP. This in turn activates two “if this, then that” (IFTTT) applets. The IFTTT API allows a developer to publish a service, build applets, and embed them directly in their apps or websites [50].

The first applet is designed to send an email to the user upon LPG detection, while the second applet delivers a notification directly to the user’s mobile device through the IFTTT application. Once React initiates the ThingHTTP process, and ThingHTTP calls the API for the IFTTT applet, prompting IFTTT to send the notification to the user. Remarkably, this entire process is completed within a swift 3 min timeframe from the Arduino detecting the gas.

The 3 min timeframe is considered optimal because it balances the system’s need for accurate data collection, processing, and reliable notification delivery. This duration ensures timely alerts without overwhelming users with false alarms, while still providing a quick response to hazardous situations. Shorter intervals might compromise accuracy, while longer intervals could delay necessary actions in critical situations.

This showcases the robust and efficient nature of the product in promptly securing the environment from potential harm.

3.3. CNN-Based ASC Deep Learning Model

The linkage formed between the IoT device and the web application permits users to periodically send audio captures of their surroundings from the IoT device to the web application. This smooth integration greatly enhances the capacity for environmental surveillance, empowering users to remotely supervise their surroundings and proactively ensure safety and well-being.

The goal is to recognize numerous sound environments or circumstances in the surroundings. For instance, if a siren sounds, the current scheme is designed to classify the sound pattern specifically associated with the sound of a siren. Acoustic scene classification aims to assign a test recording to a predefined category that characterizes the recording environment. The dataset utilized comprises a total of 2000 environmental audio recordings spread across various classes and subclasses. It encompasses 5 primary classes representing different categories of sounds commonly heard in everyday environments, with each class further divided into 10 subclasses, each representing a specific kind of sound resulting from its respective class. These classes include sounds of sirens, chain saws, handsaws, glasses breaking, engines, airplanes, helicopters, trains, church bells, coughing, crackling fire, car horns, and rain, among others. Each subclass consists of 40 distinct sound recordings, resulting in a total of 2000 sound recordings used for training the model. This dataset was further divided into the training set, consisting of 80% of the samples to train the model, and the test set, consisting of the remaining 20% of the samples to test the model’s performance.

Deep learning techniques in general and CNNs in particular have emerged as the predominant choice for tasks like acoustic scene classification. The most effective CNNs primarily utilize audio spectrograms as input, drawing architectural inspiration mainly from computer vision principles [40,45,51]. In our approach, all audio files are transformed into spectrograms using the Librosa Python library, a package for music and audio analysis [52,53]. This proposal examines the capacity of CNNs to classify concise audio recordings of environmental sounds. Figure 2 presents a screenshot exemplifying the environmental audio files of the dataset used.

Upon training the model using these initial training and testing conditions, it was observed that the existing dataset lacked satisfactory accuracy. To address this, data augmentation was employed, incorporating white noise into copies of the original dataset, resulting in a total of 4000 training examples. Subsequently, 10% of the data were randomly selected and reserved for testing, and another 10% for validation of the model. Training was conducted using TensorFlow [40,54], utilizing all available data with 30 epochs, resulting in an impressive accuracy of 96%.

Following augmentation, four Convolution2D layers were introduced to enhance accuracy, complemented by the addition of MaxPooling2D to mitigate overfitting. Prior to model compilation, the data were flattened from 2D to 1D using “flatten” [55]. The “dense” layer [56] is utilized to connect all layers, and “dropout” [57] is applied to randomly deactivate selected neurons after each epoch, ensuring ongoing improvement in model accuracy. Finally, the sequential model was compiled using the “adam” optimizer [58,59], with an initial learning ratio of 0.001, initiating the model training phase with 80% of the dataset and setting the epoch value to 30. Figure 3 and Figure 4 present a diagram and a summary of the ASC model.

3.4. Collaboration of IoT Device with ASC Model

In the previous sections, we discussed the development of two core components: the IoT device responsible for environmental monitoring and the audio scene classification (ASC) model designed for sound classification. These components operated as distinct entities, each serving a specific purpose within the broader context of the system. However, to realize the full potential of the system and provide users with comprehensive insights into their environment, integration of these components is essential. In the final phase, a web application serves as the nexus, seamlessly bringing together the functionalities of the IoT device and the ASC model. This web application acts as a centralized platform where users can access and interact with the data collected by the IoT device and processed by the ASC model. Figure 5 presents a global view of the integration of the main components of the developed system.

The integration between the IoT device and the web application enables users to send audio recordings of their environment from the IoT device to the web application at regular intervals. This seamless communication is facilitated by a Python script scheduled to run every 5 min, ensuring continuous monitoring of the environment (Figure 6). Through this application, users can receive notifications about ongoing activities in the environment using the ASC model. Additionally, the web application provides information on the air quality, temperature, and humidity obtained through the Arduino sensors. Figure 7 presents an example of visualization of these data, offering users insights into the current environmental conditions. Moreover, the system has the capability to trigger alarms and notifications in the event of detecting any hazardous gas, such as LPG, in the context of this project. This comprehensive integration enhances environmental monitoring capabilities, empowering users to remotely monitor their surroundings and take proactive measures to ensure safety and well-being.

In the Proteus schematic, all the necessary library files were incorporated, along with the inclusion of the Arduino code. The output screen utilizes a virtual terminal to display the air quality, temperature, and humidity of the environment using data from MQ135 and DHT11 sensors. The other terminal indicates the detection of LPG, represented in binary, where 1 denotes “detected”. These virtual terminals are exclusively linked to the virtual serial port emulator (VSPE) ports. The next step involves transmitting the Arduino outputs to a Python script through these virtual ports.

Another Python script plays a crucial role in ensuring seamless communication between the Arduino-based IoT device and the ThingSpeak cloud platform. Integrated with the Write API keys of the ThingSpeak channel, this script acts as a bridge, facilitating the transmission of real-time sensor data from the device to the cloud. To enable continuous monitoring and timely updates, this script is configured to run at regular intervals, typically every 20 s. This scheduling ensures that sensor readings from the Arduino are consistently captured and transmitted to the ThingSpeak channel without interruption. Once executed, this script initiates the data transmission process, packaging the sensor readings into structured data packets compatible with the ThingSpeak platform. These packets, containing air quality measurements, temperature, humidity, and gas detection status, are then sent over the internet to the designated ThingSpeak channel. Upon receiving the data, ThingSpeak processes and stores it in real time, allowing users to access up-to-date visualizations and analytics of their environment at any given moment. Through the ThingSpeak channel interface, users can view comprehensive graphs, charts, and statistics depicting environmental parameters, enabling them to gain valuable insights into their surroundings (Figure 7).

Crucially, the real-time nature of this data transmission ensures that users are promptly alerted to any significant changes or anomalies detected by the IoT device. We considered significant changes or anomalies to be any increment of more than 5% of the previous values read by the sensors. In the event of gas detection, for example, the system triggers an alert within a remarkably short timeframe of 2–3 min, notifying the user via the integrated notification system. Overall, the integration of the Python script with the ThingSpeak platform ensures that the IoT device operates seamlessly, continuously updating its cloud-based repository with real-time sensor data. This enables users to monitor their environment remotely, stay informed about critical events, and take timely actions to ensure safety and well-being.

In this system, a convolutional neural network (CNN) is employed as the core machine learning algorithm for audio scene classification. CNNs are particularly effective in analyzing audio data represented as spectrograms, which visually encode frequency and amplitude changes over time. By converting audio signals into spectrogram images using the Librosa library, the system allows the CNN to process audio as it would image data. This approach leverages the CNN’s strengths in feature extraction and pattern recognition, enabling it to detect and classify specific audio cues, such as sirens or gas leaks, with high accuracy.

To implement the CNN, spectrograms are generated from pre-processed audio recordings, which are then fed into the network. The CNN model undergoes a training phase where it learns to distinguish between different audio patterns based on labeled training data, building an understanding of various environmental sounds. After training, the model can identify and categorize new audio inputs in real time, supporting the system’s goal of accurately detecting and reporting critical environmental sounds. This integration of CNNs within an IoT setup enhances the effectiveness of the gas leak detection system by combining robust audio classification with real-time environmental monitoring.

4. Results and Discussion

The detection and measurement module of the system underwent extensive testing to assess its ability to accurately detect and measure environmental factors such as temperature, humidity, and various gases, including LPG. Testing primarily involved simulations using Proteus software, facilitating controlled experimentation and evaluation of the system’s performance. Proteus is a powerful simulation tool for circuit design and prototyping, offering quick modeling and testing without physical components. However, its reliance on idealized component models can lead to discrepancies between simulated and real-world behavior, as it does not capture nonideal characteristics like tolerance variations, aging effects, or environmental factors such as electromagnetic interference and noise. Additionally, Proteus struggles to replicate the dynamic interactions and varying loads seen in real applications. These limitations can result in unreliable predictions, highlighting the importance of validating designs through empirical testing in real-world environments to ensure robustness and accuracy. Despite these challenges, Proteus remains a valuable tool when complemented by real-world validation.

Within the Proteus environment, tests were conducted to simulate diverse environmental conditions and scenarios, enabling precise control over factors like temperature, humidity, and gas concentrations. This approach allowed for thorough assessment of the system’s response across different conditions.

4.1. Data Collection Process

To ensure robust model performance and high reliability of the IoT-based environmental monitoring and acoustic scene classification (ASC) system, data collection was performed meticulously across various stages of system development. This process included data acquisition from environmental sensors, gas leak simulations, and sound recordings for the ASC model.

Environmental data were collected using the MQ135 and MQ6 gas sensors for air-quality monitoring and LPG detection and the DHT11 sensor for humidity and temperature readings. Connected to the Arduino microcontroller, the sensors gathered real-time data over a two-month period. Various environmental conditions were simulated through Proteus software, including fluctuations in air quality, temperature, and humidity, to encompass a broad range of scenarios. Each parameter was sampled every 20 s and uploaded to ThingSpeak, supporting continuous monitoring and trend analysis of environmental factors.

In a controlled lab setting, real-world gas leak scenarios were simulated with different LPG concentrations to assess the MQ6 sensor’s accuracy and sensitivity. Data were gathered in both simulated and live settings, testing the system’s response to varying gas levels, from low concentrations to potentially hazardous levels. Sensor readings were observed and logged, with an emphasis on the system’s ability to differentiate between normal environmental levels and hazardous concentrations. Data were transmitted by the Arduino via Wi-Fi to ThingSpeak, where gas leak events were documented and analyzed to evaluate the promptness and accuracy of notifications.

The ASC model was trained on a varied audio dataset of 2000 environmental sounds across five main classes (e.g., alarm sounds, engine sounds) with 10 specific subclasses each. This dataset was split into 80% for training, 10% for testing, and 10% for validation to assess model accuracy and minimize overfitting. Given the initial dataset’s limited size, data augmentation methods, such as adding white noise, were applied to double the dataset to 4000 samples, enhancing the model’s accuracy and robustness.

4.2. Gas Leak Detection Performance and Accuracy

The system’s ability to detect LPG leaks was extensively evaluated using both simulated and real gas leak scenarios, focusing on the speed and accuracy of detection. The following are the key findings from the gas leak detection tests.

The system showed high sensitivity to minimal gas concentrations, promptly triggering alerts upon detecting LPG. When gas levels exceeded a danger-indicative threshold, sensor readings increased and the notification mechanism activated immediately, sending both an email alert and an app notification via IFTTT within 2–3 min. This rapid response is essential in emergencies, providing near-real-time alerts to users.

To avoid unnecessary alerts (so-called false alarms), the system was calibrated to allow for minor fluctuations in sensor readings, activating an alarm only if readings consistently rose above a specific threshold. This calibration reduced false alarms, ensuring notifications were reserved for truly hazardous situations.

Testing demonstrated that the system detected gas leaks accurately in both low and high concentrations, with a reliability rate of approximately 96%. The system responded effectively under various environmental conditions, such as changes in temperature and humidity. Accuracy remained consistent over time, underscoring the robustness and reliability of the gas detection mechanism.

The combination of sensors, ThingSpeak, and IFTTT integration enabled alerts to be sent within a 3 min window, balancing swift response with data accuracy. This timeframe ensures users receive alerts quickly while minimizing false positives. Additionally, sensor readings were refreshed every 20 s, allowing for continuous monitoring and timely alerts.

4.3. Model Evaluation and Confusion Matrix

The ASC model was tested on a validation set of 400 samples with different sound classes, achieving a classification accuracy of 96%. The confusion matrix analysis demonstrated high precision, recall, and F1 scores across all classes, with minimal misclassifications, confirming that the ASC model can accurately distinguish between various environmental sounds, such as sirens, engine noises, and alarms. The augmented dataset and use of CNN architecture contributed to the model’s robust performance, achieving high scores across standard metrics. This performance highlights the model’s capability to contribute effectively to environmental monitoring by identifying specific sounds that may indicate critical events or hazardous situations, thereby enhancing the system’s safety and surveillance functions.

The ASC model underwent extensive testing to evaluate its ability to accurately classify environmental sounds. The testing process involved training the model on a diverse dataset and evaluating its performance using standard metrics such as accuracy, precision, recall, and F1 score. As explained above in Section 3.3, the dataset used for training and testing the ASC model comprised environmental sounds commonly encountered in everyday situations. The dataset was divided into five primary classes, each representing a different category of environmental sound. Within each class, there were 10 subclasses, each representing a specific type of sound within that category, as shown in Figure 2. To evaluate the performance of the ASC model, a separate validation dataset was used consisting of audio recordings not included in the training dataset. Figure 8 presents an example of the output prediction of the model from a random input from the testing set. The accuracy of the ASC model was measured by comparing the model’s predictions with the ground-truth labels for each audio recording in the validation dataset. The accuracy metric represents the percentage of correctly classified audio recordings out of the total number of recordings in the validation dataset. After successfully training and testing the ASC model, an accuracy of 96% was achieved. This accuracy metric indicates that the model correctly classified 96% of the audio recordings in the validation dataset, as shown in Figure 9a,b. Additionally, precision (0.95), recall (0.94), and F1 score (0.945) metrics were calculated to provide a more comprehensive evaluation of the model’s performance.

The ASC model was trained on a dataset of 2000 environmental audio recordings, later augmented to 4000 recordings with techniques such as white noise addition to improve robustness. Training involved 80% of the dataset, while 20% was reserved for validation. Table 1 below outlines the confusion matrix derived from the test dataset, highlighting the model’s performance across various classes.

The model demonstrates high accuracy across classes, with “crowd” showing the lowest accuracy at 88%, likely due to the complexity of crowd noise, which often blends multiple individual sound profiles.

Sound classes such as dog bark (94%) and fire alarm (92%) exhibit both high precision and recall, demonstrating the model’s capability to recognize distinct, high-frequency sounds with minimal misclassification.

Minor overlaps are present between similar sound profiles, for example, engine and chainsaw, and footsteps and crowd. Through data augmentation, these overlaps were reduced, enhancing class distinctions across training epochs.

Occasional misclassifications occurred with sounds sharing overlapping frequency ranges, like siren sometimes being misclassified as fire alarm. While weather-related sounds such as rain and thunder show high accuracy, they can be misclassified due to ambient noise similarities. The confusion matrix and this analysis underline the model’s robustness in real-world environmental monitoring, especially in critical categories like alarms and high-frequency sounds, which are crucial for responsive hazard detection systems.

In addition to testing the detection and measurement component of the system in simulated environments, tests with real gases in the laboratory were conducted specifically to evaluate the system’s ability to detect hazardous gases such as LPG. Prior to the tests, we calibrated and normalized the different sensors employed in the developed device. These tests aimed to assess whether the system could accurately detect real hazardous gas leaks and promptly alert the user to mitigate potential risks. Several testing scenarios were created to represent different levels of LPG gas leakage, ranging from minor leaks to significant concentrations of gas.

The system’s performance was assessed under each scenario to determine its ability to detect and respond to hazardous gas leaks effectively. The simulations yielded very good results, demonstrating the system’s capability to accurately detect hazardous gas leaks and promptly alert the user, as shown in Figure 10, where a virtual screen is displaying output of detection of gases.

The system successfully detected simulated gas leaks of varying concentrations, ranging from low to high levels, within milliseconds of their occurrence. The response time of the system was found to be highly efficient, with alerts being generated and transmitted to the user within seconds of gas detection. This rapid response time ensures timely notification to the user, enabling them to take immediate action to mitigate the potential risks associated with gas leaks.

The performance of the proposed system is affected by variables such as temperature, humidity, and aging, requiring regular calibration to ensure accuracy. Additionally, its sensitivity to multiple gases may lead to false positives. Reliable data transmission relies on consistent internet connectivity, and network disruptions could delay notifications. The high power consumption resulting from continuous operation of sensors and microcontrollers limits the practicality of battery-operated systems. Despite achieving 96% accuracy, the CNN model may misclassify sounds in noisy environments or when the audio or sensor is in motion, necessitating retraining to adapt to different conditions. Variations in environmental factors like humidity and temperature changes also contribute to data uncertainty.

5. Conclusions

The system presented here utilizes an Arduino microcontroller and IoT technology to identify air pollution in the environment, aiming to improve overall air quality. A gas leakage detection system is integrated to provide timely alerts via a buzzer and user interface in the form of a web application, as mentioned earlier. The incorporation of IoT technology enhances the monitoring of various environmental aspects, with a focus on air quality. The basis of the hardware component of the system relies on MQ135 and MQ6 gas sensors for assessing air quality, with Arduino serving as the central component. A Wi-Fi module establishes a connection to the internet, and an LCD screen offers visual output. ThingSpeak is employed for monitoring, analyzing, and displaying data, while the IFTTT application is used for user notifications and alerts. The study also delves into exploring an alternative deep learning convolutional neural network architecture configuration specifically tailored for distinct maximum receptive fields across audio spectrograms. This approach aims to enhance the design of deep CNNs for acoustic classification tasks and adapt successful CNNs from other domains, particularly image recognition, to acoustic scene classification. The model implemented in this research reached an accuracy of 96%.

This technology holds the potential to extend beyond individual devices, enabling the installation of air-quality sensors throughout a city. This broader application could facilitate the mapping of air quality and the establishment of a website where individuals can track pollution levels in their respective areas.

The system effectively identified simulated gas leaks across a range of concentrations, from minimal to substantial, within milliseconds of their onset. Its responsiveness proved remarkably swift, issuing alerts to users mere seconds after gas detection. This rapid reaction guarantees prompt notification to users, empowering them to promptly address potential risks linked to gas leaks. Additionally, the system displayed remarkable precision in discerning between typical environmental variations and dangerous gas leaks, ensuring dependable detection of genuine safety threats.

Regarding the possible extension of this technology, while it is possible to migrate an existing setup to the ESP32, rigorous research and testing are necessary to adapt neural models for low-processing nodes like the ESP32. In addition to extensive testing, optimization techniques like quantization and pruning will be required to ensure the model functions well under the constraints of the new platform.

Although the system includes a web-based user interface and uses IFTTT for notifications, this research did not delve into aspects such as user experience, usability, or interface accessibility in depth; however, we intend to address these areas in future work.

Author Contributions

Conceptualization, P.M. and N.M.; methodology, D.K.C.; software, P.M.; validation, P.M.; formal analysis, P.P.; writing—original draft preparation, P.M., N.M. and P.P.; writing—review and editing, M.J.C.S.R.; visualization, P.M.; supervision, M.J.C.S.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chen, F.; Zhang, W.; Mfarrej, M.F.B.; Saleem, M.H.; Khan, K.A.; Ma, J.; Raposo, A.; Han, H. Breathing in danger: Understanding the multifaceted impact of air pollution on health impacts. Ecotoxicol. Environ. Saf. 2024, 280, 116532. [Google Scholar] [CrossRef] [PubMed]
Bedi, T.K.; Bhattacharya, S.P. Chapter 13—Indoor air quality and health: An emerging challenge in Indian megacities. In Developments in Environmental Science; Sivaramakrishnan, L., Dahiya, B., Sharma, M., Mookherjee, S., Karmakar, R., Eds.; Elsevier: Amsterdam, The Netherlands, 2024; Volume 15, pp. 269–293. [Google Scholar] [CrossRef]
European Environment Agency. Transport and Environment Report 2022. Digitalisation in the Mobility System: Challenges and Opportunities. EEA Report No 07/2022. 2022. Available online: https://www.eea.europa.eu/en/analysis/publications/transport-and-environment-report-2022 (accessed on 7 September 2024).
European Environment Agency. Electric Vehicles from Life Cycle and Circular Economy Perspectives TERM 2018: Transport and Environment Reporting Mechanism (TERM) Report. EEA Report No 13/2018. 2018. Available online: https://www.eea.europa.eu/en/analysis/publications/electric-vehicles-from-life-cycle (accessed on 6 September 2024).
Chang, J.H.; Lee, Y.L.; Chang, L.T.; Chang, T.Y.; Hsiao, T.C.; Chung, K.F.; Ho, K.F.; Kuo, H.P.; Lee, K.Y.; Chuang, K.J.; et al. Climate change, air quality, and respiratory health: A focus on particle deposition in the lungs. Ann Med. 2023, 55, 2264881. [Google Scholar] [CrossRef] [PubMed]
India—Country Commercial Guide. Renewable Energy. 2024. Available online: https://www.trade.gov/country-commercial-guides/india-renewable-energy (accessed on 3 July 2024).
Ministry of Environment, Forest and Climate Change. Net Zero Emissions Target. 2023. Available online: https://pib.gov.in/PressReleaseIframePage.aspx?PRID=1945472 (accessed on 7 September 2024).
Xing, Y.; Xu, Y.; Shi, M.; Lian, Y. The impact of PM2.5 on the human respiratory system. J. Thorac. Dis. 2016, 8, E69–E74. [Google Scholar]
National Disaster Management Authority (NDMA). Available online: https://ndma.gov.in/ (accessed on 4 October 2024).
Petroleum & Explosives Safety Organization (PESO)—Department for Promotion of Industry and Internal Trade. Available online: https://peso.gov.in/web/ (accessed on 4 October 2024).
Oil Industry Safety Directorate (Under Ministry of Petroleum and Natural Gas). Available online: https://www.oisd.gov.in/ (accessed on 4 October 2024).
National Crime Records Bureau Empowering Indian Police with Information Technology. Available online: https://www.ncrb.gov.in/ (accessed on 4 October 2024).
Baburao, G.; Arivarasu, A.; Srinivas, T.; Elangovan, R.K.; Govindarajan, R. Statistical Data Analysis in Emergency Management Elements of Indian State of Tamil Nadu Manufacturing Industries Utilising LPG. Int. J. Occup. Saf. Health 2024, 14, 88–97. [Google Scholar] [CrossRef]
Koutini, K.; Zadeh, H.E.; Widmer, G. Receptive-field-regularized CNN variants for acoustic scene classification. arXiv 2019, arXiv:1909.02859. [Google Scholar]
Shah, H.N.; Khan, Z.; Merchant, A.A.; Moghal, M.; Shaikh, A.; Rane, P. IOT Based Air Pollution Monitoring System. Int. J. Sci. Eng. Res. 2018, 9, 62–65. [Google Scholar]
Varma, A.; Prabhakar, S.; Jayavel, K. Gas Leakage Detection and Smart Alerting and prediction using IoT. In Proceedings of the IEEE 2017 2nd International Conference on Computing and Communications Technologies (ICCCT), Chennai, India, 23–24 February 2017; pp. 327–333. [Google Scholar]
Kumar, K.; Sabbani, H. Smart Gas Level Monitoring, Booking & Gas Leakage Detector over IoT. In Proceedings of the IEEE 7th International Advance Computing Conference (IACC), Hyderabad, India, 5–7 January 2017; pp. 330–332. [Google Scholar]
Lee, Y.; Hsiao, W.; Huang, C.; Chou, S.T. An integrated cloud-based smart home management system with community hierarchy. IEEE Trans. Consum. Electron. 2016, 62, 1–9. [Google Scholar] [CrossRef]
Joshi, J.; Rahul, S.R.; Kumar, P.; Polepally, S.; Samineni, R.; Tej, D.G.K. Performance enhancement and IoT based monitoring for smart home. In Proceedings of the IEEE 2017 International Conference on Information Networking (ICOIN), Da Nang, Vietnam, 11–13 January 2017; pp. 468–473. [Google Scholar]
Roma, G.; Nogueira, W.; Herrera, P. Recurrence Quantification Analysis Features for Auditory Scene Classification. IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events. 2013. Available online: http://www.mtg.upf.edu/system/files/publications/Roma-Waspaa-2014.pdf (accessed on 18 January 2025).
Wang, D.; Brown, G. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
Sakashita, Y.; Aono, M. Acoustic Scene Classification by Ensemble of Spectrograms Based on Adaptive Temporal Divi. DCASE2018 Challenge. 2018. Available online: https://dcase.community/documents/challenge2018/technical_reports/DCASE2018_Sakashita_15.pdf (accessed on 18 January 2025).
Piczak, K.J. ESC: Dataset for environmental sound classification. In Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM ’15, Brisbane, Australia, 26–30 October 2015. [Google Scholar]
Bzai, J.; Alam, F.; Dhafer, A.; Bojović, M.; Altowaijri, S.M.; Niazi, I.K.; Mehmood, R. Machine Learning-Enabled Internet of Things (IoT): Data, Applications, and Industry Perspective. Electronics 2022, 11, 2676. [Google Scholar] [CrossRef]
Popescu, S.M.; Mansoor, S.; Wani, O.A.; Kumar, S.S.; Sharma, V.; Sharma, A.; Arya, V.M.; Kirkham, M.B.; Hou, D.; Bolan, N.; et al. Artificial intelligence and IoT driven technologies for environmental pollution monitoring and management. Front. Environ. Sci. 2024, 12, 1336088. [Google Scholar] [CrossRef]
Saleem, T.J.; Chishti, M.A. Deep learning for the internet of things: Potential benefits and use-cases. Digit. Commun. Netw. 2021, 7, 526–542. [Google Scholar] [CrossRef]
Khan, A.; Gupta, S.; Gupta, S.K. Multi-hazard disaster studies: Monitoring, detection, recovery, and management, based on emerging technologies and optimal techniques. Int. J. Disaster Risk Reduct. 2020, 47, 101642. [Google Scholar] [CrossRef]
Ray, P.P.; Mukherjee, M.; Shu, L. Internet of things for disaster management: State-of-the-art and prospects. IEEE Access 2017, 5, 18818–18835. [Google Scholar] [CrossRef]
Esposito, M.; Palma, L.; Belli, A.; Sabbatini, L.; Pierleoni, P. Recent Advances in Internet of Things Solutions for Early Warning Systems: A Review. Sensors 2022, 22, 2124. [Google Scholar] [CrossRef] [PubMed]
Hassan, M.N.; Islam, M.R.; Faisal, F.; Semantha, F.H.; Siddique, A.H.; Hasan, M. An IoT based Environment Monitoring System. In Proceedings of the 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), Thoothukudi, India, 3–5 December 2020; pp. 1119–1124. [Google Scholar] [CrossRef]
Islam, G.Z.; Hossain, M.; Faruk, M.; Nur, F.N.; Hasan, N.; Khan, K.M.; Tumpa, Z.N. IoT-Based Automatic Gas Leakage Detection and Fire Protection System. Int. J. Interact. Mob. Technol. 2022, 16, 49–70. [Google Scholar] [CrossRef]
Mao, G.; Zhang, Y.; Xu, Y.; Li, X.; Xu, M.; Zhang, Y.; Jia, P. An electronic nose for harmful gas early detection based on a hybrid deep learning method H-CRNN. Microchem. J. 2023, 195, 109464. [Google Scholar] [CrossRef]
Praveenchandar, J.; Vetrithangam, D.; Kaliappan, S.; Karthick, M.; Pegada, N.K.; Patil, P.P.; Rao, S.G.; Umar, S. IoT-Based Harmful Toxic Gases Monitoring and Fault Detection on the Sensor Dataset Using Deep Learning Techniques. Sci. Program. 2022, 2022, 7516328. [Google Scholar] [CrossRef]
Narayana, T.L.; Venkatesh, C.; Kiran, A.; Babu, J.C.; Kumar, A.; Khan, S.B.; Almusharraf, A.; Quasim, M.T. Advances in real time smart monitoring of environmental parameters using IoT and sensors. Heliyon 2024, 10, e28195. [Google Scholar] [CrossRef]
Ayyappan, S.; Varalakshmi, V.; Mishra, R.; Murali, K. IoT Based Detection and Alerting of Hazardous Gas Detection for the welfare of Sewer Labourers. Nanotechnol. Percept. 2024, 20, 39–49. [Google Scholar]
Emvoliadis, A.; Vryzas, N.; Stamatiadou, M.-E.; Vrysis, L.; Dimoulas, C. Multimodal Environmental Sensing Using AI & IoT Solutions: A Cognitive Sound Analysis Perspective. Sensors 2024, 24, 2755. [Google Scholar] [CrossRef] [PubMed]
Tsalera, E.; Papadakis, A.; Samarakou, M. Comparison of Pre-Trained CNNs for Audio Classification Using Transfer Learning. J. Sens. Actuator Netw. 2021, 10, 72. [Google Scholar] [CrossRef]
Piczak, K.J. Environmental sound classification with convolutional neural networks. In Proceedings of the 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), Boston, MA, USA, 17–20 September 2015; pp. 1–6. [Google Scholar] [CrossRef]
Zaman, K.; Sah, M.; Direkoglu, C.; Unoki, M. A Survey of Audio Classification Using Deep Learning. IEEE Access 2023, 11, 106620–106649. [Google Scholar] [CrossRef]
Shanmugamani, R. Deep Learning for Computer Vision: Expert Techniques to Train Advanced Neural Networks Using TensorFlow and Keras; Packt Publishing Ltd.: Birmingham, UK, 2018. [Google Scholar]
Proteus. PCB Design & Simulation Made Easy. Available online: https://www.labcenter.com/ (accessed on 4 October 2024).
Parida, D.; Behera, A.; Naik, J.K.; Pattanaik, S.; Nanda, R.S. Real-time Environment Monitoring System using ESP8266 and ThingSpeak on Internet of Things Platform. In Proceedings of the 2019 International Conference on Intelligent Computing and Control Systems (ICCS), Madurai, India, 15–17 May 2019; pp. 225–229. [Google Scholar] [CrossRef]
Luo, W.; Li, Y.; Urtasun, R.; Zemel, R. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2016, 29, 4898–4906. [Google Scholar]
Bishop, C.M. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
ThingSpeak. ThingSpeak for IoT Projects: Data Collection in the Cloud with Advanced Data Analysis Using MATLAB. Available online: https://thingspeak.mathworks.com/ (accessed on 4 October 2024).
Martins, H.; Gupta, N.; Reis, M.J.C.S.; Ferreira, P.J.S.G. Low-cost Real-time IoT-Based Air Quality Monitoring and Forecasting. Lecture Notes of the Institute for Computer Sciences. In Social Informatics and Telecommunications Engineering; Springer: Cham, Switzerland, 2022; Volume 442. [Google Scholar] [CrossRef]
Introduction to NodeMCU. Available online: https://www.electronicwings.com/nodemcu/introduction-to-nodemcu (accessed on 10 October 2024).
ThingHTTP App. Available online: https://www.mathworks.com/help/thingspeak/thinghttp-app.html (accessed on 10 October 2024).
React—The Library for Web and Native User Interfaces. Available online: https://react.dev/ (accessed on 10 October 2024).
IFTTT. Available online: https://connect.ifttt.com (accessed on 4 October 2024).
Radaković, M. Audio Signal Preparation Process for Deep Learning Application Using Python. In Proceedings of the 2021—International Scientific Conference on Information Technology and Data Related Research, Zagreb, Croatia, 25–27 November 2021. [Google Scholar] [CrossRef]
Abeßer, J. A review of deep learning based methods for acoustic scene classification. Appl. Sci. 2020, 10, 2020. [Google Scholar] [CrossRef]
McFee, B.; Raffel, C.; Liang, D.; Ellis, D.P.W.; McVicar, M.; Battenberg, E.; Nieto, O. librosa: Audio and music signal analysis in python. In Proceedings of the 14th Python in Science Conference, Austin, TX, USA, 6–12 July 2015; pp. 18–25. [Google Scholar]
Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Berkeley, CA, USA, 2 November 2016; pp. 265–283. [Google Scholar]
Flatten. Flatten-Json 0.1.14. Available online: https://pypi.org/project/flatten-json/ (accessed on 4 October 2024).
Dense. Dense-Basis 0.1.9. Available online: https://pypi.org/project/dense-basis/ (accessed on 4 October 2024).
Dropout. Keras-Targeted-Dropout 0.5.0. Available online: https://pypi.org/project/keras-targeted-dropout/ (accessed on 4 October 2024).
Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. CoRR. arXiv 2013, arXiv:1312.6114. [Google Scholar] [CrossRef]
Sakshii. Adam Optimizer: A Quick Introduction. Available online: https://www.askpython.com/python/examples/adam-optimizer (accessed on 4 October 2024).

Figure 1. Main block diagram of the proposed device (GAS Sensor 1 refers to MQ135 sensor and GAS Sensor 2 refers to MQ6 sensor).

Figure 2. Dataset of the environmental audio files, where columns are classes and rows are subclasses.

Figure 3. Flowchart of the ASC model.

Figure 4. Summary of the ASC model.

Figure 5. Project demonstration after collaborating IoT device with the ASC model.

Figure 6. Python script sending output from the Arduino to ThingSpeak channel.

Figure 7. Example of using ThingSpeak to visualize the environment statistics of specific days.

Figure 8. Example of the output prediction of the model from a random input from the testing set, in this case a rooster.

Figure 9. (a) Accuracy at the end of 30 epochs for the testing dataset (b) Plot representation of accuracy and loss over epochs.

Figure 10. Simulation showing hazardous gas detection. In these particular cases, 1 denotes “detected,” meaning that the system detected the presence of a harmful gas.

Table 1. Confusion matrix highlighting the classification performance of the ASC model across ten sound classes.

Predicted/Actual	Siren	Engine	Glass Break	Rain	Chainsaw	Dog Bark	Footsteps	Fire Alarm	Thunder	Crowd	Accuracy
Siren	180	8	1	3	2	0	1	3	0	2	92%
Engine	5	185	4	3	5	2	0	1	2	3	89%
Glass Break	1	4	187	2	4	3	0	1	0	3	91%
Rain	3	5	2	183	1	0	2	1	5	3	90%
Chainsaw	1	6	3	2	182	2	1	0	4	2	91%
Dog Bark	0	1	1	0	3	192	3	1	2	4	94%
Footsteps	1	0	2	1	1	2	189	3	1	6	93%
Fire Alarm	2	2	0	2	1	2	1	184	0	4	92%
Thunder	0	3	0	6	2	1	1	0	188	3	93%
Crowd	2	4	3	4	2	4	6	3	3	177	88%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mishra, P.; Mishra, N.; Choudhary, D.K.; Pareek, P.; Reis, M.J.C.S. Use of IoT with Deep Learning for Classification of Environment Sounds and Detection of Gases. Computers 2025, 14, 33. https://doi.org/10.3390/computers14020033

AMA Style

Mishra P, Mishra N, Choudhary DK, Pareek P, Reis MJCS. Use of IoT with Deep Learning for Classification of Environment Sounds and Detection of Gases. Computers. 2025; 14(2):33. https://doi.org/10.3390/computers14020033

Chicago/Turabian Style

Mishra, Priya, Naveen Mishra, Dilip Kumar Choudhary, Prakash Pareek, and Manuel J. C. S. Reis. 2025. "Use of IoT with Deep Learning for Classification of Environment Sounds and Detection of Gases" Computers 14, no. 2: 33. https://doi.org/10.3390/computers14020033

APA Style

Mishra, P., Mishra, N., Choudhary, D. K., Pareek, P., & Reis, M. J. C. S. (2025). Use of IoT with Deep Learning for Classification of Environment Sounds and Detection of Gases. Computers, 14(2), 33. https://doi.org/10.3390/computers14020033

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Use of IoT with Deep Learning for Classification of Environment Sounds and Detection of Gases

Abstract

1. Introduction

2. Related Work

3. Device Design and Implementation

3.1. Proteus Schematic for IoT Device

3.2. Integration of ThingSpeak Cloud with Arduino and Alerting User with the Help of IFTTT

3.3. CNN-Based ASC Deep Learning Model

3.4. Collaboration of IoT Device with ASC Model

4. Results and Discussion

4.1. Data Collection Process

4.2. Gas Leak Detection Performance and Accuracy

4.3. Model Evaluation and Confusion Matrix

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI