AI-Aided Robotic Wide-Range Water Quality Monitoring System

Awwad, Ameen; Husseini, Ghaleb A.; Albasha, Lutfi

doi:10.3390/su16219499

Open AccessArticle

AI-Aided Robotic Wide-Range Water Quality Monitoring System

by

Ameen Awwad

^1,*

,

Ghaleb A. Husseini

²

and

Lutfi Albasha

¹

Department of Electrical Engineering, American University of Sharjah, University City, Sharjah P.O. Box 26666, United Arab Emirates

²

Department of Chemical and Biological Engineering, American University of Sharjah, University City, Sharjah P.O. Box 26666, United Arab Emirates

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(21), 9499; https://doi.org/10.3390/su16219499

Submission received: 7 September 2024 / Revised: 23 October 2024 / Accepted: 28 October 2024 / Published: 31 October 2024

Download

Browse Figures

Versions Notes

Abstract

:

Waterborne illnesses lead to millions of fatalities worldwide each year, particularly in developing nations. In this paper, we introduce a comprehensive system designed for the autonomous early detection of viral outbreaks transmitted through water to ensure sustainable access to healthy water resources, especially in remote areas. The system utilizes an autonomous water quality monitoring setup consisting of an airborne water sample collector, an autonomous sample processor, and an artificial intelligence-aided microscopic detector for risk assessment. The proposed system replaces the time-consuming conventional monitoring protocol by automating sample collection, sample processing, and pathogen detection. Furthermore, it provides a safer processing method against the spillage of contaminated liquids and potential resultant aerosols during the heat fixation of specimens. A morphological image processing technique of light microscopic images is used to segment images, assisting in selecting a unified appropriate input segment size based on individual blob areas of different bacterial cultures. The dataset included harmful pathogenic bacteria (A. baumanii, E. coli, and P. aeruginosa) and harmless ones found in drinking water and wastewater (E. faecium, L. paracasei, and Micrococcus spp.). The segmented labeled dataset was used to train deep convolutional neural networks to automatically detect pathogens in microscopic images. To minimize prediction error, Bayesian optimization was applied to tune the hyperparameters of the networks’ architecture and training settings. Different convolutional networks were tested in accordance with different required output labels. The neural network used to classify bacterial cultures as harmful or harmless achieved an accuracy of 99.7%. The neural network used to identify the specific types of bacteria achieved a cumulative accuracy of 93.65%.

Keywords:

automation; airborne surveying; deep neural networks; image processing; microscopic images; morphology; optimization algorithm; risk assessment; safe water supply; waterborne diseases; water quality

1. Introduction

Waterborne diseases pose a significant danger to the sustainability of public health, causing millions of deaths around the world [1]. Transmission pathways include wastewater aquifers exposed to biological communities and surface drinking or clean water susceptible to sewage spillage and other pollutants. Numerous fatalities are attributed to diarrheal illnesses, including those caused by the bacteria E. coli, which can be identified using standard light microscopy [2]. Since underserved areas lack access to experts and laboratories for conducting wide surveys, an autonomous surveying technique with an integrated system of guided airborne sample collectors, microscopic imaging and processing, and visual recognition classifiers for detecting waterborne diseases in collected water samples can significantly help in countering sudden outbreaks. Marking the sources of waterborne diseases aids underserved communities in identifying aquifers contaminated with waterborne diseases, either to provide necessary sanitation or to use alternative water resources when available.

The proposed system, depicted in Figure 1, surveys various areas rapidly aided by an automatic airborne sample collection system. The proposed fixed-wing airborne sample collector method has advantages over current hovering drones and other fixed-wing drones that can land on water bodies, since the former collects the sample without stopping, thus minimizing power consumption and making our technique more suitable for wide-range risk assessment. Moreover, a robotic sample processor, microscopic imaging, and visual recognition algorithms are used to recognize the bacterial cultures found in polluted water samples. This enables the automated categorization of a large volume of samples quickly and does not require prior knowledge from non-expert users, which is important for sustainable application in remotes areas. It also provides a safer processing environment against the spillage of contaminated liquids and the possible release of resultant aerosols during the heat fixation of specimens.

The convolutional neural classification approach, distinct from earlier AI-based methods, incorporates morphological image processing to prepare microscopic images. This integration aids in distinguishing various types of blobs in microscopic images and assists in selecting the suitable size of input segments. The bacterial cultures considered in this study are harmful pathogenic bacterial cultures that can cause waterborne disease (specifically, A. baumanii, E. coli, and P. aeruginosa) and harmless ones that can be found in drinking water and wastewater (E. faecium, L. paracasei, and Micrococcus spp.). The microscopic images of the considered bacteria were retrieved from the DIBas dataset, which is publicly available for researchers [3]. Morphology is used because bacterial cultures on slides come in various shapes, which makes it challenging to distinguish them from other bodies in microscopic images. The difference between the AI-based classifier of images of large objects, such as the classification performed using YOLO and GoogleNET, and a classifier of microscopic images is that microorganisms have similar simple shapes, increasing identification difficulty. Examples of the bacterial cultures’ morphotypes from the used DIBas dataset are shown in Figure 2.

The proposed integrated system offers a novel combination of robotics and AI-based classifiers, delivering a rapid and energy-efficient protocol for detecting waterborne diseases across a wide range. Its aerial sampling and analysis of scattered water bodies serve as an early countermeasure against outbreaks. The aerial sample collection mechanism, along with automated sample processing and pathogen detection through microscopic imaging, significantly improves upon conventional sample-based analysis in terms of time efficiency, logistical demands, and manpower requirements.

2. Literature Review

2.1. AI-Biodetection of Diseases

AI-based approaches for detecting diseases or outbreaks can be categorized based on their data and algorithm structures. Datasets can include statistics of the infected population, such as the number of infection cases and their locations, or laboratory methods such as microscopic images or other metrics of the contaminated material read by various sensors, which may be integrated with the Internet of Things. These metrics can include physiochemical parameters, such as temperature, pH, turbidity, and conductivity sensors, which share data and trigger alarms through an IoT network or offline on-site visible indicators and SMS messages when applied in rural areas lacking reliable internet connectivity [4]. Machine learning (ML) algorithms can either be integrated within a cloud network for continuous updates or installed onboard as offline devices using pre-trained models. This approach is particularly beneficial for rural applications, such as the system proposed in [4]. Based on several physicochemical parameters, this system’s ML model was trained to predict the concentration levels of V. cholerae, the pathogenic bacteria responsible for cholera. These parameters include temperature, turbidity, pH, conductivity, and salinity, and they are measured using suitable sensors connected to a microprocessor located in the monitored area. This setup is capable of sending alerts to users in the vicinity.

Whereas that method uses water quality parameters to provide the current concentration levels of pathogens, other methods can apply predictive AI models, such as artificial neural networks (ANN), support vector machines (SVM), long short-term memory (LSTM), and bootstrapped wavelet neural networks (BWNN), to forecast future variations in pollution levels and physicochemical parameters [5]. They can thus be used to foresee future outbreaks and take appropriate countermeasures. Predictive algorithms analyze changes recorded by various sensors and correlate them with confirmed infection cases, enabling the prediction of potential shifts in infection rates in areas with low or no reported cases. This foresight allows for proactive measures to be taken in anticipation of outbreaks when anomalies in sensor data are predicted. The main challenges in IoT monitoring methods for water quality are the high cost and difficulty of securing reliable power sources for widely scattered physicochemical and biosensors, as well as communication difficulties, especially in rural areas and variable terrain.

Furthermore, optical imaging can be employed with AI to monitor water quality, whether through visual recognition, using cameras integrated into an IoT network [6], or more closely through microscopic imaging of water samples acquired from the area of interest [1]. Wu et al. [6] utilized a convolutional neural network (CNN) to determine the multilayer properties of water images captured from various locations through a network of surveillance cameras and classify them into two main categories: clean and polluted water. Through a supervised training process with a pre-labeled dataset, the CNN optimizes the parameters of its kernels in the hidden layers so that the resultant feature maps encompass both low- and high-level visual features. Because of the need to classify a large number of images with varying illumination, contrast, backgrounds, and resolutions, feature maps resulting from consecutive convolutional layers are followed by attention modules to allow additional relative weighting of both low- and high-level feature maps. Moreover, CNNs were found to be suitable for microscopic image classification. Deep CNNs demonstrated the highest identification accuracy of pathogens compared to other artificial neural networks [1]. CNN can work parallel to Dense Scale Invariant Feature Transform (DSIFT) to extract feature vectors from microscopic images to be eventually classified with SVM or Random Forest [7]. Therefore, CNN will be used in this study to detect and identify waterborne pathogenic bacteria after the general morphological image processing, as mentioned earlier in the introduction, to avoid the need for extracting many features during network dataset preparation.

2.2. Airborne Water Samplers

To the knowledge of the author, all of the drones used for water sample collection in the literature are rotary-wing drones [8,9,10], where the collection devices, which extend from the drone into the water, differ in design. These systems are suggested as alternatives to manual sampling from shore or by boat, which is time-consuming and requires logistical support. Drone sample collection is considered more efficient in terms of energy consumption compared to boat sample collection and poses less of a biosecurity risk [8]. Koparan et al. [9] reported that a water collector carried by a copter provided a regulated water sampling mechanism based on onboard measurement of immersion depth and in situ water quality parameters, such as turbidity. This method was used to avoid unnecessary sample collection. Our device does not use this feature, as the system is designed to perform unconditional regular laboratory testing. The figures of that system showed horizontally balanced water-capturing cylindrical cartridges that could be conditionally closed by a plastic cap attached by a string to a servomotor controlled by a microcontroller.

Singh et al. [10] mounted a pump on the drone, activated by an ultrasonic proximity sensor, to fill a small reservoir through a flexible pipe that was folded onto the drone during flight. We find this method to be the most secure, as other methods that rely on trapping water risk leaking some of the samples due to aircraft vibrations. However, it is recommended that a more efficient and durable design be adopted than the flexible, warped pipe mentioned [10]. This type of pipe faces challenges during insertion and requires an additional motor for extension during sampling, which demands extra power.

Water samples are expected to be collected from areas with a large number of water bodies, such as south and central Africa, which has the highest rates of deaths due to unsafe water resources [11], with hundreds of water bodies to be monitored. In that region, most of the lakes lay in regions close to each other. For instance, when examining the distribution of lakes across Uganda, it becomes apparent that they can be categorized into clusters, each spanning an area with an approximate radius of 75 km. Consequently, in the event of a waterborne disease outbreak, it is recommended to deploy survey drones from a ground station strategically located at the center of each cluster of water bodies. This arrangement would enable each drone to complete a maximum round trip of 145 km for every sampling operation. This range can be covered by medium-range surveillance drones that run on a single rotary engine, which has higher efficiency at moderate speeds compared to jet engines. Because of their application in surveillance, these aircrafts are all twin-boom aircrafts, which provide aerial stability [12]. Although other rotary-wing UAVs (such as quadcopters) have the advantage of stationary hovering for sample collection compared to fixed-wing drones, they still suffer from higher power consumption, as they run multiple propellers constantly. Thus, they are less suitable for reaching remote areas.

Fixed-wing drones are expected to have a higher range than previous rotary-wing solutions [8,9,10], with fixed-wing drones offering double the range on the same battery capacity, as indicated in the survey of the flight time of commercial UAVs as a function of battery capacity in the graph in [13]. The longer range is critical for collecting water samples from a wide area with many water bodies or ponds. The long-range robotic sample collection approach provides an alternative to the high-cost installation of a large number of distributed IoT sensors, as seen in multisensory monitoring systems in [4,5] and camera networks in [6], which require communication infrastructure and a constant power supply, especially when the monitored water bodies are located in remote areas.

3. Materials & Methods

3.1. Automated Aerial Sample Collection Mechanism

The drone is controlled through an automatic flight control system (FCS) from the base station to the water body. The guidance algorithm uses a global positioning system (GPS) integrated with a pure pursuit controller to navigate to the intended water body for automatic flight. The GPS helps determine the optimal moment for descent to collect samples. However, this approach limits sample collection to water bodies with a radius larger than the GPS’s margin of error to avoid collisions with the edges of the water body. An alternative, less constrained method would involve a more sophisticated system, using computer vision to recognize the dimensions of water bodies in real-time during flight, thus aiding in planning descent and collection actions.

The required drone altitude and attitude (flight direction) between preset waypoints are calculated and stored in the flight plan using a simple, pure pursuit guidance algorithm. The flight control system (FCS) consists of two independent components: a total energy control system (TECS) and an attitude controller. The TECS manages the engine’s throttle (or thrust) and pitch angle by balancing the aircraft’s total energy. In contrast, the attitude controller adjusts the roll angle by setting the angles of the servomotors for the ailerons, elevators, and rudder. The flight management unit (FMU) switches between modes for takeoff, cruising, sample extraction, and landing, providing setpoints to the FCS based on the stored flight plan. During cruising, altitude is read using the GPS module, whereas, in near-surface sample collection, an ultrasonic proximity sensor with better accuracy and resolution is necessary.

When the drone approaches the water body at the point of sample collection, it descends to the appropriate height so that a pipe can reach the water surface. The drone relies on its motion to pump water into an onboard container, as indicated in Figure 3a,b, and then climbs back to cruising elevation. During this process, an ultrasonic sensor measures the high-resolution distance from the water surface to adjust the pipe’s depth for a certain time until the drone returns to cruising altitude. The ultrasonic sensor is preferred for its superior accuracy in measuring short distances compared to a barometer sensor, which is more suitable for use during cruising. Ultrasonic sensors have proven capabilities in close-range water-level detection applications, such as water-level indicators in closed reservoirs [14,15].

An ultrasonic proximity sensor measures the distance of a targeted surface (D_s) by measuring the delay (Δt_s) between the transmitted and echo ultrasonic pulses for a normally incident wave:

D_{s} = v_{s} / (2 Δ t_{s})

(1)

where v_s is the speed of sound in air (343 m/s).

Both ground and water surfaces provide sufficient reflection of a normal incident ultrasonic wave for a detectable echo for the ultrasonic receiver. This is because of the high acoustic impedances of soil and water relative to air (1.48 × 10⁶ kg/m²s for water and 1–3 × 10⁶ kg/m²s for soil compared to 4.1 × 10⁻¹ kg/m²s for air [16]). Thus, there is high-intensity reflectance at the boundaries between both air/water and air/soil, where the reflection coefficient (Γ) is directly proportional to the difference between the two impedance values of the boundary surfaces:

Γ = \frac{Z_{2} - Z_{1}}{Z_{1} + Z_{2}}

(2)

where Z₁ and Z₂ are the acoustic impedances of the two materials at the two sides of the boundary (1 for air and 2 for soil or water).

Although ultrasonic sensors are known for their accuracy, they have significant directivity, which requires the precise alignment with the target surface to avoid erroneous readings caused by the deflection of the ultrasonic wave echo. This issue can be mitigated by using a differential filter to eliminate incorrect readings, which are identifiable by significant spikes in the rate of change of distance measurements (further detailed in the experimental testing section to follow). Alternatively, integrating these sensors with other, less precise sensors already present on the drone could help identify and correct substantial errors.

The sample collector operates based on the drone’s state via distance measurement using condition-based control. It performs three main actions: taking off, collecting the sample, and landing. In the first state, the drone is positioned on the surface for takeoff. After takeoff, the collector’s controller shifts to the next state when the distance from the surface measurements exceeds a preset value. In the second state, the controller waits for the second contact with the surface to collect the sample. The execution condition for the second state is based on the distance between the drone’s bottom surface and the pipe tip when unfolded, which needs to be immersed in water. Folding and unfolding the water inlet pipe is managed by a servomotor, which lowers the pipe to allow water to flow into the container for a certain period before it returns to cruising; at this point, the pipe is folded back to the drone’s base. A prototype of the sample collector and the flowchart of the condition-based controller are shown in Figure 3c,d. In the sample collector prototype, an HC-SR04 ultrasonic sensor, fixed on the UAV, as shown in Figure 3c, is used, where the transmitter and receiver are connected to two digital I/O ports to transmit a rectangular pulse and record its echo. A sample-collection state is included in the flight plan within the microcontroller, which is triggered when the first near-surface flight is detected after takeoff using the proximity sensor. This state unfolds the sample collector into the water for a short delay before folding it back into the drone’s body, allowing it to land when it returns to the base station. As the sample collector’s container is folded back into the drone’s body, a pressure relief hole is closed to trap the water inside and prevent loss during the flight back.

The delay time for the sample collection mode to fill the required volume of water depends on the final velocity at the storage tank opening or the charge rate of water flowing from the water surface through the pipe fixed on the drone. For example, assuming the aircraft’s speed is 70.7 km/h, and water is entering a circular pipe with a length of 1 m and a diameter of 2 cm, the flow rate can be approximated using the Bernoulli principle to be around 6 L/s, ignoring minor kinetic losses. Thus, the sample collection time can be limited to a few seconds. We can vary the assumed final velocity, pipe length, and diameter to achieve the maximum flow rate, but disturbances to flight and turbulence at the inlet should also be considered.

3.2. The Automated Sample Processor

Bacterial colonies must be processed and stained to be detectable under the microscope. An automated sample processor runs the sample through multiple phases to prepare it for microscopic imaging and classification. Electronic dispensers and staining solutions are used to place the sample drops on a slide. After air drying the dropped sample, heat fixation must be applied by passing the slide’s back over a flame a few times to prevent washing off specimens during staining. The clip holding the slide consists of a servomotor pressing it against a base, which is fixed on another servomotor that rotates the slide through the processing steps under multiple dispensers, until it reaches the microscope, as shown in the built prototype in Figure 4.

Assuming Gram staining is performed, four dispensers are needed for four solutions: crystal violet, iodine, ethyl alcohol, and safranin, consecutively, along with a distilled water dispenser for washing the staining solutions. The dispensers’ operation must be synchronized with the motor moving the slide’s holder. The fan is placed for air drying after washing the slide with alcohol for two minutes. Delays of 60 s are set after applying crystal violet, iodine, and safranin, while a 10 s delay is set after washing with ethyl alcohol, where the entire processing time is around 5.28 min. The sample processor can produce 11 slides per hour, which is sufficient to acquire images for building a complete subset for a single pathogen category and an entire training dataset within a few hours. The sample processor is small enough to be placed in a fume hood for protection from possible infectious aerosols resulting from heat fixation, while a camera placed on the microscope’s eyepiece captures images remotely for image processing and pathogen detection.

3.3. Ethical and Environmental Concerns

In general, drones are considered a low-cost alternative to conventional mechanical harvesting in wildlife surveying and offer rapid access to ecosystems, supporting biosecurity programs [17]. However, their use in sensitive ecosystems must be approached with caution due to their potential impact on wildlife, surrounding inhabitants, and the environment. The noise from drone propellers can disrupt animal behavior from a distance [18]. Moreover, the sound of propellers can disturb both human and animal inhabitants. Drones can have disruptive effects on birds, reptiles, and mammals [17]. This effect can be minimized by planning flight paths through isolated areas whenever possible and by reducing the number of propellers on the drone. Rotary-wing drones, such as quadcopters, tend to produce louder noise than fixed-wing drones.

Moreover, bird strikes, insect swarms, or sudden wind gusts can cause turbulence or even propeller failure, increasing the risk of a crash and the potential spillage of hazardous samples. This risk is particularly high during the sample collection stage, when flying just a few meters above the ground or near water surfaces, where species or obstacles could cause a crash. This highlights the need for an obstacle detection system and a tracking mechanism to give the monitoring team time to address these risks. A fixed-wing drone has the advantage of being able to maintain lift after engine failure, allowing for an emergency landing in a safe area, unlike rotary-engine drones.

Since the guidance system does not rely on visual navigation, concerns about invading people’s privacy are mitigated. Imaging is used only at the point of sample collection for obstacle avoidance. If the water body of interest is located on private or public property, the sample collection must be coordinated with the owner to prevent damage or injury. For this reason, an obstacle detection mechanism is recommended, particularly for near-surface flights. This system does not require high-resolution imagery, reducing privacy concerns, especially when activated only at low altitudes.

These concerns can be addressed through careful management of sample collection processes, in collaboration with local or indigenous communities. A successful example to consider is the Rainforest Foundation US (RFUS), which established a rainforest alert system along the Amazon River in Peru while working closely with various indigenous organizations [19]. Preventing outbreaks that could pose hazards may outweigh some of these concerns. Technical issues can be minimized or avoided through careful design and by regulating the number of sample collection processes, limiting them to seasons when outbreaks are most likely to occur.

3.4. Microscopic Images Dataset Preprocessing

The microscopic images of the considered bacteria were retrieved from the DIBas dataset, which is publicly available for researchers [3]. The first step in segmenting microscopic images to create appropriately sized inputs to the CNN model is to estimate the appropriate size to accommodate bacterial cultures. Its size was set according to the smallest possible morphotype, so classification accounts for small details in bacterial morphotypes. To measure the size of a bacterial culture in a microscopic image that contains many boundaries, the boundaries must be roughly detected, requiring an image processing procedure, as shown in the example in Figure 5. The 3-layer RGB microscopic image is converted to a 1-layer reversed grey image. The colors in the grey-scale images are reversed, since the background in microscopic images usually has a light background, unless filtering layers are used beneath the slides. To fix the illumination and textural noise of the background, a morphological top-hat transform was applied with a 15-pixel disk as the morphological structuring element.

A binary version of each image is used to detect the pixels of blobs, where the objects with fewer than 30 pixels are removed. In instances where bacterial colonies are in close proximity within an image, it is necessary to identify each colony individually. Each shape should be differentiated in mixed images, unless it shares a common area of at least 8 pixels with another shape. Separating blobs allows the individual measurement of the lengths of the sides of the rectangular area around them by measuring the vertical distance between the uppermost and lowermost indexes and the horizontal distance between the rightmost and leftmost indexes of pixels in each blob. The minimum of the larger sides for each blob across all images in the dataset divides each complete microscopic image into equal squares. Segments in which the bacterial culture represents less than 20% of the total area in the binarized image are filtered out. A total of 20,333 microscopic image segments resulted from this process for all samples.

4. AI-Based Risk Assessment

4.1. CNN Architecture

Resultant individual estimations of areas of blobs in a single microscopic image for six bacteria kinds (harmful and harmless) are presented in the histograms in Figure 6 for the six aforementioned pathogenic and harmless bacteria. These histograms show that the microscopic images are expected to have similar small blobs with areas between 0 and 100 pixels. These distinctions can be used as a parameter to check the output of the CNN classification, which will be described in the next subsection. Classification is to be based on cropped segments from microscopic images. While binary-converted images are used for blob dimension measurement and segmentation only, classification is based on colored images. Blobs’ responses to staining pre-imaging vary from one category to another, thus contributing to classification. Segments have a common size, and each segment is input to the CNN individually. The input layer for the CNN is a fully connected 3-channel with a size of 102 × 102 pixels per channel. The size of the input layer is set according to the smallest size of color images of individual bacterial cultures among the set of interests, as explained in the last section. As shown in the diagram in Figure 7, the fully connected input layer follows the convolutional layers with randomly initialized weights. All networks contain three convolutional blocks, each containing a number of consecutive hidden convolutional layers, which is the depth of that block. Convolutional layers are followed by normalization and activation layers, where the optimal number of hidden layers is found through a Bayesian optimization algorithm and other hypermeters. The Rectified Linear Unit (ReLU) is chosen as the activation function after convolutional layers because it is widely used in colored image classifiers [20,21]. The kernel size used in convolutions is 3 × 3, while the number of kernels in each layer from the first to the last is set to rounded values of 16, 32, and 64 multiplied by the reciprocal of the square root of the number of the hidden layers in each block. Padding is used after each convolutional layer to ensure consistent dimensions for feature maps. The first and second convolutional blocks are followed by pooling and normalization layers, while the last is connected directly to the output fully connected layer.

Different convolutional networks were tested in accordance with the dataset’s different organizations. The dataset was organized into two forms. The first general form had the images of all infectious and harmless bacterial cultures pooled into two files corresponding to these two categories. The second organization of the same dataset had the images of each species in an independent file forming a total of six categories, namely for Acinetobacter baumannii (A. baumanii), Escherichia coli (E. coli), and Pseudomonas aeruginosa (P. aeruginosa), which are pathogenic bacteria, and Enterococcus faecium (E. faecium), Lacticaseibacillus paracasei (L. paracasei), and Micrococcus spp., which are harmless bacteria. Fully connected layers follow the last convolutional layer with a number of outputs equal to the number of categories in each version of the dataset, accordingly either two for the general classification or six for the specific classification. These outputs are fed into the final Softmax layer, which outputs the probability of each category label, with the label of the highest probability being output as the predicted label.

During the Bayesian optimization trials, the objective function trains the CNN models and returns the mean error calculated by counting the mean of incorrect classifications in a validation subset that consists of 10% of the training dataset. This return allows the optimization algorithm to adjust the number of convolutional blocks, initial learning rate, optimizer momentum for the weights, and weight decay rate for the next trial to minimize the mean error. During the supervised training iterations, the Stochastic Gradient Descent (SGD) with the momentum optimizer to update the networks’ weights was preferred over other optimizers, such as the Adam optimizer, because it offered faster convergence to achieve the optimal model.

4.2. Results

Thirty trials were executed in the Bayesian global optimization where, in each trial, the corresponding network is trained through 63 iterations. The two versions of the da-taset (each with a total of 20,333 microscopic image segments), with two and six labels, were divided into intro training and validation subsets, which represent 90% of the total (10% of this used for validation) and 10% for testing (2033 segments). The optimal depth was determined using the dataset labeled into two categories (harmful and harmless bacteria). The optimal depth of convolutional blocks was found to be a single layer with a total of three cascaded blocks. With a learning rate of 0.010026, this CNN achieved a validated classification accuracy of 100% in distinguishing harmful (infectious) bacteria from harmless ones, as indicated in the training progress graph in Figure 8a. During testing, the accuracy dropped slightly to 99.7%. The confusion chart in Figure 8b for the testing subset shows that all harmful bacteria images were classified correctly, with a validated accuracy of 100% and a mean error of 0. On the other hand, only six harmless bacteria images out of a total of 1005 (0.6%) were misclassified as harmful. This, despite being a false alarm, does not pose a danger compared to the reverse scenario, were harmful bacteria are classified as harmless.

Furthermore, to identify the specific kind of each bacterial culture, a different resultant optimal CNN with six possible output labels (A. baumanii, E. coli, P. aeruginosa, E. faecium, L. paracasei, and Micrococcus spp.) achieved a validated classification accuracy of 91.96% and mean error of 0.082596 during training, as indicated in the training progress graph in Figure 8c, through an optimal learning rate of 0.010017 and the depth for the three convolutional blocks set to one hidden layer per block, found through 30 Bayesian trials. For harmful bacteria, A. baumanii and P. aeruginosa, the individual classification accuracy was high: 100% and 95.1%, respectively, as shown in the summary table to the right of the confusion matrix in Figure 8d. However, for E. coli, which is pathogenic, the classification accuracy was low, 65.4%, where 34.6% of the testing E.coli images were misclassified either as A. baumanii or P. aeruginosa, both members of the pathogenic group as well as E. coli.

The fact that most harmful bacteria were either classified correctly in its specific category or misclassified as another bacteria, the general pathogenic harmful category agrees with the results of the first general CNN, which only had two major labels and achieved an accuracy of 100% in detecting harmful bacteria. The second CNN achieved high specific classification accuracies for all harmless bacteria during testing: 89% for E. faecium, 100% for L. paracasei, and 100% for Micrococcus spp. In addition, the classification accuracy for most harmful bacteria images was also high: 100% for A. baumannii and 95.1% for P. aeruginosa, although it was lower for E. coli at 65.4%. Since the primary objective of the proposed system is to serve as an early warning system, the two output labels may be sufficient to alert against an outbreak, given their near-perfect detection accuracy for harmful pathogens. However, multiple output labels are necessary to specify the appropriate treatment for combating the spread of a pathogen, a point discussed in the next section.

5. Future Work

The recognition of pathogens that can contaminate water and their interaction with other species found in unclean unfiltered water samples, such as parasites, algae, and insects, can contribute to the identification of pathogens as well as using other color-based recognition algorithms and multilabel classifiers where a bacterial infection can change the appearance of these species. Higher quality and higher magnification microscopic images, along with additional species, can enhance the classifiers’ ability to extract these features effectively. Additional pollutants that can be considered include microorganisms and chemical contaminants, such as microplastic debris, which are widespread. This can also be applied to multispectral photos of the water surface, which helps deduce the water temperature and the pH level by visually inspecting watercolor and surface reflectance maps [22]. Water temperature and pH level are among the most important features that can be used to predict dissolved oxygen levels, which is a major indicator of the existence of microorganisms in water, using neural networks [5].

For future development, the scheduling of sample collection processes throughout the year can be optimized using a reinforcement deep learning approach. This method would leverage data on infection locations, weather conditions, and statistics on natural events such as droughts and wildfires from previous years. For the control center that designs flight trajectories for each drone, a K-Nearest Neighbors (KNN) algorithm could be employed, as it is well-suited for environments with limited data. The learning environment can be constructed using a grid system or Makrov map for each location.

6. Conclusions

This paper proposes an integrated system of robotic solutions and AI models for the early detection of viral waterborne diseases. The prototype integrated an onboard ultrasonic sensor and servomotor to automatically operate a water sample collector during a near-surface flight to trap water in an onboard container. Another robotic prototype was built to execute water sample processing and capture microscopic images using an automatic slide holder and multiple electronic solution dispensers to perform the stages of Gram staining of the sample drops. The automated sample processor has a production rate of 11 slides per hour, which is sufficient to build a training dataset for the classifiers within a few hours. To detect viral pathogens in water samples using CNNs, microscopic images of potential waterborne pathogens were segmented using a morphological image processing technique, which included images of Acinetobacter baumannii (A. baumanii), Escherichia coli (E. coli), and Pseudomonas aeruginosa (P. aeruginosa), Enterococcus faecium (E. faecium), Lacticaseibacillus paracasei (L. paracasei), and Micrococcus spp. The optimized CNN model used to classify the bacterial cultures into harmful and harmless achieved an accuracy of 99.7%. On the other hand, the model used to identify specific kinds of bacteria achieved a cumulative accuracy for the six considered species of 93.65%. The confusion matrices of individual accuracies of species varied between 89% and 100% for both harmful and harmless bacteria, except for E. coli, which had the identification accuracy of 65%. However, all E. coli segments were successfully classified as harmful bacteria using the first two-output CNN model, which helps maintain the validity of the alarm system.

Author Contributions

Conceptualization, methodology, formal analysis, and software, A.A.; writing—original draft, A.A.; validation G.A.H.; writing—review and editing, G.A.H. and L.A.; supervision, G.A.H. and L.A.; project administration, L.A.; funding acquisition, L.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the American University of Sharjah, grant number FRG23-C-E08.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The used training dataset consists of processed images from the DIBas dataset which is open for research. This was mentioned in the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Joy, C.; Sundar, G.N.; Narmadha, D. Artificial intelligence-based test systems to resist waterborne diseases by early and rapid identification of pathogens: A Review. SN Comput. Sci. 2023, 4, 180. [Google Scholar] [CrossRef]
Hall-Clifford, R.; Arzu, A.; Contreras, S.; Croissert Muguercia, M.G.; de Leon Figueroa, D.X.; Ochoa Elias, M.V.; Soto Fernández, A.Y.; Tariq, A.; Banerjee, I.; Pennington, P. Toward co-design of an AI solution for detection of diarrheal pathogens in drinking water within resource-constrained contexts. PLoS Glob. Public Health 2022, 2, e0000918. [Google Scholar] [CrossRef] [PubMed]
Zielinski, B.; Plichta, A.; Misztal, K.; Spurek, P.; Brzychczy-Wloch, M.; Ochonska, D. “Digital Image of Bacterial Species (DIBaS),” Distributed by Jagiellonian University. Available online: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0184554 (accessed on 1 March 2022).
Ogore, M.M.; Nkurikiyeyezu, K.; Nsenga, J. Offline prediction of cholera in rural communal tap waters using edge ai inference. In Proceedings of the 2021 IEEE Globecom Workshops (GC Workshops), Madrid, Spain, 7–11 December 2021. [Google Scholar]
Zhu, M.; Wang, J.; Yang, X.; Zhang, Y.; Zhang, L.; Ren, H.; Wu, B.; Ye, L. A review of the application of machine learning in water quality evaluation. Eco-Environ. Health 2022, 1, 107–116. [Google Scholar] [CrossRef] [PubMed]
Wu, Y.; Zhang, X.; Xiao, Y.; Feng, J. Attention neural network for water image classification under IOT Environment. Appl. Sci. 2020, 10, 909. [Google Scholar] [CrossRef]
Zieliński, B.; Plichta, A.; Misztal, K.; Spurek, P.; Brzychczy-Włoch, M.; Ochońska, D. Deep learning approach to bacterial colony classification. PLoS ONE 2017, 12, e0184554. [Google Scholar] [CrossRef] [PubMed]
Graham, C.T.; O’Connor, I.; Broderick, L.; Broderick, M.; Jensen, O.; Lally, H.T. Drones can reliably, accurately and with high levels of precision, collect large volume water samples and physio-chemical data from Lakes. Sci. Total Environ. 2022, 824, 153875. [Google Scholar] [CrossRef] [PubMed]
Koparan, C.; Koc, A.B.; Privette, C.V.; Sawyer, C.B. Adaptive water sampling device for aerial robots. Drones 2020, 4, 5. [Google Scholar] [CrossRef]
Singh, D.; Singh, R.; Ajmeria, R.; Gupta, M.; Ponnalagu, R.N. DAWSSM: A plug-and-play drone assisted water sampling and sensing module. In Proceedings of the IECON 2021—47th Annual Conference of the IEEE Industrial Electronics Society, Toronto, ON, Canada, 13–16 October 2021. [Google Scholar]
Ritchie, H.; Roser, M. Clean Water, Our World in Data. Available online: https://ourworldindata.org/water-access (accessed on 4 June 2023).
Septiyana, A.; Ramadiansyah, M.L.; Jayanti, E.B.; Hidayat, K.; Rizaldi, A.; Atmasari, N.; Suseno, P.A. Static stability analysis on twin tail boom UAV using numerical method. In Proceedings of the 8th International Seminar on Aerospace Science and Technology—ISAST 2020, Bogor, Indonesia, 17 November 2020. [Google Scholar]
Stewart, M.P.; Martin, S.T. Unmanned Aerial Vehicles: Fundamentals, Components, Mechanics, and Regulations; Nova Science Publishers: New York, NY, USA, 2021; Chapter 1; p. 53. [Google Scholar]
Hussen Hajjaj, S.S.; Hameed Sultan, M.T.; Moktar, M.H.; Lee, S.H. Utilizing the internet of things (IOT) to develop a remotely monitored autonomous floodgate for water management and Control. Water 2020, 12, 502. [Google Scholar] [CrossRef]
Djalilov, A.; Sobirov, E.; Nazarov, O.; Urolov, S.; Gayipov, I. Study on automatic water level detection process using ultrasonic sensor. IOP Conf. Ser. Earth Environ. Sci. 2023, 1142, 012020. [Google Scholar] [CrossRef]
Jiang, Z.; Ponniah, J.; Cascante, G. Innovative nondestructive test method for condition assessment of longitudinal joints in asphalt pavements. In Proceedings of the 14th Pan-American Conference on Soil Mechanics and Geotechnical Engineering, Toronto, ON, Canada, 2–6 October 2011. [Google Scholar]
Jiménez López, J.; Mulero-Pázmány, M. Drones for Conservation in Protected Areas: Present and Future. Drones 2019, 3, 10. [Google Scholar] [CrossRef]
Pomeroy, P.; O’Connor, L.; Davies, P. Assessing use of and reaction to unmanned aerial systems in gray and harbor seals during breeding and molt in the UK. J. Unmanned Veh. Syst. 2015, 3, 102–113. [Google Scholar] [CrossRef]
Sauls, L.A.; Paneque-Gálvez, J.; Amador-Jiménez, M.; Vargas-Ramírez, N.; Laumonier, Y. Drones, Communities and Nature: Pitfalls and Possibilities for Conservation and Territorial Rights. Glob. Soc. Chall. J. 2023, 2, 24–46. [Google Scholar] [CrossRef]
Flachot, A.; Gegenfurtner, K.R. Color for object recognition: Hue and chroma sensitivity in the deep features of Convolutional Neural Networks. Vis. Res. 2021, 182, 89–100. [Google Scholar] [CrossRef] [PubMed]
Zhuang, Y.; Guo, C. City architectural color recognition based on deep learning and pattern recognition. Appl. Sci. 2023, 13, 11575. [Google Scholar] [CrossRef]
Isgró, M.A.; Basallote, M.D.; Caballero, I.; Barbero, L. Comparison of UAS and sentinel-2 multispectral imagery for water quality monitoring: A case study for acid mine drainage affected areas (SW Spain). Remote Sens. 2022, 14, 4053. [Google Scholar] [CrossRef]

Figure 1. (a) An airborne water sample collector, (b) water drip kit, (c) staining drip kit, (d) sample slide, (e) base moving the slide, (f) microscope, (g) camera, (h) image processing, and (i) deep neural network.

Figure 2. From left to right: examples of segments of Gram-stained microscopic images that are used as inputs for the CNN from the DIBas dataset. In the upper row are the pathogenic ones: A. baumanii, E. coli, and P. aeruginosa, whereas in the lower row are the harmless ones: E. faecium, L. paracasei, and Micrococcus spp.

Figure 3. (a) A simplified example of a flight profile between the waypoints and (b) a sample collector fixed on an aircraft. A miniature prototype of the device on a small drone with an HC-SR04 ultrasonic sensor to automate the water trap movement fixed to a servomotor through an arm in (c). (d) A sample collector controller’s program flowchart.

Figure 4. Parts of the automated sample processor prototype. The servomotors’ arms of the dispensers are attached to a rubber piece to open and close the pressure relief hole on the top of each container, as required for solution flow from the hole in the bottom. Five dispensers are used for crystal violet, iodine, ethyl alcohol, safranin, and distilled water for washing between staining stages.

Figure 5. Stages of the processing of a microscopic image of Micrococcus spp. morphotypes, starting with the original image in (a), converted to grey-scale and reversed to the image in (b), on which top-hat transform was applied to produce the version shown in (c), which was binarized to create (d). Blobs, shown in white in the binarized version in (d) are isolated and measured individually as illustrated by the example shown in the red rectangle.

Figure 6. Histograms of the areas of blobs in microscopic images of different bacteria as per the title on the top of each graph, where the y-axis represents the number of blobs in each area range on the horizontal axis.

Figure 7. A diagram of building blocks of the CNN, where N is the number of cascaded hidden layers in each convolutional block, and M is the number of outputs for the fully connected layer (2 in the general classification network and 6 in the specific classification network). The kernel size of convolutional (conv.) and pooling layers is written between parenthesis.

Figure 8. Training progress graphs for the CNN with two output labels for the input of microscopic images: harmful and harmless in (a) and six output labels for specific species in (c). The blue and lighter blue lines represent the real and smoothed accuracy lines, respectively, while the dots along the dashed line represent the validated accuracy values. Charts in (b,d) show the confusion matrices for each CNN resulting from the training progress graphs above. Each bacterium is listed by the first part of its name in the confusion matrix in (d), namely Acinetobacter for (A. baumanii), Enterococcus for (E. faecium), Escherichia for (E. coli), Lacticaseibacillus for (L. paracasei), Pseudomonas for (P. aeruginosa), and Micrococcus.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Awwad, A.; Husseini, G.A.; Albasha, L. AI-Aided Robotic Wide-Range Water Quality Monitoring System. Sustainability 2024, 16, 9499. https://doi.org/10.3390/su16219499

AMA Style

Awwad A, Husseini GA, Albasha L. AI-Aided Robotic Wide-Range Water Quality Monitoring System. Sustainability. 2024; 16(21):9499. https://doi.org/10.3390/su16219499

Chicago/Turabian Style

Awwad, Ameen, Ghaleb A. Husseini, and Lutfi Albasha. 2024. "AI-Aided Robotic Wide-Range Water Quality Monitoring System" Sustainability 16, no. 21: 9499. https://doi.org/10.3390/su16219499

APA Style

Awwad, A., Husseini, G. A., & Albasha, L. (2024). AI-Aided Robotic Wide-Range Water Quality Monitoring System. Sustainability, 16(21), 9499. https://doi.org/10.3390/su16219499

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

AI-Aided Robotic Wide-Range Water Quality Monitoring System

Abstract

1. Introduction

2. Literature Review

2.1. AI-Biodetection of Diseases

2.2. Airborne Water Samplers

3. Materials & Methods

3.1. Automated Aerial Sample Collection Mechanism

3.2. The Automated Sample Processor

3.3. Ethical and Environmental Concerns

3.4. Microscopic Images Dataset Preprocessing

4. AI-Based Risk Assessment

4.1. CNN Architecture

4.2. Results

5. Future Work

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI