1. Introduction
Cloud observation forms part of the core duties of meteorologists as it helps them to make informed decisions in weather forecasting. Clouds are masses of water droplets or ice particles floating in the sky. The type of cloud, the total amount of cloud (cloud cover), and the height of the cloud are the three main factors that meteorologists consider while observing clouds. Essentially, there are ten main types of clouds. They are classified as high clouds, medium clouds, or low clouds by the World Meteorological Organization (WMO). High clouds include cirrus (Ci), cirrocumulus (Cc), and cirrostratus (Cs), whereas medium clouds include altocumulus (Ac), altostratus (As), and nimbostratus (Ns). Stratus (St), stratocumulus (Sc), cumulus (Cu), and cumulonimbus (Cb) are some of the low clouds.
More than 60 percent of the global surface is covered by clouds, and they play a vital role in the hydrological circle, climate change, and radiation budgets by modifying shortwaves and longwave [
1]. Each cloud, as well as the combination of two or more clouds, has its meteorological phenomena. For instance, in the aviation industry, though aircraft can experience turbulence in clear air, clouds contribute to the greater amount of turbulence experienced by the aircraft. Although most commercial airlines fly above much of the clouds, they still fly through clouds during landing and takeoff at the airports. A typical commercial jet has a cruising altitude of around six to seven miles (nine to 11 km) above sea level; hence, on a long-distance flight, a plane will generally be above most clouds, except for cirrus and the towering cumulonimbus.
Air passengers experience a bumpy flight when the aircraft enters clouds ranging from puffy cumulus clouds, which are also known as fair-weather clouds, to monstrous cumulonimbus clouds with their characteristic anvil-shaped tops, billowing sides, and ominously dark bases. When clouds are cooler than the surrounding air, the contrast in density between the clouds and the surrounding air creates a sort of “pothole” in the sky, resulting in a less-smooth flight. Storm clouds, such as cumulonimbus, are the types of clouds that most pilots want to avoid. Cumulonimbus clouds generally contain heavy rain, lightning, hail, strong winds, and occasionally tornadoes. Pilots and air traffic control pay close attention to the weather and route flights around these types of storm clouds. Similarly, clouds can block light and heat from the Sun, making Earth’s temperature cooler.
In meteorology, cloud cover is measured in oktas, or eighths of the sky. The observers divide the sky into eight boxes in their minds as they look up at it, and then picture all the clouds they can see being crammed into these boxes. Then, they count how many boxes the cloud fills. The number of boxes they will receive corresponds to the number of oktas of clouds. Zero oktas represent a complete absence of cloud, 1 okta represents a cloud amount of one-eighth or less, but not zero, 7 oktas represent a cloud amount of seven-eighths or more, but not full cloud cover, while 8 oktas represent full cloud cover with no breaks.
Meteorologists also use terminology to convey generally how cloudy it is; for instance, “few clouds” refers to 1 to 2 oktas, the “scattered cloud” refers to 3 to 4 oktas, where about half the sky is covered, and the “broken cloud” is 5 to 7 oktas, where much of the sky is covered and “overcast” is 8 oktas of cloud with no breaks in the cloud. When there is no presence of clouds in the sky, the reported cloud amount as “nil”.
Similarly, at most synoptic observing stations, cloud base (height) is readily measured by an instrument to a reasonable level of accuracy. The cloud base recorder employs a pulsed diode laser LIDAR (light detection and ranging) technology [
2], whereby short laser pulses (eye safe) are sent out in a vertical or near vertical direction. The backscatter caused by reflection from the surface of the cloud, precipitation, or other particles is analyzed to determine the height of the cloud base. Many modern cloud base recorders are capable of detecting up to three cloud layers simultaneously.
However, some meteorologists at weather observation stations still rely heavily on the manual estimation of clouds’ height. The observers determine the types of clouds present in the sky and use a generally estimated range of each of the 10 types of clouds’ heights to estimate the height of the cloud observed. For example, low clouds, which include cumulus clouds, can form anywhere from near the surface up to 2000 m (6500 feet). Middle clouds form at altitudes of 2000–4000 m (6500–13,000 ft) above ground near the poles, 2000 to 7000 m (6500–23,000 ft) at mid-latitudes, and 2000 to 2600 m (6500–25,000 ft) at the tropics. High clouds have base heights of 3000 to 7600 m (10,000–25,000 ft) in polar regions, 5000 to 12,200 m (16,500–40,000 ft) in temperate regions, and 6100 to 18,300 m (20,000–60,000 ft) in the tropical regions.
The most important aspect of cloud observation is cloud identification, as opposed to cloud base (height) and cloud amount (cloud cover). It is quite challenging for meteorologists to determine the type of cloud present at any particular time due to the clouds’ similarity in shape, color, form, and texture. It depends on the observer’s knowledge, experience, and color vision to identify clouds accurately.
Weather forecast reports are critical to areas such as transportation (air, sea, and land), agriculture, energy, the environment, and the general public. Given the potential for misclassification, clouds often cause weather forecasts to be inaccurate, thereby exposing lives and property to extreme weather disasters.
To forestall such occurrences, a variety of measurement equipment, including satellite-based and ground-based remote sensors, is used to collect the necessary cloud data for classification tasks. Ceilometers are standalone devices that use laser-based light detection and range (LIDAR) technology to measure the height profiles of clouds and aerosols [
2]. However, there are difficulties involved in using ceilometers to gauge cloud height. They do not display the type of cloud observed, are extremely expensive and, when used as a standalone instrument, are large and heavy, difficult to use, and require specialist knowledge. Furthermore, they are fragile and unreliable due to their susceptibility to weather conditions. Examples of ceilometers are the Eliason CBME80B and the Vaisala CL31. Additionally, clouds may be observed, from a downward perspective, by satellite-based weather equipment across wide regions. However, their spatial resolution is too low to represent small-scale cloud features across large regions [
3].
Several types of studies have been undertaken in an attempt to find a better method of cloud observation. Although the acquired accuracy was not sufficient owing to dataset constraints and the approach utilized, Zhuo et al. [
4] developed a color-census transformation to extract texture and structural information from a color-sky image. Refs. [
4,
5,
6] have advocated using hand-crafted elements, including color, texture, structure, and others, to classify ground-based cloud images in their investigations. He et al. and Labati et al. [
7,
8] also explored the use of machine-learning-based convolutional neural networks to classify ground-based clouds. Furthermore, Shi et al. [
9], for example, denoted each ground-based cloud image with deep convolutional activation-based features acquired through the pooling of the convolutional activations of each feature map, and ran tests on two datasets with 784 cloud images in five classes and 1500 images in seven classes.
In another experiment to classify 1231 cloud images into nine groups, Ye et al. [
10] retrieved features from various convolutional layers, in which discriminative-local patterns are picked and subsequently represented via the Fisher vector. The learning-group-patterns method extracts cloud properties using wireless sensor networks [
11]. The combined accuracy of the two databases, on the other hand, is 81%. The task-based graph convolutional network (TGCN) model, which combines graph computation with a deep network to classify ground-based clouds, was also suggested by [
12]. CNNs, which are a type of deep-learning architecture, have previously excelled in a variety of fields, including pattern recognition and computer vision [
13,
14].
Orthodox methods fail to describe and extract the characteristics of clouds because of the convolution of cloud texture and patterns, but CNNs can learn increasingly intricate patterns and discriminative textures from vast pre-trained and labeled datasets [
15]. Additionally, convolutional neural networks typically feature tiered characteristics-extraction frameworks. On the whole, CNNs shallow layers capture fine textures, such as edge and shape, but the deeper layers represent high-level semantic data based primarily on pixels. Previous research findings indicate that both semantic features and textures are vital for cloud characterization [
4].
Despite CNN’s image-classification prowess, few researchers have evaluated its accuracy and effectiveness in cloud classification. Cloud images have different qualities in cloud representation, particularly for contrails, since they are distinctive texture images with odd forms. Given the incredibly varied and difficult-to-classify characteristics of clouds, the adaptive-learning aspect of neural network classification provides a high-accuracy and computationally efficient alternative to cloud classification [
16]. Fabel et al. [
17] used about 300,000 semantic segmentations of ground-based all-sky images (ASIs) in two different pretext tasks for pretraining. One of them pursues an image reconstruction approach, while the other is based on the deep cluster model, an iterative procedure of clustering and classifying the neural network output. Li et al. [
18] classified ground-based cloud images by using contrasting self-supervised learning to pre-train the deep model with a contrastive loss and momentum update-based optimization. Liu et al. [
12] classified ground-based cloud images by using a transformer-based GCI classification method that combines the advantages of the CNN and transformer models. Toğaçar et al. [
19] also classified clouds by using super-resolution, semantic segmentation approaches, and binary sailfish optimization methods with deep learning models.
From cloud representation to cloud classification, the spatial format of fully connected (FC) features, as well as the local texture data of shallower convolutional layers, are crucial. Although these studies helped with cloud classification, a reliable cloud classification is not yet possible. As a result, work on an automated method that can accurately classify ground-based cloud images is continuing.
Therefore, we present Cloud-MobiNet, a robust CNN model for ground-based cloud classification that not only provides excellent classification accuracy but is also compact, efficient, portable, and can be used on smartphones. This study focuses on finding a more reliable way of classifying clouds in real-time to curtail the problems associated with employing ceilometers and other traditional methods, such as human cloud observation, which depends on the observer’s training, expertise, and color vision.
The different sections of this study are arranged as follows. The Cloud-MobiNet model is explained, followed by descriptions of the dataset and data preprocessing utilized in the experiment, which is followed in turn by an explanation of
Section 2’s model-training procedure. The experimental results, model-performance assessment, classification report, confusion matrix, and model deployment on a smartphone are then covered in depth in
Section 3. The results are discussed in
Section 4. Finally,
Section 5 provides a summary of the findings.
3. Results
After 4 h and 45 min of training and validation of our Cloud-MobiNet model, the model gave an accuracy of 97.45% and a loss of 0.07624, representing 0.76% of the total.
Figure 5a,b shows a graph of training and validation loss, and training and validation accuracy, respectively.
As shown in
Supplementary Document: SD1_Cloud-MobiNet_Codes, we carried out testing of the model to assess its performance by using the test dataset.
Table 2 shows the classification report of the classes predicted by the Cloud-MobiNet model. The classification report was auto-generated as we employed the machine-learning library Scikit-Learn’s simple syntax: “
from sklearn. metrics import classification_report”. Nevertheless, we explain the mathematical principles behind this classification report based on four methods to determine whether the extrapolations are correct or incorrect.
True Negative (tn): Indicates the case was and was anticipated to be negative. True-Positive (tp): Denotes an instance that was both positive and projected to be positive. False-Negative (fn): This means the case was positive, yet the aftereffect was predicted to be negative. False-Positive (fp): This means the case was negative, but the outcome was predicted to be positive.
The precision metric indicates how accurate our predictions were to the mark. The Cloud-MobiNet model can detect a negative instance as positive. For each class, it is determined as “the ratio of true positives to the sum of true positives and false positives”. Precision simply refers to the accuracy of positive predictions, which is calculated as follows:
A recall is the ability of the Cloud-MobiNet model to discover all positive occurrences. A recall is calculated as follows:
The
F1 score is a weighted harmonic mean of precision and recall, with 1.0 being the greatest and 0 representing the poorest. The following is how the
F1 score is determined:
where
is taken to be 1.
Accuracy is calculated as:
where
is the number of
tp for class
c,
C is the number of classes and
N is the total number of instances in the dataset.
Macro average is the simple mean of scores of all classes and it is calculated as follows:
Micro average or the weighted average is the sum of the marks of all classes after multiplying their class portions and it is calculated as follows:
The is the set of labels, where B(tp, tn, fp, fn) is evaluated centered on the number of tp, tn, fp, and fn, respectively.
Let denote the number of tp, fp, tn, and fn after a binary estimate for a label λ.
Figure 6 shows the confusion matrix of the classes predicted by the Cloud-MobiNet model. The confusion matrix of the Cloud-MobiNet predictions in
Figure 6 clearly shows our model’s confusion in classifying the clouds. For instance, confusing the cloud type of Ns and St is understandable for the reasons that they are very identical in shape, structure, transparency, and arrangement. The only difference that meteorological experts can sometimes use to distinguish them is their height, since the cloud type of Ns is a medium cloud, while the cloud type of St is a low cloud. The non-zero off-diagonal elements (0.1, 0.1, and 0.2) in the confusion matrix represent the percentage of the few misclassifications of the model.
3.1. Cloud-MobiNet Model’s Predictions and Interpretations
A prediction is an array of
N numbers in which the
N represents the number of classes (labels or categories) in the dataset. Each element represents the confidence that the image corresponds to in each of the image’s different classes (labels or categories).
Figure 7 shows the predicted cloud, cloud class, and array.
Based on 11 classes (0~10) from the reference sample picture in
Figure 7, the model generated a prediction on index 0, since it had the highest confidence level of 99.99%, that the image class was 0. The model’s percentage confidence for each class on the image sample provided in
Figure 7 is shown in
Table 3.
We randomly reserved 110 samples of the initial 2543 CCSN cloud dataset for testing and then performed the augmentation processes on the remaining 2433 cloud samples to have a unique test dataset not used for modeling. Out of 110 cloud images given to the model to predict their various classes, the model successfully predicted 106 correctly.
Figure 8 shows some of the images of the clouds predicted by the Cloud-MobiNet model.
Each image has the name of the cloud (class) predicted by the model, the percentage confidence that the model attributed to each of the classes, and the actual cloud’s name. The graph beside each image shows how the model predicted the cloud images. A correctly predicted cloud image has a green bar, green predicted cloud name, and green actual cloud name. The red bar, red predicted cloud name, and red actual cloud name are those that the model failed to predict correctly. Images with more than one bar are an indication of the percentage confidence that the model gave to the image according to the features shared by the other classes.
3.2. Implementation of Cloud-MobiNet Model on Smartphone
Several types of Android building application software can be used to build and publish an Android app, such as the Cloud Prediction app. A software engineer’s or developer’s decision to choose a specific Android software-development application is determined by their expertise and familiarity with the program they select, as well as how comfortable they are with it. Because the model is not software-specific, it is capable of running on any preferred platform.
The most basic interface-design requirement is to provide users with the option of using the camera of their smartphone to capture real-time images of clouds or to select a cloud image from their storage space.
Figure 9 and
Figure 10 show the smartphone’s cloud-prediction-app progressions and the basic flowchart for implementing a cloud-prediction app on a smartphone using the Cloud-MobiNet model, respectively.
5. Conclusions
Several deep-learning models have now emerged as a result of continued research in the area of artificial intelligence and neural networks. Based on MobileNet’s convolutional neural network model, we propose a highly efficient deep-learning model called Cloud-MobiNet for the classification of ground-based cloud observation images. Cloud-MobiNet promises to be a significant model in the short term, since automated ground-based cloud classification is anticipated to be a preferred means of cloud observation, not only in meteorological analysis and forecasting but also in the aeronautical and aviation industries.
In addition to its compact and portable properties, the Cloud-MobiNet model’s strength is that it can be used on mobile phones for real-time cloud classification. Contrary to the claim made by [
21] that the CCSN dataset is complicated, Cloud-MobiNet has proven its supremacy in classifying the CCSN cloud dataset with a training and validation accuracy of 97.45 percent and an average testing accuracy of 96 percent, which is an improvement on CloudNet. Without ignoring its misclassifications of
Ns and
St, as well as
Cc and
Cs, the model is apt, since these pairs of clouds are almost not mutually exclusive in terms of texture, structure, and shape. We believe that with continued training, the model may attain an optimal accuracy of around 99 percent.
Even though the model was run on a regular laptop computer, the study was unusual in that it was not conducted in a controlled setting, nor was it based on camera sensors, nor was it altered or boosted by lighting conditions, as in previous studies undertaken by many experts. All of the resources needed, including energy (power), memory, speed, and disc space, are within the capabilities of any normal smartphone. Because the model is small enough, it was run on a laptop with standard hardware rather than a sophisticated server, making it easier to reproduce on any standard smartphone.
To implement this, the model must be compiled and executed as a mobile app. The cloud image must be accepted as input by the app’s UI. The user may either load the image from a memory source, or utilize the phone’s camera to capture the cloud image in real-time using the mobile app. The app with the model at the backend will process the image and then predict the type of cloud that is captured or loaded from the memory in real-time. We anticipate that after the model has been trained and tested on a variety of cloud images in various lighting conditions and has delivered extremely accurate results, the type of camera utilized by the cell phone will have little or no influence on the predictions.
Although the Cloud-MobiNet model predicts only the type of cloud, the identification of cloud types is the most difficult process in cloud observation for meteorologists. In this process, a wrongful determination of the type of cloud can have a strong influence on weather forecasting. However, meteorologists know the average height of each of the 11 types of clouds and, for this reason, obtaining the correct cloud type helps the observers to determine their height. In weather forecasting, the cloud type is critically examined more than the height. It is also easy for meteorologists to determine the cloud cover in the sky at any time. Future work will concentrate on how feasible the model can be used to determine both the height and cloud amount.