1. Introduction
Blowing snow is an extreme weather phenomenon, which can reduce road visibility and cause traffic accidents occasionally. The redistribution of snow by the wind will increase the accumulated snow on the leeward slope and may even cause the risk of avalanche [
1,
2]. In high latitudes, blowing snow can significantly affect the thermodynamic properties of the atmosphere and the motion state of particles in the boundary layer. The occurrence of blowing snow will increase the friction resistance between particles, which will influence the vertical wind speed [
3]. Thermodynamically, it can increase the sublimation rate of snow particles. Blowing snow plays an important role in atmospheric water vapor budget and ice sheet mass balance through sublimation and transmission [
4,
5]. In Antarctica, the strong Katabatic wind carries a huge amount of snow from the interior to the coast [
6,
7]. When the wind speed is above the threshold, snow particles begin to saltate and suspend in the atmosphere, eventually leading to the sublimation of snow [
8]. The sublimation of blowing snow is an important source of water vapor in Antarctica [
9]. Blowing snow changes the nature of surface snowpacks by sublimation and transmission [
10], affecting surface energy balance. According to the statistics, these processes remove from 50% to 80% of the snow cover in coastal Antarctica [
11]. It is especially important that blowing snow can locally lead to the formation of blue ice areas, which have a lower albedo and promote the melting of snow. This is likely to affect the stability of the ice shelf and even cause collapse [
12].
Observing and detecting blowing snow is currently made by ground-based and satellite-based remote sensing. At present, many researchers use the minimum wind speed threshold as the condition of blowing snow, which depends on the nature of the snow surface. However, the wind speed threshold may be different between different seasons. Palm et al. used the backscattering signal of CALIPSO (cloud aerosol lidar and infrared Pathfinder satellite observation) satellite and set the wind speed threshold to observe the blowing snow in Antarctica [
13]. Although satellite data can provide a large space coverage, the detection is limited to a clear sky and a blowing snow layer thicker than 30 m [
13,
14]. Moreover, about 90% of blowing snow events occur in cloudy scenes [
15]. It is, therefore, vital to conduct ground-based studies including lidar and ceilometer [
16]. Gossart et al. [
17] used the backscattering profile of the ceilometer to detect the blowing snow phenomenon in the polar region but could not distinguish between blowing snow and fog. Loeb and Kennedy pointed out that the wind speed of 3 m/s can be used as a condition for fog dissipation. However, advection fog with wind speed above 3 m/s in summer is common at McMurdo Station. For this reason, Loeb and Kennedy thought that an additional 90% relative humidity threshold can be used to separate fog from blowing snow [
10]. Therefore, they introduced the data of surface meteorology systems on the basis of Gossart to roughly distinguish blowing snow and fog by setting wind speed, humidity and visibility thresholds. In addition, blowing snow has a good correlation with ground observation elements, such as wind speed and temperature. Baggaley and Hanesiak [
18] used wind speed, air temperature and time since the last snowfall as factors to detect 66% of blowing snow events. However, the influence of fog on blowing snow detection is not considered.
This paper is devoted to using a machine learning algorithm to establish a fast, robust and accurate blowing snow and fog detection algorithm based on the data of a ceilometer and the observation elements of surface meteorology systems. We take the results of human observation as ‘true’ and divide the weather phenomenon into three categories: blowing snow, fog and other situations (clear sky or snowfall). The data of the ceilometer and surface meteorology systems are further used to construct the mapping relationship with the weather phenomenon based on the AdaBoost algorithm. The organization is as follows.
Section 2 briefly describes the instruments and data used in this paper.
Section 3 formulates the model of blowing snow and fog detection.
Section 4 evaluates the accuracy of the algorithm under different wind speeds and different relative humidities. Finally, the summary and conclusions are presented in
Section 5.
2. Instruments and Data
The U.S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) West Antarctic Radiation Experiment (AWARE) Field Campaign took place in 2016. The full ARM Mobile Facility (AMF2) was deployed at McMurdo Station on Ross Island (77°51′S, 166°40′E) for all of 2016 [
19,
20]. The site deployed various instruments, such as a Vaisala CL31 ceilometer, surface meteorology systems and particle size and velocity disdrometer (Parsivel
2; manufactured by Ott Hydromet GmbH at Kempten, Germany). In addition, the Antarctic Meteorological Research Center (AMRC) at the University of Wisconsin-Madison has provided the three-hour periods observation results by human observers at McMurdo Station since 1999 [
10].
2.1. AMF2 Instruments
2.1.1. Vaisala CL31 Ceilometer
The Vaisala CL31 ceilometer is an all-weather, automatic, remote-sensing device at a single wavelength (910 ± 10 nm at 25 °C) with a pulse width of 100 ns. Ceilometer can receive the signal scattered by cloud, aerosol and blowing snow with an avalanche photodiode receiver. It can also obtain a 7700 m vertical backscattering cross-section profile with 10 m range gates every 16 s [
21].
2.1.2. Surface Meteorology Systems
Surface meteorology systems consist mainly of conventional in situ sensors to measure the average value of basic meteorological elements within 1 min including vector-averaged wind speed and vector-averaged wind direction at 10 m (Vaisala WS425), temperature and relative humidity at 2 m (Vaisala HMP155) and vapor pressure. Concurrently, the present weather detector (Vaisala PWD22) can provide visibility and precipitation rate. The corresponding units of meteorological elements are given in
Table 1 [
22].
2.1.3. Parsivel2
Parsivel
2 is an optical disdrometer with a laser light source and a photodiode detector [
23]. The laser light at 650 nm is distributed across a 27 mm by 180 mm horizontal sheet between the light source and detector. When there are no particles, the voltage signal of the photodiode is usually large. If the particles pass through the field of view, the output of the photodiode will be reduced to measure the shadow of the particles [
10]. Parsivel
2 can directly measure the particle size and particle number and obtain the derived parameters, such as radar reflectivity and particle number density.
2.2. Human Observation
Human observers provide average wind speed and direction, temperature, visibility and barometric pressure at an altitude of 10 m within 3 h. Moreover, the duration of blowing snow and fog within 3 h is also provided. The daily observer will file these observation results into a spreadsheet.
3. Algorithm
In this section, we use the AdaBoost algorithm to construct the mapping relationship between observation elements and weather phenomena.
Section 3.1 identifies the training set and the testing set.
Section 3.2 selects the best classification features.
Section 3.3 summarizes the process of the proposed algorithm.
Section 3.4 briefly introduces the blowing snow detection method based on ceilometer and surface meteorology systems data proposed by Loeb and Kennedy.
3.1. Training Set and Testing Set
Human observers record the duration of phenomena in Antarctica within 3 h, such as blowing snow and fog. However, if blowing snow or fog is only experienced for some time within 3 h, it is impossible to judge whether blowing snow or fog has been experienced at the measuring time of the instruments. We show the human observation from 0000 Coordinated Universal Time (UTC) to 1200 UTC on 14 March 2016 in
Figure 1. Only 1.5 h of blowing snow occurred between 0600 UTC and 0900 UTC, and only 1.8 h of fog occurred between 0900 UTC and 1200 UTC. Therefore, we select the data which experience a specific event in the whole 3-h period for research, for example, the data of blowing snow from 0900 UTC to 1200 UTC and the data of other situations from 0000 UTC to 1200 UTC. It is worth noting that fog may also appear when blowing snow occurs. During the training process, we classify this kind of data as a blowing snow event. Only appearing fog without blowing snow is classified as a fog event. It is because Loeb and Kennedy averaged the backscattering profiles of the ceilometer within 5 min to calculate whether they experienced blowing snow. In order to keep consistent with the time resolution of the method proposed by Loeb and Kennedy, we also average the observation data of ceilometer and surface meteorology systems every 5 min. In this paper, we randomly select 75% of blowing snow, fog and other situations from each month as the training set, and the rest as the testing set which is displayed in
Table 2. The training set contains 47,073 groups of other situations, 1091 groups of fog and 2646 groups of blowing snow. To maintain the balance of data volume, we randomly select 3000 groups from the other situations’ data and put them into the training set. The fog events are taken smote oversampling [
24] to double the number of the samples (the processed data is shown in parentheses).
3.2. Feature Screening
To further improve the accuracy of the algorithm, reduce the risk of overfitting and reduce the calculation time of the model, the input is needed to take feature screening [
25]. This section calculates the correlation between the backscattered signals [unit: 1/(km
−1srad
−110
−4)] of the ceilometer at 5 m (C1), 15 m (C2), 25 m (C3), 35 m (C4), 45 m (C5), 55 m (C6), 65 m (C7) and 75 m (C8) and the wind speed (W_speed), relative humidity (Rh), visibility (Vb), precipitable rate (Pr), temperature (Temp), vapor pressure (V_pres) and wind direction (W_dir) observed by surface meteorology systems in three cases as shown in
Figure 2. In the three cases, an extremely positive correlation existed between the backscattering of the ceilometer at each height. In other words, if the backscattering at one height is known, the backscattering at another height can be easily derived. These increase the redundancy of the model. Therefore, we respectively select a low-level (C2) and a high-level (C6) backscattering as training features. In addition, the temperature and water vapor pressure maintained a high positive correlation. Meanwhile, there is a highly negative correlation between visibility and precipitation rate. To select the feature quantity with good independence, the water vapor pressure and precipitation rate are excluded. Finally, we use C2, C6, W_speed, Rh, Vb, Temp and W_dir as inputs.
3.3. Blowing Snow and Fog Detection Algorithm
The main processes of blowing snow and fog detection algorithm are constructed by AdaBoost algorithm as follows.
Take 1/4 of the data of other situations, fog and blowing snow in the training set as the verification set.
The training set is substituted into the AdaBoost algorithm to construct the mapping relationship between the selected elements and weather phenomena. The AdaBoost algorithm selects 50 weak classifiers. Then, the classification results in the validation set are obtained based on the blowing snow and fog detection model.
Repeat steps 1 and 2 100 times.
Calculate the accuracy of the validation set corresponding to each of the 100 models. The accuracy of 100 models corresponding to the validation set is between 90.17% and 92.61%, which reveals the potential of using machine learning to detect the blowing snow and fog.
We select the model with the maximum accuracy as the final blowing snow and fog detection model. The accuracy of the final selected algorithm in the validation set is 92.61%. Then, the exponential loss function used to evaluate the training loss can be written as [
26]:
where
y is human observation and
f(x) is the classification result of the model which are attributed to −1 or 1.
L denotes loss value and
n represents the number of corresponding categories in the training set.
Figure 3 shows the variation trend of loss value with the increase of iterations. In the first five iterations, the loss value decreases rapidly. However, when the iterations reach 10 times, the loss value remains almost unchanged. In the three comparative cases, the final average loss of training is about 0.07.
Figure 3.
The variation trend of loss value with the increase of iterations of other situations (Other) vs. blowing snow (BS), other situations vs. fog and blowing snow vs. fog in the training set.
Figure 3.
The variation trend of loss value with the increase of iterations of other situations (Other) vs. blowing snow (BS), other situations vs. fog and blowing snow vs. fog in the training set.
Use the blowing snow and fog detection algorithm to classify the testing set.
3.4. Blowing Snow Detection Algorithm Proposed by Loeb and Kennedy (Loeb Method)
Firstly, the original 16 s backscattering profiles are averaged within 5 min to reduce the interference of noise. Then, Loeb and Kennedy compare whether the backscattering signal of the ceilometer between 10 and 20 m is greater than the clear sky threshold. The clear sky threshold is usually set at 21 km−1sr−110−4 in the AWARE site. This method further compares the backscattered signal at 15 m with the average of the backscatter within 30–80 m. The ceilometer judges it as blowing snow if the former is larger. To distinguish between blowing snow and fog, the meteorological thresholds are further introduced. Under the premise of visibility less than 10 km, if the wind speed is less than 3 m/s or the relative humidity is greater than 90%, the blowing snow event is corrected as a fog event. When the average of the backscatter within 30–80 m is larger, the ceilometer will judge it as cloud or snowfall. However, it is impossible to distinguish which one it is.
4. Discussion
This section uses the blowing snow and fog detection algorithm derived in
Section 3.3 to classify the data in the testing set. Meanwhile, we further evaluate the accuracy of the proposed algorithm under different meteorological conditions.
Figure 4 displays the recognition performance of the proposed method and Loeb method when human observers record fog, blowing snow and other situations. In the testing set, the accuracy of both algorithms is about 94%. Apparently, there are great differences in the performance of the two algorithms under other situations, blowing snow and fog. It can be seen that compared with the Loeb method, the proposed method has greatly improved the recognition ability of blowing snow and fog, especially for fog detection (the accuracy is improved by nearly 45%). However, when recognizing clear sky or snowfall, the accuracy of the proposed algorithm is slightly reduced by less than 4%. In order to deeply understand the performance of the algorithm under different meteorological conditions, we use the random forest algorithm to calculate the “Gini importance” of each input in the training set to obtain the weights of different inputs in different scenes which are shown in
Figure 5 [
27]. Clearly, the backscattering of the ceilometer plays a vital role in distinguishing blowing snow from other situations and fog from other situations. However, it is difficult to distinguish between blowing snow or fog only based on backscattered signals. It is obvious that the introduction of wind speed has a significant effect on the ceilometer to distinguish between blowing snow and fog events. Furthermore, the Loeb method believes that fog occurs when the relative humidity is greater than 90%, which may cause fog events with a relative humidity of less than 90% to be unrecognized. In the following sections, we will evaluate the accuracy of the proposed algorithm under different wind speed and relative humidity conditions, respectively.
4.1. Evaluation under Different Wind Speeds
The occurrence of blowing snow is closely related to the wind.
Figure 6a reveals the distribution of other situations, blowing snow and fog under different wind speeds. Blowing snow basically occurs when the wind speed is greater than 6 m/s, and the frequency of blowing snow is the highest at 11 m/s. The distribution proportion of fog and other situations under different wind speeds is similar, almost all of which appear below 10 m/s.
Figure 6b–d shows the accuracy of the two algorithms in classifying three scenes under different wind speeds.
The recognition ability of the proposed algorithm for other situations decreases significantly at high wind speed, especially when the wind speed is above 12 m/s. The possible reason is that the contribution of wind speed to the proposed algorithm is greater than that of the backscattering of the ceilometer, which makes the proposed algorithm erroneous. Another reason is that the probability of other situations with wind speed above 12 m/s is only 1%, which is far less than the probability of blowing snow. The number of blowing snow in the training set is absolutely dominant under high wind speed, which leads to misjudgment of other situations. Meanwhile, the accuracy of the proposed algorithm is slightly smaller than the Loeb method in the classification of other situations when the wind speed is below 10 m/s. Among the misjudged data, 86% are misjudged as fog events and 14% are misjudged as blowing snow. The distribution of fog and other situations is relatively consistent under different wind speeds, which reduces the discrimination between fog and other situations. In addition, we further analyze the interference of heavy snow under low wind speed on the proposed algorithm. A case study is displayed in
Figure 7. From 0600 UTC to 0900 UTC on 6 December, human observers recorded snowfall. In addition, the wind speed during this period was about 5 m/s. The low-level backscattering signal of the ceilometer in
Figure 7a is very strong, and parsivel
2 in
Figure 7c shows a very high number density. This indicates that the snowfall intensity is strong at this time. Based on the shape of the backscattering profile of the ceilometer corresponding to 0712 UTC, both the proposed algorithm and the Loeb method judge the heavy snow as blowing snow. Hence, one can see that the accuracy of the ceilometer in detecting blowing snow is affected by heavy snow.
The proposed algorithm can recognize almost all blowing snow events with wind speed above 10 m/s, but the recognition of the Loeb method is significantly reduced for such blowing snow. In addition, it can be found from
Figure 6a that blowing snow with wind speed above 10 m/s is accounting for about 60% of the total blowing snow events. It means that the proposed algorithm has a good recognition effect on blowing snow events accompanied by strong winds, which reveals that the wind speed plays an important auxiliary role in detecting blowing snow by a ceilometer. However, the proposed algorithm loses accuracy with low wind speed while blowing snow. One reason is the limit of the backscattering profile of the ceilometer itself which is displayed in
Figure 8. Between 1912 UTC and 1924 UTC on September 1, the AWARE site experienced blowing snow with wind speeds of about 5 m/s. Parsivel
2 reveals that the particle number density is exceeding 2
8 m
−3mm
−1 and the intensity of blowing snow is strong. Although the ceilometer shows a large scattering signal at a low level, the backscattering signal at 15 m is lower than the average value of the backscattering signal between 30 to 80 m (
Figure 8b). It leads the proposed algorithm and Loeb method to misjudge blowing snow as other situations. Another reason is that low wind speed will reduce the discrimination between blowing snow and fog. It can be seen from
Figure 5c that wind speed is an important feature to distinguish between blowing snow and fog. Among the misjudged samples under low wind speed, 56% are misjudged as fog events. It results in that the accuracy of the proposed algorithm for blowing snow detection is slightly lower than that of the Loeb method. However, blowing snow with low wind speed is not common in Antarctica.
It is worth noting that although the proposed algorithm has a reduced ability to recognize fog under high wind speed. The fogs recognition ability of the proposed algorithm under any wind speed is much greater than that of the Loeb method, which is a major advantage of the proposed algorithm.
4.2. Evaluation under Different Relative Humidities
As shown in
Figure 9a, only about 30% of fog events occur when the relative humidity is above 90%. Therefore, it results in a large number of fog events that are difficult to detect by setting the 90% relative humidity threshold of the Loeb method. The relative humidity has little effect on the recognition performance in other situations, and the accuracy of the two algorithms is similar. However, for detecting blowing snow or fog, the two algorithms show great differences.
Figure 9c,d reveals the detection ability of the proposed algorithm for blowing snow and fog is significantly higher than that of the Loeb method within any relative humidity range. Compared with the simple threshold setting, using machine learning to distinguish between blowing snow and fog has higher accuracy. It is noteworthy that the Loeb method can hardly recognize blowing snow events when the relative humidity is above 90%. However, the accuracy of the proposed algorithm is close to 100%. The Loeb method takes the relative humidity of 90% as the characteristic to distinguish between advection fog and blowing snow. Even if there is a blowing snow event with relative humidity above 90%, the Loeb method will judge it as a fog event. Therefore, the Loeb method is not applicable to the detection of blowing snow with relative humidity above 90%. Nevertheless, the proposed algorithm is not limited by high humidity. This is due to the significant difference in wind speed during blowing snow and fog events under high humidity (
Figure 10). The wind speed of blowing snow under high humidity observed at McMurdo Station is almost greater than 7 m/s, while the most frequent wind speed of fog is below 7 m/s. It means the proposed algorithm can accurately identify blowing snow and fog events under high humidity.
5. Summary and Conclusions
This paper uses the AdaBoost algorithm to construct the detection algorithm of blowing snow and fog based on the data of ceilometer and surface meteorology systems. The introduction of meteorological parameters can solve the problem of distinguishing between blowing snow and fog difficultly only by a ceilometer. Compared with the Loeb method, the algorithm proposed in this paper has greatly improved the detection accuracy of blowing snow and fog. The proposed algorithm can detect 89.12% of blowing snow events and 76.10% of fog events, while the Loeb method can only identify 64.29% of blowing snow events and 31.87% of fog events. Obviously, the proposed algorithm can recognize almost all blowing snow events under high wind speed. At the same time, compared with the Loeb method, which sets the humidity threshold to distinguish between blowing snow and fog, the proposed algorithm is not affected by high humidity. This is due to the significant difference in wind speed when blowing snow and fog events occur. In addition, the fog detection performance of the proposed algorithm is better than the Loeb method under any wind speed or relative humidity. These are important advantages of the proposed algorithm, and also reveal the potential and advantages of using machine learning for blowing snow and fog detection.
It is worth noting that, two challenges exist in this paper. The first is the reduced ability to detect the other situations under high wind speed and the blowing snow events under low wind speed. This is because these situations are not common in Antarctica leading to few corresponding data in the training set. Furthermore, heavy snow will also interfere with the accuracy of the proposed algorithm. The second challenge is that the proposed algorithm is only applicable to the detection of fog and blowing snow but cannot calculate the height of the blowing snow layer or the thickness of the fog. Therefore, future work will further use the data of ceilometer and surface meteorology systems to accurately derive the height of the blowing snow layer and the thickness of the fog.