Next Article in Journal
Comparison of Classic Classifiers, Metaheuristic Algorithms and Convolutional Neural Networks in Hyperspectral Classification of Nitrogen Treatment in Tomato Leaves
Next Article in Special Issue
A Cloud Detection Method Based on Spectral and Gradient Features for SDGSAT-1 Multispectral Images
Previous Article in Journal
Correction: De Angeli et al. Newly Developed Tool for the Post-Processing of GPR Time-Slices in a GIS Environment. Remote Sens. 2022, 14, 3459
Previous Article in Special Issue
Characteristics of Summer Hailstorms Observed by Radar and Himawari-8 in Beijing, China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Method for Retrieving Cloud-Top Height Based on a Machine Learning Model Using the Himawari-8 Combined with Near Infrared Data

College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410073, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(24), 6367; https://doi.org/10.3390/rs14246367
Submission received: 10 November 2022 / Revised: 14 December 2022 / Accepted: 14 December 2022 / Published: 16 December 2022

Abstract

:
Different cloud-top heights (CTHs) have different degrees of atmospheric heating, which is an important factor for weather forecasting and aviation safety. AHIs (Advanced Himawari Imagers) on the Himawari-8 satellite are a new generation of visible and infrared imaging spectrometers characterized by a wide observation range and a high temporal resolution. In this paper, a cloud-top height retrieval algorithm based on XGBoost is proposed. The algorithm comprehensively utilizes AHI L1 multi-channel radiance data and calculates the input parameters of the generated model according to the characteristics of the cloud phase, texture, and the local brightness temperature change of the cloud. In addition, the latitude, longitude, solar zenith angle and satellite zenith angle are input into the model to further constrain the influence of the geographical and spatial factors such as the sea and land location, on CTH. Compared with the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) cloud-top height data (CTHCAL), the results show that: the algorithm retrieved the cloud-top height (CTHXGB) with a mean error (ME) of 0.3 km, a standard deviation (Std) of 1.72 km, and a root mean square error (RMSE) of 1.74 km. Additionally, it improves the problem of the large systematic deviation in the cloud-top height products released by the Japan Meteorological Agency (CTHJMA), especially for ice clouds and multi-layer clouds with ice clouds on the top layer. For water clouds below 2 km and multi-layer clouds with water clouds at the top, the algorithm solves the systematically serious CTHJMA problem. XGBoost can effectively distinguish between different cloud scenarios within the model, which is robust and suitable for CTH retrieval.

1. Introduction

Clouds play important roles in balancing the radiation of the Earth atmosphere system and in climate change. They are regulators that affect the radiation budget at the top of the atmosphere and the Earth’s surface [1,2]. The radiative effect of clouds is one of the biggest uncertainties when evaluating future climate change [3]. There are differences in the degree of atmospheric heating caused by clouds at different heights [4]. In addition, cloud-top height (CTH) is also an important parameter in the fields of weather forecasting, weather modification and aviation safety [5]. The accurate measuring of CTH is very important for climate change predictions [6,7].
At present, the measurement of CTH is mainly based on ground observations and satellite observations [8,9,10]. Ground-based remote sensing can continuously observe CTH, but it can only be observed at fixed points, and the observation space is small [11]. Meteorological satellites have extensive observation coverage, which can generate large-scale CTH observations and provide observation data over the ocean and polar regions without the need for ground stations. Therefore, satellite CTH measurements have become an important method. The remote sensing of CTH using satellites mainly includes active remote sensing and passive remote sensing. Active sensors such as Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) mounted on CALIPSO and Cloud Profiling Radar (CPR) mounted on CloudSat can obtain the vertical distribution of clouds [12,13]. Both systems have global observation capabilities, but the time resolution is low, and the field of view is small, making it difficult to continuously observe a certain area for a long time. In contrast, CTH remote sensing using geostationary meteorological satellite imagers enables continuous observations over a large, fixed area.
The new generation of geostationary meteorological satellites such as FY4A, Himawari-8, GOES-R, and MSG has provided CTH products, but the accuracy needs to be improved. The CTH algorithm released by FY−4A uses a combination of brightness temperatures in the 10.8 µm, 12.0 µm and 13.5 µm bands to retrieve the CTH [14]. This method is similar to the CTH retrieval method published by GOES-R, which uses the CO2 slice method combined with the optimal estimation method to retrieve CTH [15,16,17]. Compared with CloudSat and CALIPSO, the standard deviation (Std) of this method is about 2 km [18]. The CTH retrieval algorithm released by Himawari-8 used the 11 µm, 12 µm, and 13.5 µm channel data to obtain the CTH using the Integrated Cloud Analysis System (ICAS), where the root mean square error (RMSE) of the single-layer cloud was 2.1 km [19]. Multi-channel imagers can retrieve CTH using thermal infrared imaging, CO2 slices, and one-dimensional variational (1DVAR) methods [20,21,22]. Most CTH retrieval methods for passive sensors involve radiative transfer models (RTMs), which usually suffer from large uncertainties in cloudy skies [23,24]. It has limited CTH accuracy for optically thin or broken clouds [25]. Several previous studies pointed to significant bias in CTH measured by passive sensors [26,27,28]. Some scholars have considered improving the deviation of CTH products. Although good results have been achieved, systematic deviations still exist [29,30].
To date, the CTH retrieval algorithm of the above-mentioned multi-channel imager has not given full play to the advantages of multi-channel and high spatial resolution, and only used a few channels of thermal infrared imaging. In addition to the thermal infrared band, the near-infrared band can also be used to retrieve the cloud parameters [31,32]. Palmer analyzed the optical properties of water clouds in the near-infrared band in 1974 [33]. Pilewskie used the reflectivity of the 1.65 µm channel for cloud-phase identification and achieved good results [34]. This proves that the 1.65 µm channel can effectively extract the different information of the water cloud and ice cloud, and CTH has an inseparable relationship with the phase state of the cloud. Using near-infrared channels such as 1.65 µm to distinguish the characteristics of water clouds and ice clouds can help improve the accuracy of CTH retrieval. The CO2 slice method is to retrieve the single pixel observed by the imager independently, without considering the characteristic cloud information between the adjacent pixels. The comprehensive use of neighbor pixels can reflect the heterogeneity and scale of clouds. This paper builds a multi-channel radiance CTH retrieval algorithm based on XGBoost. The algorithm uses seven channels of AHI data in total. In this case, 1.6 µm channel reflectivity is used as one of the input parameters of the model, the purpose of which is to establish the relationship between the cloud-phase state and the CTH within the model. At the same time, this paper extracts the cloud neighborhood image element information as a model input variable, which can better analyze cloud uniformity, scale size, and other characteristics. These features have a very positive effect on CTH retrieval.
In recent years, the continuous development of retrieval algorithms based on machine learning has provided a new solution for the retrieval and prediction of meteorological elements [35,36,37]. For example, Min proposed a machine-learning-based CTH retrieval algorithm; four machine learning models were trained and compared with traditional physical algorithms, and the four models retrieved CTH results with improved accuracy [28]. Recently, Wang proposed an algorithm to retrieve CTH using a deep neural network (DNN) model, which inputs not only the brightness temperature data of the 8.6 µm, 10.4 µm, 12.4 µm and 13.3 µm channels of AHI into the model, but also the vertical temperature profile, surface elevation, and other geographic information parameters. Comparing this algorithm with the joint data of CALIOPSO/CloudSat, the CTH result of the mean error (ME) is −0.13 km, and the RMSE is 3.37 km [38]. The phase state of clouds is inextricably linked to CTH, but current algorithms for artificial neural networks or machine learning to retrieve CTH only select the thermal infrared channel of AHI, while NIR can distinguish the different phase states of clouds more effectively.
In this paper, we propose a CTH retrieval method combining NIR channels based on the XGBoost model, and it can be applied to the observation data of the Himawari-8 satellite. In this method, the multi-channel radiance data of AHI, the spatially inhomogeneous cloud feature parameters, and the corresponding geographic information parameters are used as the input variables of the model. Taking the CALIPSO profile data corresponding to time and space to extract CTH as the output variable, the XGBoost model is trained, and the CTH retrieval algorithm based on the XGBoost model is obtained. The method utilizes feature parameters such as NIR and proximity image elements for cloud characterization, which allows for the effective differentiation of the clouds within the model and has a positive effect on CTH retrieval.

2. Method

2.1. Data

This paper uses the AHI L1 full-disk radiance data of the Himawari-8 satellite, L2 CTH product data and CALIOP 5 km cloud layer product data (Level 2 Clayer) of the CALIPSO satellite.
The Himawari-8 satellite, located near 140.7°E above the equator, is a new-generation Japanese geostationary meteorological satellite. The satellite was launched on 7 October 2014, and has been operational since 7 July 2015. The Advanced Himawari Imager (AHI) payload it carries has a total of 16 channels, including 3 channels for visible light (VIS), 3 channels for near-infrared (NIR), and 10 channels for infrared (IR). The AHI can obtain 16 channels of full disk observations every 10 minutes. The Himawari-8 AHI L1 full-disk data have a spatial resolution of 5 km × 5 km and cover the regions from 60°N to 60°S and 80°E to 160°W [39]. The AHI L2 CTH data are included in the CLP (Cloud Parameters) product data file provided by the Japan Meteorological Agency (JMA) with a spatial resolution of 5 km × 5 km. The AHI L2 CTH retrieval algorithm is roughly the same as the Integrated Cloud Analysis System (ICAS) algorithm, using AHI’s 11 µm, 12 µm, and 13.5 µm channel data. The ICAS algorithm was developed by Iwabuchi in 2016, which comprehensively utilizes the CO2 slicing method, the infrared split window method, and the OE algorithm [40]. In 2018, Iwabuchi applied the ICAS algorithm to AHI and demonstrated its estimation accuracy for CTH [19]. At present, the Himawari-8 satellite only provides CTH data in the daytime, and this paper only verifies the retrieval algorithm results in the daytime.
The Cloud-Aerosols Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) satellite was launched in 1998 by the National Aeronautics and Space Administration (NASA) and the Centre National d’Etudes Spatiales (CNES) [41]. It belongs to the A-Train satellite constellation [42]. The main payload of the CALIPSO satellite is Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP). CALIOP has three channels, 1064 nm channel and the orthogonally polarized components of 532 nm, one is parallel, and the other one is perpendicular. CALIOP mainly includes Level 1B, Level 2 Profile, Level 2 VFM, and Level 2 Clayer/Alayer data, which can provide the cloud and the aerosol type and location information. This paper uses the CTH in the CALIOP 5 km cloud layer product data (Level 2 Clayer) as the ground truth for model training and validation. Considering that the temporal resolution of the AHI full-disk image is 10 min and the spatial resolution of L2 CTH is 5 km, it is stipulated that the observation time of a profile in CALIOP and the observation time of a pixel in AHI should not exceed 10 min and the spatial distance should not exceed 5 km, it is considered to be a space time match. The green line shown in Figure 1 is the space-time matching point between CALIOP and AHI. The background in the figure is the brightness temperature of the 11th channel (BT8.6 µm) of AHI at 05:10 on 1 January 2019, and the green line is the ground observation track of CALIOP during the period of 05:00~05:10 on that day.

2.2. Retrieval Algorithm

This paper designs a CTH-retrieval algorithm based on the eXtreme Gradient Boosting (XGBoost) model. XGBoost is an ensemble learning algorithm based on gradient boosting. Its principle is to achieve an accurate fitting effect through the iterative calculation of the weak estimators. The model is suitable for nonlinear fitting processing [43]. The CTH retrieval is essentially a multi-parameter nonlinear fitting problem. XGBoost consists of an ensemble algorithm, weak estimators, and an application module, in which the ensemble algorithm and weak estimator are its core [44]. The ensemble algorithm builds multiple weak estimators and aggregates the modeling results of all weak estimators to obtain better regression or classification performance than a single model. Unlike random forest (RF), which builds multiple parallel independent weak estimators at one time, XGBoost is a method of building weak estimators one by one and accumulating multiple weak estimators after multiple iterations [45]. A certain number of samples are randomly selected from the total sample set to form a training set, and the weak estimator is trained by using the training set to generate a weak estimator based on the tree model. Then, the results estimated by the weak evaluator are evaluated, and the samples with large deviations are marked. After that, the random replacement sampling is continued from the total sample set, and the probability of being selected by the marked sample increases to form a new training set, generating two weak evaluators. After many iterations, an ensemble algorithm of multiple weak evaluators is obtained, thereby generating the final XGBoost model. The XGBoost model is developed from the Gradient Boosting Decision Tree (GBDT) and is optimized in many ways based on the GBDT model. Firstly, XGBoost’s weak evaluator is not only a tree model but also supports linear models. Secondly, while GBDT uses only first-order derivative information in its optimization, XGBoost performs a second-order Taylor expansion on the cost function, which not only speeds up the convergence of the model during training but also retains more information about the objective function, which is useful for improving accuracy. Thirdly, XGBoost has better robustness with the addition of a strategy to automatically handle the missing-value features. Finally, XGBoost supports the parallel processing of features, which is more computationally efficient. Therefore, compared to GBDT and the random forest model, XGBoost is more suitable for retrieving CTH using imagers.

2.3. Model Input Parameters

AHI has a total of 13 channels. In this paper, the 5th, 7th, 10th, 11th, 14th, 15th, and 16th channel radiance data of AHI are selected as the basic parameters. Table 1 shows each channel band and its corresponding physical characteristics. The near-infrared band of 1.6 µm is a weak absorption band of water vapor, and this channel can be used to identify cirrus clouds [34]. The absorption difference between ice clouds and water clouds is obvious in the vicinity of the 3.9 µm band [32]. The 7.3 µm and 8.6 µm bands are the water vapor absorption channels. These channels are useful for distinguishing between ice and water clouds. The brightness temperature of 11 µm or 12 µm (BT11 or BT12) is the basic variable for retrieving the CTH of thick clouds, which is similar to the cloud-top temperature of thick clouds [46]. In the presence of optically thin cloud, brightness temperature differences of 11 µm and 12 µm can reflect the transparency of the cloud [47]. The 13.3 µm is the CO2 absorption channel, which helps to improve the CTH retrieval accuracy of high clouds.
The input parameters include the single-channel reflectance/brightness temperature (R1.6, BT7.3, BT11.2, BT13.3), two-channel brightness temperature difference (BT11.2-BT12.3, BT8.6-BT12.3, BT7.3-BT12.3, BT13.3-BT12.3), the Std of reflectivity/brightness temperature (R1.6, BT3.9, BT11.2), and the Std of brightness temperature difference (BT11.2-BT12.3, BT11.2-BT3.9) within a 5 × 5 grid difference, the difference between the brightness temperature of the 11.2 µm channel and the warmest/coldest brightness temperature within the 5 × 5 pixel range of the channel (BT11.2-BT11.2 W, BT11.2-BT11.2 C), and the warmest/coldest brightness temperature difference in the range of 5 × 5 pixels (BT12.3 W-BT11.2 W, BT12.3 C-BT11.2 C, BT11.2 W-BT3.9 W, BT11.2 C-BT3.9 C), which can be seen in Table 2. Brightness temperature differences can not only distinguish optical thin clouds but also identify the pixel information at the edge of the cloud. The Std of brightness temperature and that of brightness temperature difference can reflect the texture information of clouds. The difference between the hottest and the coldest brightness temperature from the neighbor pixel represents the gradient of the brightness temperature change in the cloud area, which in turn distinguishes the uniformity and the scale of the cloud. In addition, Hamada believed that constraints such as latitude, satellite zenith angle, and season could reduce the error of the imager in retrieving CTH [48]. Håkansson analyzed the relationship between the CTH bias retrieved by different imagers and the satellite observation zenith angle and found that the CTH bias was proportional to the satellite observation zenith angle [37]. Different seasons affect the height intervals of clouds with different phases in the atmosphere at different latitudes, which can cause the correlation between CTH and cloud phases to become more ambiguous. Different models need to be trained for different seasons to exclude this interference. This is the reason why only winter samples are selected for the experiments in this paper. Latitude can reflect the seasonal changes in the direct solar point and, combined with longitude, it can indirectly reflect the influence of sea and land differences on CTH. In this paper, the solar zenith angle (SZA), satellite observation zenith angle (viewing zenith angle, VZA), and longitude and latitude are also used as the input parameters of the model.
The XGBoost algorithm can generate an order of importance score for each input feature. As shown in Figure 2, (BT3.9)text and BT11.2-BT11.2 C are the two most important input variables. (BT3.9)text can effectively distinguish the phase types of clouds and their textural characteristics in the 5 × 5 pixel range. BT11.2-BT11.2 C can respond to Cloud Optical Thickness changes in the 5 × 5 pixel range. First, the importance scores are greater than 5% for any input variable involving NIR, which indicates the importance of using NIR to improve the accuracy of CTH retrieval in this paper. Second, the sum of the contributions of the variables with extracted neighborhood pixel information is greater than the sum of the variables without extracted neighborhood pixels. This proves that the retrieval of CTH is not enough to obtain information only on individual pixels. Finally, the importances of the input features do not differ much from each other and each input feature has a certain level of importance.

2.4. Model Training Method

The method of CTH retrieval based on the XGBoost model in this paper uses AHI L1 radiance data and CALIOP L2 cloud data for training and testing. The specific training scheme is shown in Figure 3. We selected the AHI L1 and CALIOP L2 cloud-layer data from January, February, and December 2019 as the data set for model training. The aim of building a winter-only training set is to help constrain the seasons and obtain more accurate results. The CTHs of the CALIOP cloud data are used as the training target. If the cloud thickness of the top layer is less than 20 m, the matching point is eliminated from the training set. A total of 87,315 matching points were used to train the XGBoost model. The parameters of the training model were tuned using the 5-fold cross-validation. method. This method is to split all the training set samples equally and randomly into 5 parts, train each of these 4 parts to obtain a model and leave 1 part for testing the model, which will give a set of prediction results. This scheme will train to obtain 5 models and 5 sets of predictions. The average of these 5 sets of prediction results reflects the effect of the model parameters. The XGBoost model for the AHI retrieval of CTH was obtained by training the model.

3. Result

3.1. Case Analysis

Figure 4 shows the Himawari-8 L2 CTH (CTHJMA) and the CTH retrieved in this paper (CTHXGB) at 5:50 (UTC) on 1 January 2021. It can be seen from the figure that the distribution trends of CTHJMA and CTHXGB are the same up to a point. Among them, high clouds are found in areas south of Taiwan, Xinjiang, and Tibet (left in the picture), and low clouds are generally predominant in China’s coastal areas and central and southern areas.
Figure 5 shows the comparison results of the CTH products from 05:00 to 06:00 on 1 January 2021 (UTC). In the figure, the pink dots represent the CTH(CTHCAL) observed by CALIOP, the green dots represent CTHJMA, and the blue dots represent CTHXGB. Both CTHJMA and CTHXGB had high consistency with CTHCAL. When the CTH is greater than 10 km, the CTHJMA is generally low, and the CTHXGB does not appear in this situation. The reason is that it is based on the XGBoost model that use CTHCAL data training, and different types of clouds can be effectively summarized in the model, thus improving the retrieval accuracy of CTH of high-level clouds. There is an overestimation of CTHJMA in areas above 20° latitude.

3.2. Statistic Analysis

Using the established XGBoost model, the cloud-top heights of 2021.01.08~2021.01.10, 2021.02.05~2021.02.07, 2021.12.02~2021.02.04 were retrieved, temporally and spatially matched with CALIOP, and a total of 18,259 matching data were obtained. Figure 6 is a scatter plot comparing CTHJMA/CTHXGB with CTHCAL. Figure 7 shows a comparison chart of the ME, Std, RMSE, and correlation coefficients. It can be seen from Figure 6a that the slope of the fitted line is significantly smaller than that of the reference line, which indicates that above about 2.5 km, with the increase in CTH, the systematic underestimation of the CTHJMA becomes more obvious. Below about 2.5 km, CTHJMA is partly overestimated. The reason is that the brightness temperature observed by the CO2 channel is the radiation brightness temperature of the entire atmospheric column. When the CTH reaches more than 10 km, there are mainly ice clouds. Most ice clouds have smaller cloud geometric thicknesses (CGTs) and higher transmittance, so they are more susceptible to radiation contamination in and under clouds. As the CTH continues to rise, the CGT of the ice cloud also decreases, and the radiation pollution under the cloud is more serious, resulting in a more obvious underestimation of the CTH. Some samples of CTHJMA are high when the cloud is low. Iwabuchi believed that the temperature inversion near the CTH has an impact on the CO2 channel, resulting in the phenomenon of low-cloud overestimation [19]. It can be seen from Figure 6b that CTHXGB achieved a good fitting effect. The fitting line is close to the reference line. The reason is that the model comprehensively utilizes short-wave infrared, which excludes the CO2 channel to observe the radiation pollution under the cloud. At the same time, the geographical distribution parameters are comprehensively used, which constrain the phenomenon of the low-cloud overestimation caused by the temperature inversion near the CTH to a certain extent.
Figure 7 shows the deviation probability density of CTHJMA\CTHXGB compared with CTHCAL, the curve with a 0.2 km interval, and the dotted line represents the median. ΔCTHXGB is closer to the normal distribution characteristic than ΔCTHJMA, and the median of ΔCTHXGB is about 0.25 km. The probability density curve of ΔCTHXGB is steeper, which is closer to the normal distribution and has fewer systematic errors. The medians of the two product results are close to the size of ME, indicating that both pieces of data conform to a symmetrical distribution. CTHXGB corrects the systematic bias of CTHJMA and achieves the effect of closer normal distribution.
It can be seen from Figure 8 that the ME of CTHXGB is 0.3 km, the ME of CTHJMA is −1.27 km, and the systematic error is reduced by 76.32%, which improves the overall low search condition of CTHJMA. The Std and RMSE of CTHXGB also decreased compared with CTHJMA, but the correlation coefficient could not be improved.
Figure 9 shows the comparison of CTH in the five cloud scenarios where the top layer observed by CALIOP is an ice cloud, water cloud, mixed cloud, monolayer cloud, and multi-layer cloud. Among them, the multi-layer cloud is the number of cloud layers observed by CALIOP ≥ 2 pixels. Figure 10 shows the absolute value, Std, RMSE, and correlation coefficient of ME under these cloud scenarios. Among them, from Figure 9(a1,b1), it can be seen that the CTHJMA of the ice cloud is lower than that of CTHCAL, and most of the scattered points are below the reference line (CTHJMA = CTHCAL), indicating that the CTH products of Himawari-8 are systematically low in ice clouds and multi-layer cloud scenarios with ice clouds on the top layer. However, the scattered points of CTHXGB and CTHCAL of ice clouds are distributed around the reference line CTHXGB = CTHCAL; the fitted trend line and the reference line basically coincide. This shows that the constructed retrieval model can better retrieve the CTH of ice clouds and solve the problem of the systematic low CTH of Himawari-8 ice clouds. From Figure 9(a2,b2), it can be seen that more scatter points of CTHJMA than CTHCAL for water clouds are located between 3 km~7 km below the reference line, indicating that the high product of the water cloud tops of Himawari-8 is systematically low between 4 km~8 km. However, the CTHXGB and CTHCAL scattering points of the water cloud are distributed around the reference line, and the fitted trend line basically overlaps the reference line, indicating that the constructed retrieval model can better retrieve the CTH of the water cloud, which better solves the systemic low CTH problem of Himawari-8 water cloud. Secondly, the concentrated area of the CTHJMA scatter distribution is similar to a circle shape for low clouds located at a CTH of less than 2 km, which indicates that there is a partial overestimation of CTHJMA below 2 km, while with CTHXGB, this bias phenomenon is not obvious in the special diagnosis. It can be seen from Figure 9(a3,a4) that the systematic deviation of the fitted line of CTHJMA basically disappears below 2.5 km, and the lowering of the fitted line becomes more obvious as the CTH increases. This is likely to be a systematic difference in the content of the ice–water mixture in the mixed cloud. When the mixed cloud’s CTH is less than 2.5 km, although it may contain ice crystals, its content is low. When the mixed cloud’s CTH is greater than 15 km, although it may contain liquid particles, its content is also low. CTHJMA’s retrieval of ice-phase clouds has obvious deviations. Compared with CTHJMA and CTHCAL, the fitting line of CTHXGB and CTHCAL for mixed clouds is obviously improved. By combining Figure 9(a4,b4,a5,b5), it can be seen that for both the single- and multi-layered clouds, there is an improvement in the fitted lines of CTHXGB to CTHCAL and CTHJMA to CTHCAL at different levels, especially for high clouds. CTHJMA and CTHCAL single-layer clouds have no systematic bias in low clouds, while multi-layer clouds have a systematic bias in low clouds. This may be the radiation pollution caused by the transmittance of the upper ice cloud or the thin upper cloud, which still needs to be further explored. The scatter-fitting results of CTHJMA and CTHCAL for single-layer clouds are better than for multi-layer clouds, which is similar to the conclusion drawn by Tan in 2018 [18]. In summary, it can be seen that CTHXGB is closer to the reference line on the fitting line for different cloud types, which fully proves that the XGBoost model has better robustness. The systematic underestimation of CTHJMA is mainly due to the small-cloud optical thickness of ice clouds.
Figure 10 shows the probability density distributions of ΔCTHJMA and ΔCTHXGB in different cloud scenarios, where the solid line is a broken line with an interval of 0.2 km, and the dashed line is the median of the corresponding cloud scenario. In Figure 10a, the CTHJMA has obvious normal distribution characteristics in water cloud and single-layer cloud scenarios. The median ΔCTHJMA of water clouds is about 0.2 km, but there is a probability density extreme value area of around 1.7 km, which indicates that for water clouds, the CTHJMA is partially overestimated. The median ΔCTHJMA of the monolayer cloud is about −0.6 km, the median ΔCTHJMA of the mixed cloud is about −1.2 km, and the median ΔCTHJMA of the multi-layer cloud and ice cloud is as low as −2.5 km. CTHJMA has the best retrieval results for water clouds, followed by single-layer clouds, and other cloud scenarios have large systematic underestimations. The ΔCTHXGB has an obvious normal distribution in each cloud scene, and the median is also distributed near the reference line, indicating that ΔCTHXGB has a symmetrical distribution. Compared with Figure 9a, the normal distribution of ΔCTHXGB in different cloud scenarios is more obvious, which shows that the model has better robustness. In addition, ΔCTHXGB has better symmetry reducing systematic bias.
Table 3 shows the statistical comparison results between CTHXGB\CTHJMA and CTHCAL under different cloud scenarios. Among them, water clouds, ice clouds, and mixed clouds all refer to the-top layer cloud phase. The ice clouds and multi-layer clouds of CTHJMA have obvious systematic underestimation, followed by mixed clouds, and the systematic deviation of water clouds and single-layer clouds is relatively small. The ME of CTHXGB in different cloud scenarios is closer to 0, which indicates that the model has better robustness to CTH retrieval in different cloud scenarios.
The model makes full use of the short-wave infrared channel, which plays a role in distinguishing the sublayer radiation pollution of ice clouds and multi-layer clouds in the decision tree model. The RMSE of CTHJMA in the water cloud scene is 1.67 km, while the RMSE of CTHXGB is 1.44 km. The Std of CTHJMA in the water cloud scene is 1.51 km, and that of CTHXGB is 1.43 km, indicating that CTHJMA and CTHXGB have relatively low accuracy and dispersion for water cloud deviation. The results of the two algorithms for water clouds are not much different. Likewise, for single-layer clouds, the CTHJMA and CTHXGB results are not much different, but CTHXGB significantly improves ME and RMSE compared to CTHJMA in the case of multi-layer clouds. For multi-layer clouds, the scheme adopted in this paper has no obvious systematic deviation in results. The proposed scheme has higher retrieval accuracy for multi-layer clouds.
Figure 11 shows the mean deviation profiles of CTHXGB and CTHJMA. Figure 10a shows the ice cloud results. The solid line is a single-layer ice cloud, and the dashed line is a multi-layer cloud with an ice cloud on the top layer. It can be seen that the ΔCTHJMA of the ice cloud is obviously lower with the increase in CTH, and the underestimation of the multi-layer ice cloud is larger than that of the single-layer ice cloud by nearly 1 km at different heights. For ice clouds, the CTHJMA retrieval bias is mainly due to the contamination of the brightness temperature observed by the CO2 channel and the infrared splitting window channel by the transmitted radiation of the atmosphere below the ice cloud. The CTHXGB retrieved by this model, whether for a single-layer ice cloud or multi-layer ice cloud, has deviation results around 0 km at different heights, and the deviation in the single-layer ice cloud and multi-layer ice cloud is not much different. It shows that the accuracy of CTHXGB in the case of ice clouds has been greatly improved.
Figure 11b shows the result of the water cloud. The solid line is a single-layer water cloud, and the dashed line represents a multi-layer cloud with a water cloud on the top layer. The CTHJMA of the water clouds is overestimated below 2 km, and the systematic overestimation of CTHJMA decreases with the increase in CTH. Iwabuchi believed that this situation might be due to the error caused by the temperature inversion near the CTH, and it may be that other constraints, such as geographic distribution, diurnal cycles, and seasonal changes, are not utilized [19]. It is also possible that CTHJMA only selects the split window channel and CO2 channel, ignoring the absorption effect of water vapor on these bands. The lower the CTH, the higher the relative humidity. The reflected long-wave radiation is more likely to be absorbed by near-surface water vapor. The deviation in ΔCTHXGB is closer to 0 km below 2 km. The bias of ΔCTHXGB below 2 km is closer to 0 km, which suggests that CTHXGB has better accuracy for water clouds below 2 km. The reason is that CTHXGB utilizes the latitude and longitude, solar zenith angle, and satellite zenith angle information as the input parameters of the model. Combined with the very small or even negligible absorption of water vapor by the 1.6 µm channel, the deviation caused by the water vapor absorption of the split window channel and the CO2 channel can be corrected more effectively. CTHJMA has underestimated the multi-layer water cloud above 2 km, the underestimation is about 2 km near 5 km, and the 7.5 km has been improved. For the multi-layer water clouds, the CTHJMA retrieval bias also comes from the fact that the brightness temperature observed by the CO2 channel and the infrared splitting window channel is polluted by the transmitted radiation from the atmosphere below the top layer of the water cloud. The reason is that when the CTH is above 7.5 km, the COT of the top water cloud is high, and the radiation below the cloud cannot be observed by AHI. At this time, the cloud top temperature is close to the brightness temperature of the split window channel, so the deviation is alleviated with the increase in height. The deviation in ΔCTHXGB at different heights is basically within 1 km, whether for single-layer water clouds or multi-layer water clouds.

4. Discussion

Currently, the Himawari-8 CTH retrieval method selects the 11 µm, 12 µm, and 13.5 µm channel data of AHI, which was developed by Iwabuchi in 2016. The algorithm combines the CO2 slicing method, the split window method, and OE algorithm, but there exists a systematic bias problem. Considering the active satellite data, this paper proposes a new CTH retrieval method based on the XGBoost model using Himawari-8/AHI data. In this paper, only winter data were chosen as the experimental sample, the purpose of which was to better constrain the bias caused by seasonality. It not only considers the effect of thermal infrared on CTH retrieval but also takes the radiance values of the 1.6 µm and 3.9 µm channels as the input of the model retrieval, which is helpful in distinguishing the different phases of clouds. Combined with the solar zenith angle and satellite zenith angle, as well as the longitude and latitude set as the input variables of the model, these variables can indirectly reflect the changes in the land, sea, and seasonal influences. In addition, the variable of the adjacent pixel value information plays a very important role in improving retrieval accuracy. The selected satellite data were calculated and divided into five categories of channel reflectance/brightness temperature, the brightness temperature difference between different channels, texture parameters, adjacent hottest/coldest difference, and the geographic and spatial information in the model. The results show that:
The solution proposed in this paper has little difference in the statistical results under different cloud scenarios. It is indicated that XGBoost can distinguish different cloud scenarios and has certain robustness, which is suitable for CTH retrieval. Although this paper uses winter data as an experimental sample, this is only a retrieval scheme for the separate training modeling of different seasons, and the scheme is equally applicable to other seasons.
The CTHXGB improves significantly in the case of ice clouds and multi-layer clouds with ice clouds at the top. In this scenario, the systematic underestimation of CTHJMA improves significantly, and the degree of improvement is proportional to the CTH value. At the same time, CTHXGB significantly improves water clouds and multi-layer clouds, with water clouds on the top layer below 2 km. The CTHJMA is prone to systematic overestimation, and the degree of improvement is inversely proportional to the CTH value.
For multi-layer clouds with water clouds on the top layer, CTHJMA has an underestimation above 2 km, and the underestimation of CTHJMA is about 2 km near 5 km, and it improves at 7.5 km. The reason is similar to the error of multi-layer clouds with ice clouds on the top. However, ΔCTHXGB is small and varies little with height. For multi-layer clouds whose top layer is a water cloud, the CTH near 2–7.5 km has a relatively obvious improvement.
In general, compared with CTHJMA, CTHXGB has a 76.32% improvement in ME compared with CTHJMA, which shows that the systematic deviation in the retrieval CTH scheme proposed in this paper is significantly reduced; the RMSE is increased by 24.65%, showing that the scheme has higher accuracy. It is improved by 11.9% in Std, indicating that the dispersion of the deviation of this scheme is lower.

5. Conclusions

The algorithm in this paper can better solve the systematic deviation problem of Himawari L2 CTH products, but the improved Std in different cloud scenarios is not large, and the improved CTH deviation dispersion is not obvious. Better improvement of Std will be the focus of the next step. Additionally, it is of great significance to the Himawari-8 satellite remote sensing retrieval of CTHs.

Author Contributions

Y.D.: Data processing, Formal analysis, Investigation, Methodology, Software, Visualization, Validation, Writing—review & editing. X.S.: Conceptualization, Funding acquisition, Supervision, Validation. Q.L.: Data processing, Software, Writing—review & editing. In addition, the above three authors are responsible for the writing and revision of the corresponding parts of the manuscript. Y.D. is responsible for the overall reviewing and editing of the manuscript and other matters not mentioned here. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National Natural Science Foundation of China grant funded by the Chinese government (41575020).

Data Availability Statement

The data that support the findings of this study are available in https://search.earthdata.nasa.gov/ (accessed on 10 December 2022) for CALIPSO and http://www.ptree.jaxa.jp (accessed on 10 December 2022) for Himawari-8.

Acknowledgments

We sincerely appreciate National Natural Science Foundation of China (41575020), Himawari-8 data provided by the Japan Meteorological Agency and the A-train satellite data downloaded from the NASA Earth data website of the National Aeronautics and Space Administration.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Stephens, G.L.; Webster, P.J. Clouds and climate: Sensitivity of simple systems. J. Atmos. Sci. 1981, 38, 235–247. [Google Scholar] [CrossRef]
  2. Sassen, K.; Wang, Z.; Liu, D. Global distribution of cirrus clouds from CloudSat/Cloud-Aerosol lidar and infrared pathfinder satellite observations (CALIPSO) measurements. J. Geophys. Res. Atmos. 2008, 113. [Google Scholar] [CrossRef]
  3. Wang, H.; Su, W. Evaluating and understanding top of the atmosphere cloud radiative effects in Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5) Coupled Model Intercomparison Project Phase 5 (CMIP5) models using satellite observations. J. Geophys. Res. Atmos. 2013, 118, 683–699. [Google Scholar] [CrossRef] [Green Version]
  4. Li, J.; Yi, Y.; Minnis, P.; Huang, J.; Yan, H.; Ma, Y.; Wang, W.; Ayers, J.K. Radiative Effect Differences between Multi-layered and Single-layer Clouds Derived from CERES.CALIPSO, and CloudSat Data. J. Quant. Spectrosc. Radiat. Transfer. 2011, 112, 361–375. [Google Scholar] [CrossRef]
  5. Boucher, O.; Randall, D.; Artaxo, P.; Bretherton, C.; Feingold, G.; Forster, P.; Zhang, X.Y. Clouds and Aerosols. In Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK, 2013. [Google Scholar]
  6. Holz, R.E.; Ackerman, S.A.; Nagle, F.W.; Frey, R.; Dutcher, S.; Kuehn, R.E.; Vaughan, M.; Baum, B.A. Global Moderate resolution Imaging Spectroradiometer (MODIS)cloud detection and height evaluation using CALIOP. J. Geophys. Res. Atmos. 2008, 113, 1–17. [Google Scholar] [CrossRef] [Green Version]
  7. Miller, S.D.; Forsythe, J.M.; Partain, P.T.; Haynes, J.M.; Bankert, R.L.; Sengupta, M.; Mitrescu, C.; Hawkins, J.D.; Vonder, H.; Thomas, H. Estimating Three-dimensional Cloud Structure via Statistically Blended Satellite Observations. J. Appl. Meteorol. Clim. 2014, 53, 437–455. [Google Scholar] [CrossRef] [Green Version]
  8. Hollars, S.; Qiang, F.; Comstock, J.; Ackerman, T. Comparison of cloud-top height retrievals from ground-based 35 GHz MMCR and GMS-5 satellite observations at ARM TWP Manus site. Atmos. Res. 2004, 72, 169–186. [Google Scholar] [CrossRef]
  9. Platnick, S.; King, M.D.; Ackerman, S.A.; Menzel, W.P.; Baum, B.A.; Riédi, J.C.; Frey, R.A. The MODIS cloud products: Algorithms and examples from Terra. IEEE Trans Geosci Remote Sens. 2003, 41, 459–473. [Google Scholar] [CrossRef] [Green Version]
  10. Minnis, P.; Sun-Mack, S.; Young, D.F.; Heck, P.W.; Garber, D.P.; Chen, Y.; Spangenberg, D.A.; Arduini, R.F.; Trepte, Q.Z.; Smith, W.L.; et al. CERES Edition-2 Cloud Property Retrievals Using TRMM VIRS and Terra and Aqua MODIS Data—Part I: Algorithms. IEEE Trans. Geosci. Remote Sens. 2011, 49, 4374–4400. [Google Scholar] [CrossRef]
  11. Campbell, J.R.; Dolinar, E.K.; Lolli, S.; Fochesatto, G.J.; Gu, Y.; Lewis, J.R.; Marquis, J.W.; McHardy, T.M.; Ryglicki, D.R.; Welton, E.J. Cirrus Cloud Top-of-the-Atmosphere Net Daytime Forcing in the Alaskan Subarctic from Ground-Based MPLNET Monitoring. J. Appl. Meteorol. Clim. 2020, 60, 51–63. [Google Scholar] [CrossRef]
  12. Da, C. Preliminary assessment of the Advanced Himawari Imager (AHI) measurement onboard Himawari-8 geostationary satellite. Remote Sens. Lett. 2015, 6, 637–646. [Google Scholar] [CrossRef]
  13. Liu, Q.; Li, Y.; Yu, M.; Long, S.C.; Yang, C. Daytime Rainy Cloud Detection and Convective Precipitation Delineation Based on a Deep Neural Network Method Using GOES-16 ABI Images. Remote Sens. 2019, 11, 2555. [Google Scholar] [CrossRef] [Green Version]
  14. Min, M.; Wu, C.; Li, C.; Liu, H.; Xu, N.; Wu, X.; Chen, L.; Wang, F.; Sun, F.; Qin, D.; et al. Developing the Science Product Algorithm Testbed for Chinese Next-Generation Geostationary Meteorological Satellites: Fengyun-4 Series. J. Meteorol. Res. 2017, 31, 708–719. [Google Scholar] [CrossRef]
  15. Heidinger, A.K.; Bearson, N.; Foster, M.J.; Li, Y.; Wanzong, S.; Ackerman, S.; Holz, R.E.; Platnick, S.; Meyer, K. Using Sounder Data to Improve Cirrus Cloud Height Estimation from Satellite Imagers. J. Atmos. Ocean. Technol. 2019, 36, 1331–1342. [Google Scholar] [CrossRef]
  16. Heidinger, A.K.; Pavolonis, M.J. Gazing at Cirrus Clouds for 25 Years through a Split Window. Part I: Methodology. J. Appl. Meteorol. Clim. 2009, 48, 6. [Google Scholar] [CrossRef]
  17. Li, J.; Menzel, W.P.; Schreiner, A.J. Variational Retrieval of Cloud Parameters from GOES Sounder Longwave Cloudy Radiance Measurements. J. Appl. Meteorol. 2001, 40, 312–330. [Google Scholar] [CrossRef]
  18. Tan, Z.; Ma, S.; Zhao, X.; Yan, W.; Lu, W. Evaluation of Cloud Top Height Retrievals from China’s Next-Generation Geostationary Meteorological Satellite FY-4A. J. Meteorol. Res. 2019, 33, 553–562. [Google Scholar] [CrossRef]
  19. Iwabuchi, H.; Putri, N.S.; Saito, M.; Tokoro, Y.; Sekiguchi, M.; Yang, P.; Baum, B.A. Cloud property retrieval from multiband infrared measurements by Himawari-8. J. Meteorol. Soc. Jpn. 2018, 96B, 27–42. [Google Scholar] [CrossRef] [Green Version]
  20. Heidinger, A. ABI cloud height. In NOAA/NESDIS/STAR, GOES-R Algorithm Theoretical Basis Document (ATBD); NOAA NESDIS Center for Satellite Applications and Research: College Park, MD, USA, 2012; pp. 1–77. [Google Scholar]
  21. Schmit, T.J.; Gunshor, M.M.; Menzel, W.P.; Gurka, J.J.; Li, J.; Bachmeier, A.S. Introducing the next generation Advanced Baseline Imager on GOES-R. Bull. Am. Meteorol. Soc. 2005, 86, 1079–1096. [Google Scholar] [CrossRef]
  22. Menzel, W.P.; Frey, R.A.; Zhang, H.; Wylie, D.P.; Moeller, C.C.; Holz, R.; Maddux, B.; Baum, B.A.; Strabala, K.I.; Gumley, L.E. MODIS global cloud-top pressure andamountestimation: Algorithm description and results. J. Appl. Meteorol. Clim. 2008, 47, 1175–1198. [Google Scholar] [CrossRef]
  23. Li, J.; Yi, Y.H.; Stamnes, K.; Ding, X.D.; Wang, T.H.; Jin, H.C.; Wang, S.S. A new approach to retrieve cloud base height of marine boundary layer clouds. Geophys. Res. Lett. 2013, 40, 4448–4453. [Google Scholar] [CrossRef]
  24. Li, J.; Li, Z.; Wang, P.; Schmit, T.J.; Bai, W.; Atlas, R. An efficient radiative transfermodel for hyperspectral IR radiance simulation and applications under cloudy skyconditions. J. Geophys. Res. Atmos. 2017, 122, 7600–7613. [Google Scholar] [CrossRef]
  25. Baum, B.; Menzel, W.P.; Frey, R.; Tobin, D.; Holz, R.; Ackerman, S. MODIS cloudtop property refinements for Collection 6. J. Appl. Meteorol. Clim. 2012, 51, 1145–1163. [Google Scholar] [CrossRef]
  26. Weisz, E.; Li, J.; Menzel, W.P.; Heidinger, A.K.; Kahn, B.H.; Liu, C.Y. Comparison ofAIRS.MODIS.CloudSat and CALIPSO cloud top height retrievals. Geophys. Res. Lett. 2007, 34, 1–5. [Google Scholar] [CrossRef]
  27. Sherwood, S.C.; Chae, J.-H.; Minnis, P.; McGill, M. Underestimation of deep convective cloud tops by thermal imagery. Geophys. Res. Lett. 2004, 31, 11. [Google Scholar] [CrossRef] [Green Version]
  28. Min, M.; Li, J.; Wang, F.; Liu, Z.J.; Menzel, W.P. Retrieval of cloud top properties from advanced geostationary satellite imager measurements based on machine learning algorithms. Remote Sens. Environ. 2019, 239, 111616. [Google Scholar] [CrossRef]
  29. Chang, F.L.; Minnis, P.; Ayers, J.K.; McGill, M.J.; Palikonda, R.; Spangenberg, D.A.; Smith, W.L., Jr.; Yost, C.R. Evaluation of satellite-based upper troposphere cloud top height retrievals in multilayer cloud conditions during TC4. J. Geophys. Res. Atmos. 2010, 10, 11–15. [Google Scholar] [CrossRef]
  30. Chang, F.L.; Minnis, P.; Bing, L.; Khaiyer, M.M.; Palikonda, R.; Spangenberg, D.A. A modified method for inferring upper troposphere cloud top height using the GOES 12 imager 10.7 and 13.3 μm data. J. Geophys. Res. Atmos. 2010, 115, 1–13. [Google Scholar] [CrossRef] [Green Version]
  31. Key, J.R.; Intrieri, J.M. Cloud Particle Phase Determination with the AVHRR. J. Appl. Meteorol. 2000, 39, 1797–1804. [Google Scholar] [CrossRef]
  32. Daniel, J.S. Cloud liquid water and ice measurements from spectrally resolved near-infrared observations: A new technique. J. Geophys. Res. Atmos. 2002, 107, 1–16. [Google Scholar] [CrossRef]
  33. Palmer, K.F.; Williams, D. Optical properties of water in the near infrared*. J. Opt. Soc. Am. B (1917-1983) 1974, 64, 1107–1110. [Google Scholar] [CrossRef]
  34. Pilewskie, P.; Twomey, S. Cloud Phase Discrimination by Reflectance Measurements near 1.6 and 2.2 µm. J. Atmos. Sci. 1987, 44, 3419–3420. [Google Scholar] [CrossRef]
  35. Min, M.; Bai, C.; Guo, J.; Sun, F.; Liu, C.; Wang, F.; Xu, H.; Tang, S.; Li, B.; Di, D.; et al. Estimating summertime precipitation from Himawari-8 and global forecast system based on machine learning. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2557–2570. [Google Scholar] [CrossRef]
  36. Tan, Z.; Huo, J.; Shuo, M.; Han, D.; Wang, X.; Hu, S.; Yan, W. Estimating cloud base height from Himawari-8 based on a random forest algorithm. Int. J. Remote Sens. 2021, 42, 2485–2501. [Google Scholar] [CrossRef]
  37. Håkansson, N.; Adok, C.; Thoss, A.; Scheirer, R.; Hörnquist, S. Neural network cloud top pressure and height for MODIS. Atmos. Meas. Tech. 2018, 11, 3177–3196. [Google Scholar] [CrossRef] [Green Version]
  38. Wang, X.; Iwabuchi, H.; Takaya, Y. Cloud identification and property retrieval from Himawari-8 infrared measurements via a deep neural network. Remote Sens. Environ. 2022, 275, 113026. [Google Scholar] [CrossRef]
  39. Husi, L.; Nagao, T.M.; Nakajima, T.Y.; Riedi, J.; Ishimoto, H.; Baran, A.J.; Shang, H.; Sekiguchi, M.; Kikuchi, M. Ice cloud properties from Himawari-8/AHI nextgeneration geostationary satellite: Capability of the AHI to monitor the DC cloud generation process. IEEE Trans Geosci Remote Sens. 2019, 57, 3229–3239. [Google Scholar]
  40. Iwabuchi, H.; Saito, M.; Tokoro, Y.; Putri, N.S.; Sekiguchi, M. Retrieval of radiative and microphysical properties of clouds from multispectral infrared measurements. Prog. Earth Planet. Sci. 2016, 3, 32. [Google Scholar] [CrossRef] [Green Version]
  41. Hostetler, C.A.; Liu, Z.; Reagan, J.; Vaughan, M.; Winker, D.; Osborn, M.; Hunt, W.H.; Powell, K.A.; Trepte, C. CALIOP Algorithm Theoretical Basis Document, Calibration and Level 1 Data Products. Available online: https://www-calipso.larc.nasa.gov/resources/pdfs/PC-SCI-201v1.0.pdf (accessed on 10 January 2022).
  42. Winker, D.M.; Vaughan, M.A.; Omar, A.; Hu, Y.; Powell, K.A.; Liu, Z.; Hunt, W.H.; Young, S.A. Overview of the CALIPSO mission and CALIOP data processing algorithms. J Atoms. Ocean. Technol. 2009, 26, 2310–2323. [Google Scholar] [CrossRef]
  43. Friedman, J.H. Stochastic gradient boosting. Comput. Stat. Data. An. 2002, 38, 367–378. [Google Scholar] [CrossRef]
  44. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. ACM 2016, 785–794. [Google Scholar] [CrossRef]
  45. Romeo, L.; Frontoni, E. A Unified Hierarchical XGBoost Model for Classifying Priorities for COVID-19 Vaccination Campaign. Pattern Recogn. 2021, 121, 108197. [Google Scholar] [CrossRef] [PubMed]
  46. Inoue, T. On the Temperature and Effective Emissivity Determination of Semi-Transparent Cirrus Clouds by Bi-Spectral Measurements in the 10 µm Window Region. J. Meteorol. Soc. Jpn. 1985, 63, 88–99. [Google Scholar] [CrossRef] [Green Version]
  47. Derrien, M.; Lavanant, L.; Le, H.; Gleau. Retrieval of the cloud top temperature of semi-transparent clouds with AVHRR. In Proceedings of the IRS’88, Deepak Publ., Hampton, Lille, France, 8–24 August 1988; pp. 199–202.
  48. Hamada, A.; Nishi, N. Development of a Cloud-Top Height Estimation Method by Geostationary Satellite Split-Window Measurements Trained with CloudSat Data. J. Appl. Meteorol. Clim. 2010, 49, 2035–2049. [Google Scholar] [CrossRef]
Figure 1. AHI and CALIOP data match.
Figure 1. AHI and CALIOP data match.
Remotesensing 14 06367 g001
Figure 2. Out-of-bag importance of input variables in the XGBoost training process.
Figure 2. Out-of-bag importance of input variables in the XGBoost training process.
Remotesensing 14 06367 g002
Figure 3. Model training process based on XGBoost.
Figure 3. Model training process based on XGBoost.
Remotesensing 14 06367 g003
Figure 4. Distribution comparison of CTH products. (a) Himawari-8 L2 CTH; (b) XGBoost CTH.
Figure 4. Distribution comparison of CTH products. (a) Himawari-8 L2 CTH; (b) XGBoost CTH.
Remotesensing 14 06367 g004
Figure 5. Comparison of CTH Products.
Figure 5. Comparison of CTH Products.
Remotesensing 14 06367 g005
Figure 6. Comparison of CTH scatter. (a) Himawari-8 L2 CTH; (b) XGBoost CTH.
Figure 6. Comparison of CTH scatter. (a) Himawari-8 L2 CTH; (b) XGBoost CTH.
Remotesensing 14 06367 g006
Figure 7. CTH Bias Probability Density Distribution vs. CALIOP.
Figure 7. CTH Bias Probability Density Distribution vs. CALIOP.
Remotesensing 14 06367 g007
Figure 8. Statistical results compared with CTHCAL.
Figure 8. Statistical results compared with CTHCAL.
Remotesensing 14 06367 g008
Figure 9. Scatter plots of CTH results compared with CALIOP in different cloud scenarios: (a) Himawari-8 L2 CTH; (b) XGBoost CTH; 1 ice cloud, 2 water cloud, 3 mixed cloud, 4 single-layer cloud, 5 multi-layer cloud.
Figure 9. Scatter plots of CTH results compared with CALIOP in different cloud scenarios: (a) Himawari-8 L2 CTH; (b) XGBoost CTH; 1 ice cloud, 2 water cloud, 3 mixed cloud, 4 single-layer cloud, 5 multi-layer cloud.
Remotesensing 14 06367 g009aRemotesensing 14 06367 g009b
Figure 10. Probability density distribution of CTH products compared with CTHCAL under different cloud scenarios. (a) Himawari-8 L2 CTH; (b) XGBoost CTH.
Figure 10. Probability density distribution of CTH products compared with CTHCAL under different cloud scenarios. (a) Himawari-8 L2 CTH; (b) XGBoost CTH.
Remotesensing 14 06367 g010
Figure 11. The mean error of CTH product at different heights.
Figure 11. The mean error of CTH product at different heights.
Remotesensing 14 06367 g011
Table 1. Selected AHI channel.
Table 1. Selected AHI channel.
Number of ChannelsCenter Wavelength of the ChannelChannel Characteristics
Channel 051.6 µmLow water vapor absorption channel
Channel 073.9 µmDifferent cloud phase states have absorption differences
Channel 107.3 µmWater vapor absorption channel
Channel 118.6 µm
Channel 1411.2 µmSplit window channel
Channel 1512.3 µm
Channel 1613.3 µmCO2 absorption channel
Table 2. Model input parameters.
Table 2. Model input parameters.
Variable TypeVariableNote
ReflectivityR1.6Sensitive to the phase state of cloud
Bright TemperatureBT11.2Temperatures close to opaque cloud
BT7.3It is important to identify high optical thin cloud
BT13.3It is important to identify high optical thin cloud
BT difference between channelsBT11.2-BT12.3, BT8.6-BT12.3, BT7.3-BT12.3, BT13.3-BT12.3Holds information about whether the cloud is opaque and how transparent it is
Texture parameters(BT11.2)text, (BT3.9)text, (R1.6)text (BT11.2-BT12.3)text, (BT11.2-BT3.9)textSave information about the opacity, translucency, or edges of cloud
BT differences to warmest/coldest neighborBT11.2-BT11.2 W, BT11.2-BT11.2 C, BT12.3 W-BT11.2 W, BT12.3 C-BT11.2 C,
BT11.2 W-BT3.9 W, BT11.2 C-BT3.9 C
Cloud Optical Thickness
Geographic and spatial informationLatitude, Longitude, SZA, VZAEliminate some uncertainties caused by geographic location and space
Table 3. Statistics of differences between CTHJMA and CTHXGB.
Table 3. Statistics of differences between CTHJMA and CTHXGB.
Cloud SceneME/kmRMSE/kmStd/km
CTHJMACTHXGBCTHJMACTHXGBCTHJMACTHXGB
Ice−2.34−0.052.801.751.701.54
Water0.820.731.671.441.511.43
Mix−1.230.342.462.002.131.97
Single-layer−0.730.561.951.721.801.62
Multi-layer−2.22−0.172.841.781.781.67
All−1.270.302.311.741.931.72
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Dong, Y.; Sun, X.; Li, Q. A Method for Retrieving Cloud-Top Height Based on a Machine Learning Model Using the Himawari-8 Combined with Near Infrared Data. Remote Sens. 2022, 14, 6367. https://doi.org/10.3390/rs14246367

AMA Style

Dong Y, Sun X, Li Q. A Method for Retrieving Cloud-Top Height Based on a Machine Learning Model Using the Himawari-8 Combined with Near Infrared Data. Remote Sensing. 2022; 14(24):6367. https://doi.org/10.3390/rs14246367

Chicago/Turabian Style

Dong, Yan, Xuejin Sun, and Qinghui Li. 2022. "A Method for Retrieving Cloud-Top Height Based on a Machine Learning Model Using the Himawari-8 Combined with Near Infrared Data" Remote Sensing 14, no. 24: 6367. https://doi.org/10.3390/rs14246367

APA Style

Dong, Y., Sun, X., & Li, Q. (2022). A Method for Retrieving Cloud-Top Height Based on a Machine Learning Model Using the Himawari-8 Combined with Near Infrared Data. Remote Sensing, 14(24), 6367. https://doi.org/10.3390/rs14246367

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop