1. Introduction
Since the Industrial Revolution, the concentration of atmospheric carbon dioxide (CO
2) has continued to rise due to the massive combustion of fossil fuels [
1,
2]. Research has shown that part of the CO
2 generated by human activities has inexplicably disappeared, which indicates that some unknown carbon sinks may exist [
3]. To determine the distribution of carbon sources and sinks, many CO
2 observation methods have been presented [
4,
5,
6,
7,
8,
9]. Differential absorption lidar (DIAL) is considered the most promising method for observing CO
2 concentration [
10]. In contrast to passive remote sensing, DIAL can provide better precision to retrieve atmospheric CO
2 concentration [
11]. Moreover, a ground-based DIAL can provide vertical profiles of CO
2, which can be of great significance to the research of the carbon cycle [
12,
13,
14].
The echo signal of a ground-based CO
2-DIAL mainly relies on the scattering of aerosol and atmospheric molecules [
15]. Therefore, the signal intensity is relatively weak. Because a lidar signal is susceptible to noise such as background light, dark current, and thermal noise, any small perturbation will result in poor signal inversion [
16]. Moreover, because the signal-to-noise ratio (SNR) rapidly decreases with the increase in range, useful information will be overwhelmed by the noise over long distances [
17]. Signal de-noising of the CO
2 -DIAL will inevitably improve the effective detection range and will play an important role in the signal inversion precision. Therefore, a proper de-noising method must be selected for an important performance.
Until now, several methods have been developed, and some are in the development process for application in lidar signal de-noising [
18,
19,
20,
21]. The multi-pulse average method has been widely used in signal de-noising by calculating the mean of the multiple sets of profiles. However, multiple averages smoothen the fast-changing components of aerosols. The wavelet transform (WT) has been proven quite useful in extracting signals from data generated in noisy, nonlinear, and nonstationary processes. Zhou et al. applied it to reduce lidar echo signal noise and improve the system SNR [
22]. WTs are founded on basis functions, which inevitably have poor adaptability. Mao et al. put forward an alternative approach to obtain a de-noised signal as a by-product of combining the ensemble Kalman filter and the Fernald method [
23].
Wu et al. and Tian et al. applied the empirical mode decomposition (EMD) to lidar signal de-noising [
24,
25]. The EMD developed by Huang et al. decomposed the signal on the basis of the local characteristic time scale of the signal itself and dispensed with any basis functions [
26]. As discussed by Huang et al., a major drawback in the original EMD is the frequent appearance of mode mixing, which makes the physical meaning of the individual intrinsic mode functions (IMFs) unclear. To overcome this problem, a new noise-assisted data analysis (NADA) method is proposed, namely the ensemble EMD (EEMD), which adds a white noise with finite amplitude into the signal and performs EMD [
27,
28].
In the present study, we assume that the signals in a short time have the same useful information and that their differences are only caused by noise in stable atmospheric conditions. We apply the EEMD method to decompose the signals into different IMFs with frequencies ranging from high to low. Generally, we consider the noise existing in the high-frequency parts and implement signal de-noising by removing them. The correlation coefficients of the IMFs with the same temporal scale are regarded as the criterion to determine the components that need removal. Finally, we regard the linear fitting parameters of the differential absorption optical depth (DAOD) as the criterion for evaluating the de-noising effect.
In the next section, the system and theoretical foundation of the CO2-DIAL and the EEMD method are presented in detail. Further, we study the feasibility of the method by processing simulated signals. Two parts were selected for the analysis, and an obvious improvement in the result was obtained. Moreover, we apply the above-mentioned method to the observed signal, and the r-square increased from 0.444 to 0.755, which is doubtless beneficial to the inversion precision. In conclusion, the method turns out to be applicable to simulated and observed signals.
3. Simulated Signal Analysis
To verify the feasibility of the method, we analyzed the simulated CO
2-DIAL signal.
Figure 3a shows the pure signal of the on-line and off-line wavelengths. The initial CO
2 concentration was assumed to be 400 parts per million, and the temperature and pressure profiles were obtained by interpolating the sounding data of the temperature and pressure at Wuhan in March 2015. Compared with the off-line wavelength, the on-line wavelength fell even more rapidly due to CO
2 absorption. We introduced a white noise into the simulated pure signal and considered their combination as the observed signal [
Figure 3b]. Because of the influence of noise, the intensity of on-line wavelength echo signal may be stronger than off-line wavelength and can impair the absorption phenomenon. Thus, this problem can influence CO
2 retrieval precision. Hence, we introduced the above-mentioned method to process the echo signal.
We proposed false SNR (FSNR) to approximately estimate the real SNR (RSNR) of the signal.
and
where
r is final residue and can be viewed as useful information of the signal,
C1 is the first IMF and can approximately represent the noise of the signal,
Sp and
Sn represent the pure signal and noise of a signal, respectively. As
Figure 4 shows, we have calculated the FSNRs and RSNRs of ten simulated signals. The difference between FSNR and RSNR is smaller, so we can substitute FSNR for RSNR.
FSNRs of ten measured signals have been calculated and they have minor difference between any two adjacent signals (
Table 1). Therefore, three similar simulated signal groups with different SNRs, (99,100,101), were selected as our research objects (
Figure 5). They can be considered as observed signals collected within a short time in stable atmospheric conditions. We segmented the signals into two parts to verify the performance of the method under different signal intensities. One part was approximately from 300 to 1500 m, and the other was from 1500 to 3000 m.
Figure 6 shows the IMFs of the selected simulated signals of the on-line wavelength from 300 to 1500 m decomposed by EEMD. The frequency of the IMF systematically decreased. We generally consider that IMFs with high frequency contain little useful information, and most of them are occupied by noise. Therefore, we assumed the IMFs with high frequency to be completely submerged in noise. We adopted the method of removing high-frequency IMFs to realize CO
2-DIAL signal de-noising. Therefore, the key to the problem was how to determine the IMFs that need to be removed. To solve this problem, the correlation coefficients of the IMFs with the same temporal scale were calculated.
We determined the IMFs that need to be removed and reconstructed the signal according to the following equations:
and
Signal
yj is reconstructed by several IMFs on individual scale
Cjs(
t) (s = 1, 2, …, S) with a relevance factor
Ks(
t), where
CR represents the correlation coefficients of the IMFs with the same temporal scale. We considered that the IMF contains more noise than useful information on the condition that half of all correlation coefficients are less than 0.5. The 0.5 correlation threshold is determined by a statistical experiment. The main idea of the experiment is to determine the threshold when the signal gets the maximum SNR and conduct a count in various thresholds. In order to verify the selection of threshold in various signal quality, we have simulated 50 signals with various SNR (51, 52, …, 100). We use the method proposed in this paper to analyze the signals in the sets of three and it totals 48 groups. The result showed that the signals obtained the maximum SNR when 0.3, 0.4, 0.6, 0.7 were selected as the criterion in some cases (
Figure 7). Although sometimes several thresholds can also get the highest SNR, only the 0.5 correlation threshold can apply to almost all of the cases (45 groups in whole 48 groups). Therefore, a 0.5 correlation threshold is optimal.
The following table (
Table 2) lists the correlation coefficients of the IMFs with the same temporal scale of the on-line and off-line wavelengths. The correlation coefficients of the first IMF were comparatively low (i.e., below 0.5), whereas those of IMFs 2–7 reached more than 0.5. Therefore, we removed the first IMF and reconstructed the remaining components.
The ultimate purpose of de-noising is to improve the inversion accuracy; thus, we simultaneously analyzed the off-line wavelength.
Figure 8 shows the linear regression of the de-noised signal as a function of the pure signal. The good linear relationship demonstrates that the signals have a high degree of similarity. Meanwhile, the SNR and mean square error (MSE) were calculated (
Figure 9). A better result was obtained after removing the first IMF.
From the above discussion, we can conclude that the method works well for on-line or off-line wavelength echo signals. However, evaluating the log ratio of two de-noised signals is more significant, which is crucial to the inversion precision. The following equation is the inversion formula for the CO
2 concentration [
29,
30]:
where
NCO2 represents the concentration of CO
2,
σ is the absorption cross-section of CO
2, and Δ
R is the range resolution.
In this paper, DAOD refers to the difference between on-line and off-line wavelength in the CO
2-DIAL system, which is used to represent the difference between the two laser echo signals caused by CO
2 absorption. Then, Equation (11) can also be expressed as the following equation by DAOD:
Figure 10 shows the DAOD of the pure, original, and de-noised signal and directly shows that the data quality significantly improves and has a better fitting effect. The r-square listed in
Table 3 is consistent with this. According to Equation (12), the slope of the DAOD is the ratio of the DAOD and Δ
R, which is important for the inversion of the CO
2 concentration. By comparing the slopes of the three data types, the slope of the de-noised signal is closer to the expected value, which demonstrates the validity of the de-noising method.
The above simulation results indicate that the method performed well in the range from 300 to 1500 m, which can be considered to have little noise influence. Further analysis should be implemented to assess the capability of the de-noising method. We performed the same steps specified above to analyze the range from 1500 to 3000 m, which is more readily affected by noise. Seven IMFs were obtained after EEMD, and we carried out signal reconstruction by removing the first four IMFs that did not meet the criteria (
Table 4). Compared with the pure signal, the de-noised signal had good consistency (
Figure 11). Therefore, we can deduce that the de-noising process destroys little useful information from the original signal. The highest SNR and the lowest MSE were obtained when the first four IMFs were removed (
Figure 12). These results suggest that the de-noising method works well for single signals.
Because of the influence of noise, the signal intensity of the on-line and off-line wavelength signals was manifested; thus, the large changes in the log ratio of the two signals produced many negative values (
Figure 13, left). These values impaired the validity and credibility of the linear fitting and led to a poor result. Nevertheless, a good result was obtained after implementing the de-noising method (
Figure 13, right). The figure shows that no negative value was obtained, which fully illustrates that the errors of the two echo signals were greatly corrected. Because the DAOD represents the log ratio of two signals significantly smoothened after de-noising, some smooth fluctuations in the DAOD are acceptable. Moreover, the slope of the de-noised data was closer to the slope of the pure signal than that of the original signal (
Table 4), which improved the inversion precision.
In order to further prove the superiority of the method in this paper, we compared the proposed method with several de-noising methods (
Figure 14). The multi-pulse average method contributed little to the result. Compared with the EEMD method, the EMD and WT methods did not sufficiently improve the performance. From the data listed in
Table 5, the EEMD method is more representative of the other methods.
4. Observed Signal Analysis
The CO2-DIAL echo signal for analysis was collected in relatively stable atmospheric conditions and without acute changes on 29 December 2015. Thus, we assumed that the useful information of the signals is the same as one another; any difference could be due to the noise with a 30 s duration. Three signal pairs sequentially collected within 30 s were selected as our research objects; each pair includes on-line and off-line wavelength echo signals.
In our CO
2-DIAL system, two acquisition modes were used, namely, photon-counting and analog modes. The former has a better SNR, and the latter can compensate for the limitation in collecting the near-range echo signal of the former. Then, we chose the photonic signals of three signal pairs ranging from 1100 to 3000 m for presentation and named them as T1, T2, and T3. In particular, we only implemented signal de-noising on T2, and T1 and T3 served as reference signals.
Figure 15 shows the IMFs of the off-wavelength echo signals of T1, T2, and T3. We calculated the correlation coefficients of the IMFs with the same temporal scale. From
Table 6, we can find that the correlation coefficients of the first two IMFs are lower than 0.5. Those of the remaining IMFs that are not listed are higher than 0.5. Thus, we deleted the two IMFs and reconstructed the residue components.
A good de-noising result is shown in the following (
Figure 16). We see that the de-noising signal became smoother as well as retained the characteristics of the original signal. To validate the effect of the de-noising method for the investigated signal, we calculated the DAOD of the original and the de-noised signals. A significant improvement was obtained. The DAOD of the original signal showed negative values and had large discreteness (
Figure 17). In contrast to the original signal, the de-noised signal had good linearity, and the r-square after linear fitting was 0.31 higher than that of the original signal (
Table 7). Moreover, the interference of the negative values was eliminated, which undoubtedly enhances the data reliability. Meanwhile, the slope of the DAOD improved to some extent.
5. Conclusions
For the CO2-DIAL, CO2 inversion is an important process. However, the existence of noise severely hinders the improvement of inversion precision. In this paper, we used the EEMD algorithm to de-noise the echo signals of the ground-based CO2-DIAL system and set the threshold by calculating the correlation coefficients of the same scale components of multiple groups of adjacent signals decomposed by EEMD so as to determine the number of basic mode components to be removed. This method not only retains the useful information effectively but also removes the noise component. We decomposed three adjacent signal pairs into different IMFs using EEMD and removed the IMFs according to their correlation coefficients with the same temporal scale. Then, we reconstructed the residual components to complete the signal de-noising.
To verify the feasibility of the method, we analyzed the simulated CO2-DIAL signal. Three similar simulated signal groups with different SNRs were selected as our research objects. They can be considered as observed signals collected within a short time in stable atmospheric conditions. We segmented the signals into two parts to verify the performance of the method under different signal intensities. First, the de-noised signal and the pure signal were compared in this paper. The good linear relationship between the de-noised signal and the pure signal demonstrates that the signals have a high degree of similarity. Meanwhile, the SNR and MSE were calculated to prove the effectiveness of the method proposed in this paper. Then, the DAOD of the de-noised signal is calculated. The R2 of the DAOD reaches 0.841 and 0.835 for the detected range of 300–1500 m and 1500–3000 m, indicating that the quality is high, which is conducive to the high-precision inversion of CO2. In order to further prove the superiority of the method in this paper, we also compared the proposed method with several de-noising methods. The results show that our method is more representative of the other methods in CO2-DIAL de-noising. Finally, the method was used to observe CO2-DIAL signal de-noising. The de-noised signal became smoother as well as retained the characteristics of the original signal. In contrast to the original signal, the de-noised signal had good linearity, and the R2 of the DAOD was improved from 0.444 of the original signal to 0.735 of the de-noised signal, which is beneficial to the enhancement of the inversion precision. Certainly, this work may have some shortcomings. For example, the threshold for selecting the correlation coefficients was determined by many experiments but lacked theoretical foundations. We simply discussed the effect of the method under stable atmospheric conditions, and other cases were not considered. The above-mentioned deficiencies still require further research, and the proposed method requires continuous improvement in the future.