Enhancing Significant Wave Height Retrieval with FY-3E GNSS-R Data: A Comparative Analysis of Deep Learning Models
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis study made a comparative analysis of different deep learning models on SWH retrievals using FY-3E data, and found that ViT-Wave model performs better than other models when the SWH is less than 8m. Moreover, it is the first time that the FY-3E data was used for SWH retrievals.Therefore, I recommend a publication of the manuscript after some modifications.
1. In Line 273-274, how to judge outliers and how to remove these outliers should be clearly explained.
2. In Line 443-445, Tab.3 and 4 did not show the ViT-Wave model’s limitations under extreme conditions.
3. In Line 456-458, since the light color represents high SWH, the color in Fig.13 should be darker than in Fig.12, indicating that the ViT-model underestimates the SWH compared to ERA5 data.
4. The reason that ViT-Wave model is superior to others should be speculated in the result analysis part.
5. In Line 90-92, the authors declares that the models used for SWH are often less sophisticated, and the amount of training data is relatively sparse. The sentence is very subjective and related references should be added or listed to make this conclusion.
6. In Line 188, the number of Table [] should be added.
Author Response
Comments 1:
In Line 273-274, how to judge outliers and how to remove these outliers should be clearly explained.
Response 1:
We agree with this comment. An explanation has been added to the article, as follows:
The data preprocessing stage involved meticulous steps to ensure the completeness and reliability of the dataset utilized in subsequent analyses. The process began with a thorough data cleaning operation, where outliers and missing values, such as NaN, were identified and removed. Common outliers included data points that exceeded typical geographic coordinates (latitude and longitude) and unusual fill values such as -9999 and infinity (inf). Moreover, entries associated with these outliers and missing values, including corresponding data points and labels, were systematically eliminated to prevent any distortion of results and degradation of model performance.
Comments 2:
In Line 443-445, Tab.3 and 4 did not show the ViT-Wave model’s limitations under extreme conditions.
Response 2:
Thank you for pointing this out. We agree with this comment. Table 3.4 does not reflect the model’s limitations under extreme conditions, so the relevant statements are deleted.
Comments 3:
In Line 456-458, since the light color represents high SWH, the color in Fig.13 should be darker than in Fig.12, indicating that the ViT-model underestimates the SWH compared to ERA5 data.
Response 3:
Thank you for pointing this out. We agree with this comment. The color in Fig.13 is darker than in Fig.12, and has been changed.
Comments 4:
The reason that ViT-Wave model is superior to others should be speculated in the result analysis part.
Response 4:
We agree with this comment. The relevant analyses were added to the Discussion section of the paper.
Comments 5:
In Line 90-92, the authors declares that the models used for SWH are often less sophisticated, and the amount of training data is relatively sparse. The sentence is very subjective and related references should be added or listed to make this conclusion.
Response 5:
We agree with this comment. The content expressed is explained more objectively and relevant citations are added.The changes are as follows:
It has been noted in several studies that the models employed for SWH often exhibit less sophistication and the datasets used are relatively sparse.
Comments 6:
In Line 188, the number of Table [] should be added.
Response 6:
Thank you for pointing this out. We agree with this comment. Modified as requested, added number1.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis work compares several deep-learning models for retrieving significant wave height (SWH) from GNSS-R data. The paper is clear and well-structured, but there are several issues regarding background knowledge that need to be addressed before acceptance:
-
L29: The altimeter has been extremely successful in SWH measurement, and its data processing is not particularly complex (atmospheric correction is not a significant factor in SWH measurement). There's no need to downplay the role of altimeters to emphasize the importance of GNSS-R in SWH measurements. New data sources like GNSS-R are valuable in their own right.
-
L33: The coverage issue you mention also applies to GNSS-R. Additionally, the citation provided is not appropriate as it does not pertain to wave measurement. A more relevant reference for the coverage limitations of altimeters would be Jiang et al. (2020), "Evaluation of altimeter undersampling in estimating global wind and wave climate using virtual observation," Remote Sensing of Environment.
-
L39: What do you mean by "human error"? Are you referring to Voluntary Observing Ship (VOS) data? If so, it's more accurate to describe them as poor quality rather than attributing errors to human factors.
-
L47-49: Is this assertion accurate? I have serious doubts. If possible, consider including a plot of 1-day GNSS-R coverage in Figure 1 to clarify this point.
-
L71: The term "roughness" is more closely related to wind speed rather than SWH.
-
L78-79: The citation here is incorrect. Peter Janssen did not conduct the work you are referencing. If you are referring to Stopa's work, the correct citation should be Quach et al. (2020), "Deep Learning for Predicting Significant Wave Height From Synthetic Aperture Radar," IEEE Transactions on Geoscience and Remote Sensing.
-
L104: Clarify the mention of CFOSAT. Is this reference correct?
-
L126: Avoid using "First" as it suggests this is something special, which may not be the case.
-
Figure 1: Specify what the observation represents and the time period it covers.
-
L155-167: The tone and style of this section are not appropriate for a scientific paper. Please revise for clarity and formality.
-
L168: ERA5 SWH data has known limitations and should not be used as a primary reference. While it's acceptable for wave climate studies, for algorithm validation, it's better to use altimeter or buoy SWH data (as these are assimilated into ERA5 to improve accuracy).
-
L188: There seems to be a placeholder or an error here. Please clarify.
-
Scripts and Code: No scripts or code snippets are provided. To ensure replicability, please upload these to a suitable repository.
-
Section 3.1: Why apply Batch Normalization to a fully connected neural network instead of using direct normalization?
-
Section 4.2: Given the simplicity of the models, there's no need to emphasize the platform and computational resources used.
-
Section 4.4: The discussion is overly verbose for basic metrics. Please condense this section.
-
Figures 12, 13, and 14: The current figures are not informative. Consider drawing the distributions of RMSE, correlation (R), and bias across different grid points (e.g., in a 2x2 format) for better visualization.
-
L492: Note that an RMSE of 0.4 meters is not indicative of good performance for global SWH estimation.
English must be improved, besides, the writing of this paper does not strictly adhere to academic conventions.
Author Response
Comments 1:
L29: The altimeter has been extremely successful in SWH measurement, and its data processing is not particularly complex (atmospheric correction is not a significant factor in SWH measurement). There's no need to downplay the role of altimeters to emphasize the importance of GNSS-R in SWH measurements. New data sources like GNSS-R are valuable in their own right.
Response 1:
Thank you for pointing this out. We agree with this comment.We have deleted the relevant parts in the original text and emphasized the role of GNSS-R.
Comments 2:
L33: The coverage issue you mention also applies to GNSS-R. Additionally, the citation provided is not appropriate as it does not pertain to wave measurement. A more relevant reference for the coverage limitations of altimeters would be Jiang et al. (2020), "Evaluation of altimeter undersampling in estimating global wind and wave climate using virtual observation," Remote Sensing of Environment.
Response 2:
Thank you for pointing this out. We agree with this comment. Reference has been replaced: Jiang H. Evaluation of altimeter undersampling in estimating global wind and wave climate using virtual observation[J]. Remote Sensing of Environment, 2020, 245: 111840.
Comments 3:
L39: What do you mean by "human error"? Are you referring to Voluntary Observing Ship (VOS) data? If so, it's more accurate to describe them as poor quality rather than attributing errors to human factors.
Response 3:
Thank you for pointing this out. We agree with this comment. We have corrected the original statement and changed it to: Ship-based observations, while valuable for direct measurements, are infrequent, geographically constrained, and costly, often restricted to specific routes or missions. In the context of Voluntary Observing Ship (VOS) data, the quality of data can be suboptimal, which may stem from the inherent limitations and challenges of the data collection process itself.
Comments 4:
L47-49: Is this assertion accurate? I have serious doubts. If possible, consider including a plot of 1-day GNSS-R coverage in Figure 1 to clarify this point.
Response 4:
Thank you for pointing this out. Figure 1 illustrates the trajectory coverage map of all data collected from the FY-3E satellite over a one-month period. Currently, the FY-3E observation system comprises only one satellite. There are additional satellite projects planned which, upon successful network formation, will enable repeated observations within a single day. At present, the same technological approach is employed by the CYGNSS satellite system, which can revisit the same location with a revisit cycle of less than 7 hours. This capability is detailed in the CYGNSS handbook (Ruf, Chris. "CYGNSS handbook." 2022).
The figure in the word is a schematic diagram of the observation points of the FY-3E satellite in one day.
Comments 5:
L71: The term "roughness" is more closely related to wind speed rather than SWH.
Response 5:
Thank you for pointing this out. We agree with this comment. We replace the term "roughness" with "sea conditions".
Comments 6:
L78-79: The citation here is incorrect. Peter Janssen did not conduct the work you are referencing. If you are referring to Stopa's work, the correct citation should be Quach et al. (2020), "Deep Learning for Predicting Significant Wave Height From Synthetic Aperture Radar," IEEE Transactions on Geoscience and Remote Sensing.
Response 6:
Thank you for pointing this out. We agree with this comment. We replaced the reference as follows: h B, Glaser Y, Stopa J E, et al. Deep learning for predicting significant wave height from synthetic aperture radar[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 59(3): 1859-1867.
Comments 7:
L104: Clarify the mention of CFOSAT. Is this reference correct?
Response 7:
Thank you for pointing this out. CFOSAT is China France Oceanography Satellite.
Related References: D.Hauser, D. Xiaolong, L. Aouf, C. Tison and P. Castillan, "Overview of the CFOSAT mission," 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing , China, 2016, pp. 5789-5792.
Comments 8:
L126: Avoid using "First" as it suggests this is something special, which may not be the case.
Response 8:
Thank you for pointing this out. We agree with this comment. We replaced the statement with "Pioneering Application of FY-3E GNOS Data in SWH Retrieval".
Comments 9:
Figure 1: Specify what the observation represents and the time period it covers.
Response 9:
Thank you for pointing this out. We agree with this comment. Figure 1 illustrates the trajectory coverage map of all data collected from the FY-3E satellite over a one-month period. We provide explanations in the figure title.
Comments 10:
L155-167: The tone and style of this section are not appropriate for a scientific paper. Please revise for clarity and formality.
Response 10:
Thank you for pointing this out. We agree with this comment. We rewrote this paragraph and drew a table.
Comments 11:
L168: ERA5 SWH data has known limitations and should not be used as a primary reference. While it's acceptable for wave climate studies, for algorithm validation, it's better to use altimeter or buoy SWH data (as these are assimilated into ERA5 to improve accuracy).
Response 11:
Thank you for pointing this out. In response to concerns regarding data selection, it is important to clarify why ERA5 reanalysis data was chosen as the primary dataset for our experiments. Unlike other available datasets, ERA5 offers a significantly large volume of data, which is crucial for the robustness and generalizability of our study. In previous attempts to align our experimental setup with buoy and altimeter data, the volume of matched data proved insufficient for rigorous experimental validation. This issue is not unique to our study; as evidenced in the literature cited in Table 4, many researchers in the field have similarly relied on ERA5 as the experimental label due to its comprehensive coverage and high resolution.
Comments 12:
L188: There seems to be a placeholder or an error here. Please clarify.
Response 12:
Thank you for pointing this out. We agree with this comment. We have corrected it.
Comments 13:
Scripts and Code: No scripts or code snippets are provided. To ensure replicability, please upload these to a suitable repository.
Response 13:
Thank you for pointing this out. We agree with this comment. In order to help everyone better understand the model, the relevant code has been placed on Github. The specific link is in the Data Availability Statement module at the end of the article.
Comments 14:
Section 3.1: Why apply Batch Normalization to a fully connected neural network instead of using direct normalization?
Response 14:
Thank you for pointing this out.In the context of training convolutional neural networks (CNNs), Batch Normalization (BN) is generally preferred over Layer Normalization (LN). While LN standardizes the inputs across features within a single training example and is quite effective for models processing individual data points independently, BN exhibits superior performance in CNNs. This distinction arises because BN normalizes across the entire mini-batch, effectively reducing internal covariate shift, which is particularly beneficial in layered network architectures like CNNs where feature scales can vary significantly due to the hierarchical structure of the network.
Lei Ba J, Kiros J R, Hinton G E. Layer normalization[J].
Ioffe S. Batch normalization: Accelerating deep network training by reducing internal covariate shift[J].
Comments 15:
Section 4.2: Given the simplicity of the models, there's no need to emphasize the platform and computational resources used.
Response 15:
Thank you for pointing this out. We agree with this comment. We deleted this paragraph.
Comments 16:
Section 4.4: The discussion is overly verbose for basic metrics. Please condense this section.
Response 16:
Thank you for pointing this out. We agree with this comment. We shorten this section.
Comments 17:
Figures 12, 13, and 14: The current figures are not informative. Consider drawing the distributions of RMSE, correlation (R), and bias across different grid points (e.g., in a 2x2 format) for better visualization.
Response 17:
Thank you for pointing this out. In our experimental analysis, the overall error evaluation was conducted using Root Mean Square Error (RMSE) and correlation (R). These metrics were selected because they provide a comprehensive assessment of model performance across the entire dataset, rather than focusing on single-point evaluations. Due to the nature of Mean Absolute Error (MAE) and BIAS metrics, where global distribution data points are numerous and dispersed, creating a visual representation that clearly depicts error distribution can be challenging. Such visualizations can become cluttered and may not effectively communicate the geographic distribution of errors. Global distribution maps of BIAS for five different models are in the word.
Comments 18:
L492: Note that an RMSE of 0.4 meters is not indicative of good performance for global SWH estimation.
Response 18:
Thank you for pointing this out. We agree with this comment. The expressions were re-expressed and sorted out.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe paper pioneers the use of FY-3E satellite's GNSS-R data for Significant Wave Height (SWH) retrieval, offering a significant contribution to oceanographic research. The integration of Vision Transformer (ViT) technology with GNSS-R data is a commendable effort that demonstrates the potential of employing advanced deep learning models to enhance the accuracy of SWH retrieval. The detailed comparative analysis of various deep learning models, such as ANN-Wave, CNN-Wave, Hybrid-Wave, Trans-Wave, and ViT-Wave, provides valuable insights into their efficacy. The reviewer has the following comments.
1. The paper would benefit from a more comprehensive literature review that includes earlier works on deep learning-based GNSS-R remote sensing and other applications using FY-3E data. Including these references could provide a more robust context for the study and acknowledge foundational research in the field.
2. A suggestion for improvement is to ensure there is a space before parentheses and brackets to improve the paper's readability.
3. Table 3, did the author mis-mark the best Bias?
4. The complexity of the deep learning models may be challenging for those without a strong background in machine learning.
5. The error analysis could benefit from a deeper exploration into the causes of discrepancies and potential mitigation strategies.
Author Response
Comments 1:
The paper would benefit from a more comprehensive literature review that includes earlier works on deep learning-based GNSS-R remote sensing and other applications using FY-3E data. Including these references could provide a more robust context for the study and acknowledge foundational research in the field.
Response 1:
Thank you for pointing this out. We agree with this comment. We have added more relevant literature in related fields. They are as follows:
Yan Q, Chen Y, Jin S, Liu S, Jia Y, Zhen Y, Chen T, Huang W. Inland water mapping based on GA-LinkNet from CYGNSS data. IEEE Geoscience and Remote Sensing Letters.
Xie Y, Yan Q. Stand-alone retrieval of sea ice thickness from FY-3E GNOS-R data. IEEE Geoscience and Remote Sensing Letters.
Comments 2:
A suggestion for improvement is to ensure there is a space before parentheses and brackets to improve the paper's readability.
Response 2:
Thank you for pointing this out. We agree with this comment. Species have been added before parentheses and brackets in the entire text.
Comments 3:
Table 3, did the author mis-mark the best Bias?
Response 3:
Thank you for pointing this out. We agree with this comment. We have marked the best Bias -0.0012 in the text.
Comments 4:
The complexity of the deep learning models may be challenging for those without a strong background in machine learning.
Response 4:
We agree with this comment. In order to help everyone better understand the model, the relevant code has been placed on Github. The specific link is in the Data Availability Statement module at the end of the article.
Comments 5:
The error analysis could benefit from a deeper exploration into the causes of discrepancies and potential mitigation strategies.
Response 5:
We agree with this comment. We have conducted error analysis according to relevant methods, and the specific content is in the discussion module.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsOn Response 4:
So GNSS satellites do not provide a better coverage than altimeters (it is not wide swath). Remove the sentence.
On Response 8:
Remove "Pioneering "
On Response 11:
As I mentioned, ERA5 should not be regarded as the "ground truth". The NDBC buoy data is sufficient for a rigorous validation of SWH. There is no need to arguing this with the reviewer. Since you have made attempts to align our experimental setup with buoy data, just show the results. I never see studies use ERA5 SWH to evaluate high-accuracy wave measurements (e.g., altimeters, buoys).
On Response 14:
You are using BN in fully-connected NN, not CNN...That's why I think it is strange.
On Response 17:
Reconsider this comment. Figures 12, 13, and 14 are not invormative and meaningless for readers. Also, Figure 14 seems to be residuals instead of bias. It has nothing to do with "challenging" at all to draw these error metrics in a 2deg by 2deg or 3deg by 3deg grid...
Comments on the Quality of English Language/
Author Response
Comments 1:
On Response 4:
So GNSS satellites do not provide a better coverage than altimeters (it is not wide swath). Remove the sentence.
Response 1:
Thank you for pointing this out. We agree with this comment. Already removed the sentence.
Comments 2:
On Response 8:
Remove "Pioneering "
Response 2:
Thank you for pointing this out. We agree with this comment. Already removed "Pioneering ".
Comments 3:
On Response 11:
As I mentioned, ERA5 should not be regarded as the "ground truth". The NDBC buoy data is sufficient for a rigorous validation of SWH. There is no need to arguing this with the reviewer. Since you have made attempts to align our experimental setup with buoy data, just show the results. I never see studies use ERA5 SWH to evaluate high-accuracy wave measurements (e.g., altimeters, buoys).
Response 3:
Thank you for pointing this out. In our previous attempts to integrate buoy data for machine learning applications, we collected measurements from 255 buoys, numbered from 41001 to 46279. However, we were only able to match data from 90 points in the same month, which is insufficient for machine learning analysis due to the limited amount of data. Therefore, we adopted ERA5 as training labels, which is a practice used in significant wave height (SWH) inversion studies, as evidenced by several publications in this field. In our experiments, we successfully matched nearly 100,000 data points using ERA5, enabling a robust experimental analysis. If we obtain a sufficient number of buoy data in the future, we plan to conduct further experiments to validate and improve our model.
Wang, F.; Yang, D.; Yang, L. Retrieval and assessment of significant wave height from CYGNSS mission using neural network. 582
Remote Sensing 2022, 14(15), 3666.
Bu, J.; Yu, K.; Ni, J.; et al. Combining ERA5 data and CYGNSS observations for the joint retrieval of global significant wave height 597
of ocean swell and wind wave: a deep convolutional neural network approach. Journal of Geodesy 2023, 97(8), 81.
Bu, J.; Yu, K. Significant wave height retrieval method based on spaceborne GNSS reflectometry. IEEE Geoscience and Remote 621
Sensing Letters 2022, 19, 1-5.
Comments 4:
On Response 14:
You are using BN in fully-connected NN, not CNN...That's why I think it is strange.
Response 4:
Apologies for the oversight in addressing your query regarding our choice of Batch Normalization (BN) for fully-connected neural networks. We opted for BN over Layer Normalization (LN) after experimental comparisons showed similar outcomes with a slight edge for BN in terms of training efficiency.
Comments 5:
On Response 17:
Reconsider this comment. Figures 12, 13, and 14 are not invormative and meaningless for readers. Also, Figure 14 seems to be residuals instead of bias. It has nothing to do with "challenging" at all to draw these error metrics in a 2deg by 2deg or 3deg by 3deg grid...
Response 5:
Thank you for pointing this out. We agree with this comment. I apologize for my misunderstanding of the word review. I have updated the image in the article to 3deg by 3deg grid.
Reviewer 3 Report
Comments and Suggestions for AuthorsAfter checking the format and references, this paper can be accepted.
Author Response
Thank you for pointing this out. We agree with this comment.