3.1. Regional Sampling Variation
Understanding of the quality of the sea state observations is an essential part of any data analysis, providing an insight into altimeter performance close to the coast and offshore. This is particularly important for interpreting the results of extreme analyses where sampling of rare (extreme) events can be problematic. Thus, we investigate both the quality of the data (for all missions) as a function of distance to the coast, and the annual sampling characteristics of data within a 50 km radius of the buoy. The six regions (see
Figure 1) representing the positions of pairs of buoys are examined. The distance of the nearshore buoy is marked on the figures for reference. The sampling characteristics with distance to the coast for region #1 are shown in
Figure 2. We focus primarily on the “Good” data (qual_flag = 3, solid blue and red lines). Be aware that the total number of observations, and their distribution with distance to coast, is a function of our circular sampling area which itself depends on the exact position of the buoy in question. Therefore comparisons between regions should be made carefully, but we advise that the percentage of good data (solid red line) is a more site independent measure in this regard. We calculate the distance to the coast by applying the “great-circle” distance based on the high resolution dataset, Global Self-consistent Hierarchical High-resolution Geography, GSHHG [
26]. The high resolution GSHHG dataset is about an 80% reduction in size and quality compared with the original (full) data resolution of 0.1 km
(
https://www.soest.hawaii.edu/pwessel/gshhg/).
There is a marked difference in data quality and abundance in the first decade of the record (
Figure 2a) compared with the most recent decade (
Figure 2b). In particular, the total number of good data points (solid blue line) is substantially lower in the first period but good data also represents a much lower percentage of the total data (solid red line). While this percentage rises with increasing distance to the coast, it remains below 40% up to 20 km and remains around 50% thereafter. This contrasts considerably with more recent years where, at even less than 10 km to the coast, good data exceeds 50% and is typically closer to 80%. The proportions of data associated with rejection flags are also shown (dotted and dashed lines) and these clearly correspond to the remaining data that is not marked as “Good” (qual_flag = 1, 2). Note that more than one rejection flag may be assigned to a single measurement. Approaching the coast the “waveform_validity” flag (green dotted line) appears to be the dominant flag but this is far more pronounced in the earlier part of the record (panel a). While we do not describe in detail the causes for the designation of rejection flags, we note that they are somewhat mission specific and inference for their application requires a thorough case-by-case examination that we do not conduct here.
The overall general pattern, with an increase in total observations and a considerable improvement in good data percentage in more recent years, applies to all other regions (#2 to #6,
Figure 3 and
Figures S1,S3,S5 and S7). This phenomenon clearly corresponds to the changes in missions from fewer earlier spacecraft and instruments with typically lower reliability, to more numerous and higher quality platforms more recently. Across all regions however, there is an enormous amount of variability in data quality and rejection flagging with distance to the coast. This arises due to different combinations of satellite track that happen to intersect the particular location. In region #3 (U.S. east coast, near buoy 41113,
Figure S1), for example, a large increase in observations is seen at approximately 35 km. This likely corresponds to an orbital track from a specific mission. A close examination of the satellite tracks contributing to this data in the latter decade, shown in
Figure 4, suggests this is likely attributable to a combination of Jason-1, Jason-2 and Jason-3 and SARAL.
The unique and varied properties of different geographic locations are a function of many variables. Firstly, owing to the orbital paths of the spacecraft, sampling density increases at higher latitudes. As a result, at lower latitudes, there is a higher probability that a given site (of the same spatial extent) will not intersect the path of a given mission. Conversely, some sights may lie close to the intersection of tracks from multiple missions that then provide an abundance of observations. As can be seen in some locations (see e.g.,
Figures S1 and S7), numbers of observations can vary by over an order of magnitude with both distance to coast and sampling period. Note also that altimeters may occasionally change orbits, change instrument mode and go on- or off-line at various times.
Detailed temporal properties of altimeter sampling is of interest and variability in total annual “Good” Hs measurements by mission is shown for the various buoy locations in regions #1 and #2 in
Figure 5 and
Figure 6 respectively. The coloured bars that correspond to the different missions that contribute measurements clearly show the missions entering and leaving service. However, there is substantial variability even where the same missions contribute to successive years. In both regions (#1 and #2), there is a clear increase in data at the offshore buoy location largely due to the fact that the sampling radius (50 km) is not partially over the land as is the case for the nearshore location. While this is somewhat obvious, in fact it raises questions about how the ocean surface area, should or could, be geographically sampled and aggregated when focusing on the nearshore. Clearly if a buoy is being used for calibration, then employing a fixed radius around the buoy is somewhat incompatible with the requirement for data being equidistant from the coast. Decisions regarding sample aggregation are likely to be needed on a case-by-case basis, and this is discussed further in
Section 4.2. Nonetheless, in this case although the land removes approximately 50% of the sampling area nearshore (
Figure 5a), the increase in data seen offshore (
Figure 5b) is only partially accounted for by the increase in sampling area. The additional data loss nearshore is explained by the increase in rejected data seen in e.g.,
Figure 2 with approach to the coast. However, remarkably, in region #6 (
Figure S8a), although the nearshore sampling area is partially over land, more observations are available than the corresponding buoy offshore (
Figure S8b) for the entire record.
The heterogeneity in inter-annual sampling of good data is in fact remarkable, all the more so since it is a strong function of location. For example, within 50 km of buoy 41002 (
Figure 6b) in the western Atlantic (∼330 km offshore), TOPEX/Poseidon and the Jason missions are largely missing from the record and only contribute data during their corresponding satellite orbit interleave modes. This can be compared with
Figure 5b where Jason-2 (marked in pale orange) was the majority contributor between 2009 and 2016. In addition, at buoy 41002 (
Figure 6b), inter-annual variation sometimes exceeds 100% with notable increases in sampling around 2005 and 2010. In particular the average difference between the periods 1991 to 1999, and 2003 to 2005 is approximately a factor of 3. Similar patterns are seen at the other sites (see
Figures S2, S4, S6 and S8), together with large variability in annual sampling as a function of location.
In summary, this overview of the data reveals the substantial heterogeneity in sampling both temporally and with distance to the coast when exploiting the Sea State CCI L2P dataset. This sampling variation itself is also highly sensitive to location which motivates considerable caution where small scale analyses are desirable for specific applications. While this cannot be linked to impacts on extreme analysis directly (which we examine in more detail in
Section 3.2 and
Section 3.3), any analysis (including extremes) of variability will be profoundly sensitive to location and impacted by loss of good quality data approaching the coast, particularly in early parts of the record where, in some locations, good data is virtually absent. The assessment of long term temporal trends, particularly within 50 km of the coast, must therefore be treated with caution and considered carefully on a case-by-case basis.
3.2. Representation of Extremes
Results in
Section 3.1 reveal the extent of the loss of sampling of satellite observations when approaching the coast, the heterogeneity in inter-annual sampling rates and the sensitivity of these factors to geographic location. While it is clear that sampling rates can be low and variable, such information does not support a quantitative assessment of the impact on the estimation of extremes. Even with low sampling rates, it is important to establish, for example, whether observations made nearer the coast accurately capture extreme events, and whether this is dependent upon mission.
A comparison of a short time series from a buoy and the Sea State CCI L2P product is shown in
Figure 7. The series of hourly measurements of Hs acquired at buoy 41010 during 2011 (red points, all panels) is overlaid by 1 Hz measurements acquired from all satellite missions observing a region of radius 50 km around the buoy location. The choice of this buoy as an example is somewhat arbitrary although in part is due to the passage of high intensity storms, including two major hurricanes (Irene and Ophelia), making it a useful example in the context of extremes. A total of four different missions were flying concurrently during this particular period and how they each contributed to this data can be seen in
Figure 7b.
In panel (a), all 1 Hz measurements (black dots) are overlaid. The scatter of the 1 Hz measurements is readily apparent and clustering occurs since several point measurements are made during each pass. Since the number of 1 Hz measurements varies in each pass, it is typical to take a median value of Hs of the track section that intersects the area of interest. Such an approach is shown in panel (c). In spite of the scatter, we can see that generally the satellite measurements coincide well with buoy and although they are sporadic (characterised by their orbital paths), frequency of passage is fairly homogeneous. In particular we draw attention to the period between the two solid vertical black lines in
Figure 7a. This part of the time series can be seen in more detail in panels (b) and (c), discussed shortly. Three particularly energetic storms passed fairly close to buoy 41010 during this time, the closet being hurricane Irene. Jason-2 was passing just before the peak of the storm and captured values of Hs at around 7 m. (Note that the highest value captured (around 9 m) does not coincide precisely with the storm peak, being a little earlier, and so was somewhat overestimated, probably due to the energetic sea state at the time).
In addition to the Hs observations, non-zero valued rejection flags associated with each 1 Hz observation are also shown (blue diamonds) in
Figure 7a. The value scale for these flags is provided on the y axis on the right-hand side and indicated by horizontal blue lines. The inclusion of this data provides an indication of their frequency of occurrence and how they relate to the Hs observations. Noting that buoy 41010 lies ∼178 km offshore, some spuriously low and high values of Hs are evident. While the range of rejection flags are numerous and underlying reasons for their issuance are somewhat complex, we draw attention in particular to the “swh_outlier” flag (=128). With careful scrutiny, it can be seen that while there are occurrences of this flag, it does not tend to coincide with high energy events. Or put another way, (climatologically) high values are not typically being marked as “outliers” and discarded. Indeed, the absence of any rejection flags for the most extreme events shown suggests that data quality is largely independent of the energetic magnitude of the sea state.
In
Figure 7c, results were filtered by qual_flag = 3, and median values are shown for each track segment that passed through the area of 50 km radius. This approach reduces noise and is similar to that used in the generation of the Sea State CCI L4 gridded product. When presented in this way, the generally good agreement between the two sources during the most energetic events is readily visible. We note however the presence of some spurious high valued satellite data—in particular from Cryosat at the beginning of October. This data was flagged as “Poor” quality (qual_flag = 1) and hence was removed when filtered. After filtering, the corresponding median value, seen in
Figure 7c, is much closer to the buoy observation. Also seen is the fairly low frequency of sampling and it is fair to assume that in this example, the coincidence of satellite passage and the extremes was reasonably fortuitous. What we can take away from this therefore is that altimetry captures the extremes accurately when passage happens to coincide with the event.
It is important to verify that agreement between buoy and satellite is good on longer time scales. Q-Q plots (see
Figures S9–S14) in a number of locations show close agreement in terms of the overall climatology but given our interest in understanding the differences in the representation of extremes, such methods are insufficient. For example, it is important to understand whether (extreme) events are observed concurrently by both observation systems and if so, to what extent do they agree in magnitude. Scatter plots based upon paired hourly mean values reveal the joint structure of the two data sets.
Plots for the pair of buoys (44005 and 44007) in region #1 are shown in
Figure 8, for the various combinations of the offshore and nearshore data. Note that a small amount of data is unused in this approach since pairs of observations are required. Over the 26 year period a total of 170,038 h observations are available from the buoy, this compares to only 3125 (=1.8%) available in the Sea State CCI L2P product. For
Figure 8a, the total number of pairwise comparisons is 2336, fewer than the total available since there is not always a corresponding value in the specific hourly slot in the opposing record. Furthermore, substantially more data pairs, and larger values of Hs, are present in
Figure 8b and
Figure 9b. (This is also the case for the Q-Q plots,
Figures S9–S14). This is due to the fact that the buoy-buoy comparisons tend to involve near-complete 1-h time series, and so very little data, including extremes, is lost during pair-wise analyses.
Figure 8a,d show the comparison between the buoy and satellite pairs for the offshore and nearshore buoys respectively. In both cases the agreement overall is good but it can also be seen in the upper right quadrants that agreement in the extremes is good. Notably, for the majority of the most extreme values nearshore, the agreement is excellent. In this example there are however three anomalous points, one offshore and two nearshore. The regression models (red lines) suggest that there is systematic bias in both cases (of different sign), but such an analysis is not appropriate for the extreme’s alone, since it is based on all the data. The general absence of points in the top left and bottom right quadrants suggests that extremes observed by the satellite are typically consistent with those observed by the buoy. We note also that there is not any clear connection between mission and observations of extremes, although some features are apparent, such as the tendency for the early missions (ERS-1, TOPEX, ERS-2) to underestimate (see
Figure 8a). The four remaining panels (b,c,e,f) show the relationship between the wave climate at the different buoy locations, and how well this relationship is represented in the satellite data. Panels (b) and (c) can be compared although it is immediately clear that the relatively few pairs of points from the satellites (panel c) makes it difficult to draw robust conclusions. In fact,
Figure 9c (and also
Figures S15–S18) reveal an extremely low number of coincident observations. This is attributed to the fact that some buoy pairs share a direction of separation that happens to coincide with the direction of travel of one or more satellites. Thus, on a single passage, there is a high probability that the satellite will pass both buoy locations, leading to temporally coincident measurements. For buoy pairs with different spatial orientations this is not the case and coincident measurements are much rarer. This effect is further exacerbated by increased spatial separation.
Interesting comparisons can be made with the other regions. Similar scatterplots for regions (#2 to #5) are provided in
Figures S15–S18 in the supplementary information but we comment specifically on results from region #6 shown in
Figure 9. The regression indicates excellent overall agreement but in terms of extremal agreement, the results are comparable with region #1. An absence of points in the top left and bottom right quadrants reveals that extremes are being captured accurately. For Hs above 6 m, differences are rarely greater than 1 m, and often less. At buoy 46005, the apparent lack of more recent missions capturing high energy events is likely related to mission orbital trajectories “missing” that particular location, and some loss of data from the buoy for extended periods of time in more recent years.
A more detailed explanation of the joint properties of the buoy pairs is beyond the scope of this analysis however we make a few observations regarding the difference between the east and west coast sites. Firstly, on the east coast and Gulf of Mexico there is a consistent pattern of high bias in the satellite measurements at the nearshore site, seen at regions #1, #2, #3 and #4. This pattern is not apparent in the two west coast regions (#5, #6). Due the limited number of comparisons these results cannot be said to be robust, nor is it clear that this applies to the most extreme observations. However, based on these findings we advocate for a more detailed investigation of this issue. Secondly, there is a clear difference in the joint characteristics between buoys.
Figure 8b reveals a distinct absence of points in the upper left quadrant characterized by a strong alignment with the 1:1 line. However, on the west coast, seen in
Figure 9b this feature is completely absent. This difference is likely due to the fetch limited conditions on the east coast created by prevailing offshore winds. In contrast, the west coast wave climate is predominantly dictated by easterly winds that cross the entire North Pacific basin. In this case, the same high waves impact both offshore and nearshore sites creating the somewhat “symmetric” looking joint structure seen in
Figure 9b.
In respect of the extreme events jointly captured in all six regions, and whether offshore or nearshore, there are very few examples of large differences in Hs magnitudes. We have found little evidence to suggest that the extremes captured by satellites are deficient in any systematic way, although there does appear to be a limited dependency on geographic location. The exceptions to this are two east coast nearshore locations, at buoys 41110 and 41113 (
Figures S14 and S15 respectively), where a strong positive bias in satellite measurements affects all the data, including the most extreme Hs values. It is particularly pronounced for buoy 41113. These two buoys are of the same “wave rider” design, lie in very shallow water (approximately 10 m and 20 m respectively) and have a fairly short duration of coverage (∼10 years). Without further detailed study it is difficult to speculate on the underlying causes of the systematic disagreement. In such shallow water, both depth-induced wave breaking and local currents are factors that may cause localised conditions that are not well captured by altimeters and re-tracking algorithms. It could also be that in such conditions the buoys also do not perform well and that the “true” value lies somewhere between the buoy and satellite measurements.
3.3. Impact of Under-Sampling on Extreme Analyses
Results suggest that the Sea State CCI L2P product generally gives a good representation of extreme Hs at distances of up to a few kilometers from the coast where it is limited primarily by low sampling of these events. The impact of this under-sampling on analysis depends upon many factors, including the wave climate at any given location. A detailed analysis for specific locations is beyond the scope of this paper but in order to quantify this impact in a general way, we have estimated Hs 10-year return level from the buoy data only, in regions #1, #4 and #5. These represent the east and west coast regions and the Gulf of Mexico, and provide the longest observational records (see
Section 2.3 for methodological details). In order to approximate the effects of satellite under-sampling we have evaluated the 10-year return level based upon random sub-samples comprising only a percentage of the total data. 500 iterations of the sampling procedure were performed at each sub-sampling level in order to obtain average values of the return level and the 95% confidence bounds. This approach is intended to be approximately representative of the actual satellite sampling rates observed. For example, in
Section 3.2 we showed that in region #1 (see
Figure 8), the satellite data accounted for only 1.84% of the buoy time series for 1-h averages. This was almost the same (1.81%) in region #6 (see
Figure 8). Although it is clear from
Figure 6 that total annual data increases over time, for the purpose of convenience here only the average sampling rate it used. Note that similar analyses adopting variable annual rates gave similar results (not shown) and satellite estimates of 10-year return level were commensurate with results from sub-sampling at the appropriate level (typically <2%).
Figure 10 shows the Hs 10-year return level estimated at sampling levels between 1% and 100%.
Figure 10a shows the estimates of absolute values of Hs, and it can be seen how these estimates are reduced as the level of sampling drops. Initially the decrease is fairly linear but below approximately 50% sampling the rate decreases nonlinearly and fairly substantial bias is introduced. Since this is a general approximation to the sampling rates seen in the Sea State CCI L2P product in any given geographic location, typically <2%, it is reasonable to expect that 10-year return level estimates would typically be severely underestimated. In
Figure 10a,b, for reference, red vertical lines are marked at 2.5% and 20%.
Figure 10b provides further insight. The curves (and uncertainty bounds) have been scaled by their 10-year return levels such that the common scale shows the difference in relative response to increased sampling. Some locations (such as NE Pacific) appear to be somewhat less affected by under-sampling than others (such as NW Atlantic). However, improvement in estimate due to increased sampling is not consistent across sites. Inspection of the uncertainty bounds reveals that once sampling is ∼20% of the total, the upper bound tends to exceed the “true” value. The red vertical line at 20% intersects a number of the upper bound (dashed) lines. An exception to this are the very wide confidence bounds for buoy 42035 in the Gulf of Mexico. While a detailed examination of this particular difference could be time consuming, we speculate that the impact of infrequent but powerful hurricanes is an important cause of this uncertainty. Infrequent high magnitude events, in a predominantly low magnitude climate, tend to substantially increase the uncertainty in fitting extreme value models due to very “long tails” in the probability distribution (see e.g., [
27]). Regardless, these results suggest that under-estimation of 10-year return levels could be substantial where sampling remains below the 50% to 75% range.