1. Introduction
Through smallsats, electro-optical Earth observation (EO) is rapidly expanding, enabled by advances in electronics, imaging sensors, data transmission, and the miniaturization of components. The resulting smallsat constellations provide rapid repeat imagery that is needed to better understand and manage the unprecedented planetary-scale threats from climate change. Perhaps the greatest challenge for all EO data applications is that the data are obtained through the atmosphere, which variably corrupts the data. The solution is to correct the data to surface reflectance, a process that seeks to remove the atmospheric effect entirely, resulting in clear images and restored digital signals. However, atmospheric correction is problematic for the many hundreds of EO smallsats without onboard equipment to calibrate sensor output, permitting direct conversion to surface reflectance.
A proposed pathway for smallsat surface reflectance correction applies the Land Surface Reflectance Code (LaSRC) and cross calibration to the data of two research grade satellite platforms, Landsat-8/9 and Sentinel-2 A/B [
1,
2], atmospherically corrected by LaSRC. Such cross-calibration can introduce uncertainty due to mismatched overpass timing and spectral responses between sensor platforms. LaSRC currently requires ancillary data to assess the degree of atmospheric effect on the day of the smallsat’s image acquisition, both for calibration and application. This potentially adds another layer of uncertainty in operational surface reflectance retrieval due to any temporally mismatched image collection for the ancillary data. Ancillary data also delay image processing and output and may have coarser granularity that reduces spatial sensitivity. CMAC was formulated to avoid these sources of uncertainty.
The Closed-form Method for Atmospheric Correction (CMAC) was developed to deliver surface reflectance with no delay upon image download. Its development was prompted by a seminal observation of atmospherically driven reflectance changes. The novel CMAC pathway was tested against Sen2Cor for Sentinel-2 correction [
1] and LaSRC for Landsat-8/9 correction [
2]. Corrected Landsat data by LaSRC, L2A, are available only through Earth Explorer and represent the general current state of the art in surface reflectance retrieval. CMAC proved accurate and precise for higher levels of atmospheric effect, estimated from each image’s spectral data alone. This paper investigates an additional inquiry into CMAC application: whether the reliability of surface reflectance output from one area of interest (AOI) applies to all other environments, especially those with very different spectral characteristics. As a yardstick for this comparison, CMAC reliability is compared to the state-of-the-art LaSRC correction for the four visible and near infrared bands (VNIR) of Landsat-8/9.
The paucity of surface reflectance data is a challenge for evaluating atmospheric correction. A few such datasets exist, but virtually never in time and space to support sustained, focused testing to compare methods, thus necessitating a workaround for investigating the reliability of atmospheric corrections. The workaround for this investigation relies upon a truism relative to atmospheric correction: correction accuracy is greatest for scenes taken through a relatively “clean” atmosphere. This is partially because clean images require much less adjustment to achieve surface reflectance but also because the engineering tolerances to accurately retrieve surface reflectance become tighter and the solution accuracy is more critical as atmospheric effects increase. In a prior investigation [
2], CMAC and LaSRC closely agreed on images acquired under relatively clean atmospheric conditions. These data were accepted as surrogate surface reflectance to provide a reference to construct the datasets to support this analysis. These surrogate surface reflectance estimates were consulted to find TOAR reflectance values present in both data sets that could be used to test the null hypothesis described below.
This paper (1) investigates the reliability of CMAC to provide surface reflectance output for environments widely different from where CMAC was developed and calibrated, and (2) evaluates the functionality of CMAC in relationship to the widely accepted state of the art, LaSRC. A null hypothesis was formulated to express atmospheric correction reliability: Equivalent top-of-atmosphere reflectance from images of different environments collected under the same level of atmospheric effect, when atmospherically corrected, will yield equivalent surface reflectance (i.e., no difference). This hypothesis can be stated more simply as “the same input affected by the same conditions will yield the same output”.
A point of continual reference in past papers that we use again here is the application of image appearance that includes clarity and color balance of atmospherically corrected images [
1,
2]. Though a subjective interpretation, images that appear clear are logically closer to surface reflectance than those containing visible haze.
Figure 1 is an example of an extremely hazy image portrayed as top-of-atmosphere reflectance (TOAR) and CMAC- and LaSRC-corrected views and illustrates the importance of scene appearance for judging atmospheric correction. While the lack of surface reflectance groundtruth in the appropriate time and at an appropriate scale can be problematic for cross-checking the accuracy of retrieved surface reflectance, image appearance is useful as a qualitative guide for correction accuracy.
CMAC represents several innovations. These include the use of scene statistics, alone, for retrieval of surface reflectance rather than dependence upon ancillary data from another satellite. Scene statistics provide assessment of the atmospheric effect as a lump sum in the form of a grayscale map, with brightness conveying a greater degree of correction. The resulting surface reflectance retrieval accounts for scatter and absorption using a conceptual model that inverts and adjusts the empirical line method [
3]. Another innovation is the representation of the atmospherically induced deviation of TOAR from surface reflectance in Cartesian space as a line. The slope and offset of the line are applied as the two parameters that reverse TOAR to deliver surface reflectance differentially for each pixel across the image.
This paper does not specifically address solar angle or pointing direction, which inevitably lead to adjustment of the signal correction: these use cases await further investigation. However, given that the atmospheric effect is evaluated as a lump-sum “see it—correct it” approach, such effects may be partially controlled without additional consideration. Solar angle and path length through variable levels of atmospheric aerosol affect target irradiance, and this correction is planned to be approached through future empirical measurements.
2. Materials and Methods
CMAC is a recently developed method intended for atmospheric correction of smallsat images. Sentinel-2 data were used as the research and development testbed for CMAC. The process flow applies image reflectance in two steps that map and then reverse the atmospheric effect spatially.
The first step estimates the atmospheric effect based on the remarkably stable blue band reflectance properties of vegetation. Using a reference crop, alfalfa, and image extraction and sampling, the top-of-atmosphere reflectance (TOAR) was modeled based on the sampled VNIR spectral band responses. In application, the model is applied through grid sampling of VNIR spectral bands that predict the TOAR blue band response. Because the atmospheric model works with scene statistics, it supports surface retrieval in near real time without other inputs. When displayed as an image, the output of this initial step produces a grayscale whose brightness is applied to adjust the degree of correction.
In a second step, the CMAC processing reverses the atmospheric effect using a conceptual model that captures our observation of reflectance behavior under increasing aerosol loading: TOAR reflectance for dark targets increases due to backscatter and for bright targets decreases due to attenuation. This response is a bandwise linear continuum from dark to bright reflectance encoding the deviation from surface reflectance as a line for all pixels under a set atmospheric condition. The slopes and offsets of linear response are calibrated for each band of a satellite that then enables efficient reversal of the atmospheric effect to deliver the original surface reflectance. These steps and how they were developed are described in
Appendix A, and the interested reader can also consult previous journal papers [
1,
2] for additional information.
This investigation applies an earlier comparison of CMAC to LaSRC (version LPGS_15.5.0) for 31 relatively clear Landsat-8 and -9 images of five AOIs of warehouse/industrial districts in Southern California (SoCal) known to have consistent surface reflectance [
2]. In that analysis, the average cumulative distribution functions (CDFs) for these two disparate methods agreed to such a close extent that they plotted virtually atop one another (
Figure 2). Those paired datasets are employed here as surrogate estimates of surface reflectance. As can be followed through an annotated spreadsheet in
Appendix B, averaging and interpolating the values for CDF extractions from the 31-image cohort were used to construct datasets for CMAC and LaSRC comparison. This pairing found the same atmospheric conditions and the same TOAR input from completely different environments than SoCal, where these methods were in close agreement. The high dark-to-bright dynamic range of reflectance in the SoCal dataset is an important distinction, because the AOIs selected for comparison of method reliability have extremely low dark-to-bright dynamic spectral range.
Landsat-8 and -9 data were not considered separately given the close agreement of these paired satellites [
4]. LaSRC applies a radiative transfer (RadTran)-based workflow that is documented in readily attainable remote sensing literature that readers are urged to consult [
5,
6]. RadTran calculations account for the various reflectance, absorbance, transmittance, etc., components to estimate the amount of light and radiance measured by the sensor.
Two locations were selected for analysis of method reliability, both with extremely low dark-to-bright dynamic spectral range, very different from other areas where CMAC has been calibrated or applied. No specific selection criteria were considered for these two locations other than their low dark-to-bright spectral diversity being notably difficult for RadTran-based atmospheric correction. The first AOI investigated was located just west of Lake Newell, Alberta, Canada, and was chosen to represent shortgrass prairie, a vegetation cover occupying a band of semiarid climate that runs north to south for over 2000 km across the United States and Canada. After appreciating the results from Lake Newell, a second site adjacent to the El Pinacate volcanic uplands in Sonora, Mexico, was chosen for confirmation. El Pinacate represents a profound desert of exposed sand, with sparse, widespread shrubby trees constituting less than three percent cover within the AOI investigated; such profound deserts are found in significant proportions of South America, Asia, Africa, and Australia. Shapefiles were mapped to enclose homogeneous cover for both AOIs (
Figure 3). A regional view from the 4 July 2022 Landsat-8 El Pinacate region in
Appendix C provides a wider view of the TOAR, CMAC-corrected, and LaSRC-corrected examples and further context for atmospheric correction over deserts with low dynamic spectral reflectance, contrasting with adjacent areas surrounding the Gulf of California shore exhibiting high dynamic spectral reflectance.
Three Landsat-8 and -9 images were selected for each AOI from the Landsat archives and downloaded as both uncorrected and LaSRC-corrected (by version LPGS_15.5.0) images from Earth Explorer (
Table 1). These six images were corrected by CMAC v1.1L and calibrated for Landsat-8/9 application; the same processing was also applied in the previous CMAC-to-LaSRC comparison [
2]. The dataset to test the null hypothesis was constructed through back comparison to the SoCal datasets from AOIs with quasi-invariant reflectance, where CMAC and LaSRC showed close agreement. Any of the five SoCal AOIs would work for this application, because the 31 images were affected by a relatively moderate atmospheric effect, and because these two disparate methods resulted in virtually the same corrected data distributions per AOI. Two were selected, Fontana and Rochester, whose reflectance distributions are shown in
Figure 2. Since the true surface reflectance is unknown, and there were slight differences between CMAC and LaSRC in the SoCal results, this analysis regarded the SoCal surface reflectance estimates as most appropriate for application as the atmospheric correction method: CMAC for CMAC and LaSRC for LaSRC. This ensured that the choice of datasets did not bias the results. For these comparisons, Atm-I was applied to measure the atmospheric effect rather than the aerosol optical thickness ancillary data applied in LaSRC from the Moderate Resolution Imaging Spectroradiometer (MODIS) [
7]; however, the LaSRC datasets downloaded from Earth Explorer were generated with the workflow that applies MODIS AOT for assessment of atmospheric effect [
5,
6].
Pixel values for the blue, green, red, and NIR bands were extracted from within the AOI polygons shown in
Figure 3 and exported to spreadsheets for analysis. Reflectance distributions from each image for three treatments (i.e., TOAR and corrected CMAC and LaSRC) were extracted and represented as CDFs in 21 percentiles from 1% to 3% and in 5-percentile steps between 5% and 95%. Data for the Rochester AOI were matched with the 8-14-2023 Lake Newell image to check for bias from selection of the SoCal AOI; none was found in comparison to data for the Fontana AOI, which was matched with the other five images (
Table 1).
Construction of the dataset for testing the null hypothesis began by finding the Atm-I of multiple images of the SoCal datasets whose averages equaled the median Atm-I’s from the images selected for Lake Newell and El Pinacate. The individual values of the 21 percentile steps for the distributions are arrayed in columns in the spreadsheets, one column per image, ranked by increasing Atm-I. This format facilitated averaging image values to support the comparisons by pairing the experimental image TOAR, CMAC, and LaSRC values with the corresponding averaged values for the SoCal images.
Appendix B provides portions of the combined 29 July 2023 Lake Newell and Fontana datasets that were reformatted and annotated to support explanation of the workflow to identify values for the three defining properties of the null hypothesis: (1) Atm-I conditions in the SoCal dataset were selected whose averages equaled the experimental datasets, thereby achieving the same atmospheric conditions; (2) interpolation to identify the exact TOAR and its percentile position in each of the six experimental images to match the SoCal TOAR input for atmospheric correction; and (3) identifying the corresponding atmospherically corrected output values from the SoCal dataset for testing the null hypothesis. Interrelating the TOAR data and the surface reflectance calculated from this data was accomplished within each dataset using their percentile positions. The 7-29-2023 Lake Newell spreadsheet contained in the
Supplementary Materials can be compared to
Appendix C to assist in following the calculation workflow.
The error for the atmospherically corrected data was estimated by treating the Fontana- and Rochester-corrected surface reflectance estimates as the standard to assess CMAC and LaSRC error: % error = 100 × (value − standard)/standard. This comparison was judged to be valid because the atmospheric correction results for the spectrally diverse SoCal AOIs were accepted as surrogate surface reflectance. The “value” in this formula represents the Lake Newell and El Pinacate surface reflectance estimates.
This statistical distribution-based workflow was repeated for all four bands for each Lake Newell and El Pinacate image. In this manner, a series of common TOAR values, and the CMAC and LaSRC surface reflectances estimated from them, were interpolated from these two datasets.
3. Results
Spreadsheets, software, image lists CDFs for the four bands of the three treatments of the three images per AOI afford a comprehensive look at the responses per correction method (
Figure 4). The TOAR CDFs for Lake Newell illustrate diverse reflectance due to the vegetated shortgrass prairie in comparison to El Pinacate, where the reflectance remained consistent for the ground surface virtually devoid of perennial vegetation.
The extremely low variability of the El Pinacate distributions in
Figure 4 illustrates several trends. The CMAC-corrected bands have greater spacing and are positioned left of LaSRC. All three treatments portray red reflectance distributions as having less coherence.
The Lake Newell CMAC distributions are tighter than LaSRC. The unexpected discrepancy for NIR observed in the Lake Newell data was due to rain prior to the 14 August 2023 image, which was investigated and confirmed as described in
Appendix E. Lake Newell LaSRC CDFs for the highest atmospheric effect 6 August 2023, 1044 (versus 920s for the other two dates) are displaced rightward. This discrepancy is an indication that the increase of Atm-I from 920s to 1044 resulted in under-correction by LaSRC; an interpretation based on the fact that atmospheric correction reduces the brightening effect of backscatter by moving the CDFs to the left. Hence under-correction results in the 6 August 2023 image being displaced rightward in relation to the corrected images from 29 July 2023 and 14 August 2023. The CMAC corrections of the visible bands were unaffected by Atm-I and maintain consistency and close agreement, as would be expected for the reflectance of midsummer shortgrass prairie when vegetation growth is essentially static.
Figure 5 presents the bandwise Lake Newell CDFs plotted with the surface reflectance points reconstructed from the SoCal AOIs identified through the workflow described earlier. For all bands, the reconstructed SoCal surface reflectance points of CMAC lie on the CDFs for Lake Newell. Many of the LaSRC points reconstructed in the same workflow, also lie on or close to the Lake Newell CDFs, partially corroborating that CMAC provides accurate surface reflectance estimates, though disagreeing with the LaSRC CDFs. Thus, for the Lake Newell comparisons, the null hypothesis that CMAC provides output equivalent to the surface reflectance surrogate dataset of SoCal is accepted. Judged by the data plots in
Figure 5, any error between the SoCal and the Lake Newell datasets was slight.
In contrast to CMAC results, the points displayed for LaSRC reconstructed from the SoCal dataset TOAR values disagree with the Lake Newell reflectance distributions in all twelve graphic comparisons in
Figure 5. The Lake Newell analysis was performed first. The analysis for the El Pinacate AOI was initiated to verify the same relationships for a different environment, one of profound aridity and almost no vegetation cover.
The El Pinacate data plotted in
Figure 6 confirmed the results from the CMAC visible band in
Figure 5 calculated from the shared SoCal TOAR reflectance values. The Fontana CMAC surface reflectance estimates lie close to the CMAC El Pinacate surface reflectance distributions for blue, green, and red. The SoCal LaSRC points plotted closer to the CMAC distribution than to the LaSRC El Pinacate distributions.
Figure 6 LaSRC NIR points plot differently than for Lake Newell (
Figure 5), instead essentially lying on the reflectance distribution for El Pinacate TOAR, indicating that virtually no correction for NIR occurred.
The error for surface reflectance estimation by CMAC and LaSRC was calculated by treating the surrogate SoCal reflectance values as true surface reflectance, presented in
Table 2 and
Table 3. For CMAC, the Lake Newell and El Pinacate surface reflectance estimates agree well with the SoCal surrogate true surface reflectance; CMAC error was low and almost evenly distributed between positive and negative values, and hence unbiased. CMAC results were comparable between the Lake Newell and El Pinacate datasets. The average absolute value of CMAC surface reflectance error did not exceed 1% for the blue band, which experienced the greatest error. The average absolute value of LaSRC error for the Lake Newell shortgrass prairie was severe, around 50% for the blue band. The error for LaSRC was lower for El Pinacate but still an order of magnitude greater than CMAC. The error for both CMAC and LaSRC decreased with increasing wavelength.
Sufficient data are presented in the
Supplementary Materials to allow the interested reader to reconstruct and verify the workflow and the results. These include averages, interpolations, error calculations and spreadsheets. Values derived through this analysis are summarized in tables within
Appendix D. Spreadsheets and shapefiles of the Fontana and Rochester AOIs are provided along with spreadsheets and shapefiles for the Lake Newell and El Pinacate AOIs. Cloud-based image browsing, selection, and CMAC correction and download of Landsat-8/9 and Sentinel-2 VNIR bands can be accessed through a link in
Supplementary Materials.
4. Discussion
The CMAC surface reflectance estimates for Lake Newell and El Pinacate were within 99% agreement with the CMAC SoCal surrogate surface reflectance estimates in all 58 TOAR-based comparisons across the four VNIR bands (agreement was calculated as 100% minus the % error). The strong agreement for CMAC results between datasets from widely diverse environments validates the accuracy and reliability of CMAC processing and of its constituent assessment of atmospheric effect and the conceptual model-derived workflow that reverses it. Further corroborating CMAC accuracy is the observation that the LaSRC surface reflectance in
Figure 5 and
Figure 6 for Lake Newell and El Pinacate lie closer to the CMAC distributions than to the LaSRC distributions, in many cases plotting atop the CMAC SoCal points.
The CMAC data demonstrate accuracy independent of the dynamic spectral range (highest minus lowest reflectance values): the SoCal AOIs had extremely wide ranges of values (
Figure 2), while the spectral ranges for Lake Newell and El Pinacate were extremely narrow (
Figure 4). For the clear to moderately hazy conditions examined here, the null hypothesis is accepted: CMAC analyses produced the same surface reflectance estimates from the same TOAR input under the same atmospheric conditions despite differences in the two terrestrial environments examined.
The LaSRC analysis demonstrated surface reflectance estimates with average agreement as low as 50%; hence, the null hypothesis is rejected—LaSRC was not reliable for estimation of surface reflectance across the two environments. This discrepancy may be related to the low dynamic spectral range of the Lake Newell and El Pinacate locations; however, this issue is more complicated because the same bandwise dynamic spectral ranges were comparable between these two experimental AOIs, but the degree of error for Lake Newell was about four times that of the El Pinacate error.
Atmospheric correction of satellite imagery by LaSRC, widely viewed as the state of the art in radiative transfer application for EO imagery, is proposed as the basis for smallsat atmospheric correction through a cross-calibration process with harmonized data from Landsat-8/9 and Sentinel-2 [
8,
9]. However, reliance upon LaSRC for smallsat applications can be expected to incorporate the same problems that reduce LaSRC accuracy. These problems include the loss of accuracy at higher levels of Atm-I that was found for LaSRC under conditions of increasing haze from wildfire [
2], and from these results, the lack of reliable accuracy, hypothetically related to low-spectral-diversity environments.
CMAC is a unique pathway for atmospheric correction, and its testing here and in the previous two journal papers shows that its performance is more accurate over a wider range of atmospheric effects than Sen2Cor and LaSRC. Rather than delaying surface reflectance output while waiting for ancillary data, CMAC can process images immediately upon download from the satellite, because the only input for surface reflectance correction is from the image itself. Due to its robust and simple mathematical structure, CMAC can readily be calibrated for Smallsat application for any VNIR band combination due to its robust mathematical structure. CMAC will be adapted to correct data from hyperspectral sensors in a next-generation program that will include improving the accuracy of the Atm-I model, reliable overwater correction, and development/application of a calibration target and the technology to apply it under automation.
The greatest source of uncertainty in the CMAC workflow is the measure of atmospheric effect, Atm-I. While Atm-I can be shown to be far more sensitive than the ancillary data currently in use by LaSRC from MODIS [
2], it was generated by a static assumption of reflectance of a reference crop rather than actual reflectance measurements. The key to this upgrade is extensive groundtruth. Likewise, extensive groundtruth will also permit spectral modeling to isolate and remove the specular reflectance component of water surfaces that could yield accurate water-leaving reflectance. Though CMAC generally performs well over water, the overarching effect of image geometry has not yet been characterized.
Calibration is the key for application of surface reflectance retrieval. This step is presently performed vicariously, requiring the use of master images of Sentinel-2 compared to proxy images from smallsats. Even though these steps are automated, this program is inefficient because it requires visual assessment steps to ensure accuracy. Vicarious calibration can be replaced by a workflow that employs a well-engineered, constructed, managed, and monitored calibration target. Such a target is expected to yield greater precision, accuracy, and automation for CMAC calibration. The benefit of periodic automated calibration is that it allows detection and compensation of episodic in-orbit radiation-related sensor degradation [
10].