1. Introduction
Imaging smallsat flocks may potentially cover the entire globe during daylight hours constrained only by cloud cover. The resulting intelligence, surveillance, and reconnaissance (ISR) can provide critical support for national security, precision agriculture, conflict resolution, and monitoring natural disasters from storms, wildfires, and flooding induced by climate change [
1,
2,
3]. For these images to be useful for visual interpretation or automated analysis, the spectral distortion of images from haze and other atmospheric effects must be removed in a process of atmospheric correction (AC). Zhang et al. [
4] provide a thorough review of existing AC methods.
Closed-form Method for Atmospheric Correction (CMAC) is software nearing completion for reliable, accurate, and near-real-time surface reflectance (SR) retrieval for smallsats. CMAC was developed for the AC of the four spectral bands typically found on smallsats: blue, green, red, and near-infrared, but can be adjusted to allow the AC of any visible through near-infrared (VNIR) band in near real time. In contrast, the application of current radiative transfer (RadTran)-based methods for smallsat images includes delays, potentially by days, in waiting for ancillary data from other satellites [
5,
6]. CMAC is an empirically derived simplification of AC, a “see it, correct it approach” made solely with the spectral statistics of each scene and treating atmospheric effects as a lumped sum rather than the product of gases such as water vapor considered separately from aerosol and other effects. CMAC does not suffer a loss of utility from this simplification: in repeated tests, CMAC is proving accurate and reliable in near real time across a wide range of environments, including low-spectral-diversity deserts and water bodies [
7]. To ensure maximum data utility, AC can be performed in or near real time upon image acquisition.
CMAC was introduced in an earlier article by Groeneveld et al. [
7], referred to here as “paper 1”. CMAC is a radical departure for AC because it does not apply RadTran methods that have been the primary approach for automated satellite imagery [
4]. After decades of intensive effort, two RadTran software applications that are considered the state of the art and stand out as accounting for the vast majority of AC-treated imagery are Land Surface Reflectance Code (LaSRC) of the Landsat program [
6] and Sen2Cor of the European Space Agency’s Sentinel 2 program [
8]. Data from the sensors of Landsat 8 and 9 and Sentinel 2 have been proposed as the basis for RadTran application for AC of smallsat data, and of these two satellite systems, LaSRC can be taken to represent the best output possible for AC of smallsat data using RadTran [
5,
9].
LaSRC and Sen2Cor can be regarded as representative of the current RadTran state of the art. As a continuation for comparison to RadTran, CMAC is evaluated here using the same analyses and metrics applied in paper 1. This workflow employed statistical metrics of precision and accuracy as well as the general performance of processing time, repeatability, reliability, and ease for smallsat application. Additionally, qualitative image-to-image comparisons of CMAC and LaSRC corrections were made using criteria of clarity, color balance and lack of induced artifacts resulting from the correction.
Before CMAC, RadTran provided the only mathematical basis for fully automated software to retrieve SR from satellite imagery. RadTran application requires radiance calibration for each spectral band that is not directly measurable in smallsats because onboard equipment to do so is omitted to optimize weight and size for economical launch in flocks. Further complicating this problem are variable changes to the sensor response of each spectral band that are well known to occur in the orbital environment [
10,
11]. Such sensor degradation is known to occur episodically, and thus, each smallsat must be periodically recalibrated to maintain accuracy for both the top-of-atmosphere reflectance (TOAR) and SR retrieval calculated from it.
Since orbiting smallsat radiance responses are unknown, a method has been offered for calibration of smallsat radiance to standardize AC from RadTran workflows in a process of sensor cross-calibration with datasets of harmonized reflectance from S2 and L8/9 [
5,
9] and following a similar workflow to LaSRC SR retrieval [
9]. This AC pathway is inconvenient because cloud-free and relatively clear atmospheric conditions are needed to match the necessary data between limited sets of satellite overpasses. This pathway may also add uncertainty, partially because bandwise sensor responses for these two satellite standards are not equivalent. More seriously, cross-calibration can introduce uncertainty through temporally mismatched images due to constantly changing atmospheric conditions, even when overpass times vary by minutes [
7]. To assure accurate results, such discrepancies require that the workflow for harmonization be closely monitored and managed.
Automation and ease of application are paramount for the operation of large flocks of smallsats. Rather than following a radiance-based workflow, CMAC was derived empirically based on the observed effect of atmospheric transmission upon reflectance. This reflectance pathway permits bypassing the consideration of sensor radiance for SR retrieval. Instead, each smallsat can use its pre-launch instrument calibration as the basis for conversion of the recorded raw radiance to an assumed reflectance following well-established processing methods [
12]. The assumed reflectance can then be converted to SR estimates through bandwise relationships derived in an automated calibration procedure that customizes CMAC for each VNIR band of each smallsat.
Every satellite must be calibrated to enable accurate CMAC processing. Once the bands of a smallsat are calibrated, CMAC SR retrieval can be performed from then on, subject to periodic recalibration to account for changing sensor response. CMAC research and development (R&D) is currently developing calibration software and design of calibration targets to be constructed. In the interim, sensors are calibrated vicariously using TOAR data that are paired with corrected images of AOIs with established SR. While far less precise than applying a calibration target, these temporary workarounds still yield excellent results for sensors aboard smallsats (tested in-house for Planet cubesats), for S2 (as presented in paper 1), and as tested here, for L8/9.
CMAC is a new and radical departure from the existing AC technology. The analyses and results presented here can be cross-checked and verified. Available for download from the
Supplemental Materials are an annotated image gallery that provides ancillary information related to CMAC and LaSRC AC, a python version of CMAC software v1.1, live spreadsheets used to develop the tables and figures, shapefiles for polygons that spatially define pixel data extractions, and a link to a cloud site where readers can browse, download and test correct images with current CMAC versions for SR retrieval of Landsat 8/9 or S2.
Section 2, the Materials and Methods, consists of five subsections.
Section 2.1 describes how CMAC derives a grayscale raster of atmospheric effect that scales the reversion to surface reflectance spatially to retrieve surface reflectance across each image as explained in
Section 2.2.
Section 2.3 introduces the application of quasi-invariant areas for procedures of image-to-image comparison to overcome the lack of surface reflectance at the scale and timing for calibration, evaluation and comparison of the methods.
Section 2.4 describes a method for image-to-image calibration that enables the rapid translation of the method for other satellites, including smallsats. Finally,
Section 2.5 describes four tests that were conducted to compare CMAC with LaSRC outputs. The results of these four tests are then reported in
Section 3.
2. Materials and Methods
There are two parts to the CMAC workflow. Part 1 generates a grayscale raster of the atmospheric effect across the image using extracted spectral statistics. In Part 2, the grayscale brightness is used as a spatial scalar to adjust the degree of correction to be imposed for SR retrieval from the input TOAR rasters of each band. To apply CMAC, all sensor bands to be corrected must undergo calibration in a step that determines coefficients for translation of the atmospheric effect, grayscale brightness, into two dynamic parameters that guide the retrieval for any level of atmospheric effect. The CMAC formulation captures the regular mathematical structure of changes to reflectance driven by atmospheric effects in the form of families of curves. The description of the CMAC method here draws from greater detail provided in paper 1.
Sentinel 2 (S2) data were selected as the testbed for CMAC R&D, capitalizing on their cadence, resolution, and high quality. Paper 1 tested the precision and accuracy of the provisional CMAC version 1.1 software for S2 compared to Sen2Cor software version 2.11. The AC analysis in paper 1 was applied to seven areas of interest (AOIs), across 46 images and thousands of extracted statistics. CMAC output was found to be more precise and accurate than Sen2Cor over a greater range of atmospheric effects. It required ~91 s for 4-band VNIR AC on a desktop PC (64 GB RAM), a short time span made possible through the use of scene statistics and a closed-form solution.
The present paper expands CMAC evaluation through:
Customizing a Landsat 8 and 9 (L8/9) calibration by reference to the S2 calibration in a test of a rapid automated method, “master-proxy calibration” (MPC).
Measuring the precision and accuracy of CMAC L8/9 SR retrieval generated from MPC.
Evaluating whether CMAC SR retrieval matched LaSRC output, thus verifying CMAC accuracy and stability.
Comparing the stability of CMAC and LaSRC as Atm-I of the images increased.
Test correction and comparison of indices derived from CMAC and LaSRC-corrected images as an indicator of CMAC statistical stability to support machine analysis and AI under moderate-to-extreme levels of haze.
2.1. Derivation and Scaling of CMAC Surface Reflectance Retrieval Methods
Field spectroscopy applied a portable Analytical Spectral Devices unit to initially sample vegetation spectra to understand the potential for a vegetation-based SR reference against which to measure atmospheric effects. The S2 blue band 2 reflectance was selected for representation of atmospheric effects because it retains sensitivity over a wide range of atmospheric aerosol content and because its SR remains remarkably consistent for continuous healthy canopies formed by many different plant species, thus offering a competent reference against which to judge atmospheric effects. Such maximal vegetation expression has been given the name “dense dark vegetation” (DDV) and has been applied to estimate atmospheric effects for AC of Landsat and S2 images [
6,
9]. An initial concept of blue reflectance to serve as a reference was tested for images containing continuous DDV cover in the Amazon basin and during summer images in the intensively cropped regions of the American Midwest. Spectra of such cover were sampled with non-overlapping grids after applying an NIR threshold to first remove the low blue reflectance caused by water surfaces. This test application provided a sensitive portrayal of atmospheric effects where DDV is continuous; however, it was obviously of limited value where DDV was lacking.
The requirement for DDV to be present in images was bypassed through the assembly of a model that uses VNIR band statistics as input. The first step in model assembly was to select alfalfa as a standard because it is a cultivated crop that reaches maximal canopy expression in a wide range of conditions, whether through irrigation in low-plant-cover arid climates or watered by rainfall in humid climates. This wide latitude of conditions supported the sampling of many types of adjacent plant cover to model atmospheric effects for the entire range of cover from deserts to continuous healthy canopies. The known spectral statistics of the highest-NDVI alfalfa fields were used as a surrogate ground truth estimate for blue band surface reflectance.
To assemble the statistically based model, verdant alfalfa fields were identified on numerous mid-summer images under clear to hazy conditions across a range of arid to humid climates; these cultivated fields were treated as index plots. Adjacent plots were identified as subsamples in ranges of cover from none to continuous canopies of whatever vegetation was present, not necessarily alfalfa, nor cultivated. The spectral band values of the index plots and their subsamples were extracted, pooled and then used in regression modeling to predict blue band values of the index samples that were assumed from relationships generated during earlier steps of spectral canopy measurements.
The model output of the measured atmospheric index is used in subsequent CMAC calculations to adjust the magnitude of the calculations that remove the atmospheric effect, thereby delivering SR. This scalar, the predicted blue-band DDV response to the atmospheric effect, is abbreviated “Atm-I”, and in operation, is estimated by the model from extracted TOAR spectral data within spatially discrete, non-overlapping sampling grid cells arrayed across the image. The model provides spatially discrete Atm-I values that represent the lumped sum atmospheric effect applicable to many surfaces, including water bodies. The brightness of this grayscale is the driving variable for AC and scales the degree of correction necessary to reverse the lumped-sum atmospheric effect as shown by the example in
Figure 1.
2.2. Derivation of the CMAC Equation to Reverse Atm-I
CMAC development began after observation of the response of reflectance data between clean and hazy images collected across short time spans for the same AOI. The same pattern of reflectance alteration was observed in every band for the images affected by haze: in the continuum of dark to bright responses, low reflectance was increased due to aerosol backscatter, while high reflectance was attenuated. These dark-to-bright responses form a continuum; between these opposing end member responses, there exists a reflectance level called the “axis point”, where backscatter and attenuation cancel, and the reflectance value does not change between TOAR and SR. These observations were translated into a graphic conceptual model (
Figure 2) that inverts and adjusts the well-known empirical line method [
13] into a linear representation of the atmospheric effect at the time of image capture. The resulting TOAR deviation line encodes the reflectance response for any pixel, dark to bright, affected by a single level of atmospheric effect. TOAR deviation lines are uniquely defined by their slope and offset that constitute the two parameters CMAC requires to reverse the atmospheric effects to retrieve SR. In agreement with the original observation, as Atm-I increases, backscatter, defined by offset, increases dark reflectance, and attenuation, defined by both slope and offset, decreases the bright reflectance.
Precedence for the linear conceptual model can be found in the literature (
Figure 3).
The conceptual model of
Figure 2 was translated into the CMAC equation for a reversal of the Atm-I effect for each pixel across the scene. The closed form of this solution allows rapid bandwise AC of each image pixel guided only by the brightness of the Atm-I grayscale raster. The CMAC equation reverses the atmospheric effect and can be applied once calibration has established the mathematical relationships translating the Atm-I grayscale brightness into the TOAR deviation line slopes (m) and offsets (b). This translation occurs through equations that express two functions derived by pairing the slope and offset with the driving parameter, Atm-I, observed from many QIA images obtained under sequentially increasing Atm-I.
2.3. Master Satellite (S2) Calibration and Translating the Calibration to the Proxy L8/9 Satellites
CMAC calibration enables SR retrieval applications for new, proxy satellites through two possible pathways. Precise calibration can be performed using a purpose-built calibration target. However, no such target yet exists; hence, vicarious (image-to-image) methods were employed. Multiple overpasses were averaged to control the uncertainty inherent in vicarious calibration.
Assessing how each band’s TOAR deviation line slopes and offsets change according to the driving variable, Atm-I, constitutes CMAC calibration. Though simple in concept, Atm-I is variable across a wide range of image states from clear conditions to extreme haze. While calibration could include iterative fitting and testing as different ranges of Atm-I are recorded, such processing is prohibitively time-consuming for application to the many calibrations required annually for an expected vast field of imaging smallsats. As an example of this potentially protracted process, the CMAC S2 calibration alone required several years of testing and analysis. Fortunately, through observation, the calibrated points of slope and offset on the driving parameter of Atm-I form families of curves that stack in Cartesian space according to wavelength. This fact enables master calibration curves to guide extrapolation of single precise proxy calibration points for the slope and offset of each band. The existence of this family-of-curves structure demonstrates that atmospheric reflectance responses are highly structured. Calibration simply encodes bandwise reflectance responses to enable reversing the reflectance changes due to atmospheric transmission.
The data to run vicarious calibration was extracted from multiple images of the Ontario-1 QIA. The first step for the proxy L8/9 calibration was to run the S2 Atm-I model on the L8/9 images and perform cross-calibration to calculate a relationship that emulates S2 Atm-I results. This was a necessary step because the relative spectral responses (RSRs) of the S2 bands are different than those of L8/9; hence, the input and output statistics from the Atm-I model are different. However, the RSRs of the four VNIR bands of these two satellites are sufficiently close that the model derived for S2 also functions for L8/9 though imperfectly. A linear function resulting from the cross-calibration of same-day overpasses was added to adjust the S2 Atm-I model output for L8/9 to emulate S2 and included as a step in the CMAC L8/9 software v1.1L. During these and other analyses, no overt differences were noted between Landsat 8 and 9 responses, consistent with the findings from comparison analyses by Gross et al. [
15]. Consequently, L8 and L9 were not considered separately.
A second calibration step for L8/9 was applied to the Ontario-1 data to determine the slopes and offsets for multiple L8/9 low Atm-I TOAR images. The reflectance distributions of these images were adjusted through the iterative fitting of slopes and offsets to match the distributions of a best-available SR estimation from multiple low Atm-I, AC-corrected images of the Ontario-1 QIA. The Atm-I of each image and resulting proxy slopes and offsets for each band were averaged to generate single values that were then projected into the families of S2 slope and offset curves and extrapolated to enable the new L8/9 CMAC version to function across the entire range of Atm-I. This vicarious CMAC calibration workflow applied to L8/9, resulted in calibration “v1.1L”, requiring about 6 h to select the images, extract the data, and perform the calculations.
2.4. Quasi-Invariant Areas for Evaluating Surface Reflectance Retrieval
To have value for atmospheric correction, ground truth must be collected in time and space over the selected area, and in sufficient quantity across diverse reflectance targets to constitute statistically valid sampling. Ideally, the sample must define not just single points of SR but entire distributions. CMAC R&D revealed that the value from comparison of individual estimates of SR is limited because greater mathematical context is lacking. The selection of a QIA with a range of reflectance values, low to high, provides reflectance distributions that convey a great deal more information than monotypic plots. For example, a single reflectance measurement could be correct while the underlying distribution is wrong, but not vice versa.
An impediment that exists for calibrating SR retrieval methods or for evaluating their output is that sets of appropriately scaled ground truth reflectance datasets available for image snapshots during image capture generally do not exist. Further, due to variable antecedent conditions and cloud cover, ground truth sampling seldom works well in real time. Instead, an optimized solution for ground truth is to select targets with known surface reflectance from archived images rather than from a new collection. For CMAC calibration and verification, a concept similar to pseudo-invariant targets (PITs) [
16] was employed: quasi-invariant areas (QIAs), defined as an AOI with a variable SR distribution that remains consistent between image acquisitions over time. QIAs allow investigation of reflectance distributions under varying atmospheric conditions and also for comparison of SR retrieval methods. PITs are homogeneous areas containing only one target of known reflectance, while QIAs can be chosen to contain many separate targets with widely varying reflectances, hence providing the desired reflectance distributions for calibration and testing.
When using a QIA to evaluate SR retrieval methods, the first step is to define boundaries that exclude cover that may change between image dates. Such surfaces generally consist of vegetation, depressions and bare soil subject to water retention in puddles or wetted soil. Warehouse and industrial districts are permanent facilities appropriate for testing across extended periods because they are engineered for the quick drainage of precipitation and include vegetation management or elimination. Such QIAs were chosen in the urban area of Southern California (SoCal) east of Los Angeles (
Figure 4). QIA boundaries were mapped to exclude vacant land potentially containing vegetation that could grow/senesce during the period of study or present surface conditions of variable wetness from antecedent rainfall.
Five QIAs were selected within the sidelap region of adjacent L8/9 tiles, in the Landsat World Reference System-2 path/row 40/36 and 41/36. The AOIs containing these QIAs are located in the vicinity of the Ontario, California Airport. Three of these QIAs were also used to evaluate S2 AC reported in paper 1. Twenty-four L8 images acquired between 2019 and 2022 and seven L9 images acquired in 2022, were selected for analysis. A list of these images is provided in the
Supplemental Materials. The selected images were displayed and inspected to assure that no gross anthropogenic changes were visible through the period of analysis, and each was verified to be cloud and cloud-shadow free. If haze was visible in an image, it was closely inspected to assure even expression within each QIA. Band 9 was inspected for each image to ensure that no cirrus was present, since cirrus, though correctable by CMAC, is seldom evenly expressed across QIA scales of a square kilometer or greater. The selected SoCal warehouse and industrial areas present mostly rooftops, paved roadways and parking that are well drained and are sufficiently large so that variability in parked or moving vehicles, or other individual sub-targets across multiple image-to-image comparisons is reduced to minor stochastic variation.
As described in paper 1, CMAC v1.1 was calibrated for correction of S2 data using a warehouse district QIA located north of the international airport in Reno, NV, USA. The S2 calibration, since upgraded through additional imagery and testing of the Reno QIA, is the basis for the L8/9 calibration used for this paper. This recalibration produced only slight changes to the S2 master calibration.
2.5. Data Analysis Workflow
The two overarching goals for this paper were as follows: (1) determining whether CMAC applied to correct L8/9 images provides comparable SR retrievals to LaSRC; (2) evaluating the precision, accuracy and consistency of CMAC SR output derived through MPC in comparison to LaSRC. CMAC performance for v1.1L SR retrieval for L8/9 data is near-real-time, requiring about one minute per image using a compiled C++ implementation executed on a desktop PC with 64 GB system RAM.
Thirty-one images were selected and downloaded as standard TOA and LaSRC-corrected products from USGS EarthExplorer [
17]. The TOA radiance product was processed to TOAR following the published radiance-to-reflectance conversion workflow [
12]. Collection dates for the 31 images were selected to minimize the influence of solar zenith angle (SZA) that would alter building shadows in the QIA and the solar irradiance pathlength through the atmosphere. The dates ranged from June 13 to August 11 of 2019, 2020. 2021 and 2022. The scene center SZA ranged between 21.8° and 28.0°.
The same analytics described in paper 1 were applied to the extracted SoCal v1.1L-treated statistical distributions of each image generated from 21 set percentile increments of reflectance: 1%, 3%, 5% and in 5% increments to 95%. This data characterization permitted comparing the distributions for three reflectance “treatments”: uncorrected TOAR and CMAC and LaSRC SR retrievals. All numerical reflectance values were extracted from the pixels within QIA shapefiles and scaled by 10,000 following the convention adopted by the Sentinel 2 program [
8]. In the graphical figures that follow, data are listed as DN (digital number) within the range of 16-bit unsigned integers. Four tests were performed to assess the performance of CMAC versus LaSRC.
Test 1. Performance statistics of corrected images under low-to-moderate Atm-I: Data were extracted from the 31 images of the Ontario-1, Fontana, Rochester, Ontario-2, and Ontario-3 QIAs. These images were obtained under Atm-I conditions that ranged from relatively clean (Atm-I = 883) to moderate (Atm-I = 1088).
Statistical metrics of the CMAC and LaSRC outputs of reflectance averages, standard deviations and coefficients of variation (CV% = StDev/Average × 100) were compared across the suite of images for each of the five QIAs. Error estimates were generated based on the logical and measurable assumption that images with the lowest Atm-I values provide best available estimates of true SR. Data distributions from the five lowest Atm-I images were selected and averaged for each QIA to control uncertainty in error estimation. These best-available SR estimates for each QIA were treated as the standard for percent error calculation at each percentile step of each image of each QIA: % error = (value − standard)/standard × 100). For the error calculation of LaSRC, the low Atm-I percentile LaSRC averages were used as the appropriate reference standard. Likewise, the percentile-wise Atm-I averages for CMAC were used as the standard for assessing CMAC.
Test 2. SR distributions derived under low-to-moderate Atm-I: The CMAC v1.1-L output was compared to LaSRC output in the form of average cumulative distribution functions (CDFs) extracted from the images of each QIA for each band. This test was performed to determine the reproducibility and stability of CMAC and MPC calibration, and to determine whether CMAC v1.1L measures reflectance of L8/9 data within the same numerical ranges as LaSRC. This was tested through displays of the average reflectance distributions of CMAC and LaSRC results compared to TOAR for each band of the 5 QIAs.
Tests 1 and 2 provided a robust sample for analysis of atmospheric correction for three treatments, TOAR, CMAC and LaSRC, across the 31 images in each of the five QIAs evaluated in 21 percentile-steps of the reflectance distributions. This sampling provided 9765 separate statistics for each band.
Test 3. Quantitative performance in moderate-to-extreme Atm-I: Atm-I is increasing due to climate change-driven wildfires in many regions around the globe. For example, smoke from pervasive West Coast, Intermountain Region and Canadian wildfires dominates the skies across the American Midwest and much of North America during each summer growing season at levels that impact imagery-based applications for precision agriculture and carbon farming. Similarly, extreme anthropogenic emissions in Southern Asia produce high levels of visible haze that may impact remote sensing applications through much of the year [
18]. Tests 1 and 2 were performed using QIA targets known to express consistent reflectance through the summer period. However, the images used captured only low-to-moderate levels of Atm-I. Located in an urbanized region of Southern California within 60 km of the Pacific Ocean to the west and receiving onshore southwesterly winds throughout each summer day, the SoCal QIAs offered no potential to assess AC performance for high levels of Atm-I resulting from wildfire smoke or other causes.
Future remote sensing applications are expected to be dominated by automated applications applying various indices, e.g., NDVI. To answer the question of how the increasing levels of smoke will affect such applications, Test 3 was formulated to investigate how such indices, generated from TOAR, CMAC and LaSRC-corrected data, perform under moderate-to-extreme Atm-I conditions.
In Test 3, CMAC and LaSRC-corrections were compared across low Atm-I that were assessed first in Tests 1 and 2. Landsat images acquired over British Columbia, Canada, that were expected to form QIAs with consistent surface reflectance conditions across yearly snapshot-in-time images were identified exhibiting low Atm-I and high Atm-I. Sampling was performed for these conditions from clear conditions and from visibly homogeneous haze in three urban locations: Penticton, Kelowna and Vernon. Polygons were mapped to enclose visibly homogeneous areas of haze (
Figure 5). Landsat images were selected to support this analysis from three consecutive years within a six day-of-year time span for each year: 3 August 2021 (smoke-affected L8) and 6 August 2022 (clear L8) and 1 August 2023 (clear L9). The third image was selected to test an assumption underlying this experimental design: that the conditions affecting peak vegetation expression were consistent across the annual snapshots during the day-of-year period of image selection. The peak plant expression in these polygons was not expected to be limited by antecedent rainfall because of irrigation.
Three circular sample areas were mapped around each of the three city polygons to include and sample irrigation-cultivated crops. For direct comparison between clear versus smoke-affected conditions, the pixel values for blue, green, red and NIR were extracted for the three treatments from the three images for the 20 pixels expressing the highest NDVI values in each sampled circle. The null hypothesis was for no difference among the three years, confirming that the predisposing environmental conditions were consistent and that the treatment was effective for SR retrieval. A fair comparison among the three years was sought through a checklist of conditions to be met to assure that peak NDVI would be present each year: irrigated crops cultivated in a strongly Mediterranean-type climate, sampling at the height of the growing season, nearly exact day-of-year image snapshot times, and relatively large, sampled areas encompassing numerous cultivated fields. These considerations were expected to isolate the effects of the smoke haze on the images and for the treatments. The sampled areas are numbered for reference in
Figure 6.
Indices were calculated for average extracted band values of the 20 highest NDVI pixels in each of the circular AOIs and to assess the relative performance of the three treatments for three indices, NDVI: NDVI = (NIR − Red)/(NIR + Red) [
19] and two indices based upon the same normalized difference format but with blue and green bands substituted for the red band, designated here as NDBI and NDGI, respectively. Together, these three indices provide an indication of the relative utility of AC-processed data to support automated machine analyses of any kind.
Test 4. Qualitative evaluation in moderate-to-extreme Atm-I: Multiple images representing moderate-to-extreme Atm-I were selected for correction by CMAC and LaSRC and visual inspection of the results in comparison to the TOAR view. Color balance and clearing of haze have been applied during CMAC development as indicators of the quality of image correction [
7] that found clear images were generally close to SR. While not conclusory, attributes of clarity, natural appearing color balance, and absence of induced artifacts have proven to be valuable for R&D as a quick confirmation of the quality of the AC. Color balance can be a competent indicator of which bands are incorrect, presenting two possible states for the interpretation. As an example, an overly green appearance may indicate that the green band is over-represented (under-corrected) but also potentially that the blue and red bands are under-represented (over-corrected).
3. Results
The CMAC calibration v1.1 for S2 was tested for the direct correction of L8/9 data without alteration. The two satellite systems are nominally the same, having blue, green, red and NIR bands but somewhat different RSRs, as shown in
Figure 7. The resulting S2 v1.1 AC-processed images often contain clearly incorrect color balance – RGB color balance and lack of haze provide competent indications when AC results for RGB bands are incorrect and which bands may be over- or under-represented. The MPC calibration performed for this investigation was confirmed via visual inspection to produce acceptable results (
Figure 8).
Data generated for Tests 1 and 2 were extracted from the L8/9 images as TOAR or AC output from CMAC and LaSRC within the QIA shapefile boundaries. These data were analyzed in spreadsheets. As a precursor to the three tests listed earlier, the provisional CMAC v1.1-L calibration was developed through the MPC pathway and tested qualitatively by correcting several relatively high-level Atm-I images. The example CMAC v1.1-L correction in
Figure 8 exhibits elevated Atm-I ranging between 970 and 1240. The CMAC test correction can be seen to have cleared most of the haze and restored a natural-appearing color balance. This result provided qualitative assurance that the MPC calibration was appropriate for application to process clean-to-moderate Atm-I recorded for the 31 test images of the five SoCal QIAs.
Although the v1.1-L calibration is acceptable, it is provisional and subject to further improvement.
Figure 8 is instructive: there are obvious features that indicate that the Atm-I levels exceeded the calibration for both methods. For example, the LaSRC image exhibits residual haze and a strong green color shift for thicker patches of haze visible in
Figure 1, indicating that Atm-I levels exceeded LaSRC capability—the green shift visible here was also found for LaSRC-corrected images for all four tests. Residual haze and thin wisps of uncorrected smoke are visible toward the upper left of Lake Tahoe in the CMAC correction. Because the scale of these features approaches the 300 m grid cells that mapped the Atm-I grayscale, the residual haze may be related to spatial resolution. The uncorrected wisps of haze in
Figure 8 are features of the TOAR smoke plume visible in
Figure 1a that are not captured by the Atm-I grayscale (
Figure 1b).
Figure 8 illustrates how the trend for increasingly higher resolution in smallsat imaging can enhance the accuracy and precision for SR retrieval. As spatial resolution increases, the finer detail from a higher resolution can capture Atm-I. Also, with higher resolutions, pixel mixing will decrease, and this will enhance spectral diversity, which is important in AI and other machine analyses; it is a property actively being explored for machine learning in remote sensing investigations [
20], for meteorology [
21], and especially for vegetation diversity as affected by climate change [
22,
23,
24].
Test 1. Performance statistics of corrected images under low-to-moderate Atm-I: CV% was estimated from the average and standard deviation of each band for the 31 images of each QIA.
Table 1 provides grand averages of the CV% for L8/9 AC across the five QIAs that show relatively close agreement of CV% among the three treatments.
Table 1 values exhibit patterns similar to those observed in the corresponding S2 analysis presented in paper 1: higher CV% (greater dispersion and lower precision) exists at the upper and lower tails of both distributions. Dispersion increased as band wavelength decreased—lowest for NIR and highest for blue. Each CV% value in
Table 1 represents an average of 155 values (31 images × 5 QIAs). A comparison of the average CV% distributions shows that CMAC performed better at the lowest reflectance levels. CMAC provides higher precision for the low reflectance of the red band and for the moderately high levels of NIR, both forming the indices for assessing vegetation performance for precision agriculture, measuring carbon cycling [
25,
26] and potentially for AI feature extraction [
27,
28]. The slightly higher dispersion (~1–2%) observed in the central portions of the CMAC-corrected visible band distributions may result from the relatively cursory L8/9 calibration.
The SR retrieval error for each image was estimated from the average distribution of the five lowest Atm-I images. This low Atm-I average was treated as the best available estimate of the true SR distribution. These calculations were employed individually in each QIA as the standard for error estimation of each percentile in the statistical distributions. To reduce the potential for biasing the comparison, the CMAC low Atm-I average was used as the standard for CMAC error estimation, and the LaSRC low Atm-I average was used for LaSRC error estimation.
Like CV%, % error is a statistic normalized through division by average values; hence, it can be combined for the five SoCal QIAs for an evaluation of the overall error of the two methods. In
Figure 9, % error estimates from each QIA were pooled and plotted according to their percentile rank.
Figure 9 demonstrates that the accuracy achieved using the CMAC provisional v1.1-L calibration is roughly equivalent to the accuracy of LaSRC. The error distributions for CMAC shown in
Figure 9 closely approximate the error distributions measured for S2 SR retrieval in Figure 14 of our earlier paper [
7].
To further investigate how error may affect the lower end of reflectance that is critical for AI feature extraction and vegetation indices, the absolute values of the estimated error were averaged for the reflectance distribution percentiles between 1% and 20% to create one index value representing the error for each image-QIA combination (n = 155 per band). These statistics were then pooled across the five QIAs and plotted according to the QIA median Atm-I (
Figure 10). The degree of data scatter increases with Atm-I.
Regression lines for the low-reflectance index values in
Figure 10 allow a statistical comparison of low-reflectance error from the clouds of points. In all four bands, LaSRC low-reflectance error was greater than CMAC judged in this manner. This disparity was greatest in the blue band, less in the green band, and even less but still present for the red band. This same pattern was also apparent in the CMAC-to-Sen2Cor comparison in paper 1, but more pronounced, especially for the blue band. A direct comparison of the same test reported in paper 1 indicates that LaSRC is more stable at the low end of Atm-I than Sen2Cor. Given the results of paper 1, this testing indicates that CMAC is more stable and accurate than Sen2Cor and LaSRC.
The results from the error analysis that are summarized in
Figure 9 and
Figure 10 here, and presented in paper 1, provide conclusive proof that the lumped-sum approach for determining atmospheric effects upon SR retrieval employed through Atm-I in CMAC has no discernible negative effect upon accuracy, at least in comparison to LaSRC and Sen2Cor output.
Test 2. SR distributions derived under low-to-moderate Atm-I: The SR retrieval values in each QIA were averaged across the 31 images at each of the percentile steps for the three treatments. The resulting average CDFs are plotted per band in
Figure 11 for each of the five QIAs. This comparison confirms that the CMAC SR retrieval is equivalent to that of LaSRC. The CMAC average SR retrievals are virtually identical to LaSRC and effectively plot one atop the other. Slight disagreement can be seen where the orange display of CMAC is exposed. Each of these cases occurred in the blue and green bands for complexities of the CMAC shape that followed from the same shape in the TOAR curves. These results confirm that CMAC v1.1L output is at least as accurate as LaSRC for low-to-moderate levels of Atm-I. Furthermore, this result indicates that the MPC based on single precise points for the translation of the S2 calibration for LaSRC application is appropriate for the calibration of CMAC for new satellites, and especially flocks of numerous smallsats.
Test 3. Quantitative performance of CMAC and LaSRC in moderate-to-extreme Atm-I: The August 2021 and 2022 British Columbia image pair, taken a year apart, captured ideal conditions for a comparison of images that were haze-affected compared to relatively haze-free conditions (
Figure 5). The three AOIs selected to support a comparison of CMAC and LaSRC L8/9 SR retrievals (cities of Penticton, Kelowna, and Vernon) captured a gradient of conditions from moderate to extreme haze effects on the 2021 image. The three circular AOIs for extraction of pixel statistics around these three cities were mapped to contain significant proportions of cultivated dense dark vegetation (DDV). Live spreadsheets for Test 3 containing extracted four-band VNIR pixel values and their statistical analyses are available for download in the
Supplemental Materials. CMAC v1.1L software can be accessed through a link to cloud-based correction for a verification of the SR retrieval reported here, and shapefiles are included to support data extraction for this purpose.
Table 2 provides a summarization of the results for the NDVI and NDVI-like indices calculated from the extracted data of moderate-to-extreme Atm-I (3 August 2021) versus relatively low Atm-I (6 August 2022). Data from the low-Atm-I image of 2023 (1 August) are included for comparison to the 2022 data.
Table 2 cells contain the results from averaging the band values extracted for the 20 highest NDVI per circular sampled area (
Figure 6) that were, in turn, averaged and used to calculate the indices presented. As mentioned previously, the lower the Atm-I, the closer the SR retrieval data are to true surface reflectance. Data from 2023 and 2022 (low Atm-I), agreed closely, as do the index values between CMAC and LaSRC for 2023 and 2022, confirming that true surface reflectance between these two dates was consistent, that surface reflectance for the 2021 smokey condition was likely equivalent to 2022 and 2023, and that the experimental design was appropriate.
As in Test 1, the clean 2022 indices can be treated as having been calculated from the true surface reflectance for the calculation of % error for the comparison to the smoke-affected indices of 2021. These calculations applied the equation presented in the Materials and Methods (% error = (value − standard)/standard × 100). The calculated CMAC error remained consistent under conditions of increasing Atm-I and even showed a slight reduction. From these results, CMAC correction can therefore be judged as reliable for applications utilizing NDVI or similar indices, AI feature extraction and various other machine analyses. Likewise, error propagation in the visible L8/9 bands was found to be lower than LaSRC for low reflectance in Test 1 (
Figure 10). This stability is reflected in CMAC’s superior performance for the Test 3 indices generated under moderate-to-extreme levels of Atm-I.
The high degree of error of the TOAR data in
Table 2 provides strong quantitative evidence for why surface reflectance retrieval is important. Given that far more smallsat images are now being generated than is practicable for analyst attention, accurate AC assumes a critical role to prepare image data for automated analyses. If performed correctly, a by-product of an AC processing step is image clarity. While existing methods can already achieve this result, they require analyst-proctored software and considerable attention to produce admittedly approximate output [
29]. CMAC can provide both in one near-real-time operation upon image download. An independent GIS-based application is planned for fine tuning severely impacted images, is an application described in the context of extreme Atm-I that affected Payette, Idaho and Davenport Iowa, USA, examples in the image gallery that supports Test 4. This secondary processing could be reserved for application to the most severely atmospherically affected images where tolerances for SR retrieval can be overwhelmed by the associated uncertainties. This application would incorporate the calibrated CMAC relationships adjusted by the analyst.
From repeated data correction and observation of the magnitude of Atm-I generated by the CMAC process, it is apparent that the extreme Atm-I levels measured in and around Vernon, BC can be expected to be episodic and due directly to coherent smoke plumes relatively close to the wildfire. Given the record levels of wildfire activity in North America in recent years, moderate levels of Atm-I resulting from dispersed smoke, such as those recorded for Penticton and Kelowna, can be expected to dominate the skies across much of North America during each growing season (spring through late fall). As indicated by
Table 2, this trend for increasing Atm-I can seriously impact LaSRC NDVI values and render TOAR NDVI unusable for precision agriculture and other automated applications.
Test 4. Qualitative results of image comparisons.
Approximately 60 L8/9 images from a wide range of conditions and locations have been downloaded and processed for CMAC development and for quantitative verification of the v1.1L calibration. Approximately 40 more images have been downloaded and processed for qualitative observation to compare LaSRC and CMAC results. Six of these images that are instructive for improving the state of the art were selected for presentation. This short image gallery can be downloaded from the
Supplemental Materials. In none of the approximately 100 total comparisons were the results from LaSRC an improvement over the image appearance from CMAC application.
4. Discussion
CMAC is a new method for atmospheric correction with no touchpoint for comparison to existing methods except through an evaluation of AC performance. While the CMAC workflow bears little resemblance to radiative transfer-based methods such as LaSRC, it could not have been developed without the prior pioneering work in surface reflectance retrieval that developed it for Landsat [
30] and applications applied to Sentinel 2 [
31]. Likewise, CMAC R&D was only made possible through analyses of copious high-quality imagery from the European Space agency’s Sentinel 2 program and the U.S. Geological Survey and NASA-supported Landsat program.
CMAC is based upon observations of the response of atmospherically transmitted reflectance. The empirical line method was adapted, inverted and adjusted to become the CMAC conceptual model that estimates a linear relationship expressing the deviation of surface reflectance to TOAR reflectance. The CMAC conceptual model is a linear representation of the effect from an atmospheric state upon the reflectance of all pixels, dark to bright, in the visible-through-NIR spectrum. Precedence exists in the remote sensing literature for such an approach [
14].
CMAC provides spatial estimates of atmospheric effect that are calculated by a model that applies scene statistics to estimate an atmospheric index, Atm-I. Through this model, Atm-I magnitude represents the blue band reflectance of hypothetical DDV plus the backscatter from the atmosphere. Atm-I values form a grayscale whose brightness provides a spatially discrete scalar to adjust the CMAC conceptual model. The combination of this conceptual model and the Atm-I model constitute the workflow that retrieves SR from TOAR. A third step calibrates each band’s sensor response to Atm-I. Calibration measures the slopes and offsets for the dynamic linear responses driven by Atm-I that exist for each sensor band. The calibration workflow results in curves of slope and offset to Atm-I that can then translate the Atm-I grayscale into a linear expression for atmospheric effect upon any pixel’s TOAR, from dark to bright. These three separate parts allow CMAC to reverse the atmospheric effect and retrieve an SR estimate individually for each band of each pixel of a calibrated satellite using only two inputs: TOAR and the Atm-I that caused it.
The CMAC calibration for L8/9 was derived from single points per band in application of master-proxy calibration that extrapolates single precise points for the range of Atm-I that can be encountered. The extrapolation was guided by the shapes of master curves of the slope and offset of each band driven by Atm-I that form families of curves of similar shape that nest together and stack in Cartesian space according to their wavelength. The L8/9 data were treated as the proxy while the highly calibrated S2 families of curves were treated as the master. This calibration generated the CMAC v1.1L software’s SR retrieval from the L8/9 TOAR presented in this paper.
Presently, calibration is applied vicariously through fitting slopes and offsets of multiple images to match known SR distributions. While this procedure was approached with programmatic steps, because such vicarious fitting incorporates uncertainty, multiple images were applied and the resulting slopes, offsets and Atm-I parameters were averaged to control the uncertainty. When fully operational, CMAC will, instead, apply a calibration target to achieve highly precise and fully automated calibration for potential application to an unlimited field of smallsats. The efficiency of this workflow is expected to ensure that high volumes of smallsats can be calibrated and periodically recalibrated to counter the inevitable but unpredictable episodic degradation of sensor response that is the consequence of the orbital environment [
11].
The accepted state-of-the-art LaSRC software was applied by the Landsat program and served as a milestone for judging the performance of CMAC SR retrieval. LaSRC uses ancillary data from MODIS [
6] to represent atmospheric effects. Instead, CMAC uses scene statistics to bypass the need for ancillary data to enable image processing immediately upon download. This difference, plus the economy for data processing in the CMAC workflow, may support multiple ISR objectives. For example, a real-time edge application with CMAC residing in-satellite and lagging by several seconds can correct an image as it is scrolled into memory by the pushbroom scanner. A digital image cleared of haze could then be transmitted to first responders or warfighters with only seconds of latency.
LaSRC and CMAC SR retrievals are compared here for 31 images obtained from a commonly encountered range of atmospheric effects. A comparison of the results showed that the CMAC and LaSRC SR retrievals were virtually identical for low-to-moderate levels of Atm-I. SR retrieval was also studied for moderate-to-extreme levels of Atm-I through comparisons of British Columbia images from three consecutive years under comparable environmental conditions and roughly commensurate SR distributions. Two images captured clear conditions and the third was impacted by a range of Atm-I wildfire smoke. Extracted data for the four VNIR bands processed by CMAC and LaSRC for the three years were combined into three normalized difference indices. These indices demonstrated that CMAC SR retrieval remains competent to support machine analysis and AI as Atm-I transitions from low to extreme effects. Judged through the nine index values calculated, CMAC contained −3% to 2% error (average 0.4%). Comparatively, LaSRC errors indicate a bias for increasing under correction at moderate-to-extreme levels of Atm-I, varying from −6% to −20% (average −9.9%).
Although it is well known that atmospheric correction is a necessary step in image processing for most applications, smallsat data purveyors may be tempted to omit this step if they find application of the RadTran AC and the harmonization pathway to be complex, inconvenient, or to provide insufficient accuracy. However, the overwhelming volume of images currently generated by smallsats can only be evaluated through automated machine analysis workflows and/or AI, and this data overload will only grow as more imaging smallsats are brought online. Against the need for automated image analysis, the results presented in
Table 2 offer the best argument for why AC is a crucial step in image processing: data left as uncorrected TOAR incorporated significant uncertainty that increased the apparent error 13% to 38% as Atm-I increased. Depending upon the application, such high levels of error are likely unacceptable.
Test results of CMAC for the 30 m pixel, L8/9 data and higher-resolution 10 m pixel S2 data, provided confirmation that CMAC accuracy and precision will increase as pixel spatial resolution increases. This enhancement is due to higher granularity for Atm-I assessment and reduced pixel-mixing effects.
As can be appreciated from the results of these analyses, current state-of-the-art LaSRC functions relatively well for images that are not highly affected by the atmosphere, but it is limited for images affected by higher levels of Atm-I. However, with each passing year, climate change is intensifying and causing increasing wildfire incidence and magnitude [
32]. Smoke haze can cover most of North America for weeks at a time [
33] and elevated wildfires are now occurring over large pan-boreal regions [
34]. Wildfire smoke has a significant impact upon all applications of satellite imagery, a ready example being precision agriculture’s reliance upon vegetation indices that are highly affected by variable atmospheric effects [
35], a finding corroborated by the Test 3 NDVI results.
The intent for CMAC development has been its application to serve the smallsat industry. Likewise, LaSRC, widely regarded as state-of-the-art in AC, has been tasked for SR retrieval from smallsat data [
5] through a radiance harmonization pathway [
5,
9]. Paper 1 examined S2 data using the same testing procedures as those employed in this paper; those results and those presented here found that LaSRC provided SR retrieval with greater accuracy than Sen2Cor. CMAC retrieved SR that was remarkably consistent with LaSRC for low-to-moderate Atm-I, but with greater accuracy at higher levels of Atm-I. These results confirm that CMAC is application-ready for optical satellites, including numerous members of smallsat flocks.