Clean Collector Algorithm for Satellite Image Pre-Processing of SAR-to-EO Translation

Kim, Min-Woo; Park, Se-Kil; Ju, Jin-Gi; Noh, Hyeon-Cheol; Choi, Dong-Geol

doi:10.3390/electronics13224529

Open AccessArticle

Clean Collector Algorithm for Satellite Image Pre-Processing of SAR-to-EO Translation

by

Min-Woo Kim

^1,†

,

Se-Kil Park

^2,†

,

Jin-Gi Ju

¹

,

Hyeon-Cheol Noh

¹

and

Dong-Geol Choi

^1,*

¹

Department of Information and Communication Engineering, Hanbat National University, Daejeon 34014, Republic of Korea

²

Korea Research Institute of Ships and Ocean Engineering (KRISO), 32, Yuseong-daero 1312 beon-gil, Yuseong-gu, Daejeon 34103, Republic of Korea

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2024, 13(22), 4529; https://doi.org/10.3390/electronics13224529

Submission received: 28 September 2024 / Revised: 8 November 2024 / Accepted: 12 November 2024 / Published: 18 November 2024

(This article belongs to the Collection Computer Vision and Pattern Recognition Techniques)

Download

Browse Figures

Versions Notes

Abstract

:

In applications such as environmental monitoring, algorithms and deep learning-based methods using synthetic aperture radar (SAR) and electro-optical (EO) data have been proposed with promising results. These results have been achieved using already cleaned datasets for training data. However, in real-world data collection, data are often collected regardless of environmental noises (clouds, night, missing data, etc.). Without cleaning the data with these noises, the trained model has a critical problem of poor performance. To address these issues, we propose the Clean Collector Algorithm (CCA). First, we use a pixel-based approach to clean the QA60 mask and outliers. Secondly, we remove missing data and night-time data that can act as noise in the training process. Finally, we use a feature-based refinement method to clean the cloud images using FID. We demonstrate its effectiveness by winning first place in the SAR-to-EO translation track of the MultiEarth 2023 challenge. We also highlight the performance and robustness of the CCA on other cloud datasets, SEN12MS-CR-TS and Scotland&India.

Keywords:

image pre-processing; real-world noise; Fréchet inception distance; SAR-to-EO image translation

1. Introduction

The field of remote sensing has undergone a remarkable transformation in recent years, driven by the synergistic advancement of satellite technology and deep learning. Satellite systems have evolved rapidly, bringing about significant improvements in data acquisition. These advancements include enhanced spatial and temporal resolutions, expanded spectral ranges, and more sophisticated sensor capabilities [1,2,3]. Parallel to this, the exponential growth in computational power, coupled with the development of increasingly sophisticated deep learning algorithms, has unlocked new possibilities for processing and analyzing the vast quantities of satellite data now available [4].

In particular, deep learning models have shown exceptional promise in tackling various remote sensing challenges. Convolutional Neural Networks (CNNs) have proven highly effective for tasks such as image classification and object detection [5,6], while generative adversarial networks (GANs) have demonstrated their utility in image-to-image translation tasks [7,8,9,10,11,12]. These approaches have significantly enhanced the accuracy and efficiency of applications ranging from environmental monitoring and urban planning to disaster response [13].

Satellite remote sensing data possess remarkable capabilities in capturing a diverse array of scenarios, rendering it extensively employed in diverse tasks, including environmental monitoring [14,15], change detection [16,17,18], and disaster assessment [19,20,21].

SAR has the advantage of capturing images regardless of weather conditions or time of day. For instance, SAR can accurately observe terrain, buildings, and objects even in cloudy weather or during night-time. In contrast, electro-optical (EO) imagery relies on visible light, making it more intuitive and easier for humans to interpret. Converting SAR imagery to the optical domain, therefore, provides easily interpretable images regardless of weather and lighting conditions. As a result, various research efforts and competitions have focused on developing SAR-to-EO conversion models. It is important to note that EO data, being based on visible light, are highly susceptible to weather variations, which can impact data quality and consistency in SAR-EO conversion. Consequently, effective pre-processing has emerged as a critical factor in maximizing the potential of deep learning in satellite remote sensing applications [22]. However, its susceptibility to noise, arising from factors such as weather, clouds, smoke, and lighting conditions, imposes substantial constraints on the utilization of electro-optical (EO) data. Specifically, the presence of noise poses a significant challenge in continuous EO imagery tasks for environmental monitoring. To surmount these limitations, a methodology incorporating synthetic aperture radar (SAR) is being proposed [23,24], offering advantages such as the ability to characterize surface properties and collect data during both day and night, even under adverse weather conditions. Nonetheless, SAR data present challenges in human interpretation, limiting their practicality. In addressing these hurdles, a solution has been suggested to convert SAR data into EO data, facilitating easier comprehension and interpretation by humans.

Cabrera Armando et al. [21] proposed a methodology to learn better mapping between SAR and EO images by leveraging additional complementary modalities, such as OpenStreetMap, infrared (IR), latitude, and longitude. Wang Haixia et al. [25] proposed a parallel generative adversarial neural network (Parallel-GAN) consisting of the backbone image translation subnetwork to convert SAR images to EO images and an EO image reconstruction subnetwork to extract hierarchical optical features. Wang Lei et al. [26] combined the advantages of CycleGAN [27] and pix2pix [28] to propose a supervised cycle-consistent adversarial neural network (S-CycleGAN) to preserve both land cover and structure information. While these methods perform well, they assume that the dataset is noise-free. Specifically, the public datasets [29,30,31] used in most studies have been cleaned by experts to remove various real-world noises, and existing methodologies have not focused on data cleaning. If a model is trained on a noisy dataset, the performance degradation can be critical.

Among the real-world noises in EO data, cloud occlusion causes the most problems by hiding geographic information. To address these issues, a cloud removal task is proposed to convert cloud images into cloud-free images. Methods that utilize generative models [32,33,34] to translate cloud images into cloud-free images are effective, but may differ from real-world geographic information when texture information is lacking (half or all clouds). The use of time series data [35,36] has also been proposed to mitigate these problems, but collecting time series data from the same region can be quite time consuming and expensive. In addition, the use of other modalities [37,38] to preserve texture and geographic information has been considered, but like the methods mentioned above, they require refined data free of real-world noise to train high-performance models.

To address these issues, we propose the Clean Collector Algorithm (CCA), which effectively refines real-world noisy data (clouds, night, missing data). The CCA consists of three main stages. Firstly, using the QA60 band and experimentally obtained pixel values, we eliminate images with significant cloud cover. Secondly, we remove night and pixel missing data by calculating the ratio of brightness values in the HSV space to pixels with low values. Lastly, to handle the remaining cloud data that could not be removed by the QA60 band, a hand-crafted constructed subset of cloud EO data is created, and the Fréchet inception distance (FID) values of each EO data are used. We have confirmed that the FID values allow us to approximately assess the extent of cloud cover in the data. Figure 1 displays example images from each stage.

We participated in the MultiEarth 2023 challenge [39] for environmental monitoring in the Amazon rainforest and achieved first place in the SAR-to-EO translation track by applying the CCA proposed in this paper. The dataset of the competition was collected in the Amazon rainforest with high rainfall, and it contains a large amount of environmental noise (weather, clouds, night and day, missing pixels, etc.).

To validate the effectiveness of the CCA, we evaluated it on an evaluation dataset of MultiEarth 2023 that we built ourselves. To prove that our method is not biased toward one dataset, we also validated its performance on two existing datasets, SEN12MS-CR-TS [40] and Scotland&India [41], which contain real-world noise. The code is available at https://github.com/cjf8899/MultiEarth2023_SAR2EO_1st_Place_Solution (accessed on 11 November 2024).

2. Methodology

EO data collected in the real world are unrefined and contain real-world noise such as clouds, weather, illumination differences, etc. An image translation model trained on data containing such noise may produce noisy results during translation and experience performance degradation, e.g., a translation model trained on data containing a lot of clouds may produce clouds in the translation results. To address these issues, we propose the Clean Collector Algorithm (CCA) for effective data cleaning in three stages.

2.1. QA60, Outlier

The datasets collected by Sentinel-2 are multi-spectral datasets that support monitoring studies by observing soil, coastal, and vegetated areas. There are various bands according to wavelength (B1 B12, QA60, etc.) in these data. QA60 has a spatial resolution of 60 m and is a bitmask band that contains cloud mask information. Using this cloud mask information, we can easily identify images containing clouds. Note that it is not necessary to perform this task if the ingested datasets do not provide a cloud mask.

Furthermore, we found that in the datasets collected by Sentinel-2, images containing clouds tend to be above a certain pixel value. Figure 2 shows a graph of the maximum pixel value for each EO image (B4, B3, and B2 bands corresponding to RGB channels) versus the FID value for that image (the method for building the FID is in Section 2.3). In this figure, we can see that a clear image with no clouds has low pixel values and high FID values, while an image with clouds has high pixel values and low FID values. Note that it is not necessary for datasets built from images that are already normalized.

Therefore, as a first stage, we excluded images with

1 \in M_{Q A}

from the training dataset, where

M_{Q A}

is the QA60 band cloud mask. Based on the analysis in Figure 2, we also considered pixels above a threshold

α

as indicative of cloud data and excluded them from the training dataset. We empirically used an

α

of 4096.

2.2. Missing, Night-Time

Due to satellite sensor malfunctions and errors in data collection, remote sensing images are missing a lot of information, as shown in Figure 1b, which reduces their utility and hinders subsequent interpretation. In addition, these missing images can change the distribution of training data, which can adversely affect learning.

Furthermore, the collected data could be taken at any time of the day and include night-time data. Such night-time data are difficult to interpret due to the lack of illumination and the limited self-luminescence of objects, resulting in very sparse shapes, structures, and characters of objects.

The main purpose of translating SAR to EO is that it should be easily interpretable by humans, but these issues can negatively impact that goal. To remove these problematic data, we convert the image to HSV space. HSV space provides a value representing the brightness of that pixel. Missing data have a value of 0 because there are missing pixels, and night data have low values because it is dark. Therefore, we minimize the impact of dark data and missing data on the training data by removing images with an average value lower than 30. In addition, we remove images where the number of pixels in the RGB space with a pixel value of less than 10 exceeds 30% of the total pixels because the percentage of missing data is small and may not be removed by the brightness value.

2.3. Fréchet Inception Distance

While the QA60 band mask has the potential to remove images where the data contain a lot of clouds, it has the limitation of not being able to remove shallow and few clouds. To address these limitations, we utilize the Fréchet inception distance (FID) value, a feature-based similarity measure, to remove residual cloud data. The FID is used in image generation tasks to measure the quality of the generated image and as a metric to assess the similarity of the distribution to the real image. Inspired by this concept, we can calculate and quantify the distribution similarity between a subset of clouds and an EO image.

The last stage is as follows. To start with, we hand-select and build a cloud subset of 1000 images consisting of images that contain a lot of clouds. With the set of cloud images we built, we chose images that are thick and have at least 4/5 of the total area of the image covered by clouds so that we can identify cloud images using the FID score. Then, we calculated the FID values between the cloud subset and the EO data to create a value set

S = {S_{i} | i = 1, 2, \dots, N}

, where N represents the number of EO data pre-processed in the previous two stages. We found that the FID value can roughly determine the degree of cloud cover contained in the data. We set a threshold of 0.3 through several experiments and that threshold is used in Equation (1) to remove EO data with values lower than the threshold:

F_{t h} = \min (S) + (\max (S) - \min (S)) \times β

(1)

where

β

is a hyperparameter that determines the quality of the training dataset; we used 0.4 for all experiments.

To summarize, we propose the CCA with three stages to remove images containing real-world noise that affects performance during training. The corresponding python-like CODE is shown in Algorithm 1.

Algorithm 1 Python Implementation of CCA

# M_QA : QA60 band mask
# D_cloud : Cloud subset of 1000 images
# FID : Functions to calculate the FID score per-image

stage1_2_list = [] # Passed through stages 1,2 set
stage3_list = [] # Passed through stage 3 set
for img, M_QA in D:
    # Stage 1: QA60 & Outlier remove
    if (1 in M_QA) or (img > alpha):
        continue

    # Stage 2: Missing & Night-time remove
    img = normalize(img) # [0,255]
    img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    brightness = np.mean(img_hsv[:,:,2])

    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    threshold = img.shape[0] ∗ img.shape[1] ∗ 0.3
    missing_num = len(img_gray[img_gray<=10])

    if (30 > brightness) or (missing_num > threshold):
        continue
    stage1_2_list.append(img)

# Stage 3: Frechet inception distance remove
S = FID(np.array(stage1_2_list), D_cloud)
S_min, S_max = np.min(S), np.max(S)
F_th = S_min + (S_max − S_min) ∗ beta # Equation^~(1)

for img, score in zip(stage1_2_list, S):
    if score > F_th:
        stage3_list.append(img)

3. Experiments

3.1. Dataset

MultiEarth 2023 The MultiEarth 2023 dataset is a collection of remote sensing data with sentinel-1 and 2 bands, containing 780,706 and 5,241,119 images respectively, all in 256 × 256 size. It focuses on Earth ecosystem analysis through challenges like that of the Amazon rainforest, which contains lots of clouds.
SEN12MS-CR-TS The SEN12MS-CR-TS dataset is a multi-modal and multi-temporal dataset designed for global and all-season cloud removal techniques. It consists of 53 diverse regions of interest, each covering about 40 × 40 km². The dataset includes over 700 patches per region, with a patch size of 256 × 256.
Scotland&India The Scotland&India dataset consist of 444 samples and all in 256 × 256 size. The data include clear sky images from 2019 and cloudy images from 2020.

3.2. Baseline Selection

We selected three GAN-based methods, pix2pix, pix2pixHD, and SPADE, to compare the performance of the CCA.

pix2pix pix2pix proposes a conditional generative adversarial network (cGAN) for image-to-image translation. The model can perform various image translation tasks in a single unified framework and has become the basis for various GAN-based models.
pix2pixHD pix2pixHD is an extended version of pix2pix and proposes a model that enables high-resolution image-to-image translation. It is designed to generate high-resolution images (2048 × 1024 resolution) that the existing pix2pix model has difficulty processing, making it very useful for realistic image generation and complex scene processing.
SPADE Spatially-adaptive Denormalization (SPADE) is a GAN-based model designed to generate high-resolution images for tasks like image synthesis and transformation. It excels at producing realistic images from semantic layouts (segmentation maps) and effectively overcomes the limitations of previous GAN-based models.

3.3. Implementation Details and Metrics

We pre-processed the SAR data following the official papers of SEN12MS-CR-TS and Scotland&India. Also, we generated three-channel SAR data as described by Cabrera Armando et al. [21]. The SAR and EO data from all datasets we used for training and inference were normalized to 0∼1 when input into the model.

To evaluate the effect of our CCA, we selected metrics such as the SSIM, PSNR, MAE, and MSE to assess the performance of the generated EO data. These evaluation metrics were used to evaluate the visual quality difference and loss of quality information between the EO data generated by the translation model and the real EO data.

3.4. Quantitative Results

To evaluate the performance of our proposed CCA methods, we compared them to several baseline methods on the MultiEarth 2023, SEN12MS-CR-TS, and Scotland&India datasets. Table 1 shows the quantitative results of the experiments measured by popular evaluation metrics in image translation tasks such as the SSIM, PSNR, MAE, and MSE. All three baseline methods were outperformed when the CCA was applied to all datasets. On the MultiEarth 2023 dataset, the earliest proposed pix2pix model among the evaluation models performed the best on all evaluation metrics with +0.144, +4.882, −0.079, and −0.027 compared to the case without applying the CCA. In addition, if we compare the pix2pix model with the CCA to the other models without the CCA, the pix2pix model achieves better performance on all datasets. These results indicate that the noise in the training data has a significant impact on performance.

3.5. Ablation Study

We conducted an ablation study to evaluate the contribution of each phase of the CCA. Our goal is to convert SAR images to high-quality EO images by removing noisy EO images during training. Table 2 shows the results of the ablation study to evaluate the contribution of each stage of the CCA. These results were obtained using the pix2pixHD model. The baseline model, without any CCA stages, struggled with the inherent challenges posed by noisy EO images, particularly those affected by cloud coverage. This initial performance highlighted the limitations of handling raw data with significant atmospheric disturbances and inconsistencies.

The QA60&Outlier stage effectively identified and removed cloud-covered areas and outliers, especially at cloud edges, resulting in cleaner and more reliable outputs. This stage underscored the importance of preemptively addressing cloud-related distortions and outliers, which had been a major source of error in the baseline model.

The Missing&Night-time stage, which addressed gaps in data coverage and the inclusion of night-time images, which often lack sufficient detail for accurate analysis. Through excluding these lower-quality inputs, the model became more consistent in generating images with higher visual fidelity. The cumulative effect of these first two stages led to noticeable improvements in image clarity, reducing artifacts and inconsistencies that had hindered the baseline model.

The final stage moved beyond addressing large-scale issues like clouds and missing data, focusing instead on more nuanced distinctions in image quality. By leveraging the FID score, the model was able to detect and eliminate subtle defects, such as residual atmospheric distortions, that previous stages might have overlooked. This resulted in a significant leap in image quality, with reconstructions that were both visually clearer and quantitatively superior. The substantial improvement in PSNR (3.568 dB increase), along with the significant reductions in MAE (−0.046) and MSE (−0.017) from the baseline to the final stage, underscores the effectiveness of the complete CCA pipeline. These enhancements translated into visually clearer and more accurate EO image reconstructions from SAR inputs, with fewer artifacts and distortions caused by atmospheric conditions or data quality issues.

Furthermore, to directly evaluate the ability of the CCA algorithm to remove cloud data, we directly label a portion of the Multi-Earth 2023 data to perform cloud dataset classification. We randomly took 22,273 samples and labeled 2271 clean images and 20,002 noisy images. We evaluated the CCA algorithm on 1000 cloud data that are not included in the sample.

The 1000 cloud images utilized in the FID score step can be seen in Figure 3a. As can be seen, the selected data were of dense and thick clouds. In SAR-EO image conversion, images with shallow, broadly shaped clouds and images with small but thick clouds also cause performance degradation when training. With the cloud dataset in (a), we were able to successfully separate an even mixture of cloud data, unlike the few and shallow cases in (b). Table 3 identifies the clouds with a high performance of 0.97 in this experiment.

These results illustrate the cumulative benefits of the CCA pipeline. Each stage contributed meaningfully to improving the model’s performance, addressing both large-scale issues like cloud coverage and more subtle image quality factors. The final output demonstrated substantial improvements in both clarity and accuracy, highlighting the value of a multi-stage approach to handling noisy EO data.

3.6. Qualitative Results

Figure 4 qualitatively compares the image translation results of pix2pix, pix2pixHD, and SPADE for each dataset. As shown in the figure, without applying CCA, both Pix2Pix and SPADE struggle to learn the texture of EO data due to cloud interference, leading to blurry images. Pix2PixHD is particularly impacted, often generating clouds even when they are absent in the EO data. In addition, pix2pixHD is highly affected by night-time images, generating darker images compared to the other models. On the other hand, the model with CCA is less affected by real-world noise such as clouds, generating a cleaner image. SPADE generates better quality images compared to the case without applying CCA, and pix2pixHD, which generated the most clouds before applying CCA, generates images without clouds.

4. Conclusions

In this paper, we introduced the CCA (Clean Collector Algorithm) to address the challenges posed by noisy data in SAR-to-EO translation. The CCA is a three-stage process designed to remove specific sources of noise, such as clouds, night-time images, and missing data, ensuring cleaner and more reliable EO outputs. Even in utilizing a sensor with masking information about clouds, such as the QA60, it is not possible to remove all clouds that are actually present. Through incorporating FID scores, our method was able to effectively address these residual cloud artifacts, improving both visual quality and quantitative performance. This demonstrates that the precise handling of atmospheric distortions, cloud cover, and low-quality inputs is essential for generating accurate EO images from SAR data. While high-performance models are important, the careful pre-processing and removal of noisy data unlock the full potential of SAR-to-EO translation. Our results are validated across three benchmark datasets and further evidenced by a first place finish in the MultiEarth 2023 challenge.

The cloud removal method using the CCA’s FID score effectively filters out even faint clouds to some extent, but further research is needed for the detection and removal of very shallow clouds, such as foggy clouds. We hope that this work encourages further exploration into noise handling techniques for SAR-to-EO translation and broader remote sensing applications, where data quality plays a fundamental role in model success.

Author Contributions

Conceptualization, J.-G.J., H.-C.N. and M.-W.K.; methodology, J.-G.J., H.-C.N. and M.-W.K.; software, J.-G.J., H.-C.N. and M.-W.K.; validation, J.-G.J., H.-C.N. and M.-W.K.; formal analysis, J.-G.J., H.-C.N., M.-W.K. and S.-K.P.; investigation, J.-G.J., H.-C.N. and M.-W.K.; resources, S.-K.P. and D.-G.C.; writing—original draft preparation, J.-G.J., H.-C.N., S.-K.P. and M.-W.K.; writing—review and editing, J.-G.J., H.-C.N., S.-K.P. and M.-W.K.; visualization, J.-G.J., H.-C.N. and M.-W.K.; supervision, D.-G.C.; project administration, D.-G.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Korea Research Institute of Ships and Ocean engineering a grant from Endowment Project of “Development of Open Platform Technologies for Smart Maritime Safety and Industries” funded by Ministry of Oceans and Fisheries (1525014880, PES5230), and in part by the NRF Grant funded by the Korean Government MIST (Ministry of Science and ICT) (2023R1A2C1007428).

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Búrquez, A.; Ochoa, M.B.; Martínez-Yrízar, A.; de Souza, J.O.P. Human-made small reservoirs alter dryland hydrological connectivity. Sci. Total Environ. 2024, 947, 174673. [Google Scholar] [CrossRef]
Paheding, S.; Saleem, A.; Siddiqui, M.F.H.; Rawashdeh, N.; Essa, A.; Reyes, A.A. Advancing horizons in remote sensing: A comprehensive survey of deep learning models and applications in image classification and beyond. Neural Comput. Appl. 2024, 36, 16727–16767. [Google Scholar] [CrossRef]
Wu, J.; Huang, X.; Xu, N.; Zhu, Q.; Zorn, C.; Guo, W.; Wang, J.; Wang, B.; Shao, S.; Yu, C. Combining Satellite Imagery and a Deep Learning Algorithm to Retrieve the Water Levels of Small Reservoirs. Remote Sens. 2023, 15, 5740. [Google Scholar] [CrossRef]
Adegun, A.A.; Viriri, S.; Tapamo, J.R. Review of deep learning methods for remote sensing satellite images classification. J. Big Data 2023, 10, 93. [Google Scholar] [CrossRef]
Xu, D.; Wu, Y. An Efficient Detector with Auxiliary Network for Remote Sensing Object Detection. Electronics 2023, 12, 4448. [Google Scholar] [CrossRef]
Kanagavelu, R.; Dua, K.; Garai, P.; Thomas, N.; Elias, S.; Elias, S.; Wei, Q.; Yong, L.; Rick, G.S.M. Federated unet model with knowledge distillation for land use classification from satellite and street views. Electronics 2023, 12, 896. [Google Scholar] [CrossRef]
Song, J.; Li, J.; Chen, H.; Wu, J. RSMT: A remote sensing image-to-map translation model via adversarial deep transfer learning. Remote Sens. 2022, 14, 919. [Google Scholar] [CrossRef]
Park, T.; Liu, M.Y.; Wang, T.C.; Zhu, J.Y. Semantic Image Synthesis with Spatially-Adaptive Normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2337–2346. [Google Scholar]
Liu, G.H.; Vahdat, A.; Huang, D.A.; Theodorou, E.A.; Nie, W.; An kumar, A. I²SB: Image-to-Image Schrödinger Bridge. Mach. Learn. Res. 2023, 202, 22042–22062. [Google Scholar]
Li, B.; Xue, K.; Liu, B.; Lai, Y.K. BBDM: Image-to-image translation with Brownian bridge diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 1952–1961. [Google Scholar]
Li, S.; Cheng, M.M.; Gall, J. Dual Pyramid Generative Adversarial Networks for Semantic Image Synthesis. arXiv 2022, arXiv:2210.04085. [Google Scholar]
Wang, T.C.; Liu, M.Y.; Zhu, J.Y.; Tao, A.; Kautz, J.; Catanzaro, B. High-resolution image synthesis and semantic manipulation with conditional gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 8798–8807. [Google Scholar]
Wang, P.; Bayram, B.; Sertel, E. A comprehensive review on deep learning based remote sensing image super-resolution methods. Earth-Sci. Rev. 2022, 232, 104110. [Google Scholar] [CrossRef]
Huang, Z.; Wang, L.; An, Q.; Zhou, Q.; Hong, H. Learning a contrast enhancer for intensity correction of remotely sensed images. IEEE Signal Process. Lett. 2021, 29, 394–398. [Google Scholar] [CrossRef]
Pan, X.; Xie, F.; Jiang, Z.; Yin, J. Haze removal for a single remote sensing image based on deformed haze imaging model. IEEE Signal Process. Lett. 2021, 22, 1806–1810. [Google Scholar] [CrossRef]
Noh, H.; Ju, J.; Seo, M.; Park, J.; Choi, D.G. Unsupervised change detection based on image reconstruction loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 1352–1361. [Google Scholar]
Noh, H.; Ju, J.; Kim, Y.; Kim, M.; Choi, D.G. Unsupervised change detection based on image reconstruction loss with segment anything. Remote Sens. Lett. 2024, 15, 919–929. [Google Scholar] [CrossRef]
Seo, M.; Lee, H.; Jeon, Y.; Seo, J. Self-Pair: Synthesizing Changes from Single Source for Object Change Detection in Remote Sensing Imagery. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2023; pp. 6374–6383. [Google Scholar]
Hussain, M.; Chen, D.; Cheng, A.; Wei, H.; Stanley, D. Change detection from remotely sensed images: From pixel-based to object-based approaches. ISPRS J. Photogramm. Remote Sens. 2013, 80, 91–106. [Google Scholar] [CrossRef]
Zhang, X.; Xiao, P.; Feng, X.; Yuan, M. Separate segmentation of multi-temporal high-resolution remote sensing images for object-based change detection in urban area. Remote Sens. Environ. 2017, 201, 243–255. [Google Scholar] [CrossRef]
Cabrera, A.; Cha, M.; Sharma, P.; Newey, M. SAR-to-EO Image Translation with Multi-Conditional Adversarial Networks. In Proceedings of the 2021 55th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 31 October–3 November 2021; pp. 1710–1714. [Google Scholar]
Lin, F.; Chuang, Y. Interoperability study of data preprocessing for deep learning and high-resolution aerial photographs for forest and vegetation type identification. Remote Sens. 2021, 13, 4036. [Google Scholar] [CrossRef]
Fornaro, G.; Reale, D.; Verde, S. Bridge Thermal Dilation Monitoring With Millimeter Sensitivity via Multidimensional SAR Imaging. IEEE Geosci. Remote Sens. Lett. 2012, 10, 677–681. [Google Scholar] [CrossRef]
Luzi, G.; Pieraccini, M.; Mecatti, D.; Noferini, L.; Macaluso, G.; Tamburini, A.; Atzeni, C. Monitoring of an Alpine Glacier by Means of Ground-Based SAR Interferometry. IEEE Geosci. Remote Sens. Lett. 2007, 4, 495–499. [Google Scholar] [CrossRef]
Wang, H.; Zhang, Z.; Hu, Z.; Dong, Q. SAR-to-Optical Image Translation with Hierarchical Latent Features. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [Google Scholar] [CrossRef]
Wang, L.; Xu, X.; Yu, Y.; Yang, R.; Gui, R.; Xu, Z.; Pu, F. SAR-to-Optical Image Translation Using Supervised Cycle-Consistent Adversarial Networks. IEEE Access 2019, 7, 129136–129149. [Google Scholar] [CrossRef]
Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
Schmitt, M.; Hughes, L.H.; Qiu, C.; Zhu, X.X. SEN12MS–A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion. arXiv 2019, arXiv:1906.07789. [Google Scholar] [CrossRef]
Schmitt, M.; Prexl, J.; Ebel, P.; Liebel, L.; Zhu, X.X. Weakly supervised semantic segmentation of satellite images for land cover mapping–challenges and opportunities. arXiv 2020, arXiv:2002.08254. [Google Scholar] [CrossRef]
Shermeyer, J.; Hogan, D.; Brown, J.; Van Etten, A.; Weir, N.; Pacifici, F.; Hansch, R.; Bastidas, A.; Soenen, S.; Bacastow, T.; et al. SpaceNet 6: Multi-Sensor All Weather Mapping Dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 196–197. [Google Scholar]
Singh, P.; Komodakis, N. Cloud-Gan: Cloud Removal for Sentinel-2 Imagery Using a Cyclic Consistent Generative Adversarial Networks. In Proceedings of the IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 1772–1775. [Google Scholar]
Pan, H. Cloud removal for remote sensing imagery via spatial attention generative adversarial network. arXiv 2018, arXiv:2009.13015. [Google Scholar]
Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 5728–5739. [Google Scholar]
Sarukkai, V.; Jain, A.; Uzkent, B.; Ermon, S. Cloud Removal in Satellite Images Using Spatiotemporal Generative Networks. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Snowmass Village, CO, USA, 1–5 March 2020; pp. 1796–1805. [Google Scholar]
Ebel, P.; Garnot, V.S.F.; Schmitt, M.; Wegner, J.D.; Zhu, X.X. UnCRtainTS: Uncertainty Quantification for Cloud Removal in Optical Satellite Time Series. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Vancouver, BC, Canada, 18–22 June 2023; pp. 2085–2095. [Google Scholar]
Meraner, A.; Ebel, P.; Zhu, X.X.; Schmitt, M. Cloud Removal in Sentinel-2 Imagery Using a Deep Residual Neural Network and SAR-Optical Data Fusion. ISPRS J. Photogramm. Remote Sens. 2020, 166, 333–346. [Google Scholar] [CrossRef] [PubMed]
Enomoto, K.; Sakurada, K.; Wang, W.; Fukui, H.; Matsuoka, M.; Nakamura, R.; Kawaguchi, N. Filmy Cloud Removal on Satellite Imagery with Multispectral Conditional Generative Adversarial Nets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 48–56. [Google Scholar]
Cha, M.; Angelides, G.; Hamilton, M.; Soszynski, A.; Swenson, B.; Maidel, N.; Isola, P.; Perron, T.; Freeman, B. MultiEarth 2023—Multimodal Learning for Earth and Environment Workshop and Challenge. arXiv 2023, arXiv:2306.04738. [Google Scholar]
Ebel, P.; Xu, Y.; Schmitt, M.; Zhu, X.X. SEN12MS-CR-TS: A Remote-Sensing Data Set for Multi-modal Multi-temporal Cloud Removal. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
Czerkawski, M.; Upadhyay, P.; Davison, C.; Werkmeister, A.; Cardona, J.; Atkinson, R.; Michie, C.; Andonovic, I.; Macdonald, M.; Tachtatzis, C. Deep Internal Learning for Inpainting of Cloud-Affected Regions in Satellite Imagery. Remote Sens. 2022, 14, 1342. [Google Scholar] [CrossRef]

Figure 1. Example images of our Clean Collector Algorithm (CCA). (a) shows QA60 band and pixel outliers, (b) shows missing data and night time, and (c) shows samples by FID score.

Figure 2. Statistics of the corresponding image’s FID values by the maximum pixel value in the image. (a) is below the

α

value and above

F_{t h}

, and (b,c) are above the

α

value and below

F_{t h}

, representing half ranges respectively.

Figure 2. Statistics of the corresponding image’s FID values by the maximum pixel value in the image. (a) is below the

α

value and above

F_{t h}

, and (b,c) are above the

α

value and below

F_{t h}

, representing half ranges respectively.

Figure 3. Images of the MultiEarth2023 dataset sample: (a) cloud images out of the 1000 selected, (b) cloud-labeled data, and (c) clean-label data without clouds.

Figure 4. Qualitative comparison of pix2pix (c,f), pix2pixHD (d,g) and SPADE (e,h) with and without the CCA (marked with an *) on the MultiEarth 2023, SEN12MS-CR-TS, and Scotland&India datasets. The top rows (1, 2) show the results on the MultiEarth 2023 dataset, the middle rows (3, 4) show the results on the SEN12MS-CR-TS dataset, and the bottom rows (5, 6) show the results on the Scotland&India dataset. Our proposed CCA shows the better results in terms of visual quality.

Table 1. Performance comparison of different SAR-to-EO translation models on the MultiEarth 2023, SEN12MS-CR-TS, and Scotland&India datasets with and without the CCA. ↑ indicates higher values are better (for SSIM and PSNR), ↓ indicates lower values are better (for MAE and MSE), The best results are in bold.

Method	MultiEarth 2023				SEN12MS-CR-TS				Scotland&India
Method	SSIM↑	PSNR↑	MAE↓	MSE↓	SSIM↑	PSNR↑	MAE↓	MSE↓	SSIM↑	PSNR↑	MAE↓	MSE↓
pix2pix	0.449	14.814	0.165	0.040	0.482	13.232	0.189	0.054	0.555	15.803	0.142	0.034
pix2pixHD	0.444	16.066	0.132	0.030	0.566	16.483	0.130	0.027	0.527	15.931	0.147	0.040
SPADE	0.407	15.668	0.148	0.034	0.462	14.661	0.169	0.049	0.522	14.592	0.168	0.037
pix2pix + CCA	0.593	19.696	0.086	0.013	0.539	14.179	0.165	0.044	0.700	18.950	0.105	0.022
pix2pixHD + CCA	0.503	19.634	0.086	0.013	0.601	18.085	0.105	0.020	0.709	20.301	0.111	0.030
SPADE + CCA	0.493	17.590	0.116	0.024	0.495	15.390	0.152	0.039	0.535	15.041	0.156	0.034

Table 2. Classification results of CCA.

	Performance Metrics				Confusion Matrix
	Accuracy	Precision	Recall	F1-Score	TN	FP	FN	TP
Score	0.9746	0.9513	0.7913	0.8639	19,910	92	474	1797

Table 3. Ablation study results comparing the performance of the pix2pixHD model on the MultiEarth 2023 dataset with three stages of the CCA. The best results are in bold. Also, blue text indicates a performance improvement.

CCA	MultiEarth 2023
CCA	SSIM↑	PSNR↑	MAE↓	MSE↓
-	0.444	16.066	0.132	0.030
QA60&Outlier	0.472	17.954	0.111	0.022
+ Missing&Night-time	0.489	18.212	0.098	0.017
+ FID score	0.503 (+0.059)	19.634 (+3.568)	0.086 (−0.046)	0.013 (−0.017)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, M.-W.; Park, S.-K.; Ju, J.-G.; Noh, H.-C.; Choi, D.-G. Clean Collector Algorithm for Satellite Image Pre-Processing of SAR-to-EO Translation. Electronics 2024, 13, 4529. https://doi.org/10.3390/electronics13224529

AMA Style

Kim M-W, Park S-K, Ju J-G, Noh H-C, Choi D-G. Clean Collector Algorithm for Satellite Image Pre-Processing of SAR-to-EO Translation. Electronics. 2024; 13(22):4529. https://doi.org/10.3390/electronics13224529

Chicago/Turabian Style

Kim, Min-Woo, Se-Kil Park, Jin-Gi Ju, Hyeon-Cheol Noh, and Dong-Geol Choi. 2024. "Clean Collector Algorithm for Satellite Image Pre-Processing of SAR-to-EO Translation" Electronics 13, no. 22: 4529. https://doi.org/10.3390/electronics13224529

APA Style

Kim, M. -W., Park, S. -K., Ju, J. -G., Noh, H. -C., & Choi, D. -G. (2024). Clean Collector Algorithm for Satellite Image Pre-Processing of SAR-to-EO Translation. Electronics, 13(22), 4529. https://doi.org/10.3390/electronics13224529

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Clean Collector Algorithm for Satellite Image Pre-Processing of SAR-to-EO Translation

Abstract

1. Introduction

2. Methodology

2.1. QA60, Outlier

2.2. Missing, Night-Time

2.3. Fréchet Inception Distance

3. Experiments

3.1. Dataset

3.2. Baseline Selection

3.3. Implementation Details and Metrics

3.4. Quantitative Results

3.5. Ablation Study

3.6. Qualitative Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI