Next Article in Journal
Special Issue on Advanced Technologies in Lifelong Learning
Previous Article in Journal
BODIPY-Based Nanomaterials—Sensing and Biomedical Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

GF-1 Satellite Imagery Data Service and Application Based on Open Data Cube

1
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
2
School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
3
College of Land Science and Technology, China Agricultural University, Beijing 100083, China
4
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430072, China
5
Satellite Application Center for Ecology and Environment, Ministry of Ecology and Environment, Beijing 100094, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2022, 12(15), 7816; https://doi.org/10.3390/app12157816
Submission received: 30 June 2022 / Revised: 21 July 2022 / Accepted: 2 August 2022 / Published: 3 August 2022

Abstract

:
With the application of big data in Earth observation, satellite imagery data are gradually becoming important means of observation for monitoring changes in vegetation, water bodies, and urbanization. Therefore, new satellite imagery data organization and management paradigms are urgently needed to fully mine the useful information from these data and provide new ways to better quantify and serve the sustainable development of resources and the environment. In this paper, a framework for processing and analyzing Chinese GF-1 satellite imagery data was developed using the latest technologies such as Open Data Cube (ODC) grids, Analysis Ready Data (ARD) generation, and space subdivision, which extended the data loading and processing capacities of the ODC grids for Chinese satellite imagery data. Using the proposed framework, we conducted a case study to investigate the spatial and temporal changes in vegetation and water mapping with GF-1 data collected from 2014 to 2021 covering the Miyun Reservoir, Beijing, China. The experimental results showed that the proposed framework had significantly improved temporal and spatial efficiency compared with the traditional scene-based data management approach, thus demonstrating the advantages and potential of the ODC grids as a new data management paradigm.

1. Introduction

Due to aggressive human intervention in natural resources and the environment, environmental issues are gradually becoming a global concern, and the sustainable development of resources and natural ecology has become a major challenge for humanity in the 21st century [1,2]. The advances in aerospace technology and Earth observation (EO) technology have greatly improved the spatial resolution, temporal resolution, and spectral resolution of satellite imagery data and significantly enhanced the EO capability and data support capability, which gradually propelled EO into the era of big data [3]. With features such as global coverage, repeatable access, and relatively consistent spatial data [4], EO data are becoming an important observation tool for monitoring and analysis of changes in vegetation, water bodies, and urbanization. Traditionally, most remote sensing data are organized and managed in the scene format, which separates the time and space aspects. As a result, spatial/temporal analysis and the application of remote sensing data become cumbersome [5], and the true value of many archived remote sensing data is not fully utilized. Therefore, new tools or techniques are urgently required to minimize the time, expertise, and tools needed to prepare, access, and analyze large-scale remote sensing data while mining useful information from these data to the maximum extent to support the sustainable development of resources and the environment [6].
In 2017, the Committee on Earth Observation Satellites (CEOS) launched the Open Data Cube (ODC) program [7]. As a generic term, ODC is often understood as a multidimensional array that simplifies data storage, data access, and data analysis to improve data organization, query, and analysis performance [8]. As a new data management approach applied to the organization and management of satellite imagery data, ODC shifts the paradigm from scene to pixel. ODC grids offer efficient storage for satellite imagery data with multi-temporal, multi-spatial, multi-spectral, and multi-attribute characteristics. In the meantime, ODC grids also take into account the time and space correlation of satellite imagery data and avoid the time and space separation of the scene management method, making the application of long time series and the analysis of image data simple and efficient. Actually, ODC facilitates programmatic access to satellite imagery data, while facilitating large-scale analysis that would not be possible if files needed to be accessed one by one. Currently, exploratory ODC applications have been conducted in countries such as the United States, Australia, and Switzerland. Using ODC as the base framework, these applications improve the effectiveness of satellite imagery data by providing easy data access for data scientists, researchers, local authorities, and business owners, while meeting the requirements of specific scenarios by customizing the programs on demand [9].
Research teams around the world have conducted a series of studies for different application scenarios. Australian researchers constructed ODCs for Landsat data dating back to 1987 [10] and mapped surface water time series across the Australian continent to provide rapid and high-resolution detection of continental-scale surface water changes [11,12,13,14], environmental changes [15], mangrove forest expansion on the northern coast [16], and land coverage [17]. Researchers from Switzerland adopted the ODC to manage large amounts of Landsat and Sentinel imagery data in studies on vegetation [18], urbanization [19], water quality [20], and snow coverage [21], and they improved the understanding of the Swiss environment by monitoring its spatial and temporal changes [22]. Through the ODC initiative and its demonstration application in Africa, African research teams applied the ODC to Landsat and Sentinel imagery data management in key fields such as water coverage, land change, and urbanization consistent with the UN Sustainable Development Goals, which satisfied local and national decision-making needs [23,24,25,26,27,28]. By fusing Landsat optical image data and Sentinel-1 radar image data, Vietnamese researchers developed a prototype application for flood monitoring in the lower Mekong River basin based on the ODC [29]. In addition, research teams from Colombia, Mexico, and Brazil applied the ODC to long-time-series Landsat and Sentinel imagery data, covering their countries for thematic studies of hydrology [30], soils [31], vegetation [32], and island area [30]. on the basis of ODC research, Chinese scholars have mostly adopted cloud storage strategies for long-time-series water body index and vegetation index analyses [33] or land coverage and land-use dynamic mapping [34,35]. However, localized data management requirements have seldom been considered, thus not satisfying the higher security requirements of satellite imagery data.
Therefore, ODC has been initially applied worldwide, but its applications can still be improved and expanded in certain areas. Specifically, in terms of Analysis Ready Data (ARD) generation for ODC, international Landsat and Sentinel imagery data are usually processed in advance by the data distributors, while Chinese imagery data require user processing, which exerts an extra burden on the users. Secondly, compared with other countries, ODC applications are less extensive in China, and open-source tools for loading and managing Chinese imagery data are especially needed. The reason is that complex data loading and application with the ODC framework often constitute high technical barriers that greatly limit its application. To remedy these limitations, we discuss the implementation of a GF-1 imagery data management framework based on ODC grids, which proved significantly superior in terms of time efficiency and spatial efficiency to the traditional scene-based data management approach. The work of this paper is truly an improved and modified version of ODC, which can realize the data loading and application support of China GF-1 satellite image data. However, the original ODC did not have the ability for direct GF-1 image data loading and application support. To obtain GF-1 imagery ARD, we conducted preprocessing operations such as radiometric calibration and geometric normalization for the GF-1 imagery data, which solved the problem that raw Chinese imagery GF-1 data could not be directly managed by ODC. Considering the basic development template of ODC and the characteristics of the GF-1 imagery data, we proposed a batch data loading method specific to the EO mode so that standardized GF-1 imagery data loading and management could be achieved in ODC grids. For the storage and management of GF-1 image data, the traditional method was scene-based. ODC-based data management is a new paradigm. The Miyun Reservoir in Beijing, China, was selected to verify the usability and superiority of the proposed framework. Using GF-1 imagery data from 2014 to 2021, we conducted spatial and temporal analyses on the long-time-series imagery data of vegetation and water bodies in the Miyun Reservoir and the surrounding area. The GF-1 imagery data processing efficiencies of the ODC grid-based approach and the scene-based method were compared in terms of processing time and memory occupation.
The remaining sections are arranged as follows: Section 2 introduces the research methodology and outlines the ODC grid-based Chinese GF-1 spatial and temporal satellite imagery data management framework. A case study is designed and conducted in Section 3. Section 4 summarizes the research results and lists future work directions.

2. Methods

With the Open Data Cube [11] open-source geospatial data management and analysis software as a foundation, we propose a Chinese GF-1 spatial and temporal satellite imagery data management framework based on ODC grids. Radiometric calibration based on absolute radiometric calibration coefficients, atmospheric correction based on the FLASSH method [36,37], and geometric normalization based on the HighImgCorrect method [38] were conducted to preprocess Chinese GF-1 satellite imagery data and generate GF-1 imagery ARD. Descriptions were provided on the retrieval, query, and analysis of GF-1 satellite imagery data on a pixel grid scale. The strengths of the proposed framework were highlighted through a case study of spatial and temporal changes in vegetation and water mapping and a comparison with the scene-based management approach in terms of temporal and spatial comparative analysis efficiencies. The proposed framework in this study is shown in Figure 1.

2.1. Data Preprocessing Based on the FLASSH and RPC Methods

Because the satellite imagery data in ARD format can be loaded and managed by ODC after being preprocessed, preprocessing steps such as radiometric correction and geometric correction of satellite imagery data are very important. Imagery data from different data distributors (e.g., GF-1, GF-2, GF-6, HJ1A, HJ1B, and JL-1) have different formats and varied data quality. For example, the GF-1 data are of L1A quality with only simple radiometric correction. Therefore, further processing such as high-precision radiometric correction and geometric correction is required to generate Analysis Ready Data with surface reflectance. In this paper, the absolute radiometric calibration coefficients were used for radiometric calibration of GF-1 satellite imagery data, and the FLASSH method was used for atmospheric correction. The HighImgCorrect method was adopted for geometric normalization. Eventually, the quality of GF-1 satellite imagery data products was improved to Level 4, which met the requirements for quantitative remote sensing analysis of water body changes, vegetation changes, urbanization, and coastline changes.

2.1.1. Radiometric Correction Based on the FLASSH Method

Radiometric correction includes radiometric calibration and atmospheric correction. The purpose of radiometric calibration is to apply absolute radiometric calibration coefficients to the multispectral and panchromatic remote sensing images and convert their DN (digital number) values into irradiance to give irradiance images. Then, the irradiance of the images can be converted to reflectance through atmospheric correction. Electromagnetic waves undergo absorption, reradiation, reflection, and spectrum redistribution during their transmission from the radiation source to the sensor, which necessitates an inversion of the true reflectivity of the surface to estimate the interference by atmospheric effects [39]. The FLASSH atmospheric correction method was adopted in this paper, and Equation (1) [36] was applied to calculate the radiance of the single-pixel spectra received by the sensor. Only the visible light waves were considered, and the thermal infrared band was neglected. After obtaining the atmospheric correction parameters, the average surface reflectance was calculated by applying Equation (1) to each band pixel, and the average spatial reflectance was estimated.
L = A ρ 1 ρ e S + B ρ e 1 ρ e S + L a ,
where L is the radiance of one pixel, ρe is the average radiance of one pixel and its surrounding area, S is the atmospheric spherical reflectance, La is the atmospheric backscattered radiance, and A and B are coefficients determined by atmospheric and geometric conditions, independent of the ground surface.

2.1.2. Geometric Normalization Based on the HighImgCorrect Method

The HighImgCorrect method is a geometric correction program developed by our team for a variety of Chinese satellite imagery data (such as GF-1, GF-2, GF-6, and HJ). Compared with IDL and other conventional tools, the method used in this paper has the advantages of one-stop data operation, high efficiency, and quickness. Due to differences in imaging method, satellite platform, and imaging parameter precision, multisource remote sensing images are not fully consistent geospatially. Therefore, geometric normalization is required to achieve geospatial consistency of multisource remote sensing images. To improve the processing efficiency of GF-1 satellite imagery data, we adopted the HighImgCorrect geometric normalization method. First, a pyramid tile structure was constructed for the baseline images, and the baseline tile data were published. Then, the level and name of the tiles to be downloaded were calculated according to the resolution and geographic coordinates of the images to be geometrically normalized, and the corresponding data were downloaded from the baseline tile data to form the standard tileset. Next, the images to be geometrically normalized were geometrically corrected with the tiles in the standard tileset. After that, the generated images of the previous step were matched with the tiles in the standard tileset to obtain multiple control points, and the calculation results of these control points were used to evaluate accuracy. Lastly, the geometric correction was terminated immediately if the accuracy met the requirements. Otherwise, a new round was conducted until the accuracy requirements were met.

2.2. Data Management Based on the Pixel-Level Grid

2.2.1. Database Connection and Initialization

Database connection and initialization are necessary to ensure that ODC is connected to the PostgreSQL [40] open-source database so that the metadata of the imagery data and the attribute information of the band data can be stored in it. To store the data, the proposed framework automatically generates a database schema containing five database tables (dataset, dataset_location, dataset_source, dataset_type, and metadata_type) during the initialization of the PostgreSQL database. The relationship between the five database tables is shown in Figure 2. As seen in Figure 2, the dataset table is the core that stores all the data records added to the database. Since each data record contains the relevant metadata information, data type information, data location information, and data source information after the data ingestion, this information is separately recorded in the other four tables.

2.2.2. Product Definition and GiST Index

ODC provides definition files and organization scripts for EO data such as Landsat and Sentinel, and configuration files and image metadata files in the YAML format (YAML Ain’t Markup Language, a highly readable format used to express data serialization) [41] can be automatically generated from the image files. Access methods and tools supporting GF-1 satellite imagery data are still needed. To this end, we set and achieved the goal to acquire spectral information, band information, and spatial reference information from GF-1 satellite imagery files and automatically generated metadata files in YAML format that conform to ODC standards. The specific processes included imagery data product definition, imagery data format conversion, and index construction, as shown in Figure 3. In the imagery data product definition phase, shown in Figure 3(I), a unified GF-1 imagery data product definition file type was formulated (e.g., GF-1Product.yaml). In the imagery data format conversion stage, as shown in Figure 3(II), the imagery data were evaluated to determine whether single-band separation was necessary, as shown in Figure 3(II)-(1), before converting their metadata files in formats such as XML (Extensible Markup Language) into the YAML format, as shown in Figure 3(II)-(2). In the index construction phase, shown in Figure 3(III), an index file (metadata. yaml) with a UUID (Universally Unique Identifier) was generated for each scene of imagery data, and its relevant information was stored in the database. The details are as follows:
(I)
The imagery data product definition focused on defining and describing the type of prestored imagery data. Considering the characteristics of GF-1 imagery metadata, the defined information included product name, basic metadata information, spatial reference information, and band information, which were saved in YAML format.
(II)
The image data format conversion was mainly the imagery data single-band extraction and YAML format metadata generation. Since ODC is based on single-band storage and the GF-1 imagery data are in a composite band format, a single-band separation operation is required to extract single-band data from the original images. To avoid the case of NoData in metadata documents of XML and other formats, we unified the NoData values of the original images to the parameters defined in YAML documents using the Geospatial Data Abstraction Library (GDAL) tool.
(III)
Index construction focused on storing the above YAML format data products in the PostgreSQL database with the GiST indexing method. GiST is a balanced tree structure access method, which can be used as a basic template to implement arbitrary indexing patterns such as b-trees and r-trees.

2.2.3. Data Access and Analysis

During data access and analysis, the imagery data were converted from scene-based management to pixel grids. The raw imagery data were stored as scenes in the PostgreSQL database. During data loading and analysis, the imagery data were written into memory for pixel-level operations. To better clarify the data storing process based on pixels, the proposed framework was designed with database table views for imagery data sharing the same type of metadata and separate views for each type of imagery data, such as the GF-1 imagery data. A view is a virtual table with contents defined by queries. Views can be used for data retrieval with custom queries and complete information integration from multiple tables, thus achieving a storage structure conversion from scene-based file storage to views in databases. According to the actual needs, certain fields can be retrieved and displayed to give pixel-level imagery data for subsequent application and analysis.
After the imagery data were written into memory, the proposed framework employed Xarray, an open-source Python package improving the operational simplicity and efficiency of labeled multidimensional arrays to conduct pixel-level operations. Specifically, the multidimensional array database, Xarray.Dataset, and the labeled N-dimensional matrix, Xarray.DataArray, in the memory are specifically needed as the data structures for pixel grid loading, computing, and storage. Xarray.Dataset is an array of tokens with aligned Dimensions similar to a dictionary and an in-memory representation of imagery data.

2.2.4. Data Ingestion

The width of the imagery data of specific types and sensors is fixed. For example, the GF-1 WFV sensor data has achieved an imaging width greater than 800 km. When the study area is much smaller than the coverage of one imagery data scene, cropping and mosaicking are required, which add to the heavy data processing for specific tasks. The proposed framework has a data ingestion process that subdivides the original scene data into data blocks with customized volumes, thus enabling fast and flexible access to and operations on small study area data.
During data ingestion, the proposed framework created and executed a YAML document containing information about custom slice sizes, specific projection formats, and specific latitude and longitude range so that the user-imported imagery data could be sliced, resampled, and compressed to generate easily computable data slices in the NetCDF (Network Common Data Form) [42] format and store them in the local file system. As a computer- and application-independent self-describing file format, NetCDF enables efficient storage and management of matrix data in heterogeneous network environments and is widely used in research areas such as meteorological sciences and geophysics [43]. In the meantime, we created dynamic B-tree-based indices for data slices in the PostgreSQL database to facilitate data query and analysis. Due to the insecurity of network storage and transmission, especially for the high-resolution GF-1 imagery data, local file system storage was adopted for the proposed framework.

2.3. Comparative Analysis

To evaluate the advantages and disadvantages of the ODC grid-based satellite imagery data management strategy over the traditional scene-based strategy, we proposed a comparative approach in terms of time efficiency and memory efficiency. For the 2014–2021 GF-1 image data that were loaded and managed in the ODC grids, we selected the Miyun Reservoir and its surroundings as the study area and conducted a comparative analysis of the monitoring of spatial changes in water bodies and vegetation of the Miyun Reservoir. Specifically, we first conducted the comparative analysis of time efficiency by calculating the time spent on GF-1 data loading and processing in the case study. Then, the comparative analysis of memory efficiency was performed by calculating the memory consumption during data loading and processing in the case study. Lastly, the spatial subdivision technique was used to study the spatial and temporal variation of surface monitoring indicators in the longitude and latitude dimensions of the Miyun Reservoir.

3. Experiment and Results

3.1. Experiment Design and Environment

The proposed framework was applied to the GF-1 imagery data of the Miyun Reservoir pilot area in Beijing spanning from 2014 to 2021 to verify its usability and advantages in terms of data storage and management and application analysis. The experiments were conducted on a PC with an Intel Core i7-8700 processor, 16 GB RAM, and a 13 TB hard disk running the Ubuntu 20.04 operating system, with PostgreSQL 12.4 installed to store the location information and auxiliary information of metadata and imagery data.

3.2. Study Area and Data

The Miyun Reservoir and its drainage basin in the northeastern of the Miyun District, Beijing, were selected as the study area, as shown in Figure 4a. As the largest reservoir in Beijing, the main functions of the Miyun Reservoir include flood control, water supply, and water exchange for urban lakes. The upstream area of the reservoir has a continental monsoon climate with an average annual temperature of 6 °C to 10 °C and average annual precipitation of 500 mm to 700 mm. The annual precipitation distribution is very uneven, with precipitation in July and August accounting for 70% of the annual total, mostly in the form of heavy rainfall [44]. Providing 80% of Beijing’s urban water supply, the water quality of the Miyun Reservoir is particularly important. The drainage basin of the reservoir is a remote and mountainous area with severe soil erosion. Therefore, monitoring the vegetation coverage information and water body coverage information with the GF-1 data could inform the development of effective measures to improve the ecological environment of the reservoir basin in the future [40]. Moreover, as the only surface drinking water source in Beijing, the water demand and water quality of the Miyun reservoir have direct effects on people’s living and the sustainable development of the economy [45].
The GF-1 satellite data processed in this paper were received for 10 years since April 2013. GF-1 WFV sensor data include four multispectral bands (red, green, blue, and NIR), with a spatial resolution of 16 m and a time revisit period of 4 days. The large-field-of-view, multispectral detection data from the GF-1 16 m medium-resolution camera can be used to monitor NDVI, NDWI, and other indices. The Miyun Reservoir as the study area of this paper is shown in Figure 4a, and the GF-1 data covering the Miyun Reservoir are presented in Figure 4b, which are organized and stored as scenes. On the basis of the proposed framework and GF-1 imagery data covering the study area from June to October in each year from 2014 to 2021, the normalized difference vegetation index (NDVI) [46] and normalized difference water index (NDWI) [47,48] were calculated to investigate the spatial and temporal dynamics of vegetation and water bodies in and around the Miyun Reservoir.

3.3. Experimental Results

3.3.1. Comparative Analysis of Time Efficiency for the Two Strategies

To evaluate the computational time efficiency of the two data organization strategies based on ODC grid and scene, we recorded the time consumption and time efficiency improvement of the two strategies for calculating NDVI and NDWI, as shown in Figure 5. The red bars and green bars in Figure 5a indicate the time consumption of the ODC grid- and scene-based approaches for calculating NDVI. The red bars and green bars in Figure 5b indicate the time consumption of the ODC grid- and scene-based approaches for calculating NDWI. The curves in Figure 5c,d show the time efficiency improvement of the ODC grid-based approach over the scene-based approach for calculating NDVI and NDWI, respectively. Taking the year 2015 as an example, it took 41.2 s to calculate NDVI with the ODC grid-based approach and 65.18 s with the scene-based approach, i.e., the proposed framework improved the time efficiency by 58.2%. According to the trends of the bars in Figure 5a,b and the time efficiency improvement curves in Figure 5c,d, the time complexity of calculating NDVI and NDWI was consistent. Therefore, the ODC grid-based framework was more efficient than the scene-based approach in terms of NDVI and NDWI calculation in the study area.

3.3.2. Comparative Analysis of Memory Efficiency for the Two Strategies

To evaluate the memory efficiency of the two data organization strategies based on the ODC grid and scene [49], we recorded the memory consumption and memory efficiency of the two strategies for calculating NDVI and NDWI, as shown in Figure 6. The red bars and green bars in Figure 6a indicate the memory consumption of the ODC grid- and scene-based approaches for calculating NDVI. The red bars and green bars in Figure 6b indicate the memory consumption of the ODC grid- and scene-based approaches for calculating NDWI. The curves in Figure 6c,d show the memory efficiency improvement of the ODC grid-based approach over the scene-based approach for calculating NDVI and NDWI, respectively. As shown in Figure 6a,b, the ODC grid-based approach consumed less memory than the scene-based approach. Taking the year 2016 as an example, it took 531.9 MB of memory to calculate NDVI with the ODC grid-based approach and 843.2 MB with the scene-based approach, i.e., the proposed framework improved the memory efficiency by 58.5%. According to the curves in Figure 6c, d and their trends, the space complexity for calculating NDVI and NDWI was consistent. Therefore, the ODC grid-based framework consumed less memory than the scene-based approach in terms of NDVI and NDWI calculation in the study area.

3.3.3. Spatiotemporal Analysis of NDVI and NDWI

In order to verify the availability of GF-1 satellite image data managed by the ODC grid, this paper selected two sample points in the study area for analysis, and the annual change curves of NDVI and NDWI time series of the selected sample points are displayed in Figure 7. According to the NDVI and NDWI change curves, the change trend of vegetation and water body at the sampling point and its surroundings can be inferred. As can be seen in Figure 7, the vegetation and water bodies in the study area showed an overall decreasing trend, indicating a deteriorating overall ecological environment. In Figure 7a, the highest vegetation amount was presented at the point on 12 September 2017, which was influenced by the establishment of the Xiongan New Area and the development of relevant protection policies. In Figure 7b, the lowest value of water volume is shown at the point on 4 September 2018, due to the reduction in water in the reservoir caused by the unusual weather.
On the basis of the loaded GF-1 imagery data, we calculated the spatial distribution of NDVI and NDWI in and around the Miyun Reservoir, as shown in Figure 8. The area and color shades of vegetation and water bodies in Figure 8 reflect the changes in vegetation and water bodies in the Miyun Reservoir and the surrounding area. For example, the second image in Figure 8a from the top left and the second image in Figure 8b from the top left clearly show a significant decrease in the area of the Miyun Reservoir in 2015, indicating a relatively dry climate in that year. The eighth image in Figure 8a from the top left and the eighth image in Figure 8b from the top left show the largest area of the Miyun Reservoir in 2021, indicating relatively abundant precipitation in that year. In short, due to improvements in speed and processing gained by ODC and the good user interaction of Jupyter Notebook, the change monitoring and application analyses became simple and feasible. At the same time, users can also feel the necessity of developing this ODC framework.

4. Conclusions

The use of ODC grids for the storage and application of satellite imagery data is a fresh attempt for the exploration of satellite imagery data organization and management in the era of big data, which provides new approaches to this field. However, the current remote sensing data infrastructure in China cannot meet the actual requirements of remote sensing applications in terms of capacity, scalability, usability, and performance. The integration, management, and on-demand service capacity of remote sensing big data infrastructure have become the bottleneck of remote sensing science and engineering in China [50]. Therefore, a Chinese GF-1 satellite imagery data management framework based on ODC grids was implemented, which extended the ODC grid in the loading and processing support for Chinese GF-1 satellite imagery data. To verify the reliability of the proposed framework, we conducted water and vegetation monitoring experiments on the Miyun Reservoir in Beijing with multiyear GF-1 satellite imagery data. Compared with the traditional scene-based data management approach, the ODC grid-based framework significantly improved the temporal efficiency on average by 51.8% and spatial efficiency on average by 48.5% in our experiments, thus demonstrating the advantages and potential of the ODC grid as a new data management paradigm.
The proposed framework focused only on the loading and application analysis of the Chinese GF-1 satellite imagery data, and the management of other types of Chinese satellite imagery data was not considered. Therefore, the proposed framework has great room for expansion in terms of remote sensing data types, and future work may focus on the loading of more data types (including the HJ imagery data, the GF series imagery data, and the JL1 imagery data). Secondly, the proposed framework was only subjected to one case study of the Miyun Reservoir and its surrounding area, thus lacking data management analysis at a larger spatial scale. Future work may attempt to manage larger volumes of satellite imagery data (including regional, provincial, and national levels). Importantly, parallel computing and distributed technologies will be introduced to improve the efficiency of processing large volumes of satellite image data, taking into account the GEE (Google Earth Engine)’s strategy. Thirdly, the proposed framework only addressed the monitoring study of vegetation and water body changes in the study area, thus lacking experimental analysis for more application scenarios. Future work may extend the proposed framework to other applications to further validate its usability. Fourthly, the current research of the ODC grid is still limited within a few countries and regions and lacks cooperation among different ODC research teams, which is a pressing problem to be solved worldwide. Therefore, future work may consider increasing collaborations with other ODC research teams and sharing code and experience in organizing and managing Chinese imagery data using the ODC grid on platforms such as GitHub and Slack. Lastly, in future work, we will optimize the existing data preprocessing algorithms and performance comparative analysis dimensions to make the experimental results more rigorous and accurate. Compared with existing geospatial cloud computing platforms, such as Sentinel Hub, Google Earth Engine (GEE), or Microsoft Azure, the advantages of the grid framework in the paper are as follows: the core code framework of ODC grids is open-source, whereby any team or individual can download the code and customize development and use according to specific data management and application needs. Meanwhile, the disadvantage of ODC grids is that the data interoperability of ODC grids needs to be further improved. The OGC (Open Geospatial Consortium) community is also discussing the development of related protocols to standardize the data interoperability of ODC grids.

Author Contributions

Conceptualization, Q.C., G.L. and X.Y.; data curation, L.Z. and D.X.; formal analysis, T.J.; software, G.Y., H.Z. and X.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China, grant number 2019YFE0127000.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

This research thanks Yue Ma for processing some experimental data.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Overpeck, J.T.; Meehl, G.A.; Bony, S.; Easterling, D.R. Climate Data Challenges in the 21st Century. Science 2011, 331, 700–702. [Google Scholar] [CrossRef]
  2. Yao, X.; Li, G.; Xia, J.; Ben, J.; Cao, Q.; Zhao, L.; Ma, Y.; Zhang, L.; Zhu, D. Enabling the Big Earth Observation Data via Cloud Computing and DGGS: Opportunities and Challenges. Remote Sens. 2020, 12, 62. [Google Scholar] [CrossRef] [Green Version]
  3. Li, G.; Pang, L. A new age of public–oriented Earth observation development (In Chinese). Scientia Sinica Inf. 2017, 47, 193–206. [Google Scholar] [CrossRef]
  4. Anderson, K.; Ryan, B.; Sonntag, W.; Kavvada, A.; Friedl, L. Earth observation in service of the 2030 Agenda for Sustainable Development. Geo-Spat. Inf. Sci. 2017, 20, 77–96. [Google Scholar] [CrossRef]
  5. Giuliani, G.; Camara, G.; Killough, B.; Minchin, S. Earth Observation Open Science: Enhancing Reproducible Science Using Data Cubes. Data 2019, 4, 147. [Google Scholar] [CrossRef] [Green Version]
  6. Casu, F.; Manunta, M.; Agram, P.; Crippen, R. Big Remotely Sensed Data: Tools, applications and experiences. Remote Sens. Environ 2017, 202, 1–2. [Google Scholar] [CrossRef]
  7. Ross, J.; Killough, B.; Dhu, T.; Paget, M. Open Data Cube and the Committee on Earth Observation Satellites Data Cube Initiative; IAC: New York, NY, USA, 2017. [Google Scholar]
  8. Augustin, H.; Sudmanns, M.; Tiede, D.; Lang, S.; Baraldi, A. Semantic Earth Observation Data Cubes. Data 2019, 4, 102. [Google Scholar] [CrossRef] [Green Version]
  9. Vâjâială–Tomici, C.-M.; Filip, I.-D.; Pop, F. Landscape Change Monitoring using Satellite Data and Open Data Cube Platform. In Proceedings of the 2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, 3–5 September 2020; pp. 573–580. [Google Scholar]
  10. Lewis, A.; Oliver, S.; Lymburner, L.; Evans, B.; Wyborn, L.; Mueller, N.; Raevksi, G.; Hooke, J.; Woodcock, R.; Sixsmith, J.; et al. The Australian Geoscience Data Cube—Foundations and lessons learned. Remote Sens. Environ. 2017, 202, 276–292. [Google Scholar] [CrossRef]
  11. Dhu, T.; Dunn, B.; Lewis, B.; Lymburner, L.; Mueller, N.; Telfer, E.; Lewis, A.; McIntyre, A.; Minchin, S.; Phillips, C. Digital earth Australia—Unlocking new value from earth observation data. Big Earth Data 2017, 1, 64–74. [Google Scholar] [CrossRef] [Green Version]
  12. Krause, C.E.; Newey, V.; Alger, M.J.; Lymburner, L. Mapping and Monitoring the Multi-Decadal Dynamics of Australia’s Open Waterbodies Using Landsat. Remote Sens. 2021, 13, 1437. [Google Scholar] [CrossRef]
  13. Malthus, T.J.; Lehmann, E.; Ho, X.; Botha, E.; Anstee, J. Implementation of a Satellite Based Inland Water Algal Bloom Alerting System Using Analysis Ready Data. Remote Sens. 2019, 11, 2954. [Google Scholar] [CrossRef] [Green Version]
  14. Bishop–Taylor, R.; Nanson, R.; Sagar, S.; Lymburner, L. Mapping Australia’s dynamic coastline at mean sea level using three decades of Landsat imagery. Remote Sens. Environ. 2021, 267, 112734. [Google Scholar] [CrossRef]
  15. Lewis, A.; Lymburner, L.; Purss, M.B.J.; Brooke, B.; Evans, B.; Ip, A.; Dekker, A.G.; Irons, J.R.; Minchin, S.; Mueller, N.; et al. Rapid, high-resolution detection of environmental change over continental scales from satellite data—the Earth Observation Data Cube. Int. J. Digit. Earth 2016, 9, 106–111. [Google Scholar] [CrossRef]
  16. Brooke, B.; Lymburner, L.; Lewis, A. Coastal dynamics of Northern Australia–Insights from the Landsat Data Cube. Remote Sens. Appl. Soc. Environ. 2017, 8, 94–98. [Google Scholar] [CrossRef]
  17. Lucas, R.; Mueller, N.; Siggins, A.; Owers, C.; Clewley, D.; Bunting, P.; Kooymans, C.; Tissott, B.; Lewis, B.; Lymburner, L. Land cover mapping using digital earth Australia. Data 2019, 4, 143. [Google Scholar] [CrossRef] [Green Version]
  18. Honeck, E.; Castello, R.; Chatenoux, B.; Richard, J.-P.; Lehmann, A.; Giuliani, G. From a Vegetation Index to a Sustainable Development Goal Indicator: Forest Trend Monitoring Using Three Decades of Earth Observations across Switzerland. ISPRS Int. J. Geo–Inf. 2018, 7, 455. [Google Scholar] [CrossRef] [Green Version]
  19. Giuliani, G.; Chatenoux, B.; Piller, T.; Moser, F.; Lacroix, P. Data Cube on Demand (DCoD): Generating an earth observation Data Cube anywhere in the world. Int. J. Appl. Earth Obs. Geoinf. 2020, 87, 102035. [Google Scholar] [CrossRef]
  20. Chatenoux, B.; Richard, J.-P.; Small, D.; Roeoesli, C.; Wingate, V.; Poussin, C.; Rodila, D.; Peduzzi, P.; Steinmeier, C.; Ginzler, C. The Swiss data cube, analysis ready data archive using earth observations of Switzerland. Sci. Data 2021, 8, 1–11. [Google Scholar] [CrossRef]
  21. Poussin, C.; Guigoz, Y.; Palazzi, E.; Terzago, S.; Chatenoux, B.; Giuliani, G. Snow Cover Evolution in the Gran Paradiso National Park, Italian Alps, Using the Earth Observation Data Cube. Data 2019, 4, 138. [Google Scholar] [CrossRef] [Green Version]
  22. Giuliani, G.; Chatenoux, B.; De Bono, A.; Rodila, D.; Richard, J.-P.; Allenbach, K.; Dao, H.; Peduzzi, P. Building an Earth Observations Data Cube: Lessons learned from the Swiss Data Cube (SDC) on generating Analysis Ready Data (ARD). Big Earth Data 2017, 1, 100–117. [Google Scholar] [CrossRef] [Green Version]
  23. Killough, B. The impact of analysis ready data in the Africa regional data cube. In Proceedings of the IGARSS 2019-2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 5646–5649. [Google Scholar]
  24. Yuan, F.; Repse, M.; Leith, A.; Rosenqvist, A.; Milcinski, G.; Moghaddam, N.F.; Dhar, T.; Burton, C.; Hall, L.; Jorand, C.; et al. An Operational Analysis Ready Radar Backscatter Dataset for the African Continent. Remote Sens. 2022, 14, 351. [Google Scholar] [CrossRef]
  25. Yuan, F.; Lewis, A.; Leith, A.; Dhar, T.; Gavin, D. Analysis Ready Data for Africa. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 1789–1791. [Google Scholar]
  26. Mfundisi, K.; Mubea, K.; Yuan, F.; Burton, C.; Boamah, E. Acnalysing effects of drought on inundation extent and vegetation cover dynamics in the Okavango delta. Earth Space Sci. Open Arch 2021, 15, GC25B-0652. [Google Scholar] [CrossRef]
  27. Halabisky, A.M.; Mubea, K.; Mar, F.; Yuan, F.; Burton, C.; Birchall, E.; Moghaddam, N.F.; Adimou, G.; Mamane, B.; Ongo, D.O.; et al. Water Observations from Space: Accurate maps of surface water through time for the continent of Africa. Earth Space Sci. Open Archive 2022, 9. [Google Scholar] [CrossRef]
  28. Burton, C.; Yuan, F.; Ee-Faye, C.; Halabisky, M.; Ongo, D.; Mar, F.; Addabor, V.; Mamane, B.; Adimou, S. Co–Production of a 10-m Cropland Extent Map for Continental Africa using Sentinel-2, Cloud Computing, and the Open–Data-Cube. Earth Space Sci. Open Arch 2021, 10, GC451-0924. [Google Scholar] [CrossRef]
  29. Quang, N.H.; Tuan, V.A.; Hao, N.T.P.; Hang, L.T.T.; Hung, N.M.; Anh, V.L.; Phuong, L.T.M.; Carrie, R. Synthetic aperture radar and optical remote sensing image fusion for flood monitoring in the Vietnam lower Mekong basin: A prototype application for the Vietnam Open Data Cube. Eur. J. Remote Sens. 2019, 52, 599–612. [Google Scholar] [CrossRef] [Green Version]
  30. Luis, O.D.A.J.; Carlos, C.P.J.; Alfredo, S.M.H. Open Data Cube for Natural Resources Mapping in Mexico. In Proceedings of the 1st International Conference, Hangzhou, China, 20–21 April 2019; pp. 70–78. [Google Scholar]
  31. Ferreira, K.R.; Queiroz, G.R.; Vinhas, L.; Marujo, R.F.; Simoes, R.E.; Picoli, M.C.; Camara, G.; Cartaxo, R.; Gomes, V.C.; Santos, L.A. Earth observation data cubes for Brazil: Requirements, methodology and products. Remote Sens. 2020, 12, 4033. [Google Scholar] [CrossRef]
  32. Bravo, G.; Castro, H.; Moreno, A.; Ariza-Porras, C.; Galindo, G.; Cabrera, E.; Valbuena, S.; Lozano–Rivera, P. Architecture for a Colombian data cube using satellite imagery for environmental applications. In Proceedings of the Colombian Conference on Computing, Cali, Colombia, 19–22 September 2017; pp. 227–241. [Google Scholar]
  33. Gao, F.; Yue, P.; Jiang, L.; Zhipeng, C.; Liang, Z.; Shangguan, B.; Hu, L.; Zhao, S. GeoCube: A multi–source EO cube towards large-scale analysis. J. Remote Sens. 2021, 26, 1051–1066. [Google Scholar] [CrossRef]
  34. Liu, H.; Gong, P. 21st century daily seamless data cube reconstruction and seasonal to annual land cover and land use dynamics mapping–iMap (China) 1.0. Natl. Remote Sens. Bull. 2021, 25, 126–147. [Google Scholar] [CrossRef]
  35. Liu, H.; Gong, P.; Wang, J.; Wang, X.; Ning, G.; Xu, B. Production of global daily seamless data cubes and quantification of global land cover change from 1985 to 2020—iMap World 1.0. Remote Sens. Environ. 2021, 258, 112364. [Google Scholar] [CrossRef]
  36. San, A.B. Evaluation of different atmospheric correction algorithms for EO–1 Hyperion imagery. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2010, XXXVIII. [Google Scholar]
  37. Anderson, G.; Felde, G.; Hoke, M.; Ratkowski, A.; Cooley, T.; Chetwynd, J.; Gardner, J.; Adler-Golden, S.; Matthew, M.; Berk, A.; et al. MODTRAN4–Based Atmospheric Correction Algorithm: FLAASH (Fast Line–of-Sight Atmospheric Analysis of Spectral Hypercubes); SPIE: Bellingham, WA, USA, 2002; Volume 4725. [Google Scholar]
  38. Zhao, Y.; Shan, X.; Tang, P. Spatial Consistency Analysis and Relative Geometric Correction of Low Spatial Resolution Multi–source Remote Sensing Data. Remote Sens. Technol. Appl. 2014, 29, 155–163. [Google Scholar] [CrossRef]
  39. Zhao, Y. Principles and Methods of Remote Sensing Application Analysis (Chinese Edition); Science Press: Beijing, China, 2003; pp. 413–416. [Google Scholar]
  40. PostgreSQL: The World’s Most Advanced Open Source Database. Available online: https://www.postgresql.org/ (accessed on 1 March 2022).
  41. The Official YAML Web Site. Available online: https://yaml.org/ (accessed on 1 March 2022).
  42. Unidata|NetCDF. Available online: https://www.unidata.ucar.edu/software/netcdf/ (accessed on 1 March 2022).
  43. Rew, R.; Davis, G. NetCDF: An interface for scientific data access. IEEE Comput. Graph. Appl. 1990, 10, 76–82. [Google Scholar] [CrossRef]
  44. Li, M.; Wu, B.; Yan, C.; Zhou, W. Estimation of Vegetation Fraction in the Upper Basin of Miyun Reservoir by Remote Sensing. Resour. Sci. 2004, 26, 153–159. [Google Scholar]
  45. Zhang, B. Analysis on the change trend of water quality of Miyun reservoir introduced from the middle route of Sout-to–North water transfer Project. Beijing Water 2018, z2, 27–32. [Google Scholar] [CrossRef]
  46. Crippen, R.E. Calculating the vegetation index faster. Remote Sens. Environ. 1990, 34, 71–73. [Google Scholar] [CrossRef]
  47. Gao, B.-C. Normalized difference water index for remote sensing of vegetation liquid water from space. In Proceedings of the Imaging Spectrometry, Orlando, FL, USA, 12 June 1995; pp. 225–236. [Google Scholar]
  48. McFeeters, S.K. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
  49. GitHub—Pixelb/ps_mem: A utility to Accurately Report the in Core Memory Usage for a Program. Available online: https://github.com/pixelb/ps_mem (accessed on 1 March 2022).
  50. Guoqing, L.; Zhenchun, H. Data infrastructure for remote sensing big data: Integration, management and on-demand service. J. Comput. Res. Dev. 2017, 54, 267. [Google Scholar] [CrossRef]
Figure 1. The Chinese GF-1 spatial and temporal satellite imagery data management framework.
Figure 1. The Chinese GF-1 spatial and temporal satellite imagery data management framework.
Applsci 12 07816 g001
Figure 2. The UML diagram of the five-table ODC structure.
Figure 2. The UML diagram of the five-table ODC structure.
Applsci 12 07816 g002
Figure 3. The process from source data to indexed data.
Figure 3. The process from source data to indexed data.
Applsci 12 07816 g003
Figure 4. (a) Massive GF-1 WFV imageries from April 2013 to April 2022 covering Beijing province; (b) the selected study area, the Miyun reservoir, located in Beijing; (c) the GF-1 WFV imageries covering the Miyun reservoir (Date: 22 October 2021).
Figure 4. (a) Massive GF-1 WFV imageries from April 2013 to April 2022 covering Beijing province; (b) the selected study area, the Miyun reservoir, located in Beijing; (c) the GF-1 WFV imageries covering the Miyun reservoir (Date: 22 October 2021).
Applsci 12 07816 g004
Figure 5. ODC grid-based and scene-based approaches processing time consumption and time efficiency improvement. (a) NDVI calculating time consumption (s) based on ODC grid-based approaches (red bars) and scene-based approaches (green bars). (b) NDWI calculating time consumption (s) based on ODC grid-based approaches (red bars) and scene-based approaches (green bars). (c,d) Time efficiency improvement percentage (%) of the ODC grid-based approach over the scene-based approach for calculating NDVI and NDWI, respectively.
Figure 5. ODC grid-based and scene-based approaches processing time consumption and time efficiency improvement. (a) NDVI calculating time consumption (s) based on ODC grid-based approaches (red bars) and scene-based approaches (green bars). (b) NDWI calculating time consumption (s) based on ODC grid-based approaches (red bars) and scene-based approaches (green bars). (c,d) Time efficiency improvement percentage (%) of the ODC grid-based approach over the scene-based approach for calculating NDVI and NDWI, respectively.
Applsci 12 07816 g005
Figure 6. Memory consumption and memory efficiency improvement of the scene-based and ODC grid-based approaches. (a) NDVI calculating memory consumption (MB) based on ODC grid-based approaches (red bars) and scene-based approaches (green bars). (b) NDWI calculating memory consumption (MB) based on ODC grid-based approaches (red bars) and scene-based approaches (green bars). (c,d) Memory efficiency improvement percentage (%) of the ODC grid-based approach over the scene-based approach for calculating NDVI and NDWI, respectively.
Figure 6. Memory consumption and memory efficiency improvement of the scene-based and ODC grid-based approaches. (a) NDVI calculating memory consumption (MB) based on ODC grid-based approaches (red bars) and scene-based approaches (green bars). (b) NDWI calculating memory consumption (MB) based on ODC grid-based approaches (red bars) and scene-based approaches (green bars). (c,d) Memory efficiency improvement percentage (%) of the ODC grid-based approach over the scene-based approach for calculating NDVI and NDWI, respectively.
Applsci 12 07816 g006
Figure 7. NDVI and NDWI time series of sampling sites on the Miyun Reservoir. (a) Annual change curves of NDVI time series for the selected sample points (116.987° E, 40.534° N) from 2014 to 2021. (b) Annual change curves of NDWI time series for the selected sample points (116.945° E, 40.534° N) from 2014 to 2021.
Figure 7. NDVI and NDWI time series of sampling sites on the Miyun Reservoir. (a) Annual change curves of NDVI time series for the selected sample points (116.987° E, 40.534° N) from 2014 to 2021. (b) Annual change curves of NDWI time series for the selected sample points (116.945° E, 40.534° N) from 2014 to 2021.
Applsci 12 07816 g007
Figure 8. Spatial distribution of NDVI and NDWI in and around the Miyun Reservoir from 2014 to 2021.
Figure 8. Spatial distribution of NDVI and NDWI in and around the Miyun Reservoir from 2014 to 2021.
Applsci 12 07816 g008
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Cao, Q.; Li, G.; Yao, X.; Jia, T.; Yu, G.; Zhang, L.; Xu, D.; Zhang, H.; Shan, X. GF-1 Satellite Imagery Data Service and Application Based on Open Data Cube. Appl. Sci. 2022, 12, 7816. https://doi.org/10.3390/app12157816

AMA Style

Cao Q, Li G, Yao X, Jia T, Yu G, Zhang L, Xu D, Zhang H, Shan X. GF-1 Satellite Imagery Data Service and Application Based on Open Data Cube. Applied Sciences. 2022; 12(15):7816. https://doi.org/10.3390/app12157816

Chicago/Turabian Style

Cao, Qianqian, Guoqing Li, Xiaochuang Yao, Tao Jia, Guojiang Yu, Lianchong Zhang, Dan Xu, Hao Zhang, and Xiaojun Shan. 2022. "GF-1 Satellite Imagery Data Service and Application Based on Open Data Cube" Applied Sciences 12, no. 15: 7816. https://doi.org/10.3390/app12157816

APA Style

Cao, Q., Li, G., Yao, X., Jia, T., Yu, G., Zhang, L., Xu, D., Zhang, H., & Shan, X. (2022). GF-1 Satellite Imagery Data Service and Application Based on Open Data Cube. Applied Sciences, 12(15), 7816. https://doi.org/10.3390/app12157816

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop