Achieving the Full Vision of Earth Observation Data Cubes
Round 1
Reviewer 1 Report
Overall this is a well written paper that provides a clear outline of the needs and challenges of providing efficient access to large collections of Earth observation data through the development of EO data cubes and services built on top of those data cubes. There were some minor editorial fixes that can be made (highlighted in red in the attached manuscript). The primary substantive comments on the manuscript are as follows:
Lines 380-382 This statement should be checked as there is a theoretical capability for users to select bands from a multi-band source published using the OGC WMS service - according to the specification. This is theoretical as it is dependent upon a specific WMS server implementing optional SLD support as part of their WMS service package.
Results (Case studies) The case studies seem a bit brief. They could be improved with some reference to the performance of the services and/or a brief comparison with a similar result obtained (if possible) using a more "traditional" data integration and analysis workflow.
Lines 535-537 Information (links to repositories, web sites, etc.) for how to access the referenced scripts and other materials should be provided here.
Comments for author File: Comments.pdf
Author Response
Comment 1: Overall this is a well written paper that provides a clear outline of the needs and challenges of providing efficient access to large collections of Earth observation data through the development of EO data cubes and services built on top of those data cubes. There were some minor editorial fixes that can be made (highlighted in red in the attached manuscript). The primary substantive comments on the manuscript are as follows:
Reply 1: Thank you for finding those, the red highlighted changes have been made as well as additional typo, grammatical, and clarity changes.
Comment 2: Lines 380-382 This statement should be checked as there is a theoretical capability for users to select bands from a multi-band source published using the OGC WMS service - according to the specification. This is theoretical as it is dependent upon a specific WMS server implementing optional SLD support as part of their WMS service package.
Reply 2: You are correct, I was not aware WMS SLD supported band specification. The section has been updated accordingly.
Comment 3: Results (Case studies) The case studies seem a bit brief. They could be improved with some reference to the performance of the services and/or a brief comparison with a similar result obtained (if possible) using a more "traditional" data integration and analysis workflow.
Reply 3: Example 1 has approximately doubled in length, now referring to more to the usage workflow and performance, compared to a traditional workflow of downloading imagery for local use.
I also added a new paragraph at the end of the examples and a graphic to help summarize the overall story of the 3 examples. The pattern of usage and comparison to how this would be done a few years ago is the same for all 3 examples.
The next idea to add more detail to this would be to provide a details server system architecture design of one example. The authors discussed decided early in the paper to avoid such a diagram because it would require significant explanation for a reader not familiar with Amazon AWS system design. However we could add this if you think it would prove beneficial.
If I missed your point of what you were asking for on this one please let me know.
Comment 4: Lines 535-537 Information (links to repositories, web sites, etc.) for how to access the referenced scripts and other materials should be provided here.
Reply 4: Training materials reside in learn.arcgis.com. I provided a link to a subsection specific to UN Sustainable Development Goals. The scripts developed for this exercises are not organized into a single repository at this time. We anticipate creating a central repository of Python raster types for integrating with a variety of data cubes but that does not exist at this time. Until then readers can reach out to us directly.
Reviewer 2 Report
You present a very valuable (high level) overview of EO data cubes and related technologies. However, the paper would benefit from the following add-ons:
a more thorough description of the data cube concept in section 1, followed by some brief examples
some figures in section 2 (e.g. in 2.1.2) would improve the section's intelligibility
some detailed sample function chains or analysis processes in sections 2.3 and 2.4 would help to get a more "hands-on" understanding
a clearer distinction between the "abstract" concept of a data cube, its implementation and its actual deployment and provision would not only help to get a better understanding of the underlying concepts, but would also help to identify the "open" parts; it would be helpful to have a figure to contrast these perspectives
Author Response
Comment 1: a more thorough description of the data cube concept in section 1, followed by some brief examples
Reply 1: I added a few paragraphs and new illustration to section 1 of the paper to explain the generic data cube concept and how that specifically manifests itself as an earth observation data cube.
Comment 2: some figures in section 2 (e.g. in 2.1.2) would improve the section's intelligibility
Reply 2: A new graphic and supported text has been added to 2.1.2 in an attempt to demystify the path/row/tile/projection/zone discussion. Doing justice to this topic could be a full article on its own. Roy et al https://doi.org/10.1080/2150704X.2016.1212419 does quite a good job explaining this and is cited as such. I can envision created a few more illustrations and expanding this to another page of text if you feel current content plus citation is insufficient.
Comment 3: some detailed sample function chains or analysis processes in sections 2.3 and 2.4 would help to get a more "hands-on" understanding
Reply 3: Two code snippet graphics were added and text updated accordingly to illustrate examples of the analysis processes.
Comment 4: a clearer distinction between the "abstract" concept of a data cube, its implementation and its actual deployment and provision would not only help to get a better understanding of the underlying concepts, but would also help to identify the "open" parts; it would be helpful to have a figure to contrast these perspectives
Reply 4: Very valid point I had missed that the generic data cube concept was never defined. This was added in section 1 as described in Reply 1 above. Additionally, Figure 3 involving data ingest in creating a cube with standard input data files, and Figure 9 illustrating the conceptual framework of the example projects publishing standard web services and APIs may be helpful in getting to what you are asking?
Regarding “open” this is a big topic I tried to scatter through the paper because there are open hooks in the workflow for people to modify or add what they want at almost every step. I included a few links as examples for context. Starting at the beginning, we use GDAL for data I/O as described relative to our format and compression tests in support of the new Table 1. Supported formats can be extended in the same way one adds a new format to GDAL. Raster types are written in Python. Analysis functions and geoprocessing tools can be authored and extended using Python and R libraries as well as C++ and other languages. (link to repo of custom raster functions) Web services can be published in common OGC formats as well as REST. Web applications are written in Javascript. Customers use our drag and drop app builders to get started, but if they want to get fancy there is a free developer edition which allows them to write their own custom GUI they can add to the system and share openly if they wish. (link to community of people building and sharing custom web app widgets)