Next Article in Journal
A Survey on Monitoring Quality Assessment for Wireless Visual Sensor Networks
Next Article in Special Issue
Automatic Detection of Sensitive Data Using Transformer- Based Classifiers
Previous Article in Journal
Zero-Inflated Patent Data Analysis Using Generating Synthetic Samples
Previous Article in Special Issue
Analytical Modeling and Empirical Analysis of Binary Options Strategies
 
 
Article
Peer-Review Record

Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval

Future Internet 2022, 14(7), 212; https://doi.org/10.3390/fi14070212
by Kangying Li 1,*, Jiayun Wang 2, Biligsaikhan Batjargal 1 and Akira Maeda 3,*
Reviewer 1:
Reviewer 2:
Future Internet 2022, 14(7), 212; https://doi.org/10.3390/fi14070212
Submission received: 9 June 2022 / Revised: 3 July 2022 / Accepted: 5 July 2022 / Published: 18 July 2022
(This article belongs to the Special Issue Big Data Analytics, Privacy and Visualization)

Round 1

Reviewer 1 Report

This paper describes the exploration of an Information Retrieval System (IRS) for Japanese Ukiyo-e print retrieval artwork. Existing IRSs use color pickers or similar approach for retrieval. The authors of this paper propose an approach that allows query formulation based on words.

 

The research question and implementation are very interesting. Many thanks for this! The structure of the paper is generally good. The chosen approach in general also (the datasets seem to be a bit small). A few questions would be interesting to address: Can this approach be generalized? What are the next steps? 

 

 

Sections 1: The introduction section reads very smoothly and well. I enjoyed reading it very much. However, the outline could be improved, as for me it contains a mixture of state of the art science and technology and motivation. This could be divided up even better. The example with the three pictures is very interesting. I would call the comparison of the commercial search engines preliminary work and place it in the chapter evaluation. Very entertaining and pictorial examples in chapter 1.2. I was pleased that the authors picked up on individual perception in search. Chapter one should be completed at the end with a description of the structure of the work. 

 

Chapter 3 has no introduction, I think it should be titled "Architecture" rather than "Methodology". After all, no methodology is described. The section is quite fragmented and the description of the technologies used is very superficial. Although the many pictures are very helpful, it would be worth considering taking out some pictures in favor of more text. The aplied approach seems to be sound but I miss some more detailed description.

The fourth chapter also feels very fragmented due to the many very small subsections.

 

Author Response

Thank you for allowing us to submit a revised draft of the manuscript “Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval”.

Below, we have attached our response outlining the explanations that will accompany the revised version of our paper.

We highly respect the time and effort that you committed to contributing us with feedback on our document. We couldn’t be more thankful for your essential opinions in providing us with the principle improvements we needed for our paper. We would like to communicate to you our appreciativeness for your alterations that you offered us which have benefited us considerably. 

Author Response File: Author Response.pdf

Reviewer 2 Report

 

The authors of the article “Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval” propose a framework to search digital artworks based on text queries and find the rare colors. Unlike previous research works, where we have seen the use of RGB or HSL values to find relevant images, this research work takes into consideration human senses and makes use of textual description for colors, i.e., it supports queries like “sea” to return images in blue color.

The article is very well structured. Section 1 introduces the problem, clearly explaining the various challenges associated with the image search by color names or values. Section 2, though very short presents some relevant related works. Section 3 presents the methodology, where they present the key components of their proposed framework. Experiments, the test data, and the associated results are described in Section 4. Section 5 concludes the article.

Compared to section 1, I feel that sections 2 and 3 need some improvement. We see the use of various language models like BERT in the experiments section (4), but I do not find a detailed discussion regarding these models in the related works section (2). Researchers are increasingly using language models for a wide variety of tasks, including identifying the colors as seen in section 4 of the article.

Section 3 presents the components in a disproportionate manner. The authors describe some components in detail with examples, whereas some others are described very briefly. Take, for example, it is not very clear to me the role played by HOG (section 3.1.2) and Triplet Data sampling. Taking into consideration Figure 7, I do not completely understand Triplet Data sampling. Is it a component completely separate from the input image? I do not see any link between the input image and the anchor/positive/negative images. It is also unclear what role the Euclidean distance plays in calculating the distance between the anchor and positive/negative images.

Section 3.3 presents a number of equations, but many of the terms are not very clear. For example, Iimg and Tcolor in equation 4. I would suggest the authors check equations 4 and 5 and present in detail the significance of various terms.

In section 3.3.2, I see that the anchor image is the same as the anchor image, which is not the case in Figure 7. I think that this section needs some more explanation since it is unclear what role does this component play in the framework.

In section 4, it’s not very clear why the authors chose BERT model for comparison. Furthermore, the authors talk about high cost of computation behind their decision to choose only 1000 images. However, there is no prior discussion on this computation or the computation cost of various components. I would suggest the authors to add a brief description to explain the time required to obtain the required values for one input image.

Author Response

Thank you for allowing us to submit a revised draft of the manuscript “Intuitively Searching for the Rare Colors from Digital Artwork Collections by Text Description: A Case Demonstration of Japanese Ukiyo-e Print Retrieval”.

Attached is our response which details the explanations that go hand in hand with the revised version of our paper. 

We greatly value the time and effort that you set aside providing feedback on our manuscript and we are appreciative of your vital perspective and suggestions. We would like to convey our gratefulness for your advice and assistance in amending our paper. 

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Firstly, I would like to thank the authors for taking into consideration my review comments. They have provided an updated version with the required changes. The main changes include an introduction to CLIP in section 1, a brief introduction to HOG and its relevance in this work, a discussion, and an introduction to the experiments with BERT models. They have also properly defined the terms in the different equations. They have also explained the datasets, especially the size. They have also improved the conclusion section elaborating the possible future works.

 

Minor remarks:

Section 2.1: Suffur from monomodal design -> suffer from monomodal design

Section 4: insection 4.2 -> in section 4.2

Back to TopTop