Fast and Effective Retrieval for Large Multimedia Collections
Round 1
Reviewer 1 Report
Authors tackle the interesting problem of indexing and retrieval of multimedia data. They proposed a graph-based technique that utilizes text, audio, image, and video features. The authors need to address the following concerns:
- The paper begins with the following sentence: “Indexing and Retrieval of Multimedia is generally implemented by employing feature graphs. ”. However, the graph-based approach is one of the major approaches to tackle this problem. For instance, there are advanced techniques utilizing similarity search to solve the mentioned problem [1]. The authors should consider such approaches in their related work and experiments. Moreover, they should emphasize the advantages their method brings compared to the state-of-the-art methods.
- Authors claim there is not a unifying framework to fuse and integrate text, audio, image, and video. However, there are several interesting works on multi-model learning [2] that can be used with similarity search techniques [1] to tackle this problem.
- Authors need to use some benchmark to evaluate the effectiveness of their proposed model.
- Authors need to address typos. For example, a reference number in the pages 17 is missing,
References:
[1] Johnson, Jeff, Matthijs Douze, and Hervé Jégou. "Billion-scale similarity search with gpus." IEEE Transactions on Big Data (2019).
[2] Alayrac, Jean-Baptiste, et al. "Self-supervised multimodal versatile networks." arXiv preprint arXiv:2006.16228 (2020).
Author Response
Dear Reviewer, first of all thanks for your support and feedback to our paper. I highly appreciated your two additional sources belonging to our topic. As mentioned by you, I added them, integrated them in the State of the art and described the differences to our approach. I also addressed the topic of benchmark in more detail in the evaluation section of the paper and hope, you will find the changes appropriate. A spell-check and language correction has been performed, as well.
Thank you very much!
Stefan.
Reviewer 2 Report
In this paper, the authors present a graph/matrix-theoretic model for multimedia collection retrieval, similarity, and queries, via so-called graph codes. The authors present details of their mathematical model, and give experimental results.
I cannot comment on the significance of the work, I however am familiar with the graph-theoretic Mathematics being invoked in the paper. The paper presents no formal results proving efficacy and efficiency of the work, so I cannot comment on its true (mathematical) effectiveness, algorithmically speaking. The authors do give considerable time to explain their ideas, which may be of value to another audience more in this community.
Below I give some main comments:
-Commonly there is an issue with the casing of certain words. Very often terms that are not referring to a whole area, a proper noun, or other things of this nature, are capitalized, when they should in fact be written with lowercase letters. For example, in some contexts "multimedia" should not be capitalized, while in others (as in an area), may be capitalized. Examples include:
----Page 2: "Images, Videos, Text, Audio"
----Page 3: "Metadata and Annotations"
----Words commonly capitalized when they should be lowercase, check instances of when this occurs and see if it may be applicable: Images, Videos, Text, Audio, Multimedia, Social Media, Documents, Multimedia Collections, Multimedia Big Data, Machine Learning, Retrieval Algorithm, Basic Boolean Algorithm, Machine learning (notice casing), Neural Network, Multimedia Assets, Adjacency Matrix, Valuation Matrix/Matrices, Multimedia Features,
-Page 3, Section 2.2, paragraph 1: If the graph is a directed graph, maybe define the MMFG as a directed graph right away in the paragraph? It makes things less confusing to the reader, as it begins less specific, talking about "graph structures". Perhaps I am mistaken, but it would be important to be as specific as possible so the reader understands the scope of the work.
-The next set of comments include Reference 24, and a bit of issues around not defining or summarizing things to a point where I could understand them properly, despite being a graph algorithms expert:
----Reference 24 is to a ScienceDirect page which is a collection of definitions of adjacency matrices and definitions, from various sources. Please cite a specific source, the one you wish to cite. It includes in each section the author and paper the writeup is from. Replace any usage of Reference 24 with this more specific reference, where applicable.
----Bottom of Page 3, top of Page 4: Is there any special meaning to "valuation matrices", beyond encoding the weights of the arcs/edges in the adjacency matrix? If not, I suggest, for the sake of ease for the reader, to just call it an adjacency matrix on a weighted digraph, unless you want to do something more specific (if so, define it more exactly); your audience will likely understand what this means. I have not heard of the term 'valuation matrix' used in this context, nor is it used in Reference 24. The paper cites reference 24, which does not define a 'valuation matrix'. Would there be any benefit in defining more carefully what an adjacency matrix and 'valuation matrix' is, in exact terms? They seem critical in later parts of the paper.
----Page 4, paragraph 1. What is the "Eigenvalue Method"? Your reference (Reference 24), never states what the "Eigenvalue Method" is. The meta-article given does describe things about eigenvalues with respect to these classes of matrices, but not an "Eigenvalue Method".
I did not carefully examine the example in Section 3. I did not review Section 5.
The paper should be run through a spell checker, in case I missed other typos. I also suggest the references are checked carefully, I comment on one of the references below (Reference 24).
Specific Comments
=================
Title
-----
-Should the title of the paper read: "Fast and Effective Retrieval for Large Multimedia Collections", not "Fast and Effective Retrieval for large Multimedia Collections"?
Abstract
--------
-Line 7: The word "mandatory" may be inappropriate (as somebody CAN in fact do this without such, if not, a mathematical proof should be presented or cited), maybe use a word like "ideal" or "desirable".
-Line 11: The word "prove" is not precise and can be misleading, use a more careful phrase instead, such as "demonstrate experimentally".
-Line 16: "prove a significant increase in efficiency and effectiveness" -> With respect to what? Add that here.
Section 1
---------
-Page 1, last line: What is the definition or expectation when describing "acceptable"? I suggest putting in clear terms what "acceptable" is or is not prior to using a subjective term like this. I would just recommend removing the phrase altogether and stitch together the last part of the sentence that says exactly what is intended, instead. The point of the sentence is to say that it takes minutes to compute, and that the proposed paper seeks to remedy this.
-Page 2, paragraph 1, line 10: Write "algorithmic" instead of "algorithmical".
Section 2
---------
-Page 2, Section 2.1, paragraph 2: What is the "Basic Boolean Algorithm"? Cite the paper that introduces this algorithm.
-Page 3, paragraph 1: When one writes "has increased the effectiveness of retrieval a lot", it does not add much precision or usefulness to the statement. A more precise statement about how much the effectiveness increases would be more helpful for the reader (e.g. numeric figures, formal claims, some guarantee of how much better it is), "a lot" is not a scientific term in how it is used here. I would recommend not writing this, I don't know what "a lot" means within this context. If it was experimentally observed, then state this, and instead of stating “a lot”, write in more precise language what improved.
-Page 3, paragraph 2: In the sentence where "Shot Detection" is first discussed, it begins with a reference; it is not common to begin sentences with a reference, I suggest rephrasing this sentence, "shot detection" is never discussed in the paper up until this point. Maybe a sentence should come before mentioning what this is, then it would ease the reader into determining if the reference is relevant for them.
-Page 3, 2nd last paragraph: Check spacing around "and / or", there should be no space between "and" and "or" with the "/".
-Page 3, last paragraph: "...all of these features is not existing" -> Do you want to say instead that presently it does not exist? If so, rephrase this.
-Page 4, Figure 1: Should the reference at the end of the caption be after the period, or before? The journal requests citations come before the punctuation.
-Page 4, Section 2.3, middle of paragraph: Is there a particular author of the "Eigenvalue Method" described? That is, can you write "the eigenvalue method of ____ [CITATION]"?
-Page 4, Section 2.3, middle of paragraph on page: Do you mean O((n+e)^2)? Do not put the exponent outside the Big-Oh notation, if it is not something else.
-Page 4, Section 2.3, last line: O(n+e) + O(1) is O(n+e), I do not understand why O(1) is present. If there is a mistake in how it is written, just write O(n+e).
-Page 4, Section 2.4, line 3: I suggest not using the phrase "any graph", be as specific to the kinds of graphs you mean, when you write "any graph"? Some may interpret this statement to include all sorts of classes or types of graphs, e.g. multigraphs. If you mean, simple graphs, say "any simple graph", for instance.
-Page 4, 2nd last line: Insert a comma, as follows, "Valuation matrices contain one row and column for each node, always resulting in square matrices."
-Page 4, last line: Instead of saying "between nodes", say instead "incident on nodes".
-Page 5, line 1: Instead of position, say row and column maybe? Perhaps say "... at row n_1 and column n_2, i.e. position/coordinate/edge (n_1,n_2)."
-Page 5, "Edge types are coloured according to their edge types" -> This sentence does help the reader, do you mean instead that "Edges are coloured according to their edge types?" If not, correct it.
-Page 6, line 4: "prove" -> "demonstrate"
-Page 6, Section 2.5, line 2: "prove" -> "demonstrate"
-Page 6, bullet listing: Consider the casing of items listed here. Should anything here be capitalized?
-Page 6, 2nd last paragraph: "Thus, high-resolution dataset..." -> "Thus, a high-resolution dataset..."
-Page 6, last paragraph: I suggest deleting this whole paragraph, it introduces no new information.
-Page 7, line 1: "... a sufficient set of in our view appropriate algorithms..." -> "... a sufficient set of, in our view, appropriate algorithms..."
-Page 7, 3rd last line of section: "one major open challenge" -> Is the word "major" necessary? Why is this a "major" one, as opposed to just an open challenge? Perhaps more discussion can be had here, or reference another part of the paper where this is properly discussed?
-Page 7, last line of section: "the solution" -> Can you be more specific than this here? "The solution" may mean a couple of things.
Section 3
---------
-Page 7, paragraph 1: "For graphs like the MMFG, a metric for similarity would be e.g. the Cosine Similarity" -> "For graphs like the MMFG, a metric for similarity would be, for example, the cosine similarity metric".
-Page 10, paragraph 2: "better to apply aa much smaller dictionary" -> "better to apply a much smaller dictionary"
-Page 11, 2nd-last line of page: "ration" -> "ratio".
-Page 12, line 1: Does |v| get used in the formulae below it? Please check this carefully. If not, I suggest removing this sentence.
-Page 15: "Thus, the comparison of two Graph Codes can be done in O(1)". This is unclear, at least with how it reads, if true. I suggest, if true, to explain using 2-3 sentences how the graph code is calculated in O(1) time, with the given conditions. I strongly suggest explaining this further, the sentence preceding it does not justify the claim on its own. If it is seen as "obvious", just 'spell it out' for the reader with a concise explanation. If this claim cannot be justified in this manner, remove it. It is not necessary for the claims being made in the paper, as no claim of formal time complexity is made in the paper, it is all experimental.
Section 4
---------
-Page 17, Section 4.1, line 2: "java implementation" -> "Java implementation"
-Page 17, Section 4.1, paragraph 2, line 2: Citation is missing [?].
-Page 18, Figures 7 and 8: Would it look nicer to typeset the code snippets?
-Page 18, line 2: "Pseudo-Code" -> pseudocode
-Page 19, Figure 10: "Highresolution image" -> "High-resolution image".
-Page 20, Table 11: Earlier in the paper "n" and "e" were used to define number of nodes and number of edges, respectively. Would it make sense to include the "#" in front of each? Or is "n" used in a different context elsewhere in the paper?
Section 6
---------
-Remove the first sentence, it does not seem to add anything meaningful to the section. Regardless, your paper cites another paper to properly define many aspects of the work, mathematically speaking at least.
-Line 3: Maybe insert the phrase "for Graph Codes", between "to graph-traversal operations" and ", by"?
-Line 4: As stated earlier in my review, I recommend using more precise language, the paper doesn't "prove" things mathematically (important, given that this involves a mathematical model) about effectiveness (if so, it would have theorems formally stated, with written proofs in the paper). Use phrases like "demonstrated" or "demonstrated experimentally" instead.
-Paragraph 2, line 1: "focussing" -> "focusing"
-Paragraph 2, sentence 2: I suggest rephrasing this sentence, it is a bit strangely phrased, at least in my opinion.
-Paragraph 2, last sentence: "Highly efficient" is not precise in how it is being used here. How is it "highly efficient"? Imagine if somebody were just reading the conclusion and wanted to know what your work demonstrates, what did you demonstrate? This work is mostly experimental, so the reader will want to know, in brief terms, what was demonstrated numerically. What is the statistic you want the reader to walk away from the paper with? I suggest, if you have a mathematical statement backing this statement, that you write it. If it is a strictly experimental claim, state some numeric statistics from the previous section to back this claim. It will read far better.
Author Response
Dear Reviewer,
first of all, thank you very much for your time and effort to provide feedback to our paper. I appreciate it very much, especially the detailed support in terms of phrasing and optimizing the language - that was a great help. We - of course - had an additional spell- and language-check now, and integrated all of your comments and remarks in that regard.
In addition to that, I would like to comment on some of your other remarks, as well:
- the topic of adjacency matrixes, eigenvalue-method, etc. has been reworked and sources for the various types of mathematical element have been added.
- your suggestions, like "prove" vs "demonstrate" have been reviewed and reworked
- a further definition and source of "acceptable" has been provided
- O-notation has been reworked
- the source-code snippets have been replaced by inline typesets
- some more smaller corrections
I hope, this reworked version finds your approval.
Thanks again for your effort, best regards
Stefan.
Round 2
Reviewer 1 Report
The authors addressed some of the major concerns. However, there is a major concern regarding the effectiveness of their approach with regard to the state-of-the-art deep learning retrieval techniques. If the title does not include 'effective, and the rest of the paper the authors mention the main benefit of their framework is being fast, this major concern is addressed!
Author Response
Dear Reviewer, thank you for your comments. In regards of the topic with "effectiveness" I want to point out, that we provided various Precision & Recall experiments in chapter 5, conducted on two different datasets. Our results demonstrate, that the effectiveness of the concept is up to 15% higher than other experiments on the same dataset (see e.g. page 25 in the middle). Therefore, I do not fully agree (or understand), that we should remove the term "effective" from the document. I will upload a revised version, maybe you can have an additional look at the mentioned sections. I really hope, this convinces you in terms of effectiveness. Thank you very much in advance. Stefan.
Reviewer 2 Report
The authors appear to have made appropriate changes based on most of my comments. I cannot comment on the changes made in the Sections I did not previously review.
Below I provide additional comments:
-Page 4. My comment about the "Eigenvalue Method" I feel has not been properly addressed. I still do not know what is the "Eigenvalue Method" is. The reference was indeed changed, but the article from Wolfram Mathworld does not state what the "Eigenvalue Method" is. I suggest, if this is indeed a reference for this concept, that better, more exact terminology be used. There are at least a couple different algorithms (methods) mentioned by the source, is it one of the algorithms mentioned there? If so, state exactly what is being computed. Is this about computing the eigenvalues and eigenvectors of a matrix? If so, state this instead. Otherwise, provide a source that uses this exact phrase and who/whom came up with the "Eigenvalue Method".
-Page 26, Section 6, paragraph 2, sentence 2: "As we have been focusing on images in this paper, further research in regards of the application of... have to be made". Can this sentence be rephrased? Some usage of "of" might be replaced with "to", and stating research as an action will be more helpful, i.e., "have to be made" -> "remains to be conducted".
-Reference 31: I noticed it reads as "Mark Needham (2019)... Publisher: nullISBN:...", is the "null" something that should be there?
Author Response
Dear Reviewer,
thank you for your further comments. I rephrased the section of the Eigenvalue method and gave a more detailed description, what it does, what it is used for, and why it is important for the paper.
Furthermore, I changed the sentence in section 6, paragraph 2 to make it clearer, what we mean with it - I fully understand your remark here (thank you).
Finally, those "nullSBN" typos have been replaced correctly - sorry for that mistake.
I hope, you can now give your approval to the paper. In case of any comments, to not hesitate to contact me. Thanks a lot for your effort!
Stefan.