Squeezing Backbone Feature Distributions to the Max for Efficient Few-Shot Learning
Round 1
Reviewer 1 Report
The paper is a minor follow up to previously published work https://arxiv.org/pdf/2006.03806.pdf (Hu et al 2021) which shares many ideas with the current paper, including content such as figure, tables, design of experiments etc.
Here, authors propose an approach to few shot classification of pictures that includes feature preprocessing using PEM and a framework for optimal transport using Sinkhorn algorithm. Similar ideas were already described in Hu et al 2021. The novelty of the current paper is the fact that the current approach does not require any explicit prior about data distribution between classes and other minor tweaks.
Some remarks:
1. I do understand the meaning of 'backbone' as explained in line 71 (pre-trained feature extractor). But I do not understand the explanation in line 39 (well-thought-out transfer architectures) - this is confusing. The whole explanation in lines 39-41 is not clear.
2. In line 162 the reference to section number is broken (??)
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The authors consider a very interesting problem of squeezing backbone feature distributions to the max for few-shot learning, which is especially important in the absence of representative datasets for classifier training. Typically, transferring knowledge acquired on another task with a pretrained feature extractor is used to circumvent the limitations of few-shot learning. To avoid class balancing dilemmas, the Authors proposed a novel transfer-based method that is able to cope with both inductive casesband transductive cases.
There are only minor shortcomings in this very well-written manuscript, which the authors can correct themselves.
1. The description of the article organization is missing (at the end of the Introduction section). At the end, there is also no brief presentation of the directions for further research (in the last section, Conclusions)
2. When assessing the quality of the new method, the focus has generally been on accuracy. The value of the area under the ROC curve is also given. What about F1? Maybe it is also worth providing training or prediction times?
3. Typos:
a) line 152: "more details in Section ??";
b) line 154-155: Text lines are not numbered;
c) Table 8: Proposed => proposed.
Overall, the manuscript is very well written and the above shortcomings can be corrected by the authors without the need for a further review.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf