Learning DOM Trees of Web Pages by Subpath Kernel and Detecting Fake e-Commerce Sites
Round 1
Reviewer 1 Report
In order to detect fake e-commerce websites (and could be extended to other sites) the authors propose to analyse DOM tress and recognize patters related to fake websites using the SubPath Kernel.
The positive value of the paper is the usage of real data coming from one of the biggest e-commerce platform in Japan to test the relevance and the efficiency of the approach. the authors were also able to rely on a decent amount of fake websites to train their model to recognize fake and legitimate websites. The results are very interesting.
The introduction and the problem statement are not clear for the reader. At the end of the introduction it is very hard to identify which problem this paper is tackling. Is it and enhancement of the SubPath Kernel ? is it a solution to identify fake websites using SubPath Kernel ? is it both ? The problematic should be clearly explained and the pain points clearly identified.
The state of the art is also missing clarification, especially with regards to other solutions proposed in the litterature to identify fake websites. Are the other approaches inefficient ?
Many references like "Fake-website detection tools: Identifying elements that promote individuals' use and enhance their performance" or "Detecting fake websites: the contribution of statistical learning theory" are missing in the state of the art. What is the difference in terms of precision, false positive rate or performance.
Author Response
Thank you for your valuable comments. We have submitted our responses to your review.
Best,
Kilho Shin, corresponding author
Author Response File: Author Response.pdf
Reviewer 2 Report
Paper deals with important task in e-commerce system.
Paper has a good structure, interesting approach and a lot of results in visual form. Abstract section is excellent!
Paper has scientific novelty and great practical value
Suggestions
- Please explain the all abbreviation. For example DOM.
- Please, provide the link to the open-access repository with dataset used for modeling.
- The legend of the fig. 8 isnt clear and informative. This also applies to other figures. Please fix it
- Most of references are outdated. Please fix it.
- It would be good to take into account some SVM improvement that provide more precision results (DOI: 10.5815/ijisa.2018.09.05)
Overall, the article made a very good impression and I enjoyed reading it
Author Response
Thank you for your valuable comments. We have submitted our responses to your review.
Best,
Kilho Shin, corresponding author
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
All the comments and questions proposed in the first review are clearly addressed. The problematic and the solution is now clarified and give a better understanding for the reader.
The state of the art is also enhanced according to the suggestions, and now takes into account very relevant references of similar approaches with a clear argumentation on the differentiators. A comparative table is added to position the proposed solution with regards to the state of the art ones. This gives a valuable contribution to the paper.