Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Hardware–Software Co-Design of an Audio Feature Extraction Pipeline for Machine Learning Applications

Electronics 2024, 13(5), 875; https://doi.org/10.3390/electronics13050875

by Jure Vreča^1,2,*

, Ratko Pilipović³

and Anton Biasizzo¹

Reviewer 1:

Ya-long Yang

Reviewer 2: Anonymous

Electronics 2024, 13(5), 875; https://doi.org/10.3390/electronics13050875

Submission received: 31 January 2024 / Revised: 17 February 2024 / Accepted: 22 February 2024 / Published: 24 February 2024

(This article belongs to the Special Issue Embedded AI)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper explores the simplifications of the MFCC audio features and derive a simplified version, which can be more easily used in embedded applications. Additionally, this paper implement a hardware generator that generates an appropriate hardware pipeline for the simplified audio features extraction. The cited references are generally related to the research content and have some cutting-edge characteristics, But the latest literature is relatively scarce and it is recommended to increase them. No important references were excluded.

Comments on the Quality of English Language

The paper has good readability, smooth logic, and clear expression. It is recommended to provide a more detailed description of the core scheme of the paper. The tables in the paper can clearly reflect the experimental results. It is recommended to add more detailed and eye-catching annotations in the images or adjust the proportion of key display positions appropriately to reflect the experimental results.

Author Response

Dear reviewer,

Please find our responses in the attached pdf file.

Best regards,

Jure Vreča

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript is well written and organized; the proposed approach, while certainly not novel in its general form, is interesting for the specific case. Furthermore, the authors report convincing results that, in my opinion, well support their proposal. The manuscript is somehow short and it can be extended by explaining better certain choices that have been made and, more in general, by providing more details on the implementation.

Following, some additional issues that I noticed in the manuscript:

* Section 0 (Introduction) should be numbered Section 1.

* In Section 0, it is mentioned that audio is sampled at 16kHz, but in Equation 1, 16383 samples are considered. The author should explain the reason of these additional 383 samples.

* Subsection 1.1 is unnecessary, being the only subsection of Section 1, and should be removed.

* In Section 2.2, the authors mention that the FFT results are incremented by 1 to achieve better numerical stability. This choice needs to be motivated.

* Note 1 should be included in the text of the section, if the authors feel like it is important to mention this.

* Figure 6 is difficult to understand even by reading the explanatory text. To me, it creates unneeded complexity and it should be changed/removed, unless there is a specific reason for the proposed graphical representation (in this case, it should definitely be explained better).

Author Response

Dear reviewer,

Please find our responses in the attached pdf file.

Best regards,

Jure Vreča

Author Response File: Author Response.pdf

Article Menu

Hardware–Software Co-Design of an Audio Feature Extraction Pipeline for Machine Learning Applications

Further Information

Guidelines

MDPI Initiatives

Follow MDPI