Next Article in Journal
Allelopathic Potential of Rice and Identification of Published Allelochemicals by Cloud-Based Metabolomics Platform
Next Article in Special Issue
Photosynthetic Co-production of Succinate and Ethylene in a Fast-Growing Cyanobacterium, Synechococcus elongatus PCC 11801
Previous Article in Journal
Antibiotic-Induced Changes in Microbiome-Related Metabolites and Bile Acids in Rat Plasma
Previous Article in Special Issue
Heterologous Production of 6-Deoxyerythronolide B in Escherichia coli through the Wood Werkman Cycle
 
 
Review
Peer-Review Record

Machine Learning Applications for Mass Spectrometry-Based Metabolomics

Metabolites 2020, 10(6), 243; https://doi.org/10.3390/metabo10060243
by Ulf W. Liebal 1,*, An N. T. Phan 1, Malvika Sudhakar 2,3,4, Karthik Raman 2,3,4 and Lars M. Blank 1,*
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Metabolites 2020, 10(6), 243; https://doi.org/10.3390/metabo10060243
Submission received: 30 April 2020 / Revised: 9 June 2020 / Accepted: 11 June 2020 / Published: 13 June 2020
(This article belongs to the Special Issue Metabolic Engineering and Synthetic Biology Volume 2)

Round 1

Reviewer 1 Report

The manuscript by Liebal et al is a review article introducing recent trends in the machine learning applications for mass spectrometry-based metabolomics. This article will be very useful for many readers of Metabolites due to the comprehensive citation of many recently published papers and concise summary of their relationship such as Tables 5, 6, and 7. There are a few comments on this excellent review article.

1)     A small flaw of this manuscript seems to be a vague conclusion since the most important message of this article looks like to be “Because each data set is unique, and any data property can affect the performance of the different statistical approaches, it is advisable to test multiple ML tools on the data. (lines 336-337)”. Readers need more practical lessons such as a good order to try multiple ML methods as well as the pros and cons of three strategies for multi-omics integration.

2)     All sections explained that “Each technology has been more or less improved by ML but not enough”. Readers cannot understand how much improvement was there. More quantitative explanation using some benchmark would be very much useful for readers such as for the case of the missing data imputation, QC normalization, and peak picking issues. If there is no method for benchmarking, this is a current bottleneck of this field and should be discussed in this review.

3)     In table 3 ReSpect [45] Over 9,000 MS7MS spectrum of phytochemicals => MS/MS

Author Response

Thank you for giving us the opportunity to further improve our manuscript “Machine learning applications for mass spectrometry-based metabolomics” (metabolites-805982) based on the reports that you received. We thank the reviewers for their response. We have addressed all points raised by the reviewers and incorporated the same in the manuscript. Our changes are highlighted in yellow in the manuscript. 

 We have further included a point-by-point response to all the comments raised by both reviewers. In order to highlight the responses, we have included our reply (italics) to each specific suggestion the reviewer made, and how we addressed it in the revised manuscript.  

We would like to thank the reviewers again for their reviews, as well as the editor for handling this manuscript.  

Sincerely, 

Ulf W. Liebal, An N. T. Phan, Malvika Sudhakar, Karthik Raman and Lars M. Blank 

Review #1

We are grateful for the response by the reviewer and to learn that (s)he recognizes the value of our contribution. 

 

Point-by-point response: 

Point 1:     A small flaw of this manuscript seems to be a vague conclusion since the most important message of this article looks like to be “Because each data set is unique, and any data property can affect the performance of the different statistical approaches, it is advisable to test multiple ML tools on the data. (lines 336-337)”. Readers need more practical lessons such as a good order to try multiple ML methods as well as the pros and cons of three strategies for multi-omics integration. 

Response: We added the following text to describe the pros and cons of multi-omics integration strategies: 

‘Integrating data using concatenation and ensemble methods discovers data correlations across omics layers that are invisible to the post-analysis approach. The post-analysis, however, is relevant for analyzing data from different experiments when homogeneous data across omics sets are not available.’ 

Point 2:     All sections explained that “Each technology has been more or less improved by ML but not enough”. Readers cannot understand how much improvement was there. More quantitative explanation using some benchmark would be very much useful for readers such as for the case of the missing data imputation, QC normalization, and peak picking issues. If there is no method for benchmarking, this is a current bottleneck of this field and should be discussed in this review. 

Response: The benchmark problem is a highly relevant for comparing machine learning tools and we are grateful to the reviewer for highlighting this issue. We have added in Table 5 (new version) a column with the data origin that also shows the benchmarks that were currently employed. We hope in this way we can contribute to the evolution of a coherent benchmark data set. 

Point 3:     In table 3 ReSpect [45] Over 9,000 MS7MS spectrum of phytochemicals => MS/MS 

Response: Corrected 

Reviewer 2 Report

This paper is well-organized, easy to read and covers every important topic of currently ML applications for MS. Therefore, it is recommended to be published as a review; but first, the authors need to provide some of the following revisions that I suggest below:

  1. Pag. 2, Lines 65-66. The statement made in the manuscript read unclear: “The ML tools differ in their statistical models…
  2. Pag. 5 Line 137 – Please specify some references for SIMCA-P and MetaboAnalyst tools.
  3. Table 3: I suggest inserting an additional column for the database URL.
  4. Pag 12, Lines 351.352. Please add a reference for the statement “However, ML-based approaches are being developed …. to support mechanistic model formulation”
  5. Pag 13, Line 387. MFA is a huge domain of research. Referencing a specific MFA review paper can be useful.

Author Response

Thank you for giving us the opportunity to further improve our manuscript “Machine learning applications for mass spectrometry-based metabolomics” (metabolites-805982) based on the reports that you received. We thank the reviewers for their response. We have addressed all points raised by the reviewers and incorporated the same in the manuscript. Our changes are highlighted in yellow in the manuscript. 

 We have further included a point-by-point response to all the comments raised by both reviewers. In order to highlight the responses, we have included our reply (italics) to each specific suggestion the reviewer made, and how we addressed it in the revised manuscript.  

We would like to thank the reviewers again for their reviews, as well as the editor for handling this manuscript.  

Sincerely, 

Ulf W. Liebal, An N. T. Phan, Malvika Sudhakar, Karthik Raman and Lars M. Blank 

 

Reviewer #2 

We are happy that the reviewer found our article useful and we thank them for all the constructive suggestions. 

Point 1: Pag. 2, Lines 65-66. The statement made in the manuscript read unclear: “The ML tools differ in their statistical models… 

Response: We hope to clarify by the following adaptation: 

‘The ML tools use different algorithms and Table 1 provides a brief overview...’ 

Point 2: Pag. 5 Line 137 – Please specify some references for SIMCA-P and MetaboAnalyst tools. 

Response: This sentence was deleted, we added another reference with a citation to O’Shea & Misra latest toolbox paper: 

In addition, there are several software tools for manual data processing, as reviewed by O’Shea and Misra (2020) [26]. 

Point 3: Table 3: I suggest inserting an additional column for the database URL. 

Response: Column added. 

Point 4: Pag 12, Lines 351.352. Please add a reference for the statement “However, ML-based approaches are being developed …. to support mechanistic model formulation” 

Response: We added references of Alber et al., (2019) who provide a general review on development of machine learning with mechanistic models, and Heckmann et al. (2018) with a strategy to generate enzyme parameters from protein structure to enable large scale model construction. 

Point 5: Pag 13, Line 387. MFA is a huge domain of research. Referencing a specific MFA review paper can be useful. 

Response: We added the reference by Antoniewicz (2018) an excellent introduction to MFA even for clinical applications. 

Reviewer 3 Report

The purpose of the paper is to review machine learning methods and applications in mass spectrometry-based metabolomics.

Unfortunately I found the paper mostly confusing. It is not clear to me who the authors think the target audience of the paper is supposed to be and what is the supposed added value of the paper for that audience.

The main problems of this paper in my view are:

1) The workflows in metabolomics are not clearly reflected. Several tasks are discussed without making it clear how the different tasks connect to each other, and what are the inputs and outputs of each phase. I felt the order of discussion is at many point not logical and makes it hard to follow. For example, section 2.1. first discusses peak detection (aka peak picking, feature finding), i.e. deciding which peaks to store, which to discard from raw LC/MS-retention time data from complex samples, and combines that with a discussion of metabolite identification methods which are trained using MS/MS reference spectra, and thus represent a completely different case from ML point of view.

2) The different modes of generating and using MS data are not clearly depicted. e,g. many times metabolomics is done using MS1+retention time data, and MS/MS is used as a data source for obtaining more accurate annotations if required. From ML point of view the two setups, the data and consequently the best methods used are different, but this not at all apparent from the text. Also the biomarker discovery and classification applications can work either using MS1-RT based features, molecular fingerprints, or putative annotations of molecules, making a big difference of what ML techniques are needed. Without grounding the comparison to the kind of data being used, the method comparisons are not meaningful.

4) A significant part of the references in the paper actually do not actually discuss MS based metabolomics data. In several points the authors aim to draw conclusions of such papers, but I do not think this is a valid approach. Metabolomics data is different from proteomics, not to mention transcriptomics, so it is not clear the lessons learned there would directly translate. I felt confused at many points in trying to understand why certain paper outside metabolomics was discussed.

5) In many points, I feel the continuum and the state-of-the-art in the research field is not properly presented. This is probably mostly due to selecting very recent papers for references which may not reflect the field that well, at least not as yet. The prime example is the discussion about “peak annotation” and molecular fingerprints where the text implies that the idea of predicting molecular fingerprints was due to a few very recent ANN-based papers, while in reality the original approach (FingerID by Heinonen et al. 2012) was SVM-based and continues to be used in CSI:FingerID and SIRIUS and several other tools.

 

Details

=====

Line  78

  • Not sure if RF is highly interpretable. If the forest consists of, say 100 trees of a dozen of variables each, hard to interpret.
  • SVMs are not notorious of overfitting, in fact in medium sized datasets they have state of the art accuracy, on part with RF and Logistic regression, better than ANN; On big data (10^5 data points and upwards) ANN is better than the others
  • I don’t think ANNs are any more interpretable than SVMs
  • I don’t understand the GA block, in particular how is a “function tree” related to GA, GA is a very generic methodology, the function tree appears to be a specific instance. This needs to be made more clear

96: Weka is a general purpose ML software, KniMET is specific to metabolomics. it is confusing to refer to them as if they are alternatives to each other, without explaining the context.

100: What does “with relevance to MS data analysis” mean? In a review paper in Metabolites you should be more specific.

112: Is the Table 2 really helpful for this paper? I don’t think the explanations given to the terms are sufficient to actually understand them if you don’t already, and the terms do not appear that many times in the rest of the paper (e.g. “softmax" and “activation function” only appear once in the rest of the article)

122: I think this is too much simplified view of the standard of metabolite annotation (See e.g. Blazenovic et al 2018)

125: These tools/databases do not address the same task, and there are much more that could be mentioned. Please be more specific.

132: Why only PLS family of methods is highlighted when other statistical tools could and are used as well, including PCA, t-SNE, ...

136: It is not clearly explained here how the cross-validation and the statistical tests relate. I feel the discussion is here too superficial to be helpful.

137: Again, a general tool (SIMCA-P) and a specific one (Metaboanalyst) are mentioned side by side without mentioning their very different scopes.

144: The cited review papers do NOT actually discuss peak detection problem at all. Please correct this bit.

146: The reference [48] is not about MS, there should be a better choice for introducing the CNN discussion here. I don’t understand what “was also performed” refers to

148: I  think saying non-linearity of retention times is because of “all the atoms” does not hit the actual issue. I think it is mostly about finding the correct features to describe the interaction of the molecule with the LC column. It may of may not be non-linear in these features but the main problem is that we do not know what the optimal features are.

154: There are lots of papers about retention time prediction, highlighting one paper does not feel as representative.

161-166: This paragraph does not give a good picture of the state-of-the-art. 

168: tandem MS or MS/MS

176: This section actually reads well, but it is very much dependent on a few references, while the previous section compresses a huge bit of literature in similar space

207-: This section misses the discussion what causes the missing values (lacking ionization, too low abundance, ...)

244-250: I am not sure what the authors are aiming to achieve with this conceptual cross-mapping of methods. Even though all methods could be mapped to ANNs that does not answer the question which method to use for a given dataset.

256: The selection of applications in the table is too heterogeneous to learn any lessons. What more useful information it conveys than the frequencies of using particular methods?

258: This section lacks the high level: whether we use unidentified MS profiles or putatively identified metabolites, which are completely different approaches

298: I don’t think RFE is generally regarded as a stable feature selection algorithm. It would require more than one paper to conclude that it is in the specific case of metabolomics.

304: I am not sure RF is more data hungry than SVM or ANN. Generally ANN starts to wrok better than RF or SVM only when there is lot of data.

310 DeepSpectra is not about MS spectra in particular, but spectra in general. Again, caution is advised from making direct conclusions based on this paper.

323: I don’t think that binary output is the explanation for conventional methods faring well. The bigger reason is that lot of the benchmark datasets lack complex non-linear dependencies between input and output, which ML methods implicitly assume.

470: The conclusion does not seem to correspond to the contents of the paper too well

Author Response

Thank you for giving us the opportunity to further improve our manuscript “Machine learning applications for mass spectrometry-based metabolomics” (metabolites-805982) based on the reports that you received. We thank the reviewers for their response. We have addressed all points raised by the reviewers and incorporated the same in the manuscript. Our changes are highlighted in yellow in the manuscript. 

 We have further included a point-by-point response to all the comments raised by both reviewers. In order to highlight the responses, we have included our reply (italics) to each specific suggestion the reviewer made, and how we addressed it in the revised manuscript.  

We would like to thank the reviewers again for their reviews, as well as the editor for handling this manuscript.  

Sincerely, 

Ulf W. Liebal, An N. T. Phan, Malvika Sudhakar, Karthik Raman and Lars M. Blank 

 

Review #3 

We greatly appreciate the detailed review and constructive criticism, which has helped us greatly improve the manuscript. We believe we have addressed all of the reviewer's comments satisfactorily, and we hope that the present manuscript is convincing and will be a useful information resource for machine learning in metabolomics. 

 

Point-by-point response: 

Point 1: The workflows in metabolomics are not clearly reflected. Several tasks are discussed without making it clear how the different tasks connect to each other, and what are the inputs and outputs of each phase. I felt the order of discussion is at many point not logical and makes it hard to follow. For example, section 2.1. first discusses peak detection (aka peak picking, feature finding), i.e. deciding which peaks to store, which to discard from raw LC/MS-retention time data from complex samples, and combines that with a discussion of metabolite identification methods which are trained using MS/MS reference spectra, and thus represent a completely different case from ML point of view. 

Response: We added a new Figure to visualize the metabolomics workflow and how machine learning relates to each step. We reworked section 2.1 in particular, by reducing topic jumps, for example classification on raw m/z spectra concludes the paragraph, and streamlining the content, for example by removing the description on peptide retention time prediction. We believe the association of peak picking followed by metabolite annotation is justified as the data requirements for the machine learning approaches are similar. 

Point 2: The different modes of generating and using MS data are not clearly depicted. e,g. many times metabolomics is done using MS1+retention time data, and MS/MS is used as a data source for obtaining more accurate annotations if required. From ML point of view the two setups, the data and consequently the best methods used are different, but this not at all apparent from the text. Also the biomarker discovery and classification applications can work either using MS1-RT based features, molecular fingerprints, or putative annotations of molecules, making a big difference of what ML techniques are needed. Without grounding the comparison to the kind of data being used, the method comparisons are not meaningful. 

Response: We add a new column to the new Table 4, ‘Knowledge production with ML support’, with details on the experimental basis for each study. Thus, it is now possible to find recent references on defined experimental technologies. 

Point 4: A significant part of the references in the paper actually do not actually discuss MS based metabolomics data. In several points the authors aim to draw conclusions of such papers, but I do not think this is a valid approach. Metabolomics data is different from proteomics, not to mention transcriptomics, so it is not clear the lessons learned there would directly translate. I felt confused at many points in trying to understand why certain paper outside metabolomics was discussed. 

Response: We removed the discussion of the peptide retention time prediction (2.1) and the transcriptome  data preprocessing (3.1). We agree that among these domains the data properties are probably not sufficiently conserved. We have kept a more general review on the benefit of CNN for spectrometric analysis, because the relevance and the educational benefit is high. 

Point 5: In many points, I feel the continuum and the state-of-the-art in the research field is not properly presented. This is probably mostly due to selecting very recent papers for references which may not reflect the field that well, at least not as yet. The prime example is the discussion about “peak annotation” and molecular fingerprints where the text implies that the idea of predicting molecular fingerprints was due to a few very recent ANN-based papers, while in reality the original approach (FingerID by Heinonen et al. 2012) was SVM-based and continues to be used in CSI:FingerID and SIRIUS and several other tools. 

Response: We agree that the limitation to very recent publications is an insufficient representation of the field.  We chose this restriction to be able to provide the broad impact of ML in the overall MS field. We believe we cite authoritative reviews for each subfield to enable informative further reading. We rewrote the section 2.1 on metabolite annotation with QSSR and fingerprint to include the article by Heinonen et al. (2012) as well as a recent comparison of ML-based retention time predictions (Bouwmeester et al., 2019). 

 

---- 

Details: 

Line  78 

  • Not sure if RF is highly interpretable. If the forest consists of, say 100 trees of a dozen of variables each, hard to interpret. 

Response: RF is interpretable, irrespective of number of trees, by finding the contribution of a feature across the trees. Also, the decision path can be visualized to provide additional insight. To increase clarity for the 'interpretation'-term in Table 1, we added the sentence: ‘Methods transforming features into latent variables impede the interpretation of individual feature contributions to the prediction.’ 

In the introduction we added: ‘The item ‘interpretation’ judges how direct the feature is connected to the target value prediction and thus allows direct biological understanding of the decision. Methods transforming features into latent variables impede the interpretation of individual feature contributions to the prediction.’ 

 

  • SVMs are not notorious of overfitting, in fact in medium sized datasets they have state of the art accuracy, on part with RF and Logistic regression, better than ANN; On big data (10^5 data points and upwards) ANN is better than the others 

Response: We use three levels to judge overfitting risk. Linear models have the lowest risk of overfitting, whereas RF have an elevated risk. SVM 'carries [the] risks of overfitting and unnecessary complexity if one is not careful' (Brereton & Lloyd, 2010). It is more reasonable to keep the overfitting risk of SVM higher than RF. 

 

  • I don’t think ANNs are any more interpretable than SVMs 

Response: ANN provide more interpretative power compared to SVM. SVM has a single function, RBF, that can display highly nonlinear features. ANN generate a latent space which is interpretable in principle. The articles of Morton et al. (2019) and Le et al. (2019) exemplify how hidden nodes of neural networks can be analysed. The Conclusion is also focusing on the potential of ANN for interpretation. 

 

  • I don’t understand the GA block, in particular how is a “function tree” related to GA, GA is a very generic methodology, the function tree appears to be a specific instance. This needs to be made more clear 

Response: We reformulated the description of GA as follows: ‘Solution space is searched by operations similar to natural genetic processes to identify suitable solutions. Fitness function is defined to find the fittest solutions. The fittest solutions are subject to cross-over and mutations to evolve towards the best solution.’ 

 

96: Weka is a general purpose ML software, KniMET is specific to metabolomics. it is confusing to refer to them as if they are alternatives to each other, without explaining the context. 

Response: Reformulated to focus on the specific MS contributions in each general-purpose software: ‘MS data analysis guides and add-ons for tools with a visual interface were published for WEKA [16] and KNIME [17].’ 

 

100: What does “with relevance to MS data analysis” mean? In a review paper in Metabolites you should be more specific. 

Response: The specific benefit to MS is briefly mentioned: 'New methods evolved from ANN, including convolutional neural networks (CNN) suited for peak characterization and encoder-decoder systems suited for latent variable projections.' 

 

112: Is the Table 2 really helpful for this paper? I don’t think the explanations given to the terms are sufficient to actually understand them if you don’t already, and the terms do not appear that many times in the rest of the paper (e.g. “softmax" and “activation function” only appear once in the rest of the article) 

Response: In order to improve the flow while reading, we put the definition of terms to an abbreviation section to the article end. While the brevity of the definitions will not lead to full understanding, we are convinced the descriptions raise insight. 

 

122: I think this is too much simplified view of the standard of metabolite annotation (See e.g. Blazenovic et al 2018) 

Response: Simple indeed, but not false. 

125: These tools/databases do not address the same task, and there are much more that could be mentioned. Please be more specific. 

Response: We deleted the incomplete list and added instead a reference to the authoritative review by O'Shea and Misra (2020) on the latest software tools in MS analytics. 

132: Why only PLS family of methods is highlighted when other statistical tools could and are used as well, including PCA, t-SNE, … 

Response: In order not to confuse the audience with too many methods, we decided not to mention any specific MVAs here. 

 136: It is not clearly explained here how the cross-validation and the statistical tests relate. I feel the discussion is here too superficial to be helpful. 

Response: We deleted the information of additional statistical tests on variables to increase the message of the importance of cross-validation. 

 137: Again, a general tool (SIMCA-P) and a specific one (Metaboanalyst) are mentioned side by side without mentioning their very different scopes. 

Response: The sentence is deleted. 

144: The cited review papers do NOT actually discuss peak detection problem at all. Please correct this bit. 

Response: We moved the citation to the paragraph discussing peak detection. 

146: The reference [48] is not about MS, there should be a better choice for introducing the CNN discussion here. I don’t understand what “was also performed” refers to 

Response: We added the article by Risum and Bro (2019) about CNN in MS. The reference [48](old) is about spectral analysis with CNN. There is certainly something to learn about spectral analysis generated in other experimental domains. 

The sentence containing 'was also performed' is an independent new topic and we moved it to a better position at the end of the paragraph and changed it to: ‘While data pre-processing increases the information content of the raw data and allows for more complex analysis, methods were developed to bridge from raw spectral data directly to phenotype characterization.‘ 

 

148: I  think saying non-linearity of retention times is because of “all the atoms” does not hit the actual issue. I think it is mostly about finding the correct features to describe the interaction of the molecule with the LC column. It may of may not be non-linear in these features but the main problem is that we do not know what the optimal features are. 

Response: Adapted as: ‘The ab initio prediction of metabolite retention time is a complex problem because unknown subsets of a metabolite atoms are involved.’ 

154: There are lots of papers about retention time prediction, highlighting one paper does not feel as representative. 

Response: We have removed a reference to peptide spectrometry and added a reference with a comparative study of different ML approaches for retention time prediction. 

' Bouwmeester et al. (2019) [47] conducted an illustrative comparison of different ML approaches for LC retention time prediction. ' 

161-166: This paragraph does not give a good picture of the state-of-the-art.  

Response: Newly written with a focus on ML properties instead of conversion. We start with the CSI-Fingerprint-spectrum analysis using SVM that lead to SIRIUS. This is followed by the characterization of the ML specific benefits for ANN and CNN. 

168: tandem MS or MS/MS 

Response: Corrected. 

176: This section actually reads well, but it is very much dependent on a few references, while the previous section compresses a huge bit of literature in similar space 

Response: Indeed, the topic of normalization contains fewer publications with machine learning contribution compared to peak picking and metabolite annotation, and the given articles raised the bar already high. 

207-: This section misses the discussion what causes the missing values (lacking ionization, too low abundance, ...) 

Response: We do mention the origins of missing values, e.g., ‘While MAR is usually caused by a failure in data preprocessing, such as inaccurate peak detection and deconvolution of co-eluting compounds, MCAR is mainly due to the data acquisition process like incomplete derivatization or ionization [73].’ (line 226ff, new version)  

244-250: I am not sure what the authors are aiming to achieve with this conceptual cross-mapping of methods. Even though all methods could be mapped to ANNs that does not answer the question which method to use for a given dataset. 

Response: We believe there is a benefit for the reader with marginal ML knowledge. To know the interrelationships of the methods helps to mentally categorize them. It helps to appreciate what are the functional modifications in each approach to represent data complexity. 

256: The selection of applications in the table is too heterogeneous to learn any lessons. What more useful information it conveys than the frequencies of using particular methods? 

Response: Particularly by exposing the heterogeneity the table helps anyone who is interested in applying a specific method. The table provides a broad overview and guides with its categorization. We have added a column to identify the spectrometric method used for data generation. 

258: This section lacks the high level: whether we use unidentified MS profiles or putatively identified metabolites, which are completely different approaches 

Response: Wherever possible, we added to the text, the origin of the data. Also, we added an additional column to the table presenting the articles with their spectrometric technique. 

298: I don’t think RFE is generally regarded as a stable feature selection algorithm. It would require more than one paper to conclude that it is in the specific case of metabolomics. 

Response: RFE deleted, focus is on minimum sample number per class for robust classification: ‘On the lower limit, one study reported robust binary classification with as little as three samples in each class for linear SVM with untargeted data derived from archaeal cultivation and pig urine after traumatization [101].’ 

304: I am not sure RF is more data hungry than SVM or ANN. Generally ANN starts to wrok better than RF or SVM only when there is lot of data. 

Response: Data hungriness is not an independent ML property but subject to the data. We cite the comparative analysis by van der Ploeg et al. (2014) on clinical data and Mendez et al. (2019) with metabolomics data to support this claim. 

310 DeepSpectra is not about MS spectra in particular, but spectra in general. Again, caution is advised from making direct conclusions based on this paper. 

Response: We agree and have downsized the impact of this paper, first, by only reporting on the possibility of m/z raw spectrum classification/regression and leaving out the controversial data pre-processing results, and second, by moving the part to the peak processing part in section 2.1. 

323: I don’t think that binary output is the explanation for conventional methods faring well. The bigger reason is that lot of the benchmark datasets lack complex non-linear dependencies between input and output, which ML methods implicitly assume. 

Response: Binary classification is a simpler problem compared to regression. Linear models can often separate two classes efficiently even if the underlying data is nonlinear. The advantage is that the data efficient linear models can fit better than the nonlinear models. It might be reasonable to assume lack of complex nonlinearities in the clinical data - but in the absence of a mechanistic model we don’t know. 

470: The conclusion does not seem to correspond to the contents of the paper too well 

Response: The conclusion has now been streamlined to focus on the inevitable diversity of the methods and how to tackle this diversity with standards and benchmarks. 

Round 2

Reviewer 3 Report

I think the manuscript has improved massively and now provides a good overview of ML in metabolomics.

Back to TopTop