Metabolomics Data Processing and Data Analysis—Current Best Practices

A special issue of Metabolites (ISSN 2218-1989). This special issue belongs to the section "Bioinformatics and Data Analysis".

Deadline for manuscript submissions: closed (31 May 2019) | Viewed by 152542

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors


E-Mail Website
Guest Editor
Academy Research Fellow, University of Eastern Finland, 70211 Kuopio, Finland
Interests: food and nutritional metabolomics; LC-MS based metabolic profiling approaches; development of data-analytical procedures for metabolomics
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Bioinformatics Group, Department of Plant Sciences, Wageningen University, 6708 PB Wageningen, The Netherlands
Interests: metabolomics; metabolite annotation; metabolite identification; metabolome mining; mass spectrometry; mass fragmentation; machine learning-based approaches; substructures; chemical classes; natural product discovery; food metabolome
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Metabolomics data-analytical approaches are developing with accelerating speed, alongside technical improvements in the instrumentation used in the field. There is currently a plethora of vendor-specific and open source software solutions for various aspects of the metabolomics data-analysis—some of which are covering the whole workflow, whereas some are focusing on specific aspects, such as the in silico prediction of metabolite structures. Thus, the choice of methods for new scholars entering the field may be confusing, and the selection of suitable approach is a tedious process. This Special Issue is devoted to reviewing the current practical aspects of metabolomic data-analytical workflows, starting from the data collection all the way to the presentation of publication-ready metabolomics results, to serve as a tutorial on the current best practices. We therefore invite review and viewpoint manuscripts devoted to various aspect within non-targeted metabolite profiling data-analysis with a specific emphasis on peak picking, data preprocessing (e.g., normalization, scaling, imputation), metabolite annotation and identification, as well as visualization practices. Finally, we also invite manuscripts with innovative and integrative solutions towards peak picking and metabolite annotations—which may well become “current practices” in the near future.

The Special Issue is open for submission now. A proper extension may be granted. Please kindly let us know in advance. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website.

Dr. Kati Hanhineva
Dr. Justin van der Hooft
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Metabolites is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • metabolomics
  • data processing
  • data analysis
  • data interpretation: annotation and visualization

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

16 pages, 2912 KiB  
Article
In Silico Optimization of Mass Spectrometry Fragmentation Strategies in Metabolomics
by Joe Wandy, Vinny Davies, Justin J. J. van der Hooft, Stefan Weidt, Rónán Daly and Simon Rogers
Metabolites 2019, 9(10), 219; https://doi.org/10.3390/metabo9100219 - 9 Oct 2019
Cited by 15 | Viewed by 6380
Abstract
Liquid chromatography (LC) coupled to tandem mass spectrometry (MS/MS) is widely used in identifying small molecules in untargeted metabolomics. Various strategies exist to acquire MS/MS fragmentation spectra; however, the development of new acquisition strategies is hampered by the lack of simulators that let [...] Read more.
Liquid chromatography (LC) coupled to tandem mass spectrometry (MS/MS) is widely used in identifying small molecules in untargeted metabolomics. Various strategies exist to acquire MS/MS fragmentation spectra; however, the development of new acquisition strategies is hampered by the lack of simulators that let researchers prototype, compare, and optimize strategies before validations on real machines. We introduce Virtual Metabolomics Mass Spectrometer (ViMMS), a metabolomics LC-MS/MS simulator framework that allows for scan-level control of the MS2 acquisition process in silico. ViMMS can generate new LC-MS/MS data based on empirical data or virtually re-run a previous LC-MS/MS analysis using pre-existing data to allow the testing of different fragmentation strategies. To demonstrate its utility, we show how ViMMS can be used to optimize N for Top-N data-dependent acquisition (DDA) acquisition, giving results comparable to modifying N on the mass spectrometer. We expect that ViMMS will save method development time by allowing for offline evaluation of novel fragmentation strategies and optimization of the fragmentation strategy for a particular experiment. Full article
Show Figures

Figure 1

16 pages, 1594 KiB  
Article
R-MetaboList 2: A Flexible Tool for Metabolite Annotation from High-Resolution Data-Independent Acquisition Mass Spectrometry Analysis
by Manuel D. Peris-Díaz, Shannon R. Sweeney, Olga Rodak, Enrique Sentandreu and Stefano Tiziani
Metabolites 2019, 9(9), 187; https://doi.org/10.3390/metabo9090187 - 17 Sep 2019
Cited by 9 | Viewed by 5288
Abstract
Technological advancements have permitted the development of innovative multiplexing strategies for data independent acquisition (DIA) mass spectrometry (MS). Software solutions and extensive compound libraries facilitate the efficient analysis of MS1 data, regardless of the analytical platform. However, the development of comparable tools [...] Read more.
Technological advancements have permitted the development of innovative multiplexing strategies for data independent acquisition (DIA) mass spectrometry (MS). Software solutions and extensive compound libraries facilitate the efficient analysis of MS1 data, regardless of the analytical platform. However, the development of comparable tools for DIA data analysis has significantly lagged. This research introduces an update to the former MetaboList R package and a workflow for full-scan MS1 and MS/MS DIA processing of metabolomic data from multiplexed liquid chromatography high-resolution mass spectrometry (LC-HRMS) experiments. When compared to the former version, new functions have been added to address isolated MS1 and MS/MS workflows, processing of MS/MS data from stepped collision energies, performance scoring of metabolite annotations, and batch job analysis were incorporated into the update. The flexibility and efficiency of this strategy were assessed through the study of the metabolite profiles of human urine, leukemia cell culture, and medium samples analyzed by either liquid chromatography quadrupole time-of-flight (q-TOF) or quadrupole orbital (q-Orbitrap) instruments. This open-source alternative was designed to promote global metabolomic strategies based on recursive retrospective research of multiplexed DIA analysis. Full article
Show Figures

Figure 1

14 pages, 4734 KiB  
Article
rMSIKeyIon: An Ion Filtering R Package for Untargeted Analysis of Metabolomic LDI-MS Images
by Esteban del Castillo, Lluc Sementé, Sònia Torres, Pere Ràfols, Noelia Ramírez, Manuela Martins-Green, Manel Santafe and Xavier Correig
Metabolites 2019, 9(8), 162; https://doi.org/10.3390/metabo9080162 - 2 Aug 2019
Cited by 2 | Viewed by 3710
Abstract
Many MALDI-MS imaging experiments make a case versus control studies of different tissue regions in order to highlight significant compounds affected by the variables of study. This is a challenge because the tissue samples to be compared come from different biological entities, and [...] Read more.
Many MALDI-MS imaging experiments make a case versus control studies of different tissue regions in order to highlight significant compounds affected by the variables of study. This is a challenge because the tissue samples to be compared come from different biological entities, and therefore they exhibit high variability. Moreover, the statistical tests available cannot properly compare ion concentrations in two regions of interest (ROIs) within or between images. The high correlation between the ion concentrations due to the existence of different morphological regions in the tissue means that the common statistical tests used in metabolomics experiments cannot be applied. Another difficulty with the reliability of statistical tests is the elevated number of undetected MS ions in a high percentage of pixels. In this study, we report a procedure for discovering the most important ions in the comparison of a pair of ROIs within or between tissue sections. These ROIs were identified by an unsupervised segmentation process, using the popular k-means algorithm. Our ion filtering algorithm aims to find the up or down-regulated ions between two ROIs by using a combination of three parameters: (a) the percentage of pixels in which a particular ion is not detected, (b) the Mann–Whitney U ion concentration test, and (c) the ion concentration fold-change. The undetected MS signals (null peaks) are discarded from the histogram before the calculation of (b) and (c) parameters. With this methodology, we found the important ions between the different segments of a mouse brain tissue sagittal section and determined some lipid compounds (mainly triacylglycerols and phosphatidylcholines) in the liver of mice exposed to thirdhand smoke. Full article
Show Figures

Figure 1

25 pages, 4407 KiB  
Article
MolNetEnhancer: Enhanced Molecular Networks by Integrating Metabolome Mining and Annotation Tools
by Madeleine Ernst, Kyo Bin Kang, Andrés Mauricio Caraballo-Rodríguez, Louis-Felix Nothias, Joe Wandy, Christopher Chen, Mingxun Wang, Simon Rogers, Marnix H. Medema, Pieter C. Dorrestein and Justin J.J. van der Hooft
Metabolites 2019, 9(7), 144; https://doi.org/10.3390/metabo9070144 - 16 Jul 2019
Cited by 262 | Viewed by 22024
Abstract
Metabolomics has started to embrace computational approaches for chemical interpretation of large data sets. Yet, metabolite annotation remains a key challenge. Recently, molecular networking and MS2LDA emerged as molecular mining tools that find molecular families and substructures in mass spectrometry fragmentation data. Moreover, [...] Read more.
Metabolomics has started to embrace computational approaches for chemical interpretation of large data sets. Yet, metabolite annotation remains a key challenge. Recently, molecular networking and MS2LDA emerged as molecular mining tools that find molecular families and substructures in mass spectrometry fragmentation data. Moreover, in silico annotation tools obtain and rank candidate molecules for fragmentation spectra. Ideally, all structural information obtained and inferred from these computational tools could be combined to increase the resulting chemical insight one can obtain from a data set. However, integration is currently hampered as each tool has its own output format and efficient matching of data across these tools is lacking. Here, we introduce MolNetEnhancer, a workflow that combines the outputs from molecular networking, MS2LDA, in silico annotation tools (such as Network Annotation Propagation or DEREPLICATOR), and the automated chemical classification through ClassyFire to provide a more comprehensive chemical overview of metabolomics data whilst at the same time illuminating structural details for each fragmentation spectrum. We present examples from four plant and bacterial case studies and show how MolNetEnhancer enables the chemical annotation, visualization, and discovery of the subtle substructural diversity within molecular families. We conclude that MolNetEnhancer is a useful tool that greatly assists the metabolomics researcher in deciphering the metabolome through combination of multiple independent in silico pipelines. Full article
Show Figures

Graphical abstract

12 pages, 2001 KiB  
Article
Visualization and Interpretation of Multivariate Associations with Disease Risk Markers and Disease Risk—The Triplot
by Tessa Schillemans, Lin Shi, Xin Liu, Agneta Åkesson, Rikard Landberg and Carl Brunius
Metabolites 2019, 9(7), 133; https://doi.org/10.3390/metabo9070133 - 6 Jul 2019
Cited by 11 | Viewed by 4630
Abstract
Metabolomics has emerged as a promising technique to understand relationships between environmental factors and health status. Through comprehensive profiling of small molecules in biological samples, metabolomics generates high-dimensional data objectively, reflecting exposures, endogenous responses, and health effects, thereby providing further insights into exposure-disease [...] Read more.
Metabolomics has emerged as a promising technique to understand relationships between environmental factors and health status. Through comprehensive profiling of small molecules in biological samples, metabolomics generates high-dimensional data objectively, reflecting exposures, endogenous responses, and health effects, thereby providing further insights into exposure-disease associations. However, the multivariate nature of metabolomics data contributes to high complexity in analysis and interpretation. Efficient visualization techniques of multivariate data that allow direct interpretation of combined exposures, metabolome, and disease risk, are currently lacking. We have therefore developed the ‘triplot’ tool, a novel algorithm that simultaneously integrates and displays metabolites through latent variable modeling (e.g., principal component analysis, partial least squares regression, or factor analysis), their correlations with exposures, and their associations with disease risk estimates or intermediate risk factors. This paper illustrates the framework of the ‘triplot’ using two synthetic datasets that explore associations between dietary intake, plasma metabolome, and incident type 2 diabetes or BMI, an intermediate risk factor for lifestyle-related diseases. Our results demonstrate advantages of triplot over conventional visualization methods in facilitating interpretation in multivariate risk modeling with high-dimensional data. Algorithms, synthetic data, and tutorials are open source and available in the R package ‘triplot’. Full article
Show Figures

Figure 1

15 pages, 2531 KiB  
Article
Mass Spectrometry Data Repository Enhances Novel Metabolite Discoveries with Advances in Computational Metabolomics
by Hiroshi Tsugawa, Aya Satoh, Haruki Uchino, Tomas Cajka, Makoto Arita and Masanori Arita
Metabolites 2019, 9(6), 119; https://doi.org/10.3390/metabo9060119 - 24 Jun 2019
Cited by 40 | Viewed by 7752
Abstract
Mass spectrometry raw data repositories, including Metabolomics Workbench and MetaboLights, have contributed to increased transparency in metabolomics studies and the discovery of novel insights in biology by reanalysis with updated computational metabolomics tools. Herein, we reanalyzed the previously published lipidomics data from nine [...] Read more.
Mass spectrometry raw data repositories, including Metabolomics Workbench and MetaboLights, have contributed to increased transparency in metabolomics studies and the discovery of novel insights in biology by reanalysis with updated computational metabolomics tools. Herein, we reanalyzed the previously published lipidomics data from nine algal species, resulting in the annotation of 1437 lipids achieving a 40% increase in annotation compared to the previous results. Specifically, diacylglyceryl-carboxyhydroxy-methylcholine (DGCC) in Pavlova lutheri and Pleurochrysis carterae, glucuronosyldiacylglycerol (GlcADG) in Euglena gracilis, and P. carterae, phosphatidylmethanol (PMeOH) in E. gracilis, and several oxidized phospholipids (oxidized phosphatidylcholine, OxPC; phosphatidylethanolamine, OxPE; phosphatidylglycerol, OxPG; phosphatidylinositol, OxPI) in Chlorella variabilis were newly characterized with the enriched lipid spectral databases. Moreover, we integrated the data from untargeted and targeted analyses from data independent tandem mass spectrometry (DIA-MS/MS) acquisition, specifically the sequential window acquisition of all theoretical fragment-ion MS/MS (SWATH-MS/MS) spectra, to increase the lipidomic annotation coverage. After the creation of a global library of precursor and diagnostic ions of lipids by the MS-DIAL untargeted analysis, the co-eluted DIA-MS/MS spectra were resolved in MRMPROBS targeted analysis by tracing the specific product ions involved in acyl chain compositions. Our results indicated that the metabolite quantifications based on DIA-MS/MS chromatograms were somewhat inferior to the MS1-centric quantifications, while the annotation coverage outperformed those of the untargeted analysis of the data dependent and DIA-MS/MS data. Consequently, integrated analyses of untargeted and targeted approaches are necessary to extract the maximum amount of metabolome information, and our results showcase the value of data repositories for the discovery of novel insights in lipid biology. Full article
Show Figures

Figure 1

18 pages, 5687 KiB  
Article
Comparison of Bi- and Tri-Linear PLS Models for Variable Selection in Metabolomic Time-Series Experiments
by Qian Gao, Lars O. Dragsted and Timothy Ebbels
Metabolites 2019, 9(5), 92; https://doi.org/10.3390/metabo9050092 - 9 May 2019
Cited by 4 | Viewed by 4735
Abstract
Metabolomic studies with a time-series design are widely used for discovery and validation of biomarkers. In such studies, changes of metabolic profiles over time under different conditions (e.g., control and intervention) are compared, and metabolites responding differently between the conditions are identified as [...] Read more.
Metabolomic studies with a time-series design are widely used for discovery and validation of biomarkers. In such studies, changes of metabolic profiles over time under different conditions (e.g., control and intervention) are compared, and metabolites responding differently between the conditions are identified as putative biomarkers. To incorporate time-series information into the variable (biomarker) selection in partial least squares regression (PLS) models, we created PLS models with different combinations of bilinear/trilinear X and group/time response dummy Y. In total, five PLS models were evaluated on two real datasets, and also on simulated datasets with varying characteristics (number of subjects, number of variables, inter-individual variability, intra-individual variability and number of time points). Variables showing specific temporal patterns observed visually and determined statistically were labelled as discriminating variables. Bootstrapped-VIP scores were calculated for variable selection and the variable selection performance of five PLS models were assessed based on their capacity to correctly select the discriminating variables. The results showed that the bilinear PLS model with group × time response as dummy Y provided the highest recall (true positive rate) of 83–95% with high precision, independent of most characteristics of the datasets. Trilinear PLS models tend to select a small number of variables with high precision but relatively high false negative rate (lower power). They are also less affected by the noise compared to bilinear PLS models. In datasets with high inter-individual variability, bilinear PLS models tend to provide higher recall while trilinear models tend to provide higher precision. Overall, we recommend bilinear PLS with group x time response Y for variable selection applications in metabolomics intervention time series studies. Full article
Show Figures

Figure 1

23 pages, 2914 KiB  
Article
CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification
by Yannick Djoumbou-Feunang, Allison Pon, Naama Karu, Jiamin Zheng, Carin Li, David Arndt, Maheswor Gautam, Felicity Allen and David S. Wishart
Metabolites 2019, 9(4), 72; https://doi.org/10.3390/metabo9040072 - 13 Apr 2019
Cited by 200 | Viewed by 20827
Abstract
Metabolite identification for untargeted metabolomics is often hampered by the lack of experimentally collected reference spectra from tandem mass spectrometry (MS/MS). To circumvent this problem, Competitive Fragmentation Modeling-ID (CFM-ID) was developed to accurately predict electrospray ionization-MS/MS (ESI-MS/MS) spectra from chemical structures and to [...] Read more.
Metabolite identification for untargeted metabolomics is often hampered by the lack of experimentally collected reference spectra from tandem mass spectrometry (MS/MS). To circumvent this problem, Competitive Fragmentation Modeling-ID (CFM-ID) was developed to accurately predict electrospray ionization-MS/MS (ESI-MS/MS) spectra from chemical structures and to aid in compound identification via MS/MS spectral matching. While earlier versions of CFM-ID performed very well, CFM-ID’s performance for predicting the MS/MS spectra of certain classes of compounds, including many lipids, was quite poor. Furthermore, CFM-ID’s compound identification capabilities were limited because it did not use experimentally available MS/MS spectra nor did it exploit metadata in its spectral matching algorithm. Here, we describe significant improvements to CFM-ID’s performance and speed. These include (1) the implementation of a rule-based fragmentation approach for lipid MS/MS spectral prediction, which greatly improves the speed and accuracy of CFM-ID; (2) the inclusion of experimental MS/MS spectra and other metadata to enhance CFM-ID’s compound identification abilities; (3) the development of new scoring functions that improves CFM-ID’s accuracy by 21.1%; and (4) the implementation of a chemical classification algorithm that correctly classifies unknown chemicals (based on their MS/MS spectra) in >80% of the cases. This improved version called CFM-ID 3.0 is freely available as a web server. Its source code is also accessible online. Full article
Show Figures

Figure 1

10 pages, 1056 KiB  
Article
MetaboAnalystR 2.0: From Raw Spectra to Biological Insights
by Jasmine Chong, Mai Yamamoto and Jianguo Xia
Metabolites 2019, 9(3), 57; https://doi.org/10.3390/metabo9030057 - 22 Mar 2019
Cited by 236 | Viewed by 18704
Abstract
Global metabolomics based on high-resolution liquid chromatography mass spectrometry (LC-MS) has been increasingly employed in recent large-scale multi-omics studies. Processing and interpretation of these complex metabolomics datasets have become a key challenge in current computational metabolomics. Here, we introduce MetaboAnalystR 2.0 for comprehensive [...] Read more.
Global metabolomics based on high-resolution liquid chromatography mass spectrometry (LC-MS) has been increasingly employed in recent large-scale multi-omics studies. Processing and interpretation of these complex metabolomics datasets have become a key challenge in current computational metabolomics. Here, we introduce MetaboAnalystR 2.0 for comprehensive LC-MS data processing, statistical analysis, and functional interpretation. Compared to the previous version, this new release seamlessly integrates XCMS and CAMERA to support raw spectral processing and peak annotation, and also features high-performance implementations of mummichog and GSEA approaches for predictions of pathway activities. The application and utility of the MetaboAnalystR 2.0 workflow were demonstrated using a synthetic benchmark dataset and a clinical dataset. In summary, MetaboAnalystR 2.0 offers a unified and flexible workflow that enables end-to-end analysis of LC-MS metabolomics data within the open-source R environment. Full article
Show Figures

Graphical abstract

15 pages, 1765 KiB  
Article
Annotating Nontargeted LC-HRMS/MS Data with Two Complementary Tandem Mass Spectral Libraries
by Herbert Oberacher, Vera Reinstadler, Marco Kreidl, Michael A. Stravs, Juliane Hollender and Emma L. Schymanski
Metabolites 2019, 9(1), 3; https://doi.org/10.3390/metabo9010003 - 23 Dec 2018
Cited by 27 | Viewed by 7856
Abstract
Tandem mass spectral databases are indispensable for fast and reliable compound identification in nontargeted analysis with liquid chromatography–high resolution tandem mass spectrometry (LC-HRMS/MS), which is applied to a wide range of scientific fields. While many articles now review and compare spectral libraries, in [...] Read more.
Tandem mass spectral databases are indispensable for fast and reliable compound identification in nontargeted analysis with liquid chromatography–high resolution tandem mass spectrometry (LC-HRMS/MS), which is applied to a wide range of scientific fields. While many articles now review and compare spectral libraries, in this manuscript we investigate two high-quality and specialized collections from our respective institutes, recorded on different instruments (quadrupole time-of-flight or QqTOF vs. Orbitrap). The optimal range of collision energies for spectral comparison was evaluated using 233 overlapping compounds between the two libraries, revealing that spectra in the range of CE 20–50 eV on the QqTOF and 30–60 nominal collision energy units on the Orbitrap provided optimal matching results for these libraries. Applications to complex samples from the respective institutes revealed that the libraries, combined with a simple data mining approach to retrieve all spectra with precursor and fragment information, could confirm many validated target identifications and yield several new Level 2a (spectral match) identifications. While the results presented are not surprising in many ways, this article adds new results to the debate on the comparability of Orbitrap and QqTOF data and the application of spectral libraries to yield rapid and high-confidence tentative identifications in complex human and environmental samples. Full article
Show Figures

Graphical abstract

14 pages, 3363 KiB  
Article
Mind the Gap: Mapping Mass Spectral Databases in Genome-Scale Metabolic Networks Reveals Poorly Covered Areas
by Clément Frainay, Emma L. Schymanski, Steffen Neumann, Benjamin Merlet, Reza M. Salek, Fabien Jourdan and Oscar Yanes
Metabolites 2018, 8(3), 51; https://doi.org/10.3390/metabo8030051 - 15 Sep 2018
Cited by 46 | Viewed by 9999
Abstract
The use of mass spectrometry-based metabolomics to study human, plant and microbial biochemistry and their interactions with the environment largely depends on the ability to annotate metabolite structures by matching mass spectral features of the measured metabolites to curated spectra of reference standards. [...] Read more.
The use of mass spectrometry-based metabolomics to study human, plant and microbial biochemistry and their interactions with the environment largely depends on the ability to annotate metabolite structures by matching mass spectral features of the measured metabolites to curated spectra of reference standards. While reference databases for metabolomics now provide information for hundreds of thousands of compounds, barely 5% of these known small molecules have experimental data from pure standards. Remarkably, it is still unknown how well existing mass spectral libraries cover the biochemical landscape of prokaryotic and eukaryotic organisms. To address this issue, we have investigated the coverage of 38 genome-scale metabolic networks by public and commercial mass spectral databases, and found that on average only 40% of nodes in metabolic networks could be mapped by mass spectral information from standards. Next, we deciphered computationally which parts of the human metabolic network are poorly covered by mass spectral libraries, revealing gaps in the eicosanoids, vitamins and bile acid metabolism. Finally, our network topology analysis based on the betweenness centrality of metabolites revealed the top 20 most important metabolites that, if added to MS databases, may facilitate human metabolome characterization in the future. Full article
Show Figures

Figure 1

Review

Jump to: Research, Other

30 pages, 3229 KiB  
Review
From Samples to Insights into Metabolism: Uncovering Biologically Relevant Information in LC-HRMS Metabolomics Data
by Julijana Ivanisevic and Elizabeth J. Want
Metabolites 2019, 9(12), 308; https://doi.org/10.3390/metabo9120308 - 17 Dec 2019
Cited by 74 | Viewed by 10190
Abstract
Untargeted metabolomics (including lipidomics) is a holistic approach to biomarker discovery and mechanistic insights into disease onset and progression, and response to intervention. Each step of the analytical and statistical pipeline is crucial for the generation of high-quality, robust data. Metabolite identification remains [...] Read more.
Untargeted metabolomics (including lipidomics) is a holistic approach to biomarker discovery and mechanistic insights into disease onset and progression, and response to intervention. Each step of the analytical and statistical pipeline is crucial for the generation of high-quality, robust data. Metabolite identification remains the bottleneck in these studies; therefore, confidence in the data produced is paramount in order to maximize the biological output. Here, we outline the key steps of the metabolomics workflow and provide details on important parameters and considerations. Studies should be designed carefully to ensure appropriate statistical power and adequate controls. Subsequent sample handling and preparation should avoid the introduction of bias, which can significantly affect downstream data interpretation. It is not possible to cover the entire metabolome with a single platform; therefore, the analytical platform should reflect the biological sample under investigation and the question(s) under consideration. The large, complex datasets produced need to be pre-processed in order to extract meaningful information. Finally, the most time-consuming steps are metabolite identification, as well as metabolic pathway and network analysis. Here we discuss some widely used tools and the pitfalls of each step of the workflow, with the ultimate aim of guiding the reader towards the most efficient pipeline for their metabolomics studies. Full article
Show Figures

Graphical abstract

15 pages, 885 KiB  
Review
Metabolic Modeling of Human Gut Microbiota on a Genome Scale: An Overview
by Partho Sen and Matej Orešič
Metabolites 2019, 9(2), 22; https://doi.org/10.3390/metabo9020022 - 28 Jan 2019
Cited by 56 | Viewed by 12624
Abstract
There is growing interest in the metabolic interplay between the gut microbiome and host metabolism. Taxonomic and functional profiling of the gut microbiome by next-generation sequencing (NGS) has unveiled substantial richness and diversity. However, the mechanisms underlying interactions between diet, gut microbiome and [...] Read more.
There is growing interest in the metabolic interplay between the gut microbiome and host metabolism. Taxonomic and functional profiling of the gut microbiome by next-generation sequencing (NGS) has unveiled substantial richness and diversity. However, the mechanisms underlying interactions between diet, gut microbiome and host metabolism are still poorly understood. Genome-scale metabolic modeling (GSMM) is an emerging approach that has been increasingly applied to infer diet–microbiome, microbe–microbe and host–microbe interactions under physiological conditions. GSMM can, for example, be applied to estimate the metabolic capabilities of microbes in the gut. Here, we discuss how meta-omics datasets such as shotgun metagenomics, can be processed and integrated to develop large-scale, condition-specific, personalized microbiota models in healthy and disease states. Furthermore, we summarize various tools and resources available for metagenomic data processing and GSMM, highlighting the experimental approaches needed to validate the model predictions. Full article
Show Figures

Graphical abstract

Other

Jump to: Research, Review

35 pages, 15010 KiB  
Protocol
“Notame”: Workflow for Non-Targeted LC–MS Metabolic Profiling
by Anton Klåvus, Marietta Kokla, Stefania Noerman, Ville M. Koistinen, Marjo Tuomainen, Iman Zarei, Topi Meuronen, Merja R. Häkkinen, Soile Rummukainen, Ambrin Farizah Babu, Taisa Sallinen, Olli Kärkkäinen, Jussi Paananen, David Broadhurst, Carl Brunius and Kati Hanhineva
Metabolites 2020, 10(4), 135; https://doi.org/10.3390/metabo10040135 - 31 Mar 2020
Cited by 79 | Viewed by 13463
Abstract
Metabolomics analysis generates vast arrays of data, necessitating comprehensive workflows involving expertise in analytics, biochemistry and bioinformatics in order to provide coherent and high-quality data that enable discovery of robust and biologically significant metabolic findings. In this protocol article, we introduce notame, an [...] Read more.
Metabolomics analysis generates vast arrays of data, necessitating comprehensive workflows involving expertise in analytics, biochemistry and bioinformatics in order to provide coherent and high-quality data that enable discovery of robust and biologically significant metabolic findings. In this protocol article, we introduce notame, an analytical workflow for non-targeted metabolic profiling approaches, utilizing liquid chromatography–mass spectrometry analysis. We provide an overview of lab protocols and statistical methods that we commonly practice for the analysis of nutritional metabolomics data. The paper is divided into three main sections: the first and second sections introducing the background and the study designs available for metabolomics research and the third section describing in detail the steps of the main methods and protocols used to produce, preprocess and statistically analyze metabolomics data and, finally, to identify and interpret the compounds that have emerged as interesting. Full article
Show Figures

Graphical abstract

Back to TopTop