Identifying and Analyzing Topic Clusters in a Nutri-, Food-, and Diet-Proteomic Corpus Using Machine Reading
Abstract
:1. Introduction
2. Materials and Methods
2.1. Querying and Document Parsing
2.2. Document Annotation
2.3. Co-Mention Analysis
2.4. Document Clustering
2.5. Disease Annotation
3. Results
3.1. Query Results
3.2. Document Annotation
3.3. Comention Analysis
3.4. Document Clustering
3.5. Simple Quantitative Charactercistics of the Thematic Clusters
3.6. Disease Annotation
3.7. From Group Level Data to Individual Papers
3.7.1. Cluster A—Protein, Liver, Diet, Mouse, Fatty, Rat, Disease, Obesity, Plasma, Muscle
3.7.2. Cluster B—Milk, Protein, Peptide, Human, Infant, Milk Fat Globule Membrane, Colostrum, Lactation, Bovine, Membrane
3.7.3. Cluster C—Protein, Plant, Seed, Food, Allergen, Gluten, Wheat, Soybean, Crop, Fruit
3.7.4. Cluster E—Gut, Probiotic, Microbiota, Protein, Intestinal, Cell, Microbiome, Bacterial, Human, Host
3.7.5. Cluster G—Cell, Protein, Cancer, Meat, Colorectal, Quality, Proteomics, Fish, Study, Muscle
4. Discussions and Conclusions
4.1. Data-Driven Analysis and Organization of Literature
4.2. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lee, M.B.; Hill, C.M.; Bitto, A.; Kaeberlein, M. Antiaging diets: Separating fact from fiction. Science 2021, 374, eabe7365. [Google Scholar] [CrossRef] [PubMed]
- Afman, L.; Milenkovic, D.; Roche, H.M. Nutritional aspects of metabolic inflammation in relation to health--insights from transcriptomic biomarkers in PBMC of fatty acids and polyphenols. Mol. Nutr. Food Res. 2014, 58, 1708–1720. [Google Scholar] [CrossRef] [PubMed]
- Maruvada, P.; Lampe, J.W.; Wishart, D.S.; Barupal, D.; Chester, D.N.; Dodd, D.; Djoumbou-Feunang, Y.; Dorrestein, P.C.; Dragsted, L.O.; Draper, J.; et al. Perspective: Dietary Biomarkers of Intake and Exposure—Exploration with Omics Approaches. Adv. Nutr. 2020, 11, 200–215. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Landberg, R.; Hanhineva, K.; Tuohy, K.; Garcia-Aloy, M.; Biskup, I.; Llorach, R.; Yin, X.; Brennan, L.; Kolehmainen, M. Biomarkers of cereal food intake. Genes Nutr. 2019, 14, 28. [Google Scholar] [CrossRef] [PubMed]
- Cuparencu, C.; Praticó, G.; Hemeryck, L.Y.; Sri Harsha, P.S.C.; Noerman, S.; Rombouts, C.; Xi, M.; Vanhaecke, L.; Hanhineva, K.; Brennan, L.; et al. Biomarkers of meat and seafood intake: An extensive literature review. Genes Nutr. 2019, 14, 35. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mathias, M.G.; Coelho-Landell, C.D.A.; Scott-Boyer, M.P.; Lacroix, S.; Morine, M.J.; Salomao, R.G.; Toffano, R.B.D.; Almada, M.O.R.D.V.; Camarneiro, J.M.; Hillesheim, E.; et al. Clinical and Vitamin Response to a Short-Term Multi-Micronutrient Intervention in Brazilian Children and Teens: From Population Data to Interindividual Responses. Mol. Nutr. Food Res. 2018, 62, 1700613. [Google Scholar] [CrossRef] [Green Version]
- Lundberg, M.; Eriksson, A.; Tran, B.; Assarsson, E.; Fredriksson, S. Homogeneous antibody-based proximity extension assays provide sensitive and specific detection of low-abundant proteins in human blood. Nucleic Acids Res. 2011, 39, e102. [Google Scholar] [CrossRef]
- Dayon, L.L.; Núñez Galindo, A.; Corthesy, J.; Cominetti, O.; Kussmann, M.; Galindo, A.N.; Corthesy, J.; Cominetti, O.; Kussmann, M. Comprehensive and Scalable Highly Automated MS-Based Proteomic Workflow for Clinical Biomarker Discovery in Human Plasma. J. Proteome Res. 2014, 13, 3837–3845. [Google Scholar] [CrossRef]
- Cominetti, O.; Galindo, A.N.; Corthesy, J.; Moreno, S.O.; Irincheeva, I.; Valsesia, A.; Astrup, A.; Saris, W.H.M.; Hager, J.; Kussmann, M.; et al. Proteomic Biomarker Discovery in 1000 Human Plasma Samples with Mass Spectroscopy. J. Proteome Res. 2016, 15, 389–399. [Google Scholar] [CrossRef]
- Gold, L.; Ayers, D.; Bertino, J.; Bock, C.; Bock, A.; Brody, E.N.; Carter, J.; Dalby, A.B.; Eaton, B.E.; Fitzwater, T.; et al. Aptamer-based multiplexed proteomic technology for biomarker discovery. PLoS ONE 2010, 5, e15004. [Google Scholar] [CrossRef]
- Morine, M.J.; Priami, C.; Coronado, E.; Haber, J.; Kaput, J. A Comprehensive and Holistic Health Database. In Proceedings of the 2022 IEEE International Conference on Digital Health (ICDH), Barcelona, Spain, 10–16 July 2022; pp. 202–207. [Google Scholar] [CrossRef]
- Lee, J.; Yoon, W.; Kim, S.; Kim, D.; Kim, S.; So, C.H.; Kang, J. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020, 36, 1234–1240. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lacroix, S.; Klicic Badoux, J.; Scott-Boyer, M.P.; Parolo, S.; Matone, A.; Priami, C.; Morine, M.J.; Kaput, J.; Moco, S. A computationally driven analysis of the polyphenol-protein interactome. Sci. Rep. 2018, 8, 2232. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Leaman, R.; Doǧan, R.I.; Lu, Z. DNorm: Disease name normalization with pairwise learning to rank. Bioinformatics 2013, 29, 2909–2917. [Google Scholar] [CrossRef] [Green Version]
- Wei, C.H.; Kao, H.Y.; Lu, Z. GNormPlus: An Integrative Approach for Tagging Genes, Gene Families, and Protein Domains. BioMed Res. Int. 2015, 2015. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Van Der Maaten, L.; Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Van Der Maaten, L.; Hinton, G. Visualizing non-metric similarities in multiple maps. Mach. Learn. 2012, 87, 33–55. [Google Scholar] [CrossRef] [Green Version]
- Sherman, B.T.; Hao, M.; Qiu, J.; Jiao, X.; Baseler, M.W.; Lane, H.C.; Imamichi, T.; Chang, W. DAVID: A web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Res. 2022, 50, W216–W221. [Google Scholar] [CrossRef]
- Szklarczyk, D.; Gable, A.L.; Nastou, K.C.; Lyon, D.; Kirsch, R.; Pyysalo, S.; Doncheva, N.T.; Legeay, M.; Fang, T.; Bork, P.; et al. The STRING database in 2021: Customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 2021, 49, D605–D612. [Google Scholar] [CrossRef]
- Chen, L.; Zhang, Y.H.; Wang, S.P.; Zhang, Y.H.; Huang, T.; Cai, Y.D. Prediction and analysis of essential genes using the enrichments of gene ontology and KEGG pathways. PLoS ONE 2017, 12, e0184129. [Google Scholar] [CrossRef] [Green Version]
- Walker, M.E.; Song, R.J.; Xu, X.; Gerszten, R.E.; Ngo, D.; Clish, C.B.; Corlin, L.; Ma, J.; Xanthakis, V.; Jacques, P.F.; et al. Proteomic and metabolomic correlates of healthy dietary patterns: The framingham heart study. Nutrients 2020, 12, 1476. [Google Scholar] [CrossRef]
- Kim, Y.; Lu, S.; Ho, J.E.; Hwang, S.J.; Yao, C.; Huan, T.; Levy, D.; Ma, J. Proteins as mediators of the association between diet quality and incident cardiovascular disease and all-cause mortality: The framingham heart study. J. Am. Heart Assoc. 2021, 10, e021245. [Google Scholar] [CrossRef]
- Maitiabola, G.; Tian, F.; Sun, H.; Zhang, L.; Gao, X.; Xue, B.; Wang, X. Proteome characteristics of liver tissue from patients with parenteral nutrition-associated liver disease. Nutr. Metab. 2020, 17, 43. [Google Scholar] [CrossRef] [PubMed]
- Yubero-Serrano, E.M.; Fernandez-Gandara, C.; Garcia-Rios, A.; Rangel-Zuñiga, O.A.; Gutierrez-Mariscal, F.M.; Torres-Peña, J.D.; Marin, C.; Lopez-Moreno, J.; Castaño, J.P.; Delgado-Lista, J.; et al. Mediterranean diet and endothelial function in patients with coronary heart disease: An analysis of the CORDIOPREV randomized controlled trial. PLoS Med. 2020, 17, e1003282. [Google Scholar] [CrossRef] [PubMed]
- Valsesia, A.; Chakrabarti, A.; Hager, J.; Langin, D.; Saris, W.H.M.; Astrup, A.; Blaak, E.E.; Viguerie, N.; Masoodi, M. Integrative phenotyping of glycemic responders upon clinical weight loss using multi-omics. Sci. Rep. 2020, 10, 9236. [Google Scholar] [CrossRef] [PubMed]
- Manoni, M.; Di Lorenzo, C.; Ottoboni, M.; Tretola, M.; Pinotti, L. Comparative Proteomics of Milk Fat Globule Membrane (MFGM) Proteome across Species and Lactation Stages and the Potentials of MFGM Fractions in Infant Formula Preparation. Foods 2020, 9, 1251. [Google Scholar] [CrossRef]
- Cao, X.; Zheng, Y.; Wu, S.; Yang, N.; Wu, J.; Liu, B.; Ye, W.; Yang, M.; Yue, X. Characterization and comparison of milk fat globule membrane N-glycoproteomes from human and bovine colostrum and mature milk. Food Funct. 2019, 10, 5046–5058. [Google Scholar] [CrossRef]
- Lu, J.; Wang, X.; Zhang, W.; Liu, L.; Pang, X.; Zhang, S.; Lv, J. Comparative proteomics of milk fat globule membrane in different species reveals variations in lactation and nutrition. Food Chem. 2016, 196, 665–672. [Google Scholar] [CrossRef]
- Yang, M.; Deng, W.; Cao, X.; Wang, L.; Yu, N.; Zheng, Y.; Wu, J.; Wu, R.; Yue, X. Quantitative Phosphoproteomics of Milk Fat Globule Membrane in Human Colostrum and Mature Milk: New Insights into Changes in Protein Phosphorylation during Lactation. J. Agric. Food Chem. 2020, 68, 4546–4556. [Google Scholar] [CrossRef]
- Dingess, K.A.; Li, C.; Zhu, J. Human milk proteome: What’s new? Curr. Opin. Clin. Nutr. Metab. Care 2021, 24, 252–258. [Google Scholar] [CrossRef]
- Holm, M.; Saraswat, M.; Joenväärä, S.; Seppo, A.; Looney, R.J.; Tohmola, T.; Renkonen, J.; Renkonen, R.; Järvinen, K.M. Quantitative glycoproteomics of human milk and association with atopic disease. PLoS ONE 2022, 17, e0267967. [Google Scholar] [CrossRef]
- Afzal, M.; Pfannstiel, J.; Zimmermann, J.; Bischoff, S.C.; Würschum, T.; Longin, C.F.H. High-resolution proteomics reveals differences in the proteome of spelt and bread wheat flour representing targets for research on wheat sensitivities. Sci. Rep. 2020, 10, 14677. [Google Scholar] [CrossRef] [PubMed]
- Kumar, A.; Anju, T.; Kumar, S.; Chhapekar, S.S.; Sreedharan, S.; Singh, S.; Choi, S.R.; Ramchiary, N.; Lim, Y.P. Integrating omics and gene editing tools for rapid improvement of traditional food plants for diversified and sustainable food security. Int. J. Mol. Sci. 2021, 22, 8093. [Google Scholar] [CrossRef] [PubMed]
- Chai, Y.N.; Qin, J.; Tong, Y.L.; Liu, G.H.; Wang, X.R.; Liu, C.Y.; Peng, M.H.; Qin, C.Z.; Xing, Y.R. TMT proteomics analysis of intestinal tissue from patients of irritable bowel syndrome with diarrhea: Implications for multiple nutrient ingestion abnormality. J. Proteom. 2021, 231, 103995. [Google Scholar] [CrossRef] [PubMed]
- Mindikoglu, A.L.; Abdulsada, M.M.; Jain, A.; Jalal, P.K.; Devaraj, S.; Wilhelm, Z.R.; Opekun, A.R.; Jung, S.Y. Intermittent fasting from dawn to sunset for four consecutive weeks induces anticancer serum proteome response and improves metabolic syndrome. Sci. Rep. 2020, 10, 18341. [Google Scholar] [CrossRef] [PubMed]
- Shen, X.; Li, Y.; Sun, G.; Guo, D.; Bai, X. miR-181c-3p and -5p Promotes High-Glucose-Induced Dysfunction in Human Umbilical Vein Endothelial Cells by Regulating Leukemia Inhibitory Factor; Elsevier: Amsterdam, The Netherlands, 2018; Volume 115, ISBN 860451859. [Google Scholar]
- Handjieva-Darlenska, T.; Handjiev, S.; Larsen, T.M.; Van Baak, M.A.; Jebb, S.; Papadaki, A.; Pfeiffer, A.F.H.; Martinez, J.A.; Kunesova, M.; Holst, C.; et al. Initial weight loss on an 800-kcal diet as a predictor of weight loss success after 8 weeks: The Diogenes study. Eur. J. Clin. Nutr. 2010, 64, 994–999. [Google Scholar] [CrossRef]
- Jung, T.W.; Yoo, H.J.; Choi, K.M. Implication of hepatokines in metabolic disorders and cardiovascular diseases. BBA Clin. 2016, 5, 108–113. [Google Scholar] [CrossRef] [Green Version]
- Thumser, A.E.; Moore, J.B.; Plant, N.J. Fatty acid binding proteins: Tissue-specific functions in health and disease. Curr. Opin. Clin. Nutr. Metab. Care 2014, 17, 124–129. [Google Scholar] [CrossRef] [Green Version]
- Ludka-Gaulke, T.; Ghera, P.; Waring, S.C.; Keifer, M.; Serogy, C.; Gern, J.E.; Kirkhorn, S. Farm Exposure in Early Childhood is Associated with a Lower Risk of Severe Respiratory Illnesses. J. Allergy Clin. Immunol. 2018, 141, 454–456. [Google Scholar] [CrossRef] [Green Version]
- Li, L.; Wang, P.; Yan, J.; Wang, Y.; Li, S.; Jiang, J.; Sun, Z.; Tang, B.; Chang, T.H.; Wang, S.; et al. Real-world data medical knowledge graph: Construction and applications. Artif. Intell. Med. 2020, 103, 101817. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Cho, H.; Lee, H. Biomedical named entity recognition using deep neural networks with contextual information. BMC Bioinform. 2019, 20, 735. [Google Scholar] [CrossRef] [PubMed]
- Cenikj, G.; Eftimov, T.; Seljak, B.K. SAFFRON: Transfer Learning For Food-Disease Relation extraction. In Proceedings of the 20th Workshop on Biomedical Language Processing, Online, June 2021; pp. 30–40. [Google Scholar] [CrossRef]
- Zhu, L.; Zheng, H. Biomedical event extraction with a novel combination strategy based on hybrid deep neural networks. BMC Bioinform. 2020, 21, 1–12. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Boyack, K.W.; Newman, D.; Duhon, R.J.; Klavans, R.; Patek, M.; Biberstine, J.R.; Schijvenaars, B.; Skupin, A.; Ma, N.; Börner, K. Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches. PLoS ONE 2011, 6, e18029. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Abdelkader, W.; Navarro, T.; Parrish, R.; Cotoi, C.; Germini, F.; Iorio, A.; Haynes, R.B.; Lokker, C. Machine learning approaches to retrieve high-quality, clinically relevant evidence from the biomedical literature: Systematic review. JMIR Med. Inform. 2021, 9, e30401. [Google Scholar] [CrossRef] [PubMed]
- Rossanez, A.; dos Reis, J.C.; Torres, R.d.S.; de Ribaupierre, H. KGen: A knowledge graph generator from biomedical scientific literature. BMC Med. Inform. Decis. Mak. 2020, 20, 314. [Google Scholar] [CrossRef]
- Xu, J.; Kim, S.; Song, M.; Jeong, M.; Kim, D.; Kang, J.; Rousseau, J.F.; Li, X.; Xu, W.; Torvik, V.I.; et al. Building a PubMed knowledge graph. Sci. Data 2020, 7, 205. [Google Scholar] [CrossRef]
- De Barros, T.T.; Venâncio, V.D.P.; Hernandes, L.C.; Greggi Antunes, L.M.; Hillesheim, E.; Salomão, R.G.; Mathias, M.G.; Coelho-Landell, C.A.; Toffano, R.B.D.; Do Vale Almada, M.O.R.; et al. DNA damage is inversely associated to blood levels of DHA and EPA fatty acids in Brazilian children and adolescents. Food Funct. 2020, 11, 5115–5121. [Google Scholar] [CrossRef]
- Almada, M.O.R.D.V.; Almeida, A.C.F.; Ued, F.d.V.; Mathias, M.G.; Coelho-Landell, C.d.A.; Salomão, R.G.; Toffano, R.B.D.; Camarneiro, J.M.; Hillesheim, E.; de Barros, T.T.; et al. Metabolic groups related to blood vitamin levels and inflammatory biomarkers in Brazilian children and adolescents. J. Nutr. Sci. Vitaminol. 2020, 66, 515–525. [Google Scholar] [CrossRef] [PubMed]
- Coelho-Landell, C.A.; Salomão, R.G.; Almada, M.O.R.d.V.; Mathias, M.G.; Toffano, R.B.D.; Hillesheim, E.; Barros, T.T.; Camarneiro, J.M.; Camelo-Junior, J.S.; Rosa, J.C.; et al. Metabo groups in response to micronutrient intervention: Pilot study. Food Sci. Nutr. 2020, 8, 683–693. [Google Scholar] [CrossRef]
- Kang, T.; Zou, S.; Weng, C. Pretraining to recognize piCO elements from randomized controlled trial literature. Stud. Health Technol. Inform. 2019, 264, 188–192. [Google Scholar] [CrossRef] [PubMed]
- Davagdorj, K.; Wang, L.; Li, M.; Pham, V.H.; Ryu, K.H.; Theera-Umpon, N. Discovering Thematically Coherent Biomedical Documents Using Contextualized Bidirectional Encoder Representations from Transformers-Based Clustering. Int. J. Environ. Res. Public Health 2022, 19, 5893. [Google Scholar] [CrossRef] [PubMed]
Query | PubMed Terms 1 |
---|---|
protnutr_mh | Proteomics [MH] AND “Diet, Food, and Nutrition” [MH] and Human [MH] |
protnutr_majr | Proteomics [MAJR] AND “Diet, Food, and Nutrition” [MAJR] and Human [MH] |
protnutr_ab | (proteomics [TIAB] OR “DNA aptamer” [TIAB] OR Somascan [TIAB]) and (“Nutrition” [TIAB] OR “Nutritional” [TIAB]) AND (Human [MH] OR Human [TIAB] or individuals [TIAB] or patients [TIAB] or participants [TIAB] or subjects [TIAB]) |
Query | PubMed Central Terms |
protnutr_mh | Proteomics [MH] AND “Diet, Food, and Nutrition” [MH] AND Human [MH] AND (open access [filter] OR author manuscript [filter]) |
protnutr_majr | Proteomics [MH] AND “Diet, Food, and Nutrition” [MH] AND Human [MH] AND (open access [filter] OR author manuscript [filter]) |
protnutr_ab | (proteomics [Abstract] OR “DNA aptamer” [Abstract] OR Somascan [Abstract]) and (“Nutrition” [Abstract] OR “Nutritional” [Abstract]) AND (Human [MH] OR Human [TIAB] or individuals [Abstract] or patients [Abstract] or participants [Abstract] or subjects [Abstract]) AND (open access [filter] OR author manuscript [filter]) |
Cluster | Theme | Cluster Size | Median SJR | Median Year | Earliest | Latest |
---|---|---|---|---|---|---|
A | protein, liver, diet, mouse, fatty, rat, disease, obesity, plasma, muscle | 112 | 1.189 | 2017 | 2001 | 2022 |
B | milk, protein, peptide, human, infant, milk fat globule membrane, colostrum, lactation, bovine, membrane | 101 | 1.189 | 2017 | 2006 | 2022 |
C | protein, plant, seed, food, allergen, gluten, wheat, soybean, crop, fruit | 181 | 1.279 | 2018 | 2002 | 2022 |
D | gut, probiotic, microbiota, protein, intestinal, cell, microbiome, bacterial, human, host | 158 | 0.934 | 2012 | 2000 | 2022 |
E | nutrition, nutritional, food, research, nutrigenomics, health, metabolomics, disease, genomics, science | 104 | 1.085 | 2017 | 2004 | 2022 |
F | cell, protein, cancer, meat, colorectal, quality, proteomics, fish, study, muscle | 156 | 1.323 | 2017 | 2003 | 2022 |
G | protein, egg, salivary, plasma, child, biomarkers, saliva, disease, proteomics, proteome | 133 | 1.189 | 2016 | 2003 | 2022 |
Cluster | Protein_ID | Name | #/Cluster | p-Value |
---|---|---|---|---|
A | INS | insulin | 21 | 3.14 × 10−7 |
A | ADIPOQ | adiponectin, C1Q and collagen domain containing | 5 | 7.43 × 10−3 |
A | APOE | apolipoprotein E | 5 | 1.84 × 10−2 |
A | HP | haptoglobin | 4 | 2.75 × 10−2 |
A | Insr | insulin receptor | 3 | 4.03 × 10−2 |
A | PRKAA2 | protein kinase AMP-activated catalytic subunit a2 | 3 | 4.03 × 10−2 |
A | Srebf1 | sterol regulatory element bindingTF1transcription | 3 | 4.03 × 10−2 |
A | TNF | tumor necrosis factor | 5 | 4.88 × 10−2 |
A | CLU | clusterin | 3 | 6.33 × 10−2 |
A | TF | transferrin | 3 | 6.33 × 10−2 |
A | VIM | vimentin | 3 | 6.33 × 10−2 |
A | CRP | C-reactive protein | 4 | 7.67 × 10−2 |
A | TTR | transthyretin | 4 | 7.67 × 10−2 |
A | APOA4 | apolipoprotein A4 | 3 | 9.10 × 10−2 |
A | APOC3 | apolipoprotein C3 | 3 | 9.10 × 10−2 |
A | Apoa1 | apolipoprotein A1 | 2 | 9.86 × 10−2 |
A | B2M | beta-2-microglobulin | 2 | 9.86 × 10−2 |
A | Irs1 | insulin receptor substrate 1 | 2 | 9.86 × 10−2 |
A | PON1 | paraoxonase 1 | 2 | 9.86 × 10−2 |
A | PUM3 | pumilio RNA binding family member 3 | 2 | 9.86 × 10−2 |
A | Slc2a4 | solute carrier family 2 member 4 | 2 | 9.86 × 10−2 |
A | VDR | vitamin D receptor | 2 | 9.86 × 10−2 |
B | MFGE8 | milk fat globule EGF with factor V/VIII domain | 25 | 2.90 × 10−8 |
B | LALBA | lactalbumin alpha | 9 | 7.72 × 10−4 |
B | LYZ | lysozyme | 6 | 2.24 × 10−2 |
B | CSN2 | casein beta | 5 | 1.29 × 10−2 |
B | LTF | lactotransferrin | 4 | 2.66 × 10−2 |
B | MfgE8 | milk fat globule EGF with factor V/VIII domain | 4 | 8.48 × 10−2 |
C | IGHE | immunoglobulin heavy episolon chain | 4 | 2.57 × 10−2 |
C | NT5C3A | 5 prime-nucleotidase, cytosolic IIIA | 3 | 2.77 × 10−2 |
C | LOC112695262 | 2 | 7.67 × 10−2 | |
D | FGB | fibrinogen beta chain | 3 | 1.52 × 10−2 |
D | LEP | leptin | 3 | 2.46 × 10−2 |
D | FGG | fibrinogen gamma chain | 2 | 5.11 × 10−2 |
D | Ldlr | low density lipoprotein receptor | 2 | 5.11 × 10−2 |
D | NOS1 | nitric oxide synthase 1 | 2 | 5.11 × 10−2 |
D | Nos1 | murine nitric oxide synthase 1 | 2 | 5.11 × 10−2 |
D | TXN | thioredoxin | 2 | 5.11 × 10−2 |
D | APOB | apolipoprotein B | 2 | 7.98 × 10−2 |
E | ABCB1 | ATP binding cassette subfamily B member 1 | 3 | 1.41 × 10−2 |
E | ABCC2 | ATP binding cassette subfamily C member 2 | 3 | 1.41 × 10−2 |
E | ABCC3 | ATP binding cassette subfamily C member 3 | 3 | 1.41 × 10−2 |
E | ABCG2 | ATP binding cassette subfamily G member 2 | 2 | 4.87 × 10−2 |
E | CASP3 | caspase 3 | 2 | 4.87 × 10−2 |
E | Gusb | glucuronidase beta | 4 | 7.06 × 10-−3 |
E | HSP90AA1 | heat shock protein 90 a family class A member 1 | 2 | 7.61 × 10−2 |
E | IL10 | interleukin 10 | 2 | 4.87 × 10−2 |
E | SLC15A1 | solute carrier family 15 member 1 | 3 | 1.41 × 10−2 |
E | SLCO2B1 | solute carrier organic anion transporter 2B1 | 2 | 4.87 × 10−2 |
G | AKT1 | AKT serine/threonine kinase 1 | 3 | 2.95 × 10−2 |
G | MTOR | mechanistic target of rapamycin kinase | 3 | 2.95 × 10−2 |
G | Crtc1 | CREB regulated transcription coactivator 1 | 2 | 5.79 × 10−2 |
G | SOD1 | superoxide dismutase 1 | 2 | 5.79 × 10−2 |
G | TNFSF10 | TNF superfamily member 10 | 2 | 5.79 × 10−2 |
Cluster | #Term ID | Term Description | Observed Protein Count | Background Protein Count | False Discovery Rate |
---|---|---|---|---|---|
A | DOID:0060158 | Acquired metabolic disease | 42 | 320 | 4.98 × 10−27 |
DOID:0014667 | Disease of metabolism | 64 | 997 | 1.02 × 10−26 | |
DOID:4 | Disease | 138 | 5921 | 4.54 × 10−20 | |
DOID:9120 | Amyloidosis | 19 | 70 | 1.96 × 10−16 | |
DOID:7 | Disease of anatomical entity | 109 | 4452 | 3.59 × 10−15 | |
DOID:4194 | Glucose metabolism disease | 21 | 125 | 8.46 × 10−15 | |
DOID:0050828 | Artery disease | 20 | 118 | 3.34 × 10−14 | |
DOID:178 | Vascular disease | 24 | 223 | 2.44 × 10−13 | |
DOID:9351 | Diabetes mellitus | 19 | 118 | 3.82 × 10−13 | |
DOID:1287 | Cardiovascular system disease | 31 | 454 | 1.21 × 10−12 | |
B | DOID:0014667 | Disease of metabolism | 32 | 997 | 7.14 × 10−11 |
DOID:0050161 | Lower respiratory tract disease | 16 | 206 | 1.24 × 10−9 | |
DOID:1579 | Respiratory system disease | 17 | 263 | 2.40 × 10−9 | |
DOID:0060158 | Acquired metabolic disease | 18 | 320 | 3.47 × 10−9 | |
DOID:4 | Disease | 73 | 5921 | 1.42 × 10−8 | |
DOID:0050828 | Artery disease | 12 | 118 | 2.09 × 10−8 | |
DOID:850 | Lung disease | 13 | 172 | 8.18 × 10−8 | |
DOID:326 | Ischemia | 7 | 23 | 2.91 × 10−7 | |
DOID:77 | Gastrointestinal system disease | 19 | 510 | 3.18 × 10−7 | |
DOID:552 | Pneumonia | 7 | 25 | 3.81 × 10−7 | |
D | DOID:0014667 | Disease of metabolism | 32 | 997 | 6.28 × 10−13 |
DOID:4 | Disease | 71 | 5921 | 9.45 × 10−12 | |
DOID:0050636 | Familial visceral amyloidosis | 9 | 21 | 4.70 × 10−11 | |
DOID:0060158 | Acquired metabolic disease | 18 | 320 | 2.45 × 10−10 | |
DOID:9120 | Amyloidosis | 11 | 70 | 4.55 × 10−10 | |
DOID:0050828 | Artery disease | 10 | 118 | 1.19 × 10−6 | |
DOID:7 | Disease of anatomical entity | 52 | 4452 | 1.31 × 10−6 | |
DOID:178 | Vascular disease | 12 | 223 | 2.55 × 10−6 | |
DOID:1247 | Blood coagulation disease | 8 | 76 | 7.61 × 10−6 | |
DOID:0050161 | Lower respiratory tract disease | 11 | 206 | 1.00 × 10−5 | |
E | DOID:178 | Vascular disease | 14 | 223 | 1.01 × 10−8 |
DOID:4 | Disease | 58 | 5921 | 1.01 × 10−8 | |
DOID:7 | Disease of anatomical entity | 50 | 4452 | 1.01 × 10−8 | |
DOID:326 | Ischemia | 7 | 23 | 5.33 × 10−8 | |
DOID:77 | Gastrointestinal system disease | 16 | 510 | 9.92 × 10−7 | |
DOID:2914 | Immune system disease | 17 | 611 | 1.37 × 10−6 | |
DOID:0014667 | Disease of metabolism | 21 | 997 | 1.47 × 10−6 | |
DOID:0050828 | Artery disease | 9 | 118 | 2.32 × 10−6 | |
DOID:11162 | Respiratory failure | 5 | 10 | 2.32 × 10−6 | |
DOID:552 | Pneumonia | 6 | 25 | 2.32 × 10−6 | |
F | DOID:0060158 | Acquired metabolic disease | 10 | 320 | 1.43 × 10−7 |
DOID:10652 | Alzheimers disease | 6 | 35 | 1.43 × 10−7 | |
DOID:9351 | Diabetes mellitus | 7 | 118 | 7.14 × 10−7 | |
DOID:9352 | Type 2 diabetes mellitus | 5 | 29 | 9.97 × 10−7 | |
DOID:10763 | Hypertension | 5 | 36 | 2.33 × 10−6 | |
DOID:0014667 | Disease of metabolism | 12 | 997 | 5.77 × 10−6 | |
DOID:0050828 | Artery disease | 6 | 118 | 1.19 × 10−5 | |
DOID:4 | Disease | 24 | 5921 | 1.19 × 10−5 | |
DOID:5844 | Myocardial infarction | 4 | 20 | 1.69 × 10−5 | |
G | DOID:18 | Urinary system disease | 16 | 315 | 1.36 × 10−7 |
DOID:4 | Disease | 65 | 5921 | 1.75 × 10−7 | |
DOID:0050686 | Organ system cancer | 21 | 677 | 2.16 × 10−7 | |
DOID:14566 | Disease of cellular proliferation | 25 | 1012 | 2.40 × 10−7 | |
DOID:850 | Lung disease | 12 | 172 | 3.12 × 10−7 | |
DOID:77 | Gastrointestinal system disease | 18 | 510 | 3.23 × 10−7 | |
DOID:9120 | Amyloidosis | 9 | 70 | 3.23 × 10−7 | |
DOID:162 | Cancer | 23 | 895 | 3.46 × 10−7 | |
DOID:0060158 | Acquired metabolic disease | 14 | 320 | 1.61 × 10−6 | |
DOID:0050687 | Cell type cancer | 15 | 406 | 3.51 × 10−6 |
Ref | Top Document Words | Top Thematic Cluster Words | Cluster |
---|---|---|---|
[21] | pattern; dietary; dash; ahei; md; metabolomic; fdr; metabolite; index; framingham | protein, liver, diet, mouse, fatty, rat, disease, obesity, plasma, muscle | A |
[22] | diet; score; mortality; cvd; association; p0; incident; mediation; all cause; quality | ||
[23] | pnald; liver; mitochondrial; glycolipid; deps; parenteral; oxidative; differentially; expressed; patient | ||
[24] | endothelial; ci; diet; mediterranean; cordioprev; lowfat; coronary; patient; difference; dysfunction | ||
[25] | nonresponders; curve; responder; body; ketone; roc; glycemic; weight; difference; baseline | ||
[26] | mfgm; bovine; ifs; specie; globule; fraction; milk; attempt; proteome; fat | milk, protein, peptide, human, infant, milk fat globule membrane, colostrum, lactation, bovine, membrane | B |
[27] | mfgm; milk; bovine; colostrum; nglycoproteomes; nglycoproteins; mature; lactation; glycosylation; globule | ||
[28] | lipid; fat; mfgm; milk; globule; catabolic; membrane; specie; enzyme; utilization | ||
[29] | phosphorylation; site; mfgm; milk; colostrum; mature; protein; phosphoproteomics; phosphoprotein; globule | ||
[30] | milk; infant; proteome; ptms; human; expanded; endogenous; mother; like; peptide | ||
[31] | atopic; glycopeptides; oom; milk; nglycoprotein; rochester; lifestyle; york; mother; child | ||
[32] | wheat; spelt; bread; flour; ncws; heritability; differed; celiac; protein; hypersensitivities; bread | protein, plant, seed, food, allergen, gluten, wheat, soybean, crop, fruit | C |
[33] | tfps; editing; trait; food; plant; improvement; omics; sustainable; nutritious; traditional | ||
[34] | ibsd; intestinal; irritable; bowel; patient; molecule; tissue; syndrome; tmt; ingestion | gut, probiotic, microbiota, protein, intestinal, cell, microbiome, bacterial, human, host | E |
[35] | fasting; intermittent; clock; sunset; day; circadian; dawn; consecutive; neuropsychiatric; syndrome | cell, protein, cancer, meat, colorectal, quality, proteomics, fish, study, muscle | G |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Monteiro, J.P.; Morine, M.J.; Ued, F.V.; Kaput, J. Identifying and Analyzing Topic Clusters in a Nutri-, Food-, and Diet-Proteomic Corpus Using Machine Reading. Nutrients 2023, 15, 270. https://doi.org/10.3390/nu15020270
Monteiro JP, Morine MJ, Ued FV, Kaput J. Identifying and Analyzing Topic Clusters in a Nutri-, Food-, and Diet-Proteomic Corpus Using Machine Reading. Nutrients. 2023; 15(2):270. https://doi.org/10.3390/nu15020270
Chicago/Turabian StyleMonteiro, Jacqueline Pontes, Melissa J. Morine, Fabio V. Ued, and Jim Kaput. 2023. "Identifying and Analyzing Topic Clusters in a Nutri-, Food-, and Diet-Proteomic Corpus Using Machine Reading" Nutrients 15, no. 2: 270. https://doi.org/10.3390/nu15020270
APA StyleMonteiro, J. P., Morine, M. J., Ued, F. V., & Kaput, J. (2023). Identifying and Analyzing Topic Clusters in a Nutri-, Food-, and Diet-Proteomic Corpus Using Machine Reading. Nutrients, 15(2), 270. https://doi.org/10.3390/nu15020270