Assessment of Data-Independent Acquisition Mass Spectrometry (DIA-MS) for the Identification of Single Amino Acid Variants
Abstract
:1. Introduction
2. Methods
2.1. Cell Culture
2.2. Sample Preparation
2.3. Mass Spectrometry Analysis
2.3.1. LC-MS Analysis—Eclipse DDA and DIA
2.3.2. LC-MS Analysis—Exploris 480, PRM
2.4. Retrieval and Filtering of Single Amino Acid Variants
2.5. In Silico Decoy DB Generation for Entrapment Search
2.6. Protein Sequence DBs
2.7. Database Searching
2.8. DIA Data Analysis Pipelines
2.9. DDA Data Analysis Pipelines
3. Results
3.1. Overview of the DIA-MS Evaluation Strategy
3.2. HeLa SAAV Peptide Identification Using Current DIA Search Engine Pipelines
3.3. Evaluation of DIA Search Engine Pipeline Performance Using Entrapment DBs
3.4. DIA-MS and DDA-MS: CPS-DB and Entrapment DB Searches
3.5. Validation of SAAV Peptides Using PRM
4. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Vegvari, A. Mutant Proteogenomics. Adv. Exp. Med. Biol. 2016, 926, 77–91. [Google Scholar] [CrossRef] [PubMed]
- The International HapMap Consortium; Frazer, K.A.; Ballinger, D.G.; Cox, D.R.; Hinds, D.A.; Stuve, L.L.; Gibbs, R.A.; Belmont, J.W.; Boudreau, A.; Hardenbol, P.; et al. A Second Generation Human Haplotype Map of over 3.1 Million Snps. Nature 2007, 449, 851–861. [Google Scholar] [CrossRef]
- Smith, L.M.; Kelleher, N.L.; Proteomics Consortium for Top Down. Proteoform: A Single Term Describing Protein Complexity. Nat. Methods 2013, 10, 186–187. [Google Scholar] [CrossRef]
- Robert, F.; Pelletier, J. Exploring the Impact of Single-Nucleotide Polymorphisms on Translation. Front. Genet. 2018, 9, 507. [Google Scholar] [CrossRef]
- Hornbeck, P.V.; Kornhauser, J.M.; Latham, V.; Murray, B.; Nandhikonda, V.; Nord, A.; Skrzypek, E.; Wheeler, T.; Zhang, B.; Gnad, F. 15 Years of Phosphositeplus(R): Integrating Post-Translationally Modified Sites, Disease Variants and Isoforms. Nucleic Acids Res. 2019, 47, D433–D441. [Google Scholar] [CrossRef]
- Morozova, O.; Marra, M.A. Applications of Next-Generation Sequencing Technologies in Functional Genomics. Genomics 2008, 92, 255–264. [Google Scholar] [CrossRef] [PubMed]
- Metzker, M.L. Sequencing Technologies—The Next Generation. Nat. Rev. Genet. 2010, 11, 31–46. [Google Scholar] [CrossRef] [PubMed]
- Cirulli, E.T.; Goldstein, D.B. Uncovering the Roles of Rare Variants in Common Disease through Whole-Genome Sequencing. Nat. Rev. Genet. 2010, 11, 415–425. [Google Scholar] [CrossRef] [PubMed]
- Sheynkman, G.M.; Shortreed, M.R.; Cesnik, A.J.; Smith, L.M. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation. Annu. Rev. Anal. Chem. 2016, 9, 521–545. [Google Scholar] [CrossRef]
- Aebersold, R.; Mann, M. Mass-Spectrometric Exploration of Proteome Structure and Function. Nature 2016, 537, 347–355. [Google Scholar] [CrossRef]
- Wen, B.; Xu, S.; Zhou, R.; Zhang, B.; Wang, X.; Liu, X.; Xu, X.; Liu, S. Pga: An R/Bioconductor Package for Identification of Novel Peptides Using a Customized Database Derived from Rna-Seq. BMC Bioinform. 2016, 17, 244. [Google Scholar] [CrossRef] [PubMed]
- Wang, X.; Zhang, B. Customprodb: An R Package to Generate Customized Protein Databases from Rna-Seq Data for Proteomics Search. Bioinformatics 2013, 29, 3235–3237. [Google Scholar] [CrossRef] [PubMed]
- Wen, B.; Xu, S.; Sheynkman, G.M.; Feng, Q.; Lin, L.; Wang, Q.; Xu, X.; Wang, J.; Liu, S. Sapfinder: An R/Bioconductor Package for Detection of Variant Peptides in Shotgun Proteomics Experiments. Bioinformatics 2014, 30, 3136–3138. [Google Scholar] [CrossRef]
- Sheynkman, G.M.; Johnson, J.E.; Jagtap, P.D.; Shortreed, M.R.; Onsongo, G.; Frey, B.L.; Griffin, T.J.; Smith, L.M. Using Galaxy-P to Leverage Rna-Seq for the Discovery of Novel Protein Variations. BMC Genom. 2014, 15, 703. [Google Scholar] [CrossRef]
- Ruggles, K.V.; Tang, Z.; Wang, X.; Grover, H.; Askenazi, M.; Teubl, J.; Cao, S.; McLellan, M.D.; Clauser, K.R.; Tab, D.L.b; et al. An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer. Mol. Cell Proteom. 2016, 15, 1060–1071. [Google Scholar] [CrossRef]
- Krasnov, G.S.; Dmitriev, A.A.; Kudryavtseva, A.V.; Shargunov, A.V.; Karpov, D.S.; Uroshlev, L.A.; Melnikova, N.V.; Blinov, V.M.; Poverennaya, E.V.; Archakov, A.I.; et al. Ppline: An Automated Pipeline for Snp, Sap, and Splice Variant Detection in the Context of Proteogenomics. J. Proteome Res. 2015, 14, 3729–3737. [Google Scholar] [CrossRef] [PubMed]
- Alfaro, J.A.; Ignatchenko, A.; Ignatchenko, V.; Sinha, A.; Boutros, P.C.; Kislinger, T. Detecting Protein Variants by Mass Spectrometry: A Comprehensive Study in Cancer Cell-Lines. Genome Med. 2017, 9, 62. [Google Scholar] [CrossRef]
- Robin, T.; Bairoch, A.; Muller, M.; Lisacek, F.; Lane, L. Large-Scale Reanalysis of Publicly Available Hela Cell Proteomics Data in the Context of the Human Proteome Project. J. Proteome Res. 2018, 17, 4160–4170. [Google Scholar] [CrossRef]
- Desai, H.; Ofori, S.; Boatner, L.; Yu, F.; Villanueva, M.; Ung, N.; Nesvizhskii, A.I.; Backus, K. Multi-Omic Stratification of the Missense Variant Cysteinome. bioRxiv 2023. [Google Scholar] [CrossRef]
- Hughes, C.S.; Moggridge, S.; Muller, T.; Sorensen, P.H.; Morin, G.B.; Krijgsveld, J. Single-Pot, Solid-Phase-Enhanced Sample Preparation for Proteomics Experiments. Nat. Protoc. 2019, 14, 68–85. [Google Scholar] [CrossRef]
- Peterson, A.C.; Russell, J.D.; Bailey, D.J.; Westphall, M.S.; Coon, J.J. Parallel Reaction Monitoring for High Resolution and High Mass Accuracy Quantitative, Targeted Proteomics. Mol. Cell Proteom. 2012, 11, 1475–1488. [Google Scholar] [CrossRef] [PubMed]
- Tate, J.G.; Bamford, S.; Jubb, H.C.; Sondka, Z.; Beare, D.M.; Bindal, N.; Boutselakis, H.; Cole, C.G.; Creatore, C.; Dawson, E.; et al. Cosmic: The Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 2019, 47, D941–D947. [Google Scholar] [CrossRef] [PubMed]
- Lawrence, M.; Huber, W.; Pages, H.; Aboyoun, P.; Carlson, M.; Gentleman, R.; Morgan, M.T.; Carey, V.J. Software for Computing and Annotating Genomic Ranges. PLoS Comput. Biol. 2013, 9, e1003118. [Google Scholar] [CrossRef]
- Rainer, J.; Gatto, L.; Weichenberger, C.X. Ensembldb: An R Package to Create and Use Ensembl-Based Annotation Resources. Bioinformatics 2019, 35, 3151–3153. [Google Scholar] [CrossRef] [PubMed]
- Lazear, M.R. Sage: An Open-Source Tool for Fast Proteomics Searching and Quantification at Scale. J. Proteome Res. 2023, 22, 3652–3659. [Google Scholar] [CrossRef]
- Nesvizhskii, A.I.; Roos, F.F.; Grossmann, J.; Vogelzang, M.; Eddes, J.S.; Gruissem, W.; Baginsky, S.; Aebersold, R. Dynamic Spectrum Quality Assessment and Iterative Computational Analysis of Shotgun Proteomic Data: Toward More Efficient Identification of Post-Translational Modifications, Sequence Polymorphisms, and Novel Peptides. Mol. Cell Proteom. 2006, 5, 652–670. [Google Scholar] [CrossRef]
- Elias, J.E.; Gygi, S.P. Target-Decoy Search Strategy for Increased Confidence in Large-Scale Protein Identifications by Mass Spectrometry. Nat. Methods 2007, 4, 207–214. [Google Scholar] [CrossRef]
- Ma, K.; Vitek, O.; Nesvizhskii, A.I. A Statistical Model-Building Perspective to Identification of Ms/Ms Spectra with Peptideprophet. BMC Bioinform. 2012, 13 (Suppl. S16), S1. [Google Scholar] [CrossRef]
- The, M.; MacCoss, M.J.; Noble, W.S.; Kall, L. Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0. J. Am. Soc. Mass. Spectrom. 2016, 27, 1719–1727. [Google Scholar] [CrossRef]
- Demichev, V.; Messner, C.B.; Vernardis, S.I.; Lilley, K.S.; Ralser, M. Dia-Nn: Neural Networks and Interference Correction Enable Deep Proteome Coverage in High Throughput. Nat. Methods 2020, 17, 41–44. [Google Scholar] [CrossRef]
- Bruderer, R.; Bernhardt, O.M.; Gandhi, T.; Miladinovic, S.M.; Cheng, L.Y.; Messner, S.; Ehrenberger, T.; Zanotelli, V.; Butscheid, Y.; Escher, C.; et al. Extending the Limits of Quantitative Proteome Profiling with Data-Independent Acquisition and Application to Acetaminophen-Treated Three-Dimensional Liver Microtissues. Mol. Cell Proteom. 2015, 14, 1400–1410. [Google Scholar] [CrossRef] [PubMed]
- Kong, A.T.; Leprevost, F.V.; Avtonomov, D.M.; Mellacheruvu, D.; Nesvizhskii, A.I. Msfragger: Ultrafast and Comprehensive Peptide Identification in Mass Spectrometry-Based Proteomics. Nat. Methods 2017, 14, 513–520. [Google Scholar] [CrossRef] [PubMed]
- Yu, F.; Teo, G.C.; Kong, A.T.; Frohlich, K.; Li, G.X.; Demichev, V.; Nesvizhskii, A.I. Analysis of Dia Proteomics Data Using Msfragger-Dia and Fragpipe Computational Platform. Nat. Commun. 2023, 14, 4154. [Google Scholar] [CrossRef] [PubMed]
- Wen, B.; Freestone, J.; Riffle, M.; MacCoss, M.J.; Noble, W.S.; Keich, U. Assessment of false discovery rate control in tandem mass spectrometry analysis using entrapment. bioRxiv 2024. [Google Scholar] [CrossRef]
- Aggarwal, S.; Raj, A.; Kumar, D.; Dash, D.; Yadav, A.K. False Discovery Rate: The Achilles’ Heel of Proteogenomics. Brief. Bioinform. 2022, 23, 1–15. [Google Scholar] [CrossRef]
- Vaudel, M.; Burkhart, J.M.; Breiter, D.; Zahedi, R.P.; Sickmann, A.; Martens, L. A Complex Standard for Protein Identification, Designed by Evolution. J. Proteome Res. 2012, 11, 5065–5071. [Google Scholar] [CrossRef]
- Aebersold, R.; Agar, J.N.; Amster, I.J.; Baker, M.S.; Bertozzi, C.R.; Boja, E.S.; Costello, C.E.; Cravatt, B.F.; Fenselau, C.; Garcia, B.A.; et al. How Many Human Proteoforms Are There? Nat. Chem. Biol. 2018, 14, 206–214. [Google Scholar] [CrossRef]
- Feng, X.D.; Li, L.W.; Zhang, J.H.; Zhu, Y.P.; Chang, C.; Shu, K.X.; Ma, J. Using the Entrapment Sequence Method as a Standard to Evaluate Key Steps of Proteomics Data Analysis Process. BMC Genom. 2017, 18 (Suppl. 2), 143. [Google Scholar] [CrossRef]
- Kim, H.; Lee, S.; Park, H. Target-Small Decoy Search Strategy for False Discovery Rate Estimation. BMC Bioinform. 2019, 20, 438. [Google Scholar] [CrossRef]
- Tyanova, S.; Temu, T.; Cox, J. The Maxquant Computational Platform for Mass Spectrometry-Based Shotgun Proteomics. Nat. Protoc. 2016, 11, 2301–2319. [Google Scholar] [CrossRef]
- Gallien, S.; Duriez, E.; Crone, C.; Kellmann, M.; Moehring, T.; Domon, B. Targeted Proteomic Quantification on Quadrupole-Orbitrap Mass Spectrometer. Mol. Cell Proteom. 2012, 11, 1709–1723. [Google Scholar] [CrossRef] [PubMed]
- Klaproth-Andrade, D.; Hingerl, J.; Bruns, Y.; Smith, N.H.; Trauble, J.; Wilhelm, M.; Gagneur, J. Deep Learning-Driven Fragment Ion Series Classification Enables Highly Precise and Sensitive De Novo Peptide Sequencing. Nat. Commun. 2024, 15, 151. [Google Scholar] [CrossRef] [PubMed]
- Adams, C.; Gabriel, W.; Laukens, K.; Picciani, M.; Wilhelm, M.; Bittremieux, W.; Boonen, K. Fragment Ion Intensity Prediction Improves the Identification Rate of Non-Tryptic Peptides in Timstof. Nat. Commun. 2024, 15, 3956. [Google Scholar] [CrossRef] [PubMed]
- Levitsky, L.I.; Ivanov, M.V.; Goncharov, A.O.; Kliuchnikova, A.A.; Bubis, J.A.; Lobas, A.A.; Solovyeva, E.M.; Pyatnitskiy, M.A.; Ovchinnikov, R.K.; Kukharsky, M.S.; et al. Massive Proteogenomic Reanalysis of Publicly Available Proteomic Datasets of Human Tissues in Search for Protein Recoding Via Adenosine-to-Inosine Rna Editing. J. Proteome Res. 2023, 22, 1695–1711. [Google Scholar] [CrossRef]
- de Souza, E.V.; Bookout, A.L.; Barnes, C.A.; Miller, B.; Machado, P.; Basso, L.A.; Bizarro, C.V.; Saghatelian, A. The Integration of Proteogenomics and Ribosome Profiling Circumvents Key Limitations to Increase the Coverage and Confidence of Novel Microproteins. bioRxiv 2023. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fierro-Monti, I.; Fröhlich, K.; Schori, C.; Schmidt, A. Assessment of Data-Independent Acquisition Mass Spectrometry (DIA-MS) for the Identification of Single Amino Acid Variants. Proteomes 2024, 12, 33. https://doi.org/10.3390/proteomes12040033
Fierro-Monti I, Fröhlich K, Schori C, Schmidt A. Assessment of Data-Independent Acquisition Mass Spectrometry (DIA-MS) for the Identification of Single Amino Acid Variants. Proteomes. 2024; 12(4):33. https://doi.org/10.3390/proteomes12040033
Chicago/Turabian StyleFierro-Monti, Ivo, Klemens Fröhlich, Christian Schori, and Alexander Schmidt. 2024. "Assessment of Data-Independent Acquisition Mass Spectrometry (DIA-MS) for the Identification of Single Amino Acid Variants" Proteomes 12, no. 4: 33. https://doi.org/10.3390/proteomes12040033
APA StyleFierro-Monti, I., Fröhlich, K., Schori, C., & Schmidt, A. (2024). Assessment of Data-Independent Acquisition Mass Spectrometry (DIA-MS) for the Identification of Single Amino Acid Variants. Proteomes, 12(4), 33. https://doi.org/10.3390/proteomes12040033