Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Protein Databases Utilized in this Study
2.2. Data Used in This Study
2.3. Description of the Bolt Parameters Utilized in This Study
2.4. Custom Genomic Data analysis of Nine Patient Samples
2.5. CPTAC Data Pipeline Used to Compare the Proteogenomic Results
2.6. CRDC Data Pipeline Used to Compare PTM/Proteomics Results
2.7. Construction of the Bolt CPTAC Web Portal
3. Results and Discussion
3.1. Peptide IDs with Bolt
3.2. Evaluation of the Protein Sequencing Depth
3.3. Comparison with Proteogenomic Analysis
3.4. Evaluation of Discrepancies between the Bolt Results and the Original Analysis
3.5. Validation of Bolt IDs
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Abbatiello, S.E.; Schilling, B.; Mani, D.R.; Zimmerman, L.J.; Hall, S.C.; MacLean, B.; Albertolle, M.; Allen, S.; Burgess, M.; Cusack, M.P.; et al. Large-scale interlaboratory study to develop, analytically validate and apply highly multiplexed, quantitative peptide assays to measure cancer-relevant proteins in plasma. Mol. Cell. Proteom. 2015, 14, 2357–2374. [Google Scholar] [CrossRef] [Green Version]
- Stephens, P.; Edkins, S.; Davies, H.; Greenman, C.; Cox, C.; Hunter, C.; Bignell, G.; Teague, J.; Smith, R.; Stevens, C.; et al. A screen of the complete protein kinase gene family identifies diverse patterns of somatic mutations in human breast cancer. Nat. Genet. 2005, 37, 590–592. [Google Scholar] [CrossRef] [PubMed]
- Mertins, P.; Mani, D.R.; Ruggles, K.V.; Gillette, M.A.; Clauser, K.R.; Wang, P.; Wang, X.; Qiao, J.W.; Cao, S.; Petralia, F.; et al. Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 2016, 534, 55–62. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Vasaikar, S.; Huang, C.; Wang, X.; Petyuk, V.A.; Savage, S.R.; Wen, B.; Dou, Y.; Zhang, Y.; Shi, Z.; Arshad, O.A.; et al. Proteogenomic Analysis of Human Colon Cancer Reveals New Therapeutic Opportunities. Cell 2019, 177, 1035–1049.e19. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Huang, K.L.; Li, S.Q.; Mertins, P.; Cao, S.; Gunawardena, H.P.; Ruggles, K.V.; Mani, D.R.; Clauser, K.R.; Tanioka, M.; Usary, J.; et al. Proteogenomic integration reveals therapeutic targets in breast cancer xenografts. Nat. Commun. 2017, 8. [Google Scholar] [CrossRef] [PubMed]
- Prakash, A.; Ahmad, S.; Majumder, S.; Jenkins, C.; Orsburn, B. Bolt: A New Age Peptide Search Engine for Comprehensive MS/MS Sequencing Through Vast Protein Databases in Minutes. J. Am. Soc. Mass Spectrom. 2019, 30, 2408–2418. [Google Scholar] [CrossRef]
- Meier, F.; Brunner, A.D.; Koch, S.; Koch, H.; Lubeck, M.; Krause, M.; Goedecke, N.; Decker, J.; Kosinski, T.; Park, M.A.; et al. Online parallel accumulation–serial fragmentation (PASEF) with a novel trapped ion mobility mass spectrometer. Mol. Cell. Proteom. 2018, 17, 2534–2545. [Google Scholar] [CrossRef] [Green Version]
- Eliuk, S.; Makarov, A. Evolution of Orbitrap Mass Spectrometry Instrumentation. Annu. Rev. Anal. Chem. 2015, 8, 61–80. [Google Scholar] [CrossRef]
- Bekker-Jensen, D.B.; Kelstrup, C.D.; Batth, T.S.; Larsen, S.C.; Haldrup, C.; Bramsen, J.B.; Sorensen, K.D.; Hoyer, S.; Orntoft, T.F.; Andersen, C.L.; et al. An Optimized Shotgun Strategy for the Rapid Generation of Comprehensive Human Proteomes. Cell Syst. 2017, 4, 587–599.e4. [Google Scholar] [CrossRef] [Green Version]
- Bekker-Jensen, D.B.; Martínez-Val, A.; Steigerwald, S.; Rüther, P.; Fort, K.L.; Arrey, T.N.; Harder, A.; Makarov, A.; Olsen, J.V. A compact quadrupole-orbitrap mass spectrometer with FAIMS interface improves proteome coverage in short LC gradients. Mol. Cell. Proteom. 2020, 19, 716–729. [Google Scholar] [CrossRef] [Green Version]
- Specht, H.; Emmott, E.; Petelski, A.A.; Gray Huffman, R.; Perlman, D.H.; Serra, M.; Kharchenko, P.; Koller, A.; Slavov, N. Single-cell mass-spectrometry quantifies the emergence of macrophage heterogeneity. Genome Biol. 2021, 22, 50. [Google Scholar] [CrossRef] [PubMed]
- Jenkins, C.; Norris, A.; O’Neill, M.; Das, S.; Andresson, T.; Orsburn, B. Reporter Ion Data Analysis Reduction (R.I.D.A.R) for isobaric proteomics quantification studies. bioRxiv 2018, 437210. [Google Scholar] [CrossRef]
- Prakash, A.; Majumder, S.; Ahmad, S.; Varkey, M.; Anish, T.A.; Jenkins, C.; Rigby, M.; Orsburn, B. Detection and verification of 2.3 million cancer mutations in NCI60 cancer cell lines with a cloud search engine. J. Proteom. 2019, 209, 103488. [Google Scholar] [CrossRef] [PubMed]
- Mani, D.R.; Maynard, M.; Kothadia, R.; Krug, K.; Christianson, K.E.; Heiman, D.; Clauser, K.R.; Birger, C.; Getz, G.; Carr, S.A. PANOPLY: A cloud-based platform for automated and reproducible proteogenomic data analysis. Nat. Methods 2021, 18, 580–582. [Google Scholar] [CrossRef]
- Krug, K.; Jaehnig, E.J.; Satpathy, S.; Blumenberg, L.; Karpova, A.; Anurag, M.; Miles, G.; Mertins, P.; Geffen, Y.; Tang, L.C.; et al. Proteogenomic Landscape of Breast Cancer Tumorigenesis and Targeted Therapy. Cell 2020, 183, 1436–1456.e31. [Google Scholar] [CrossRef]
- Flores, M.A.; Lazar, I.M. XMAn v2—A database of Homo sapiens mutated peptides. Bioinformatics 2020, 36, 1311–1313. [Google Scholar] [CrossRef]
- Davies, R.W.; Kucka, M.; Su, D.; Shi, S.; Flanagan, M.; Cunniff, C.M.; Chan, Y.F.; Myers, S. Rapid genotype imputation from sequence with reference panels. Nat. Genet. 2021. [Google Scholar] [CrossRef]
- Brademan, D.R.; Riley, N.M.; Kwiecien, N.W.; Coon, J.J. Interactive Peptide Spectral Annotator: A Versatile Web-based Tool for Proteomic Applications*. Mol. Cell. Proteom. 2019, 18, S193–S201. [Google Scholar] [CrossRef] [Green Version]
- Tyanova, S.; Albrechtsen, R.; Kronqvist, P.; Cox, J.; Mann, M.; Geiger, T. Proteomic maps of breast cancer subtypes. Nat. Commun. 2016, 7, 10259. [Google Scholar] [CrossRef] [Green Version]
- Tang, W.; Zhou, M.; Dorsey, T.H.; Prieto, D.A.; Wang, X.W.; Ruppin, E.; Veenstra, T.D.; Ambs, S. Integrated proteotranscriptomics of breast cancer reveals globally increased protein-mRNA concordance associated with subtypes and survival. Genome Med. 2018, 10, 94. [Google Scholar] [CrossRef] [Green Version]
- Gomig, T.H.B.; Cavalli, I.J.; Souza, R.L.R.d.; Lucena, A.C.R.; Batista, M.; Machado, K.C.; Marchini, F.K.; Marchi, F.A.; Lima, R.S.; Urban, C.d.A.; et al. High-throughput mass spectrometry and bioinformatics analysis of breast cancer proteomic data. Data Brief 2019, 25, 104125. [Google Scholar] [CrossRef]
- Lawrence, R.T.; Perez, E.M.; Hernández, D.; Miller, C.P.; Haas, K.M.; Irie, H.Y.; Lee, S.-I.; Blau, C.A.; Villén, J. The proteomic landscape of triple-negative breast cancer. Cell Rep. 2015, 11, 630–644. [Google Scholar] [CrossRef] [Green Version]
- Hollingshead, M.G.; Stockwin, L.H.; Alcoser, S.Y.; Newton, D.L.; Orsburn, B.C.; Bonomi, C.A.; Borgel, S.D.; Divelbiss, R.; Dougherty, K.M.; Hager, E.J.; et al. Gene expression profiling of 49 human tumor xenografts from in vitro culture through multiple in vivo passages—Strategies for data mining in support of therapeutic studies. BMC Genom. 2014, 15, 393. [Google Scholar] [CrossRef] [Green Version]
- Gholami, A.M.; Hahne, H.; Wu, Z.; Auer, F.J.; Meng, C.; Wilhelm, M.; Kuster, B. Global proteome analysis of the NCI-60 cell line panel. Cell Rep. 2013, 4, 609–620. [Google Scholar] [CrossRef] [Green Version]
- Liu, Y.; Mi, Y.; Mueller, T.; Kreibich, S.; Williams, E.G.; Van Drogen, A.; Borel, C.; Germain, P.-L.; Frank, M.; Bludau, I.; et al. Genomic, Proteomic and Phenotypic Heterogeneity in HeLa Cells across Laboratories: Implications for Reproducibility of Research Results. bioRxiv 2018, 307421. [Google Scholar] [CrossRef]
- Hunt, D.F.; Henderson, R.A.; Shabanowitz, J.; Sakaguchi, K.; Michel, H.; Sevilir, N.; Cox, A.L.; Appella, E.; Engelhard, V.H. Characterization of peptides bound to the class I MHC molecule HLA-A2.1 by mass spectrometry. Science 1992, 255, 1261–1263. [Google Scholar] [CrossRef] [Green Version]
- Xue, J.Y.; Zhao, Y.; Aronowitz, J.; Mai, T.T.; Vides, A.; Qeriqi, B.; Kim, D.; Li, C.; de Stanchina, E.; Mazutis, L.; et al. Rapid non-uniform adaptation to conformation-specific KRAS(G12C) inhibition. Nature 2020, 577, 421–425. [Google Scholar] [CrossRef] [PubMed]
- Drosten, M.; Barbacid, M. Targeting the MAPK Pathway in KRAS-Driven Tumors. Cancer Cell 2020, 37, 543–550. [Google Scholar] [CrossRef] [PubMed]
- Gillson, J.; Ramaswamy, Y.; Singh, G.; Gorfe, A.A.; Pavlakis, N.; Samra, J.; Mittal, A.; Sahni, S. Small molecule KRAS inhibitors: The future for targeted pancreatic cancer therapy? Cancers 2020, 12, 1341. [Google Scholar] [CrossRef] [PubMed]
- Bache, N.; Geyer, P.E.; Bekker-Jensen, D.B.; Hoerning, O.; Falkenby, L.; Treit, P.V.; Doll, S.; Paron, I.; Müller, J.B.; Meier, F.; et al. A novel LC system embeds analytes in pre-formed gradients for rapid, ultra-robust proteomics. Mol. Cell. Proteom. 2018, 17, 2284–2296. [Google Scholar] [CrossRef] [Green Version]
- Manjili, M.H. The premise of personalized immunotherapy for cancer dormancy. Oncogene 2020, 39, 4323–4330. [Google Scholar] [CrossRef] [PubMed]
- Geyer, P.E.; Kulak, N.A.; Pichler, G.; Holdt, L.M.; Teupser, D.; Mann, M. Plasma Proteome Profiling to Assess Human Health and Disease. Cell Syst. 2016, 3, 185–195. [Google Scholar] [CrossRef] [Green Version]
- Ghazalpour, A.; Bennett, B.; Petyuk, V.A.; Orozco, L.; Hagopian, R.; Mungrue, I.N.; Farber, C.R.; Sinsheimer, J.; Kang, H.M.; Furlotte, N.; et al. Comparative analysis of proteome and transcriptome variation in mouse. PLoS Genet. 2011, 7, e001393. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xiao, W.; Ren, L.; Chen, Z.; Fang, L.T.; Zhao, Y.; Lack, J.; Guan, M.; Zhu, B.; Jaeger, E.; Kerrigan, L.; et al. Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing. Nat. Biotechnol. 2021, 39, 1141–1150. [Google Scholar] [CrossRef]
- Krassowski, M.; Das, V.; Sahu, S.K.; Misra, B.B. State of the Field in Multi-Omics Research: From Computational Needs to Data Mining and Sharing. Front. Genet. 2020. [Google Scholar] [CrossRef] [PubMed]
- Handler, D.C.; Pascovici, D.; Mirzaei, M.; Gupta, V.; Salekdeh, G.H.; Haynes, P.A. The Art of Validating Quantitative Proteomics Data. Proteomics 2018, 18, 1800222. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Protein Database | Number of Protein Sequences | Version/Date/Source |
---|---|---|
Human SwissProt; Canonical + isoforms | 42,414 | UniProt, September, 2019 |
Human UniProt Trembl | 53,211 | UniProt, September, 2019 |
Common contaminants | 269 | cRAP database (gpm.org) |
Known somatic variants (missense + nonsense) | 2,537,773 | February, 2020 (Lazar Lab) [16] |
Known population variants (dbSNP) | 1,042,598 | dbSNP, July, 2020 |
PRIDE Identifier | Type | FDR | Number of Fractions | Total Peptide IDs | Percentage Identified by Bolt |
---|---|---|---|---|---|
PXD009766 | Tumor | 1% | 6 | 164935 | 77 |
Tumor + SuperSILAC | 1% | 6 | 218107 | 70 | |
PXD005692 | Tumor | 1% | 17 | 101781 | 81 |
Tumor | 0.1% | 17 | 70936 | 87 | |
PXD012431 | Tumor | 1% | 0 | 23483 | 96 |
PXD013455 | Cell lines | 1% | 5 | 58305 | 69 |
Tumor | 1% | 5 | 90336 | 78 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Prakash, A.; Taylor, L.; Varkey, M.; Hoxie, N.; Mohammed, Y.; Goo, Y.A.; Peterman, S.; Moghekar, A.; Yuan, Y.; Glaros, T.; et al. Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants. Cancers 2021, 13, 5034. https://doi.org/10.3390/cancers13205034
Prakash A, Taylor L, Varkey M, Hoxie N, Mohammed Y, Goo YA, Peterman S, Moghekar A, Yuan Y, Glaros T, et al. Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants. Cancers. 2021; 13(20):5034. https://doi.org/10.3390/cancers13205034
Chicago/Turabian StylePrakash, Amol, Lorne Taylor, Manu Varkey, Nate Hoxie, Yassene Mohammed, Young Ah Goo, Scott Peterman, Abhay Moghekar, Yuting Yuan, Trevor Glaros, and et al. 2021. "Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants" Cancers 13, no. 20: 5034. https://doi.org/10.3390/cancers13205034
APA StylePrakash, A., Taylor, L., Varkey, M., Hoxie, N., Mohammed, Y., Goo, Y. A., Peterman, S., Moghekar, A., Yuan, Y., Glaros, T., Steele, J. R., Faridi, P., Parihari, S., Srivastava, S., Otto, J. J., Nyalwidhe, J. O., Semmes, O. J., Moran, M. F., Madugundu, A., ... Orsburn, B. C. (2021). Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants. Cancers, 13(20), 5034. https://doi.org/10.3390/cancers13205034