Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis
Abstract
:1. Introduction
2. Materials and Methods
2.1. The Methods
Algorithm 1:The ssPCA algorithm |
Given the cluster labels , scRNA-seq data X, similarity (kernel) matrix and hyperparameter : 1. Recode into a binary matrix Y, calculate , and compute from X if is not available. 2. Find W, the eigenvectors of corresponding to the k largest eigenvalues. 3. Project to low-dimension with for cluster visualization. |
2.2. The scRNA-seq Datasets
3. Results
3.1. Simulation Data
3.2. Real Data
4. Discussion
5. Conclusions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
PCA | Principal Component Analysis |
ssPCA | Semisupervised Principal Analysis |
t-SNE | t-distributed Stochastic Neighbor Embedding |
scRNA-seq | Single Cell RNA-sequencing |
PHATE | Potential of Heat-diffusion for Affinity-based Transition Embedding |
UMAP | Uniform Manifold Approximation and Projection |
Appendix A. Supplementary Figure A1, Figure A2, Figure A3 and Figure A4
References
- Jiang, L.; Chen, H.; Pinello, L.; Yuan, G.C. GiniClust: Detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol. 2016, 17, 144. [Google Scholar] [CrossRef] [Green Version]
- Buettner, F.; Natarajan, K.; Casale, F.; Proserpio, V.; Scialdone, A.; Theis, F.; Teichmann, S.; Marioni, J.; Stegle, O. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-Sequencing data reveals hidden subpopulations of cells. Nat. Biotechnol. 2015, 33, 155–160. [Google Scholar] [CrossRef]
- Kiselev, V.; Kirschner, K.; Schaub, M.; Andrews, T.; Yiu, A.; Chandra, T.; Natarajan, K.; Reik, W.; Barahona, M.; Green, A.; et al. SC3: Consensus clustering of single-cell RNA-seq data. Nat. Methods 2017, 14, 483–486. [Google Scholar] [CrossRef] [Green Version]
- Žurauskienė, J.; Yau, C. pcaReduce: Hierarchical clustering of single cell transcriptional profiles. BMC Bioinform. 2016, 17, 140. [Google Scholar] [CrossRef] [Green Version]
- Usoskin, D.; Furlan, A.; Islam, S.; Abdo, H.; Lönnerberg, P.; Lou, D.; Hjerling Leffler, J.; Haeggström, J.; Kharchenko, O.; Kharchenko, P.; et al. Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing. Nat. Neurosci. 2014, 18, 145–153. [Google Scholar] [CrossRef]
- Shalek, A.; Satija, R.; Shuga, J.; Trombetta, J.; Gennert, D.; Lu, D.; Chen, P.; Gertner, R.; Gaublomme, J.; Yosef, N.; et al. Single cell RNA Seq reveals dynamic paracrine control of cellular variation. Nature 2014, 510, 363–369. [Google Scholar] [CrossRef] [Green Version]
- van der Maaten, L. Accelerating t-SNE using Tree-Based Algorithms. J. Mach. Learn. Res. 2015, 15, 3221–3245. [Google Scholar]
- Zhou, B.; Jin, W. Visualization of Single Cell RNA-Seq Data Using t-SNE in R. Methods Mol. Biol. 2020, 2117, 159–167. [Google Scholar] [CrossRef] [PubMed]
- Linderman, G.; Rachh, M.; Hoskins, J.; Steinerberger, S.; Kluger, Y. Fast interpolation-based t-SNE for improved visualization of single-cell RNA-seq data. Nat. Methods 2019, 16, 1. [Google Scholar] [CrossRef] [PubMed]
- Haghverdi, L.; Lun, A.; Morgan, M.; Marioni, J. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 2018, 36, 421–427. [Google Scholar] [CrossRef] [PubMed]
- Butler, A.; Hoffman, P.; Smibert, P.; Papalexi, E.; Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 2018, 36, 411–420. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.; Fan, J.; Fan, H.; Rosenfeld, D.; Tse, D. An interpretable framework for clustering single-cell RNA-Seq datasets. BMC Bioinform. 2018, 19, 1–12. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ntranos, V.; Kamath, G.; Zhang, J.; Pachter, L.; Tse, D. Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts. Genome Biol. 2016, 17, 1–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Becht, E.; McInnes, L.; Healy, J.; Dutertre, C.A.; Kwok, I.; Ng, L.G.; Ginhoux, F.; Newell, E. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 2018, 37, 38–44. [Google Scholar] [CrossRef] [PubMed]
- McInnes, L.; Healy, J.; Saul, N.; Grossberger, L. UMAP: Uniform Manifold Approximation and Projection. J. Open Sour. Softw. 2018, 3, 861. [Google Scholar] [CrossRef]
- Gretton, A.; Bousquet, O.; Smola, A.; Schölkopf, B. Measuring Statistical Dependence with Hilbert-Schmidt Norms; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3734. [Google Scholar] [CrossRef] [Green Version]
- Barshan, E.; Ghodsi, A.; Azimifar, Z.; Jahromi, M. Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds. Pattern Recognit. 2011, 44, 1357–1371. [Google Scholar] [CrossRef]
- Ritchie, A.; Scott, C.; Balzano, L.; Kessler, D.; Sripada, C. Supervised Principal Component Analysis Via Manifold Optimization. In Proceedings of the 2019 IEEE Data Science Workshop (DSW), Minneapolis, MN, USA, 2–5 June 2019; pp. 6–10. [Google Scholar] [CrossRef]
- Wang, B.; Zhu, J.; Pierson, E.; Ramazzotti, D.; Batzoglou, S. Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat. Methods 2017, 14, 414–416. [Google Scholar] [CrossRef]
- Kolodziejczyk, A.A.; Kim, J.K.; Tsang, J.; Ilicic, T.; Henriksson, J.; Natarajan, K.; Tuck, A.; Gao, X.; Bühler, M.; Liu, P.; et al. Single Cell RNA-Sequencing of Pluripotent States Unlocks Modular Transcriptional Variation. Cell Stem Cell 2015, 17, 471–485. [Google Scholar] [CrossRef] [Green Version]
- Pollen, A.; Nowakowski, T.; Shuga, J.; Wang, X.; Leyrat, A.; Lui, J.; Li, N.; Szpankowski, L.; Fowler, B.; Chen, P.; et al. Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat. Biotechnol. 2014, 32, 1053. [Google Scholar] [CrossRef] [Green Version]
- Moon, K.; Dijk, D.; Wang, Z.; Gigante, S.; Burkhardt, D.; Chen, W.; Yim, K.; Elzen, A.; Hirn, M.; Coifman, R.; et al. Visualizing structure and transitions in high-dimensional biological data. Nat. Biotechnol. 2019, 37, 1482–1492. [Google Scholar] [CrossRef]
- Wang, S.; Karikomi, M.; MacLean, A.; Nie, Q. Cell lineage and communication network inference via optimization for single-cell transcriptomics. Nucleic Acids Res. 2019, 47, e66. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zheng, R.; Li, M.; Liang, Z.; Wu, F.X.; Pan, Y.; Wang, J. SinNLRR: A robust subspace clustering method for cell type detection by nonnegative and low rank representation. Bioinformatics 2019, 35, 3642–3650. [Google Scholar] [CrossRef] [PubMed]
© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Z. Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis. Int. J. Mol. Sci. 2020, 21, 5797. https://doi.org/10.3390/ijms21165797
Liu Z. Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis. International Journal of Molecular Sciences. 2020; 21(16):5797. https://doi.org/10.3390/ijms21165797
Chicago/Turabian StyleLiu, Zhenqiu. 2020. "Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis" International Journal of Molecular Sciences 21, no. 16: 5797. https://doi.org/10.3390/ijms21165797
APA StyleLiu, Z. (2020). Visualizing Single-Cell RNA-seq Data with Semisupervised Principal Component Analysis. International Journal of Molecular Sciences, 21(16), 5797. https://doi.org/10.3390/ijms21165797