Genes

Editorial

Jump to: Research

5 pages, 164 KiB

Open AccessEditorial

The Versatility of SMRT Sequencing

by Matthew S. Hestand and Adam Ameur

Genes 2019, 10(1), 24; https://doi.org/10.3390/genes10010024 - 4 Jan 2019

Cited by 17 | Viewed by 5525

Abstract

The adoption of single molecule real-time (SMRT) sequencing [...] Full article

(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)

Research

Jump to: Editorial

17 pages, 4698 KiB

Open AccessArticle

Single-Molecule Real-Time (SMRT) Full-Length RNA-Sequencing Reveals Novel and Distinct mRNA Isoforms in Human Bone Marrow Cell Subpopulations

by Anne Deslattes Mays, Marcel Schmidt, Garrett Graham, Elizabeth Tseng, Primo Baybayan, Robert Sebra, Miloslav Sanda, Jean-Baptiste Mazarati, Anna Riegel and Anton Wellstein

Genes 2019, 10(4), 253; https://doi.org/10.3390/genes10040253 - 27 Mar 2019

Cited by 13 | Viewed by 9492

Abstract

Hematopoietic cells are continuously replenished from progenitor cells that reside in the bone marrow. To evaluate molecular changes during this process, we analyzed the transcriptomes of freshly harvested human bone marrow progenitor (lineage-negative) and differentiated (lineage-positive) cells by single-molecule real-time (SMRT) full-length RNA-sequencing. [...] Read more.

Hematopoietic cells are continuously replenished from progenitor cells that reside in the bone marrow. To evaluate molecular changes during this process, we analyzed the transcriptomes of freshly harvested human bone marrow progenitor (lineage-negative) and differentiated (lineage-positive) cells by single-molecule real-time (SMRT) full-length RNA-sequencing. This analysis revealed a ~5-fold higher number of transcript isoforms than previously detected and showed a distinct composition of individual transcript isoforms characteristic for bone marrow subpopulations. A detailed analysis of messenger RNA (mRNA) isoforms transcribed from the ANXA1 and EEF1A1 loci confirmed their distinct composition. The expression of proteins predicted from the transcriptome analysis was evaluated by mass spectrometry and validated previously unknown protein isoforms predicted e.g., for EEF1A1. These protein isoforms distinguished the lineage negative cell population from the lineage positive cell population. Finally, transcript isoforms expressed from paralogous gene loci (e.g., CFD, GATA2, HLA-A, B, and C) also distinguished cell subpopulations but were only detectable by full-length RNA sequencing. Thus, qualitatively distinct transcript isoforms from individual genomic loci separate bone marrow cell subpopulations indicating complex transcriptional regulation and protein isoform generation during hematopoiesis. Full article

(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)

► Show Figures

Graphical abstract

16 pages, 2391 KiB

Open AccessArticle

Genome Sequencing Illustrates the Genetic Basis of the Pharmacological Properties of Gloeostereum incarnatum

by Xinxin Wang, Jingyu Peng, Lei Sun, Gregory Bonito, Jie Wang, Weijie Cui, Yongping Fu and Yu Li

Genes 2019, 10(3), 188; https://doi.org/10.3390/genes10030188 - 1 Mar 2019

Cited by 18 | Viewed by 5225

Abstract

Gloeostereum incarnatum is a precious edible mushroom that is widely grown in Asia and known for its useful medicinal properties. Here, we present a high-quality genome of G. incarnatum using the single-molecule real-time (SMRT) sequencing platform. The G. incarnatum genome, which is the [...] Read more.

Gloeostereum incarnatum is a precious edible mushroom that is widely grown in Asia and known for its useful medicinal properties. Here, we present a high-quality genome of G. incarnatum using the single-molecule real-time (SMRT) sequencing platform. The G. incarnatum genome, which is the first complete genome to be sequenced in the family Cyphellaceae, was 38.67 Mbp, with an N50 of 3.5 Mbp, encoding 15,251 proteins. Based on our phylogenetic analysis, the Cyphellaceae diverged ~174 million years ago. Several genes and gene clusters associated with lignocellulose degradation, secondary metabolites, and polysaccharide biosynthesis were identified in G. incarnatum, and compared with other medicinal mushrooms. In particular, we identified two terpenoid-associated gene clusters, each containing a gene encoding a sesterterpenoid synthase adjacent to a gene encoding a cytochrome P450 enzyme. These clusters might participate in the biosynthesis of incarnal, a known bioactive sesterterpenoid produced by G. incarnatum. Through a transcriptomic analysis comparing the G. incarnatum mycelium and fruiting body, we also demonstrated that the genes associated with terpenoid biosynthesis were generally upregulated in the mycelium, while those associated with polysaccharide biosynthesis were generally upregulated in the fruiting body. This study provides insights into the genetic basis of the medicinal properties of G. incarnatum, laying a framework for future characterization of bioactive proteins and pharmaceutical uses of this fungus. Full article

(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)

► Show Figures

Figure 1

18 pages, 1879 KiB

Open AccessArticle

Genome Sequencing of Cladobotryum protrusum Provides Insights into the Evolution and Pathogenic Mechanisms of the Cobweb Disease Pathogen on Cultivated Mushroom

by Frederick Leo Sossah, Zhenghui Liu, Chentao Yang, Benjamin Azu Okorley, Lei Sun, Yongping Fu and Yu Li

Genes 2019, 10(2), 124; https://doi.org/10.3390/genes10020124 - 8 Feb 2019

Cited by 22 | Viewed by 6112

Abstract

Cladobotryum protrusum is one of the mycoparasites that cause cobweb disease on cultivated edible mushrooms. However, the molecular mechanisms of evolution and pathogenesis of C. protrusum on mushrooms are largely unknown. Here, we report a high-quality genome sequence of C. protrusum using the [...] Read more.

Cladobotryum protrusum is one of the mycoparasites that cause cobweb disease on cultivated edible mushrooms. However, the molecular mechanisms of evolution and pathogenesis of C. protrusum on mushrooms are largely unknown. Here, we report a high-quality genome sequence of C. protrusum using the single-molecule, real-time sequencing platform of PacBio and perform a comparative analysis with closely related fungi in the family Hypocreaceae. The C. protrusum genome, the first complete genome to be sequenced in the genus Cladobotryum, is 39.09 Mb long, with an N50 of 4.97 Mb, encoding 11,003 proteins. The phylogenomic analysis confirmed its inclusion in Hypocreaceae, with its evolutionary divergence time estimated to be ~170.1 million years ago. The genome encodes a large and diverse set of genes involved in secreted peptidases, carbohydrate-active enzymes, cytochrome P450 enzymes, pathogen–host interactions, mycotoxins, and pigments. Moreover, C. protrusum harbors arrays of genes with the potential to produce bioactive secondary metabolites and stress response-related proteins that are significant for adaptation to hostile environments. Knowledge of the genome will foster a better understanding of the biology of C. protrusum and mycoparasitism in general, as well as help with the development of effective disease control strategies to minimize economic losses from cobweb disease in cultivated edible mushrooms. Full article

(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)

► Show Figures

Figure 1

18 pages, 1699 KiB

Open AccessArticle

Genome Assembly and Annotation of the Trichoplusia ni Tni-FNL Insect Cell Line Enabled by Long-Read Technologies

by Keyur Talsania, Monika Mehta, Castle Raley, Yuliya Kriga, Sujatha Gowda, Carissa Grose, Matthew Drew, Veronica Roberts, Kwong Tai Cheng, Sandra Burkett, Steffen Oeser, Robert Stephens, Daniel Soppet, Xiongfeng Chen, Parimal Kumar, Oksana German, Tatyana Smirnova, Christopher Hautman, Jyoti Shetty, Bao Tran, Yongmei Zhao and Dominic Esposito Show full author list Hide full author list

Genes 2019, 10(2), 79; https://doi.org/10.3390/genes10020079 - 23 Jan 2019

Cited by 14 | Viewed by 7836

Abstract

Background: Trichoplusia ni derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in [...] Read more.

Background: Trichoplusia ni derived cell lines are commonly used to enable recombinant protein expression via baculovirus infection to generate materials approved for clinical use and in clinical trials. In order to develop systems biology and genome engineering tools to improve protein expression in this host, we performed de novo genome assembly of the Trichoplusia ni-derived cell line Tni-FNL. Methods: By integration of PacBio single-molecule sequencing, Bionano optical mapping, and 10X Genomics linked-reads data, we have produced a draft genome assembly of Tni-FNL. Results: Our assembly contains 280 scaffolds, with a N50 scaffold size of 2.3 Mb and a total length of 359 Mb. Annotation of the Tni-FNL genome resulted in 14,101 predicted genes and 93.2% of the predicted proteome contained recognizable protein domains. Ortholog searches within the superorder Holometabola provided further evidence of high accuracy and completeness of the Tni-FNL genome assembly. Conclusions: This first draft Tni-FNL genome assembly was enabled by complementary long-read technologies and represents a high-quality, well-annotated genome that provides novel insight into the complexity of this insect cell line and can serve as a reference for future large-scale genome engineering work in this and other similar recombinant protein production hosts. Full article

(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)

► Show Figures

Figure 1

11 pages, 1686 KiB

Open AccessArticle

A High-Quality De novo Genome Assembly from a Single Mosquito Using PacBio Sequencing

by Sarah B. Kingan, Haynes Heaton, Juliana Cudini, Christine C. Lambert, Primo Baybayan, Brendan D. Galvin, Richard Durbin, Jonas Korlach and Mara K. N. Lawniczak

Genes 2019, 10(1), 62; https://doi.org/10.3390/genes10010062 - 18 Jan 2019

Cited by 91 | Viewed by 25127

Abstract

A high-quality reference genome is a fundamental resource for functional genetics, comparative genomics, and population genomics, and is increasingly important for conservation biology. PacBio Single Molecule, Real-Time (SMRT) sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful [...] Read more.

A high-quality reference genome is a fundamental resource for functional genetics, comparative genomics, and population genomics, and is increasingly important for conservation biology. PacBio Single Molecule, Real-Time (SMRT) sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful technology for de novo genome assembly. Improvements in throughput and concomitant reductions in cost have made PacBio an attractive core technology for many large genome initiatives, however, relatively high DNA input requirements (~5 µg for standard library protocol) have placed PacBio out of reach for many projects on small organisms that have lower DNA content, or on projects with limited input DNA for other reasons. Here we present a high-quality de novo genome assembly from a single Anopheles coluzzii mosquito. A modified SMRTbell library construction protocol without DNA shearing and size selection was used to generate a SMRTbell library from just 100 ng of starting genomic DNA. The sample was run on the Sequel System with chemistry 3.0 and software v6.0, generating, on average, 25 Gb of sequence per SMRT Cell with 20 h movies, followed by diploid de novo genome assembly with FALCON-Unzip. The resulting curated assembly had high contiguity (contig N50 3.5 Mb) and completeness (more than 98% of conserved genes were present and full-length). In addition, this single-insect assembly now places 667 (>90%) of formerly unplaced genes into their appropriate chromosomal contexts in the AgamP4 PEST reference. We were also able to resolve maternal and paternal haplotypes for over 1/3 of the genome. By sequencing and assembling material from a single diploid individual, only two haplotypes were present, simplifying the assembly process compared to samples from multiple pooled individuals. The method presented here can be applied to samples with starting DNA amounts as low as 100 ng per 1 Gb genome size. This new low-input approach puts PacBio-based assemblies in reach for small highly heterozygous organisms that comprise much of the diversity of life. Full article

(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)

► Show Figures

Figure 1

16 pages, 2207 KiB

Open AccessArticle

De Novo Assembly of Two Swedish Genomes Reveals Missing Segments from the Human GRCh38 Reference and Improves Variant Calling of Population-Scale Sequencing Data

by Adam Ameur, Huiwen Che, Marcel Martin, Ignas Bunikis, Johan Dahlberg, Ida Höijer, Susana Häggqvist, Francesco Vezzi, Jessica Nordlund, Pall Olason, Lars Feuk and Ulf Gyllensten

Genes 2018, 9(10), 486; https://doi.org/10.3390/genes9100486 - 9 Oct 2018

Cited by 32 | Viewed by 11199

Abstract

The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes [...] Read more.

The current human reference sequence (GRCh38) is a foundation for large-scale sequencing projects. However, recent studies have suggested that GRCh38 may be incomplete and give a suboptimal representation of specific population groups. Here, we performed a de novo assembly of two Swedish genomes that revealed over 10 Mb of sequences absent from the human GRCh38 reference in each individual. Around 6 Mb of these novel sequences (NS) are shared with a Chinese personal genome. The NS are highly repetitive, have an elevated GC-content, and are primarily located in centromeric or telomeric regions. Up to 1 Mb of NS can be assigned to chromosome Y, and large segments are also missing from GRCh38 at chromosomes 14, 17, and 21. Inclusion of NS into the GRCh38 reference radically improves the alignment and variant calling from short-read whole-genome sequencing data at several genomic loci. A re-analysis of a Swedish population-scale sequencing project yields > 75,000 putative novel single nucleotide variants (SNVs) and removes > 10,000 false positive SNV calls per individual, some of which are located in protein coding regions. Our results highlight that the GRCh38 reference is not yet complete and demonstrate that personal genome assemblies from local populations can improve the analysis of short-read whole-genome sequencing data. Full article

(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)

► Show Figures

Figure 1

14 pages, 2482 KiB

Open AccessArticle

A Statistical Method for Observing Personal Diploid Methylomes and Transcriptomes with Single-Molecule Real-Time Sequencing

by Yuta Suzuki, Yunhao Wang, Kin Fai Au and Shinichi Morishita

Genes 2018, 9(9), 460; https://doi.org/10.3390/genes9090460 - 19 Sep 2018

Cited by 3 | Viewed by 4370

Abstract

We address the problem of observing personal diploid methylomes, CpG methylome pairs of homologous chromosomes that are distinguishable with respect to phased heterozygous variants (PHVs), which is challenging due to scarcity of PHVs in personal genomes. Single molecule real-time (SMRT) sequencing is promising [...] Read more.

We address the problem of observing personal diploid methylomes, CpG methylome pairs of homologous chromosomes that are distinguishable with respect to phased heterozygous variants (PHVs), which is challenging due to scarcity of PHVs in personal genomes. Single molecule real-time (SMRT) sequencing is promising as it outputs long reads with CpG methylation information, but a serious concern is whether reliable PHVs are available in erroneous SMRT reads with an error rate of ∼15%. To overcome the issue, we propose a statistical model that reduces the error rate of phasing CpG site to 1%, thereby calling CpG hypomethylation in each haplotype with >90% precision and sensitivity. Using our statistical model, we examined GNAS complex locus known for a combination of maternally, paternally, or biallelically expressed isoforms, and observed allele-specific methylation pattern almost perfectly reflecting their respective allele-specific expression status, demonstrating the merit of elucidating comprehensive personal diploid methylomes and transcriptomes. Full article

(This article belongs to the Special Issue Advances in Single Molecule, Real-Time (SMRT) Sequencing)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Advances in Single Molecule, Real-Time (SMRT) Sequencing

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (8 papers)

Editorial

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI