Feature Papers in Bioinformatics and Systems Biology Section

A topical collection in Biomolecules (ISSN 2218-273X). This collection belongs to the section "Bioinformatics and Systems Biology".

Viewed by 70903

Editor


E-Mail Website
Collection Editor
Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
Interests: structural bioinformatics; intrinsically disordered proteins; protein function prediction; protein-ligand interactions; protein-nucleic acids interactions; structural genomics
Special Issues, Collections and Topics in MDPI journals

Topical Collection Information

This Topical Collection, “Feature Papers in Bioinformatics and Systems Biology”, collects high-quality research articles on outstanding new algorithms and databases in computational molecular biology and significant discoveries that have been developed through the specialized use of computational tools, methods, and databases. It also includes review articles that cover popular and emerging areas of research. With the goal to cover the frontiers of the research in bioinformatics and systems biology, the submissions that reflect the latest progress in their research field are limited to the the Editorial Board Members of Biomolecules and the senior authors they invite. These invited papers will be published online, free of charge, once accepted.

Topics covered in this Collection include, without being limited to, the following:

  • Databases and ontologies;
  • Function determination and prediction;
  • Gene and protein expression analysis;
  • Gene and protein networks;
  • Genome and proteome analysis;
  • Phylogenetics;
  • Sequence analysis;
  • Structural bioinformatics;
  • Structure determination and prediction;
  • Systems biology.

Dr. Lukasz Kurgan
Collection Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the collection website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Biomolecules is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (18 papers)

2023

Jump to: 2022, 2021

15 pages, 1519 KiB  
Article
Comparison of Biomolecular Condensate Localization and Protein Phase Separation Predictors
by Erich R. Kuechler, Alex Huang, Jennifer M. Bui, Thibault Mayor and Jörg Gsponer
Biomolecules 2023, 13(3), 527; https://doi.org/10.3390/biom13030527 - 13 Mar 2023
Cited by 4 | Viewed by 3182
Abstract
Research in the field of biochemistry and cellular biology has entered a new phase due to the discovery of phase separation driving the formation of biomolecular condensates, or membraneless organelles, in cells. The implications of this novel principle of cellular organization are vast [...] Read more.
Research in the field of biochemistry and cellular biology has entered a new phase due to the discovery of phase separation driving the formation of biomolecular condensates, or membraneless organelles, in cells. The implications of this novel principle of cellular organization are vast and can be applied at multiple scales, spawning exciting research questions in numerous directions. Of fundamental importance are the molecular mechanisms that underly biomolecular condensate formation within cells and whether insights gained into these mechanisms provide a gateway for accurate predictions of protein phase behavior. Within the last six years, a significant number of predictors for protein phase separation and condensate localization have emerged. Herein, we compare a collection of state-of-the-art predictors on different tasks related to protein phase behavior. We show that the tested methods achieve high AUCs in the identification of biomolecular condensate drivers and scaffolds, as well as in the identification of proteins able to phase separate in vitro. However, our benchmark tests reveal that their performance is poorer when used to predict protein segments that are involved in phase separation or to classify amino acid substitutions as phase-separation-promoting or -inhibiting mutations. Our results suggest that the phenomenological approach used by most predictors is insufficient to fully grasp the complexity of the phenomenon within biological contexts and make reliable predictions related to protein phase behavior at the residue level. Full article
Show Figures

Figure 1

19 pages, 6812 KiB  
Article
On the Best Way to Cluster NCI-60 Molecules
by Saiveth Hernández-Hernández and Pedro J. Ballester
Biomolecules 2023, 13(3), 498; https://doi.org/10.3390/biom13030498 - 8 Mar 2023
Cited by 8 | Viewed by 3406
Abstract
Machine learning-based models have been widely used in the early drug-design pipeline. To validate these models, cross-validation strategies have been employed, including those using clustering of molecules in terms of their chemical structures. However, the poor clustering of compounds will compromise such validation, [...] Read more.
Machine learning-based models have been widely used in the early drug-design pipeline. To validate these models, cross-validation strategies have been employed, including those using clustering of molecules in terms of their chemical structures. However, the poor clustering of compounds will compromise such validation, especially on test molecules dissimilar to those in the training set. This study aims at finding the best way to cluster the molecules screened by the National Cancer Institute (NCI)-60 project by comparing hierarchical, Taylor–Butina, and uniform manifold approximation and projection (UMAP) clustering methods. The best-performing algorithm can then be used to generate clusters for model validation strategies. This study also aims at measuring the impact of removing outlier molecules prior to the clustering step. Clustering results are evaluated using three well-known clustering quality metrics. In addition, we compute an average similarity matrix to assess the quality of each cluster. The results show variation in clustering quality from method to method. The clusters obtained by the hierarchical and Taylor–Butina methods are more computationally expensive to use in cross-validation strategies, and both cluster the molecules poorly. In contrast, the UMAP method provides the best quality, and therefore we recommend it to analyze this highly valuable dataset. Full article
Show Figures

Figure 1

17 pages, 26393 KiB  
Article
Improving Protein–Ligand Interaction Modeling with cryo-EM Data, Templates, and Deep Learning in 2021 Ligand Model Challenge
by Nabin Giri and Jianlin Cheng
Biomolecules 2023, 13(1), 132; https://doi.org/10.3390/biom13010132 - 9 Jan 2023
Cited by 12 | Viewed by 4131 | Correction
Abstract
Elucidating protein–ligand interaction is crucial for studying the function of proteins and compounds in an organism and critical for drug discovery and design. The problem of protein–ligand interaction is traditionally tackled by molecular docking and simulation, which is based on physical forces and [...] Read more.
Elucidating protein–ligand interaction is crucial for studying the function of proteins and compounds in an organism and critical for drug discovery and design. The problem of protein–ligand interaction is traditionally tackled by molecular docking and simulation, which is based on physical forces and statistical potentials and cannot effectively leverage cryo-EM data and existing protein structural information in the protein–ligand modeling process. In this work, we developed a deep learning bioinformatics pipeline (DeepProLigand) to predict protein–ligand interactions from cryo-EM density maps of proteins and ligands. DeepProLigand first uses a deep learning method to predict the structure of proteins from cryo-EM maps, which is averaged with a reference (template) structure of the proteins to produce a combined structure to add ligands. The ligands are then identified and added into the structure to generate a protein–ligand complex structure, which is further refined. The method based on the deep learning prediction and template-based modeling was blindly tested in the 2021 EMDataResource Ligand Challenge and was ranked first in fitting ligands to cryo-EM density maps. These results demonstrate that the deep learning bioinformatics approach is a promising direction for modeling protein–ligand interactions on cryo-EM data using prior structural information. Full article
Show Figures

Figure 1

2022

Jump to: 2023, 2021

21 pages, 1130 KiB  
Review
Progress and Impact of Latin American Natural Product Databases
by Alejandro Gómez-García and José L. Medina-Franco
Biomolecules 2022, 12(9), 1202; https://doi.org/10.3390/biom12091202 - 30 Aug 2022
Cited by 12 | Viewed by 4085
Abstract
Natural products (NPs) are a rich source of structurally novel molecules, and the chemical space they encompass is far from being fully explored. Over history, NPs have represented a significant source of bioactive molecules and have served as a source of inspiration for [...] Read more.
Natural products (NPs) are a rich source of structurally novel molecules, and the chemical space they encompass is far from being fully explored. Over history, NPs have represented a significant source of bioactive molecules and have served as a source of inspiration for developing many drugs on the market. On the other hand, computer-aided drug design (CADD) has contributed to drug discovery research, mitigating costs and time. In this sense, compound databases represent a fundamental element of CADD. This work reviews the progress toward developing compound databases of natural origin, and it surveys computational methods, emphasizing chemoinformatic approaches to profile natural product databases. Furthermore, it reviews the present state of the art in developing Latin American NP databases and their practical applications to the drug discovery area. Full article
Show Figures

Figure 1

18 pages, 1756 KiB  
Article
Prediction of Aggregation of Biologically-Active Peptides with the UNRES Coarse-Grained Model
by Iga Biskupek, Cezary Czaplewski, Justyna Sawicka, Emilia Iłowska, Maria Dzierżyńska, Sylwia Rodziewicz-Motowidło and Adam Liwo
Biomolecules 2022, 12(8), 1140; https://doi.org/10.3390/biom12081140 - 18 Aug 2022
Cited by 4 | Viewed by 2993
Abstract
The UNited RESidue (UNRES) model of polypeptide chains was applied to study the association of 20 peptides with sizes ranging from 6 to 32 amino-acid residues. Twelve of those were potentially aggregating hexa- or heptapeptides excised from larger proteins, while the remaining eight [...] Read more.
The UNited RESidue (UNRES) model of polypeptide chains was applied to study the association of 20 peptides with sizes ranging from 6 to 32 amino-acid residues. Twelve of those were potentially aggregating hexa- or heptapeptides excised from larger proteins, while the remaining eight contained potentially aggregating sequences, functionalized by attaching larger ends rich in charged residues. For 13 peptides, the experimental data of aggregation were used. The remaining seven were synthesized, and their properties were measured in this work. Multiplexed replica-exchange simulations of eight-chain systems were conducted at 12 temperatures from 260 to 370 K at concentrations from 0.421 to 5.78 mM, corresponding to the experimental conditions. The temperature profiles of the fractions of monomers and octamers showed a clear transition corresponding to aggregate dissociation. Low simulated transition temperatures were obtained for the peptides, which did not precipitate after incubation, as well as for the H-GNNQQNY-NH2 prion–protein fragment, which forms small fibrils. A substantial amount of inter-strand β-sheets was found in most of the systems. The results suggest that UNRES simulations can be used to assess peptide aggregation except for glutamine- and asparagine-rich peptides, for which a revision of the UNRES sidechain–sidechain interaction potentials appears necessary. Full article
Show Figures

Figure 1

13 pages, 1734 KiB  
Article
Comparative Analysis of Gene Correlation Networks of Breast Cancer Patients Based on Mutations in TP53
by Byungkyu Park, Jinho Im and Kyungsook Han
Biomolecules 2022, 12(7), 979; https://doi.org/10.3390/biom12070979 - 13 Jul 2022
Cited by 3 | Viewed by 2409
Abstract
Breast cancer is one of the most prevalent cancers in females, with more than 450,000 deaths each year worldwide. Among the subtypes of breast cancer, basal-like breast cancer, also known as triple-negative breast cancer, shows the lowest survival rate and does not have [...] Read more.
Breast cancer is one of the most prevalent cancers in females, with more than 450,000 deaths each year worldwide. Among the subtypes of breast cancer, basal-like breast cancer, also known as triple-negative breast cancer, shows the lowest survival rate and does not have effective treatments yet. Somatic mutations in the TP53 gene frequently occur across all breast cancer subtypes, but comparative analysis of gene correlations with respect to mutations in TP53 has not been done so far. The primary goal of this study is to identify gene correlations in two groups of breast cancer patients and to derive potential prognostic gene pairs for breast cancer. We partitioned breast cancer patients into two groups: one group with a mutated TP53 gene (mTP53) and the other with a wild-type TP53 gene (wtTP53). For every gene pair, we computed the hazard ratio using the Cox proportional hazard model and constructed gene correlation networks (GCNs) enriched with prognostic information. Our GCN is more informative than typical GCNs in the sense that it indicates the type of correlation between genes, the concordance index, and the prognostic type of a gene. Comparative analysis of correlation patterns and survival time of the two groups revealed several interesting findings. First, we found several new gene pairs with opposite correlations in the two GCNs and the difference in their correlation patterns was the most prominent in the basal-like subtype of breast cancer. Second, we obtained potential prognostic genes for breast cancer patients with a wild-type TP53 gene. From a comparative analysis of GCNs of mTP53 and wtTP53, we found several gene pairs that show significantly different correlation patterns in the basal-like breast cancer subtype and obtained prognostic genes for patients with a wild-type TP53 gene. The GCNs and prognostic genes identified in this study will be informative for the prognosis of survival and for selecting a drug target for breast cancer, in particular for basal-like breast cancer. To the best of our knowledge, this is the first attempt to construct GCNs for breast cancer patients with or without mutations in the TP53 gene and to find prognostic genes accordingly. Full article
Show Figures

Figure 1

12 pages, 1129 KiB  
Article
Automated Protein Secondary Structure Assignment from Cα Positions Using Neural Networks
by Mohammad N. Saqib, Justyna D. Kryś and Dominik Gront
Biomolecules 2022, 12(6), 841; https://doi.org/10.3390/biom12060841 - 17 Jun 2022
Cited by 2 | Viewed by 2695
Abstract
The assignment of secondary structure elements in protein conformations is necessary to interpret a protein model that has been established by computational methods. The process essentially involves labeling the amino acid residues with H (Helix), E (Strand), or C (Coil, also known as [...] Read more.
The assignment of secondary structure elements in protein conformations is necessary to interpret a protein model that has been established by computational methods. The process essentially involves labeling the amino acid residues with H (Helix), E (Strand), or C (Coil, also known as Loop). When particular atoms are absent from an input protein structure, the procedure becomes more complicated, especially when only the alpha carbon locations are known. Various techniques have been tested and applied to this problem during the last forty years. The application of machine learning techniques is the most recent trend. This contribution presents the HECA classifier, which uses neural networks to assign protein secondary structure types. The technique exclusively employs Cα coordinates. The Keras (TensorFlow) library was used to implement and train the neural network model. The BioShell toolkit was used to calculate the neural network input features from raw coordinates. The study’s findings show that neural network-based methods may be successfully used to take on structure assignment challenges when only Cα trace is available. Thanks to the careful selection of input features, our approach’s accuracy (above 97%) exceeded that of the existing methods. Full article
Show Figures

Graphical abstract

32 pages, 4767 KiB  
Article
Immune-Related Protein Interaction Network in Severe COVID-19 Patients toward the Identification of Key Proteins and Drug Repurposing
by Pakorn Sagulkoo, Apichat Suratanee and Kitiporn Plaimas
Biomolecules 2022, 12(5), 690; https://doi.org/10.3390/biom12050690 - 11 May 2022
Cited by 4 | Viewed by 4105
Abstract
Coronavirus disease 2019 (COVID-19) is still an active global public health issue. Although vaccines and therapeutic options are available, some patients experience severe conditions and need critical care support. Hence, identifying key genes or proteins involved in immune-related severe COVID-19 is necessary to [...] Read more.
Coronavirus disease 2019 (COVID-19) is still an active global public health issue. Although vaccines and therapeutic options are available, some patients experience severe conditions and need critical care support. Hence, identifying key genes or proteins involved in immune-related severe COVID-19 is necessary to find or develop the targeted therapies. This study proposed a novel construction of an immune-related protein interaction network (IPIN) in severe cases with the use of a network diffusion technique on a human interactome network and transcriptomic data. Enrichment analysis revealed that the IPIN was mainly associated with antiviral, innate immune, apoptosis, cell division, and cell cycle regulation signaling pathways. Twenty-three proteins were identified as key proteins to find associated drugs. Finally, poly (I:C), mitomycin C, decitabine, gemcitabine, hydroxyurea, tamoxifen, and curcumin were the potential drugs interacting with the key proteins to heal severe COVID-19. In conclusion, IPIN can be a good representative network for the immune system that integrates the protein interaction network and transcriptomic data. Thus, the key proteins and target drugs in IPIN help to find a new treatment with the use of existing drugs to treat the disease apart from vaccination and conventional antiviral therapy. Full article
Show Figures

Figure 1

12 pages, 2442 KiB  
Article
Differentiating Inhibitors of Closely Related Protein Kinases with Single- or Multi-Target Activity via Explainable Machine Learning and Feature Analysis
by Christian Feldmann and Jürgen Bajorath
Biomolecules 2022, 12(4), 557; https://doi.org/10.3390/biom12040557 - 8 Apr 2022
Cited by 8 | Viewed by 2654
Abstract
Protein kinases are major drug targets. Most kinase inhibitors are directed against the adenosine triphosphate (ATP) cofactor binding site, which is largely conserved across the human kinome. Hence, such kinase inhibitors are often thought to be promiscuous. However, experimental evidence and activity data [...] Read more.
Protein kinases are major drug targets. Most kinase inhibitors are directed against the adenosine triphosphate (ATP) cofactor binding site, which is largely conserved across the human kinome. Hence, such kinase inhibitors are often thought to be promiscuous. However, experimental evidence and activity data for publicly available kinase inhibitors indicate that this is not generally the case. We have investigated whether inhibitors of closely related human kinases with single- or multi-kinase activity can be differentiated on the basis of chemical structure. Therefore, a test system consisting of two distinct kinase triplets has been devised for which inhibitors with reported triple-kinase activities and corresponding single-kinase activities were assembled. Machine learning models derived on the basis of chemical structure distinguished between these multi- and single-kinase inhibitors with high accuracy. A model-independent explanatory approach was applied to identify structural features determining accurate predictions. For both kinase triplets, the analysis revealed decisive features contained in multi-kinase inhibitors. These features were found to be absent in corresponding single-kinase inhibitors, thus providing a rationale for successful machine learning. Mapping of features determining accurate predictions revealed that they formed coherent and chemically meaningful substructures that were characteristic of multi-kinase inhibitors compared with single-kinase inhibitors. Full article
Show Figures

Figure 1

13 pages, 6750 KiB  
Article
Darling: A Web Application for Detecting Disease-Related Biomedical Entity Associations with Literature Mining
by Evangelos Karatzas, Fotis A. Baltoumas, Ioannis Kasionis, Despina Sanoudou, Aristides G. Eliopoulos, Theodosios Theodosiou, Ioannis Iliopoulos and Georgios A. Pavlopoulos
Biomolecules 2022, 12(4), 520; https://doi.org/10.3390/biom12040520 - 30 Mar 2022
Cited by 12 | Viewed by 4043
Abstract
Finding, exploring and filtering frequent sentence-based associations between a disease and a biomedical entity, co-mentioned in disease-related PubMed literature, is a challenge, as the volume of publications increases. Darling is a web application, which utilizes Name Entity Recognition to identify human-related biomedical terms [...] Read more.
Finding, exploring and filtering frequent sentence-based associations between a disease and a biomedical entity, co-mentioned in disease-related PubMed literature, is a challenge, as the volume of publications increases. Darling is a web application, which utilizes Name Entity Recognition to identify human-related biomedical terms in PubMed articles, mentioned in OMIM, DisGeNET and Human Phenotype Ontology (HPO) disease records, and generates an interactive biomedical entity association network. Nodes in this network represent genes, proteins, chemicals, functions, tissues, diseases, environments and phenotypes. Users can search by identifiers, terms/entities or free text and explore the relevant abstracts in an annotated format. Full article
Show Figures

Figure 1

16 pages, 3641 KiB  
Article
VEGFA, B, C: Implications of the C-Terminal Sequence Variations for the Interaction with Neuropilins
by Charles Eldrid, Mire Zloh, Constantina Fotinou, Tamas Yelland, Lefan Yu, Filipa Mota, David L. Selwood and Snezana Djordjevic
Biomolecules 2022, 12(3), 372; https://doi.org/10.3390/biom12030372 - 26 Feb 2022
Cited by 2 | Viewed by 2740
Abstract
Vascular endothelial growth factors (VEGFs) are the key regulators of blood and lymphatic vessels’ formation and function. Each of the proteins from the homologous family VEGFA, VEGFB, VEGFC and VEGFD employs a core cysteine-knot structural domain for the specific interaction with one or [...] Read more.
Vascular endothelial growth factors (VEGFs) are the key regulators of blood and lymphatic vessels’ formation and function. Each of the proteins from the homologous family VEGFA, VEGFB, VEGFC and VEGFD employs a core cysteine-knot structural domain for the specific interaction with one or more of the cognate tyrosine kinase receptors. Additional diversity is exhibited by the involvement of neuropilins–transmembrane co-receptors, whose b1 domain contains the binding site for the C-terminal sequence of VEGFs. Although all relevant isoforms of VEGFs that interact with neuropilins contain the required C-terminal Arg residue, there is selectivity of neuropilins and VEGF receptors for the VEGF proteins, which is reflected in the physiological roles that they mediate. To decipher the contribution made by the C-terminal sequences of the individual VEGF proteins to that functional differentiation, we determined structures of molecular complexes of neuropilins and VEGF-derived peptides and examined binding interactions for all neuropilin-VEGF pairs experimentally and computationally. While X-ray crystal structures and ligand-binding experiments highlighted similarities between the ligands, the molecular dynamics simulations uncovered conformational preferences of VEGF-derived peptides beyond the C-terminal arginine that contribute to the ligand selectivity of neuropilins. The implications for the design of the selective antagonists of neuropilins’ functions are discussed. Full article
Show Figures

Figure 1

14 pages, 1363 KiB  
Article
Prediction and Modeling of Protein–Protein Interactions Using “Spotted” Peptides with a Template-Based Approach
by Chiara Gasbarri, Serena Rosignoli, Giacomo Janson, Dalila Boi and Alessandro Paiardini
Biomolecules 2022, 12(2), 201; https://doi.org/10.3390/biom12020201 - 25 Jan 2022
Cited by 3 | Viewed by 3559
Abstract
Protein–peptide interactions (PpIs) are a subset of the overall protein–protein interaction (PPI) network in the living cell and are pivotal for the majority of cell processes and functions. High-throughput methods to detect PpIs and PPIs usually require time and costs that are not [...] Read more.
Protein–peptide interactions (PpIs) are a subset of the overall protein–protein interaction (PPI) network in the living cell and are pivotal for the majority of cell processes and functions. High-throughput methods to detect PpIs and PPIs usually require time and costs that are not always affordable. Therefore, reliable in silico predictions represent a valid and effective alternative. In this work, a new algorithm is described, implemented in a freely available tool, i.e., “PepThreader”, to carry out PPIs and PpIs prediction and analysis. PepThreader threads multiple fragments derived from a full-length protein sequence (or from a peptide library) onto a second template peptide, in complex with a protein target, “spotting” the potential binding peptides and ranking them according to a sequence-based and structure-based threading score. The threading algorithm first makes use of a scoring function that is based on peptides sequence similarity. Then, a rerank of the initial hits is performed, according to structure-based scoring functions. PepThreader has been benchmarked on a dataset of 292 protein–peptide complexes that were collected from existing databases of experimentally determined protein–peptide interactions. An accuracy of 80%, when considering the top predicted 25 hits, was achieved, which performs in a comparable way with the other state-of-art tools in PPIs and PpIs modeling. Nonetheless, PepThreader is unique in that it is able at the same time to spot a binding peptide within a full-length sequence involved in PPI and model its structure within the receptor. Therefore, PepThreader adds to the already-available tools supporting the experimental PPIs and PpIs identification and characterization. Full article
Show Figures

Figure 1

18 pages, 1079 KiB  
Review
Looking at COVID-19 from a Systems Biology Perspective
by Emily Samuela Turilli, Marta Lualdi and Mauro Fasano
Biomolecules 2022, 12(2), 188; https://doi.org/10.3390/biom12020188 - 22 Jan 2022
Cited by 3 | Viewed by 6219
Abstract
The sudden outbreak and worldwide spread of the SARS-CoV-2 pandemic pushed the scientific community to find fast solutions to cope with the health emergency. COVID-19 complexity, in terms of clinical outcomes, severity, and response to therapy suggested the use of multifactorial strategies, characteristic [...] Read more.
The sudden outbreak and worldwide spread of the SARS-CoV-2 pandemic pushed the scientific community to find fast solutions to cope with the health emergency. COVID-19 complexity, in terms of clinical outcomes, severity, and response to therapy suggested the use of multifactorial strategies, characteristic of the network medicine, to approach the study of the pathobiology. Proteomics and interactomics especially allow to generate datasets that, reduced and represented in the forms of networks, can be analyzed with the tools of systems biology to unveil specific pathways central to virus–human host interaction. Moreover, artificial intelligence tools can be implemented for the identification of druggable targets and drug repurposing. In this review article, we provide an overview of the results obtained so far, from a systems biology perspective, in the understanding of COVID-19 pathobiology and virus–host interactions, and in the development of disease classifiers and tools for drug repurposing. Full article
Show Figures

Figure 1

17 pages, 9357 KiB  
Article
Determination of the Amino Acid Recruitment Order in Early Life by Genome-Wide Analysis of Amino Acid Usage Bias
by Mingxiao Zhao, Ruofan Ding, Yan Liu, Zhiliang Ji and Yufen Zhao
Biomolecules 2022, 12(2), 171; https://doi.org/10.3390/biom12020171 - 21 Jan 2022
Cited by 5 | Viewed by 3036
Abstract
The mechanisms shaping the amino acids recruitment pattern into the proteins in the early life history presently remains a huge mystery. In this study, we conducted genome-wide analyses of amino acids usage and genetic codons structure in 7270 species across three domains of [...] Read more.
The mechanisms shaping the amino acids recruitment pattern into the proteins in the early life history presently remains a huge mystery. In this study, we conducted genome-wide analyses of amino acids usage and genetic codons structure in 7270 species across three domains of life. The carried-out analyses evidenced ubiquitous usage bias of amino acids that were likely independent from codon usage bias. Taking advantage of codon usage bias, we performed pseudotime analysis to re-determine the chronological order of the species emergence, which inspired a new species relationship by tracing the imprint of codon usage evolution. Furthermore, the multidimensional data integration showed that the amino acids A, D, E, G, L, P, R, S, T and V might be the first recruited into the last universal common ancestry (LUCA) proteins. The data analysis also indicated that the remaining amino acids most probably were gradually incorporated into proteogenesis process in the course of two long-timescale parallel evolutionary routes: I→F→Y→C→M→W and K→N→Q→H. This study provides new insight into the origin of life, particularly in terms of the basic protein composition of early life. Our work provides crucial information that will help in a further understanding of protein structure and function in relation to their evolutionary history. Full article
Show Figures

Figure 1

13 pages, 1428 KiB  
Article
Deregulation of Trace Amine-Associated Receptors (TAAR) Expression and Signaling Mode in Melanoma
by Anastasia N. Vaganova, Savelii R. Kuvarzin, Anastasia M. Sycheva and Raul R. Gainetdinov
Biomolecules 2022, 12(1), 114; https://doi.org/10.3390/biom12010114 - 11 Jan 2022
Cited by 5 | Viewed by 2515
Abstract
Trace amine-associated receptors (TAARs) interact with amine compounds called “trace amines” which are present in tissues at low concentrations. Recently, TAARs expression in neoplastic tumors was reported. In this study, TAARs expression was analyzed in public RNAseq datasets in nevi and melanoma samples [...] Read more.
Trace amine-associated receptors (TAARs) interact with amine compounds called “trace amines” which are present in tissues at low concentrations. Recently, TAARs expression in neoplastic tumors was reported. In this study, TAARs expression was analyzed in public RNAseq datasets in nevi and melanoma samples and compared to the expression of dopamine receptors (DRDs) that are known to be involved in melanoma pathogenesis. It was found that all DRDs and TAARs are expressed in nevi at comparable levels. Differential expression analysis demonstrated the drastic decrease of TAAR1, TAAR2, TAAR5, TAAR6, and TAAR8 expression in melanomas compared to benign nevi with only TAAR6, TAAR8, and TAAR9 remaining detectable in malignant tumors. No association of TAARs expression levels and melanoma clinicopathological characteristics was observed. TAARs co-expressed genes in melanoma and nevi were selected by correlation values for comparative pathway enrichment analysis between malignant and benign neoplasia. It was found that coexpression of TAARs with genes inquired in neurotransmitter signaling is lost in melanoma, and tumor-specific association of TAAR6 expression with the mTOR pathway and inflammatory signaling is observed. It is not excluded that TAARs may have certain functions in melanoma pathogenesis, the significance of which to tumor progression is yet to be understood. Full article
Show Figures

Figure 1

2021

Jump to: 2023, 2022

12 pages, 1303 KiB  
Article
Accurate Sequence-Based Prediction of Deleterious nsSNPs with Multiple Sequence Profiles and Putative Binding Residues
by Ruiyang Song, Baixin Cao, Zhenling Peng, Christopher J. Oldfield, Lukasz Kurgan, Ka-Chun Wong and Jianyi Yang
Biomolecules 2021, 11(9), 1337; https://doi.org/10.3390/biom11091337 - 9 Sep 2021
Viewed by 2773
Abstract
Non-synonymous single nucleotide polymorphisms (nsSNPs) may result in pathogenic changes that are associated with human diseases. Accurate prediction of these deleterious nsSNPs is in high demand. The existing predictors of deleterious nsSNPs secure modest levels of predictive performance, leaving room for improvements. We [...] Read more.
Non-synonymous single nucleotide polymorphisms (nsSNPs) may result in pathogenic changes that are associated with human diseases. Accurate prediction of these deleterious nsSNPs is in high demand. The existing predictors of deleterious nsSNPs secure modest levels of predictive performance, leaving room for improvements. We propose a new sequence-based predictor, DMBS, which addresses the need to improve the predictive quality. The design of DMBS relies on the observation that the deleterious mutations are likely to occur at the highly conserved and functionally important positions in the protein sequence. Correspondingly, we introduce two innovative components. First, we improve the estimates of the conservation computed from the multiple sequence profiles based on two complementary databases and two complementary alignment algorithms. Second, we utilize putative annotations of functional/binding residues produced by two state-of-the-art sequence-based methods. These inputs are processed by a random forests model that provides favorable predictive performance when empirically compared against five other machine-learning algorithms. Empirical results on four benchmark datasets reveal that DMBS achieves AUC > 0.94, outperforming current methods, including protein structure-based approaches. In particular, DMBS secures AUC = 0.97 for the SNPdbe and ExoVar datasets, compared to AUC = 0.70 and 0.88, respectively, that were obtained by the best available methods. Further tests on the independent HumVar dataset shows that our method significantly outperforms the state-of-the-art method SNPdryad. We conclude that DMBS provides accurate predictions that can effectively guide wet-lab experiments in a high-throughput manner. Full article
Show Figures

Figure 1

16 pages, 3639 KiB  
Article
GNAi2/gip2-Regulated Transcriptome and Its Therapeutic Significance in Ovarian Cancer
by Ji Hee Ha, Muralidharan Jayaraman, Mingda Yan, Padmaja Dhanasekaran, Ciro Isidoro, Yong Sang Song and Danny N. Dhanasekaran
Biomolecules 2021, 11(8), 1211; https://doi.org/10.3390/biom11081211 - 14 Aug 2021
Cited by 10 | Viewed by 4196
Abstract
Increased expression of GNAi2, which encodes the α-subunit of G-protein i2, has been correlated with the late-stage progression of ovarian cancer. GNAi2, also referred to as the proto-oncogene gip2, transduces signals from lysophosphatidic acid (LPA)-activated LPA-receptors to oncogenic cellular responses [...] Read more.
Increased expression of GNAi2, which encodes the α-subunit of G-protein i2, has been correlated with the late-stage progression of ovarian cancer. GNAi2, also referred to as the proto-oncogene gip2, transduces signals from lysophosphatidic acid (LPA)-activated LPA-receptors to oncogenic cellular responses in ovarian cancer cells. To identify the oncogenic program activated by gip2, we carried out micro-array-based transcriptomic and bioinformatic analyses using the ovarian cancer cell-line SKOV3, in which the expression of GNAi2/gip2 was silenced by specific shRNA. A cut-off value of 5-fold change in gene expression (p < 0.05) indicated that a total of 264 genes were dependent upon gip2-expression with 136 genes coding for functional proteins. Functional annotation of the transcriptome indicated the hitherto unknown role of gip2 in stimulating the expression of oncogenic/growth-promoting genes such as KDR/VEGFR2, CCL20, and VIP. The array results were further validated in a panel of High-Grade Serous Ovarian Carcinoma (HGSOC) cell lines that included Kuramochi, OVCAR3, and OVCAR8 cells. Gene set enrichment analyses using DAVID, STRING, and Cytoscape applications indicated the potential role of the gip2-stimulated transcriptomic network involved in the upregulation of cell proliferation, adhesion, migration, cellular metabolism, and therapy resistance. The results unravel a multi-modular network in which the hub and bottleneck nodes are defined by ACKR3/CXCR7, IL6, VEGFA, CYCS, COX5B, UQCRC1, UQCRFS1, and FYN. The identification of these genes as the critical nodes in GNAi2/gip2 orchestrated onco-transcriptome establishes their role in ovarian cancer pathophysiology. In addition, these results also point to these nodes as potential targets for novel therapeutic strategies. Full article
Show Figures

Figure 1

17 pages, 2750 KiB  
Article
Deep Learning for Novel Antimicrobial Peptide Design
by Christina Wang, Sam Garlick and Mire Zloh
Biomolecules 2021, 11(3), 471; https://doi.org/10.3390/biom11030471 - 22 Mar 2021
Cited by 50 | Viewed by 8380
Abstract
Antimicrobial resistance is an increasing issue in healthcare as the overuse of antibacterial agents rises during the COVID-19 pandemic. The need for new antibiotics is high, while the arsenal of available agents is decreasing, especially for the treatment of infections by Gram-negative bacteria [...] Read more.
Antimicrobial resistance is an increasing issue in healthcare as the overuse of antibacterial agents rises during the COVID-19 pandemic. The need for new antibiotics is high, while the arsenal of available agents is decreasing, especially for the treatment of infections by Gram-negative bacteria like Escherichia coli. Antimicrobial peptides (AMPs) are offering a promising route for novel antibiotic development and deep learning techniques can be utilised for successful AMP design. In this study, a long short-term memory (LSTM) generative model and a bidirectional LSTM classification model were constructed to design short novel AMP sequences with potential antibacterial activity against E. coli. Two versions of the generative model and six versions of the classification model were trained and optimised using Bayesian hyperparameter optimisation. These models were used to generate sets of short novel sequences that were classified as antimicrobial or non-antimicrobial. The validation accuracies of the classification models were 81.6–88.9% and the novel AMPs were classified as antimicrobial with accuracies of 70.6–91.7%. Predicted three-dimensional conformations of selected short AMPs exhibited the alpha-helical structure with amphipathic surfaces. This demonstrates that LSTMs are effective tools for generating novel AMPs against targeted bacteria and could be utilised in the search for new antibiotics leads. Full article
Show Figures

Figure 1

Back to TopTop