1. Introduction
Proteins are important macromolecules that are essential components for every living organism. Human beings obtain the necessary quantity and quality of protein through the food they eat. Nowadays, many studies revealed that some proteins possess specific sequences of amino acids (peptides) that, if released from the native structure, has the potential to benefit or harm specific functions or systems in human body [
1]. Commonly reported activities include antihypertensive properties [
2], and antioxidant [
3], antimicrobial [
4], cholesterol-lowering [
5], and immunomodulatory [
6] activities. Practically, every protein can contain biologically active peptide sequences (BAP) or, once identified, they can be artificially synthesized.
Bioinformatics and chemoinformatics tools can be used to design peptides with new functions, which have their application in the food industry, cosmetics, pharmacology, and medicine. These tools are based on the so-called biological databases.
The pursuit of healthy eating has led to an increasing interest in biologically active food-derived peptides (FDBP). This, in turn, leads to an increasing demand for information about precisely this type of peptides and, accordingly, to the creation of databases containing such information. A literature search was done in PubMed, Web of Science, and Elsevier ScienceDirect databases using several keywords: “food-derived peptides” (19,443, 16,793, and 7011 results respectively), “bioactive peptides from food” (5846, 4181, and 5145 results respectively), and “bioactive peptides database” (632, 2635, and 696 results respectively). It must be concluded that currently, there are a limited number of open-access databases focused on bioactive peptides derived from foods: BIOPEP-UWM [
7,
8], APD3 [
9], AHTPDB [
10], PlantPepDB [
11], FeptideDB [
12], CancerPPD [
13], BioPepDB [
14], MBPDB [
15], DFBP [
16], and FermFooDB [
17]. The study of different peptide sequences, by in silico or in vitro methods, requires the availability of correct FDBP data. After researching publicly available databases and data sets, it was found that they have the following shortcomings:
- -
in some of the public databases, there is duplication of information in the various records;
- -
in some of the public databases, contradictory and even incorrect information about bioactive peptides is noticed;
- -
in the different databases, a given peptide is said to have different biological activities, i.e., it is beneficial to collect the information about all the studied activities for it;
- -
there is insufficient information on the physicochemical properties of peptides, and the available information often has errors in some indicators;
- -
there is no possibility (with few exceptions) to extract information from a given random peptide sequence, or present it graphically;
- -
not all of the presented tools work correctly, and in some cases, there is even no access to some tools at all;
- -
existing peptide databases do not offer optimization, so when entering a peptide sequence into Internet search engines, no information about it can be found, both in one-letter format and in three-letter format.
In order to overcome these shortcomings, a web-based open access platform PepLab (
www.pep-lab.info, accessed on 4 December 2022) was created. It includes a database of biologically active peptides extracted from food and software tools for their processing and analysis.
PepLab’s database contains several thousand peptide sequence records, and the information on them is updated daily or new peptides are added. Each peptide has been checked by comparison with available biological activity databases as well as in the scientific literature.
PepLab also includes the DMPep module for data mining analysis of the physicochemical properties of a peptide or a set of peptides. DMpep extracts information on multiple physicochemical characteristics. All data are visualized both textually and graphically and can be downloaded for further processing by researchers. Furthermore, the development of additional modules in the PepLab platform has been planned, an example being a module for predicting biological activity using artificial intelligence methods.
The main advantages of the PepLab platform are that it is a convenient and intuitive user interface, has a responsive design that allows working with it on devices with different resolutions, and it is optimized for search engines in order to improve its visibility on the Internet. The database does not allow an existing peptide sequence to be recorded. In addition, users have the option to download data and results in a format convenient for them (text and/or graphic). The advantages of the PepLab platform as well as its comparison with the databases cited above are summarized in
Table 1.
The PepLab is designed mainly as a platform where researchers can find both correct and well-presented visual information about FDBP and additional tools for peptide sequence analysis in one place.
2. Peplab Platform Description
The database of the PepLab platform contains 2764 unique peptide sequences (last checked on 4 December 2022), and the information in the database is constantly updated. According to the classification of databases presented in [
18], the database in PepLab is primary and community-curated. Being primary means that it contains information about the structure and/or sequence of a given molecule, while community-curated means that database curators work in an integrated and collaborative group of professionals.
Peptide sequences in the database of the PepLab platform are divided into 15 groups according to their biological activity. In addition, a group of multifunctional peptides was also created for peptides that have more than one type of biological activity. There are several reasons for choosing these groups: they are the most extensively studied ones, there is enough public data about them, and they are the most widely distributed, as they are used for the treatment and prevention of the most common diseases. There are also peptide sequences with biological activity different from those included in PepLab. However, there is limited public data for them—less than 10 peptides of a given class. Such activities can be added at a later stage to the database.
Peptide sequences are also classified according to the source from which they were derived. In this case, seven large groups were formed—terrestrial animal, terrestrial plant, marine organisms, human, milk, microorganisms, and synthetic. The name of the specific organism from which the sequence was derived is recorded in the peptide information card.
Peptide sequences included in the database of the PepLab platform were manually verified by the team at the National Library of Medicine at the US National Center for Biotechnology Information [
19], the world’s largest protein and peptide information resource, UniProt [
20], and in scientific publications indexed in Scopus and Web of science. In case the information on a certain amino acid structure is impossible to verify, this amino acid structure is not added to the PepLab database. Each peptide in the categories with biological activity features only has the proven activity presented in the scientific literature, and the sequences with more than one activity are separated into a separate category—multifunctional peptides. Synthetic biomolecules featuring activities and characteristics that largely duplicate the properties of FDBP have also been reported.
The data is stored in a MySQL database version 8.0, and various web programming languages comprising PHP 7.4, HTML 5, CSS 3, Java script, and Bootstrap 5.1 framework are used for processing and visualization. The design is executed in a fully responsive and multi-column layout variant, so that the information is visualized well both on large screens and on devices with a small screen such as tablets and phones. Charts are implemented using the Highcharts (
highcharts.com, accessed on 4 December 2022) graphic libraries, and users are provided with the ability to export to the most popular graphic formats.
2.1. Peplab Architecture
The PepLab platform is built from several modules, based on which the main functionalities of the application for input, output, processing and presentation of information are implemented:
A module for predicting the biological activity of peptides using artificial intelligence techniques and methods is under development.
The PepLab platform works on the basis of a standard three-tier model comprising the web presentation layer, the application layer and the database layer (
Figure 1).
2.1.1. Input Data Module
Input data module has the function of accepting the data entered by the user researcher and passing it to the application layer, where it is processed. Information can be submitted to the PepLab platform from four entry points, respectively, when searching for information in the database, when analyzing a peptide sequence or dataset of peptides, when proposing a new peptide for addition, and the contact form for contacting the team.
2.1.2. DMPep Module
TheDMPep module provides a complex analysis of physicochemical characteristics of a peptide sequence. It accepts data from the input data module for a single peptide or dataset of peptides and extracts: peptide length, molecular weight, hydrophobicity, grand average of hydropathy index (GRAVY) [
21], aliphatic index [
22], protein-binding potential index [
23], acidity, polarity, isoelectric point, and net charge, as well as atomic distribution, AAC, and grouped AAC. The obtained results are transmitted to the output data module for visualization in text or graphic format, as well as to the download module for generating files in a spreadsheet (CSV) file.
2.1.3. Statistical Module
The statistical module displays summary information about the peptides available in the database. The user has the option to choose between data for all peptides or only for a specific type of biological activity. According to the selected criterion, minimum and maximum values are calculated as well as the frequency distribution of amino acids (AAC), molecular mass, isoelectric point, hydrophobicity index (GRAVY), aliphatic index, and protein-binding potential (Boman index). The data are grouped in strict compliance with the data grouping rules known from statistics. The number of classes is different for peptides with different biological activities, as it is directly dependent on the sample volume (i.e., the number of peptide sequences with the corresponding activity), according to the Sturgis rule [
24]:
where
k is the number of groups (classes) and
n is the sample size.
The obtained results are transmitted to the output data module for visualization.
2.1.4. Download Module
The download module belongs to the application layer of the PepLab platform architecture. It accepts input data from DMPep and statistical modules. Based on them, it creates data files in either the most popular format in bioinformatics, FASTA, or a format convenient for working with data software in tabular form, CSV.
2.1.5. Output Data Module
The output data module provides the platform user with the results of their query. Data can enter here from all modules of the application layer, with the exception of the supplementing database data module. The results are visualized in text or graphic form or there is an option to download the files generated by the download module. The graphic presentation is dynamic, which allows additional information to be visualized when hovering over the graphic. A full screen option is provided, as well as the possibility to download the graphics in various raster and vector formats.
2.1.6. Supplementing Database Data Module
The supplementing database data module has the function of supplementing missing information in database records. When a new peptide is recorded in the database, part of the information about it is missing. To update this information after entering the primary information, this module was started to add the missing information to the database using the capabilities of DMPep.
Table 2 shows the parts of the information in the peptide map that are from the scientific literature, and the information that is extracted by the DMPep module.
2.2. PepLab Platform Functionalities
Access to the information map of the peptide is through the database heading, and for this purpose, two options are provided: Browse and Search. The Browse option provides general information about the available data and three search options by activity, source, and length with the corresponding graph representing the individual stocks. For each of them, the information is visualized in a similar way.
Figure 2 shows the workflow of the Browse by Activity option. It is demonstrated by screenshots of the single peptide information card access pages. After selecting an activity, the results are visualized by pages and through details the user can go to the complete information of the selected peptide sequence.
The second possibility to access information is through the Search option. The search can be carried out according to the following parameters—ID, sequence, length, activity and organism. After selecting a parameter from the drop-down menu and a value for this parameter, the results are visualized again and from there the user can go to the information map of the peptide of the peptide (
Figure 3).
The PepLab platform provides users with the opportunity to download available data on peptides with certain biological activity in tabular or FASTA format. This is done through the Downloads menu.
The input screen of the DMPep physicochemical feature extraction tool allows visualization of the values of a single peptide and an option to download them in tabular format (
Figure 4a), as well as to calculate the same features for an entire dataset proposed by the user. In this case, the features are the number of peptides in the dataset, minimum and maximum values of molecular weight, length, isoelectric point, grand average of hydropathy index (GRAVY), aliphatic index, protein-binding potential index, and a file with the calculated values for all peptides (
Figure 4b).
Summary statistics information of all database records is presented in the statistics menu. Six graphs provide information on the frequency distribution by amino acids, and the distributions for molecular weight, pI, GRAVY, Aliphatic index, and Boman index (
Figure 5). This information can be displayed both for all peptides and for any available biological activity.
For feedback to the team that developed the PepLab platform, the form in the contact menu can be used. In addition, researchers have the possibility to propose a peptide sequence for addition to the database. For this purpose, the submit peptide option is used. The PepLab platform contains a database focused primarily on FDBP, but as we plan to expand its capabilities, any proposed peptide with credible scientific information would be useful to us.
3. Conclusions and Future Work
This article introduces the open access web-based platform PepLab. Currently, it includes a database of peptide sequences extracted from foods and the MDpep tool which is used to calculate the theoretical values of the physicochemical characteristics of peptides. Its architecture, structure, functionalities, and the way to work with the PepLab platform are demonstrated. Currently, there are 2764 peptide entries in the database, distributed in sixteen categories according to their biological activity and in seven categories according to the source from which they were obtained.
Among the main advantages of the platform are its convenient and intuitive user interface and responsive design. Peplab is optimized for search engines to improve its visibility on the Internet. In addition, users have the option to download data and results in a format convenient for them (text and/or graphic).
In the future, the development of additional modules has been planned; an example of such a development is a module for predicting the biological activity of peptides using artificial intelligence methods. In addition, a system for registering users with different roles and with different levels of access will be implemented. A news module will also be created with the purpose of keeping researchers informed about new developments in the platform. A smooth transition to structural bioinformatics is also envisaged, i.e., inclusion in the database of images of peptide sequences, and their recognition with deep learning algorithms.
Author Contributions
Conceptualization, Z.T. and M.T.; methodology, Z.T.; software, Z.T. and M.T.; validation, A.K., M.T., Z.T., D.M., E.H., I.D. and S.H.; formal analysis, A.K., D.M. and I.D.; investigation, S.H. and E.H.; resources, A.K., D.M., S.H., I.D. and E.H.; data curation, Z.T., A.K., M.T., S.H., E.H., I.D. and D.M.; writing—original draft preparation, Z.T. and M.T.; writing—review and editing, S.H., I.D. and A.K.; visualization, E.H. and D.M.; supervision, M.T.; project administration, M.T.; funding acquisition, M.T. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Science Fund of University of Fооd Tеchnоlоgiеs (UFT), Plovdiv, Bulgaria, grant No. 08/21-H, and Bulgarian Scientific Fund, grant No. KP-06-M36/2.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The PepLab platform is freely available at the following website:
www.pep-lab.info (accessed on 4 December 2022).
Conflicts of Interest
The authors declare no conflict of interest.
References
- Karami, Z.; Akbari-Adergani, B. Bioactive food derived peptides: A review on correlation between structure of bioactive peptides and their functional properties. J. Food Sci. Technol. 2019, 56, 535–547. [Google Scholar] [CrossRef] [PubMed]
- Okagu, I.U.; Ezeorba, T.P.; Aham, E.C.; Aguchem, R.N.; Nechi, R.N. Recent findings on the cellular and molecular mechanisms of action of novel food-derived antihypertensive peptides. Food Chem. Mol. Sci. 2022, 4, 100078. [Google Scholar] [CrossRef] [PubMed]
- Nwachukwu, I.D.; Aluko, R.E. Structural and functional properties of food protein-derived antioxidant peptides. J. Food Biochem. 2019, 43, e12761. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Magana, M.; Pushpanathan, M.; Santos, A.L.; Leanse, L.; Fernandez, M.; Ioannidis, A.; Giulianotti, M.A.; Apidianakis, Y.; Bradfute, S.; Ferguson, A.L.; et al. The value of antimicrobial peptides in the age of resistance. Lancet Infect. Dis. 2020, 20, e216–e230. [Google Scholar] [CrossRef] [PubMed]
- Boachie, R.; Yao, S.; Udenigwe, C.C. Molecular mechanisms of cholesterol-lowering peptides derived from food proteins. Curr. Opin. Food Sci. 2018, 20, 58–63. [Google Scholar] [CrossRef]
- Lee, J.H.; Paik, H.D. Anticancer and immunomodulatory activity of egg proteins and peptides: A review. Poult. Sci. 2019, 98, 6505–6516. [Google Scholar] [CrossRef]
- Minkiewicz, P.; Iwaniak, A.; Darewicz, M. BIOPEP-UWM database of bioactive peptides: Current opportunities. Int. J. Mol. Sci. 2019, 20, 5978. [Google Scholar] [CrossRef] [Green Version]
- Minkiewicz, P.; Iwaniak, A.; Darewicz, M. BIOPEP-UWM Virtual—A Novel Database of Food-Derived Peptides with In Silico-Predicted Biological Activity. Appl. Sci. 2022, 12, 7204. [Google Scholar] [CrossRef]
- Wang, G.; Li, X.; Wang, Z. APD3: The antimicrobial peptide database as a tool for research and education. Nucleic Acids Res. 2016, 44, D1087–D1093. [Google Scholar] [CrossRef] [Green Version]
- Kumar, R.; Chaudhary, K.; Sharma, M.; Nagpal, G.; Chauhan, J.S.; Singh, S.; Gautam, A.; Raghava, G.P. AHTPDB: A comprehensive platform for analysis and presentation of antihypertensive peptides. Nucleic Acids Res. 2015, 43, D956–D962. [Google Scholar] [CrossRef]
- Das, D.; Jaiswal, M.; Khan, F.N.; Ahamad, S.; Kumar, S. PlantPepDB: A manually curated plant peptide database. Sci. Rep. 2020, 10, 2194. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Panyayai, T.; Ngamphiw, C.; Tongsima, S.; Mhuantong, W.; Limsripraphan, W.; Choowongkomon, K.; Sawatdichaikul, O. FeptideDB: A web application for new bioactive peptides from food protein. Heliyon 2019, 5, e02076. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Tyagi, A.; Tuknait, A.; Anand, P.; Gupta, S.; Sharma, M.; Mathur, D.; Joshi, A.; Singh, S.; Gautam, A.; Raghava, G.P. CancerPPD: A database of anticancer peptides and proteins. Nucleic Acids Res. 2015, 43, D837–D843. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, Q.; Zhang, C.; Chen, H.; Xue, J.; Guo, X.; Liang, M.; Chen, M. BioPepDB: An integrated data platform for food-derived bioactive peptides. Int. J. Food Sci. Nutr. 2018, 69, 963–968. [Google Scholar] [CrossRef]
- Nielsen, S.D.; Beverly, R.L.; Qu, Y.; Dallas, D.C. Milk bioactive peptide database: A comprehensive database of milk protein-derived bioactive peptides and novel visualization. Food Chem. 2017, 232, 673–682. [Google Scholar] [CrossRef]
- Qin, D.; Bo, W.; Zheng, X.; Hao, Y.; Li, B.; Zheng, J.; Liang, G. DFBP: A comprehensive database of food-derived bioactive peptides for peptidomics research. Bioinformatics 2022, 38, 3275–3280. [Google Scholar] [CrossRef]
- Chaudhary, A.; Bhalla, S.; Patiyal, S.; Raghava, G.P.; Sahni, G. FermFooDb: A database of bioactive peptides derived from fermented foods. Heliyon 2021, 7, e06668. [Google Scholar] [CrossRef]
- Iwaniak, A.; Darewicz, M.; Minkiewicz, P. Databases of bioactive peptides. In Biologically Active Peptides; Academic Press: Cambridge, MA, USA, 2021; pp. 309–330. [Google Scholar]
- White, J. PubMed 2.0. Med. Ref. Serv. Q. 2020, 39, 382–387. [Google Scholar] [CrossRef]
- UniProt: The universal protein knowledgebase in 2021. Nucleic Acids Res. 2021, 49, D480–D489. [CrossRef]
- Kyte, J.; Doolittle, R.F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982, 157, 105–132. [Google Scholar] [CrossRef]
- Ikai, A. Thermostability and aliphatic index of globular proteins. J. Biochem. 1980, 88, 1895–1898. [Google Scholar] [PubMed]
- Boman, H.G. Antibacterial peptides: Basic facts and emerging concepts. J. Intern. Med. 2003, 254, 197–215. [Google Scholar] [CrossRef] [PubMed]
- Sturges, H.A. The choice of a class interval. J. Am. Stat. Assoc. 1926, 21, 65–66. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).