Next Article in Journal
Intraguild Prey Served as Alternative Prey for Intraguild Predators in a Reciprocal Predator Guild between Neoseiulus barkeri and Scolothrips takahashii
Previous Article in Journal
The Role of Ascorbate–Glutathione System and Volatiles Emitted by Insect-Damaged Lettuce Roots as Navigation Signals for Insect and Slug Parasitic Nematodes
Previous Article in Special Issue
Transcriptome Analysis and Identification of Chemosensory Genes in Baryscapus dioryctriae (Hymenoptera: Eulophidae)
 
 
Article
Peer-Review Record

iORandLigandDB: A Website for Three-Dimensional Structure Prediction of Insect Odorant Receptors and Docking with Odorants

Insects 2023, 14(6), 560; https://doi.org/10.3390/insects14060560
by Shuo Jin, Kun Qian, Lin He and Zan Zhang *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Insects 2023, 14(6), 560; https://doi.org/10.3390/insects14060560
Submission received: 12 April 2023 / Revised: 28 May 2023 / Accepted: 9 June 2023 / Published: 15 June 2023
(This article belongs to the Special Issue Advance in Insect Chemosensory Receptors)

Round 1

Reviewer 1 Report

In this manuscript, Jin and colleagues present a database of predicted 3D structures for Odorant Receptors (OR). This is a much needed resource in the field that is also well structured, but could improve with a few minor tweaks.

Comments:
*) Line 51, “we constructed a website capable of predicting the three-dimensional structure…”. The website itself does not predict the structure, it is AlphaFold the one doing the predictions (as the authors state in the Methods section); please correct.
*) There are a bunch of typos scattered throughout the manuscript; the authors must carefully read the manuscript and correct them. Example: lines 173 and 179, “kinds of information” should read “types of information”. Also, in line 176, “Instatility Index” should read “Instability Index”; this must be replaced also in the database.
*) The Introduction should mention the current knowledge about the structure of OR proteins. There is only one experimentally-solved structure of a protein from the OR family, the Orco protein from Apocrypta bakeri (PMID:30111839).
*) Why was the set of OR sequences obtained directly from NCBI, even though it may contain fragments and predicted proteins, instead from a curated dataset of OR sequences? Examples in PMID:35627304. Also, how were these sequences obtained indeed? No procedure is detailed in Methods.
*) Lines 83-84: “A total of 151 sequences with amino acid deletion were corrected in the amino acid sequence library”; how were they initially identified?
*) The detailed explanation about the meaning of pLDDT scores (lines 191-198) should be moved to Methods.
*) In Figure 4 I think the data would be much better seen if they were plotted by compound and not by taxa. What I mean is having one panel per compound, instead of the current one panel per taxa. By doing that one would be able to compare the response to the same compound by different taxa, which is more difficult to do in the present form of the figure.
*) Is the text from line 311 to line 328 all part of the figure legend? It is in a smaller font than the text from line 330. If it is a legend, please shorten it.

Comments about the database itself:
*) Typo in Help section, “Q2: Sequneces”.
*) In Sequences section, on the bottom of the page, there is some text (next page?) in what I assume is chinese. Please translate.
*) I think it would be good to include a table or file to download with all the information about proteins and odors. As of now, one can only go protein by protein and access this information by clicking on “Browse” in the field Odors. Also, include either in the submission or in the database a link to a multifasta with all 4426 OR sequences used in the database.
*) There should be a consensus in nomenclature, because in a protein entry the authors refer to “Odors”, while on top of the page they let the users access information about “Ligand”, which are the odors themselves.

 

There are a bunch of typos scattered throughout the manuscript; the authors must carefully read the manuscript and correct them. Example: lines 173 and 179, “kinds of information” should read “types of information”. Also, in line 176, “Instatility Index” should read “Instability Index”; this must be replaced also in the database.

Author Response

In this manuscript, Jin and colleagues present a database of predicted 3D structures for Odorant Receptors (OR). This is a much needed resource in the field that is also well structured, but could improve with a few minor tweaks.

Comments:

*) Line 51, “we constructed a website capable of predicting the three-dimensional structure…”. The website itself does not predict the structure, it is AlphaFold the one doing the predictions (as the authors state in the Methods section); please correct.

Answer: Thanks for your suggestion, we have corrected it in the manuscript.

 

*) There are a bunch of typos scattered throughout the manuscript; the authors must carefully read the manuscript and correct them. Example: lines 173 and 179, “kinds of information” should read “types of information”. Also, in line 176, “Instatility Index” should read “Instability Index”; this must be replaced also in the database.

Answer: Thanks for your suggestion, we have corrected it in the manuscript.

 

*) The Introduction should mention the current knowledge about the structure of OR proteins. There is only one experimentally-solved structure of a protein from the OR family, the Orco protein from Apocrypta bakeri (PMID:30111839).

Answer: Thank you for your suggestion. We have added relevant content about the experimentally-solved structure in the manuscript.

 

*) Why was the set of OR sequences obtained directly from NCBI, even though it may contain fragments and predicted proteins, instead from a curated dataset of OR sequences? Examples in PMID:35627304. Also, how were these sequences obtained indeed? No procedure is detailed in Methods.

Answer: Thanks for your question, we know that reliable data is the basis of a database. We searched for experimentally determined insect odorant receptor sequences from the literature and downloaded the corresponding sequence files from NCBI. Therefore, I believe our dataset is also a curated dataset. You can see from the sequence information page that most sequences in the database are literature-based. You can also refer to the 'paper' column in Appendix Table S1 to view the literature information. As mentioned above, our sequence acquisition method, namely how these sequences were obtained, can be described as follows: First, we searched for articles that experimentally determined insect olfactory receptor sequences. Then, we obtained the measured insect olfactory receptor sequences from the articles. Finally, we searched and downloaded the corresponding sequence files from the NCBI database. I apologize for our unclear expression in the text. We have added more details in the manuscript.

 

*) Lines 83-84: “A total of 151 sequences with amino acid deletion were corrected in the amino acid sequence library”; how were they initially identified?

Answer: Thanks for your question, we should have some clarification on why the wrong sequences occur. AlphaFold2 mainly uses multiple sequence alignment (MSA) to integrate the structural and biological information of proteins into the deep learning algorithm (PMID: 34265844). During the multiple sequence alignment in data input, sequences with amino acid deletions will result in errors in Alphafold2. We have recorded these errors and identified the corresponding sequences. We then processed these sequences amino acid deletion as we said in the manuscript.

 

*) The detailed explanation about the meaning of pLDDT scores (lines 191-198) should be moved to Methods.

Answer: Thanks for your suggestion, we have corrected it in the manuscript.

 

*) In Figure 4 I think the data would be much better seen if they were plotted by compound and not by taxa. What I mean is having one panel per compound, instead of the current one panel per taxa. By doing that one would be able to compare the response to the same compound by different taxa, which is more difficult to do in the present form of the figure.

Answer: Thank you for your suggestion. We have provided a docking binding energy diagram for each compound. However, due to space limitations, we have put the diagram in Appendix Figure 4s as a supplement to Figure 4. Please refer to Appendix Figure 4s for details.

 

*) Is the text from line 311 to line 328 all part of the figure legend? It is in a smaller font than the text from line 330. If it is a legend, please shorten it.

Answer: Thanks for your suggestion, we have corrected it in the manuscript.

 

 

Comments about the database itself:

*) Typo in Help section, “Q2: Sequneces”.

Answer: Thanks for your suggestion, we have corrected it in the manuscript.

 

*) In Sequences section, on the bottom of the page, there is some text (next page?) in what I assume is chinese. Please translate.

Answer: Thanks for your suggestion, we have corrected it in the website

 

 

*) I think it would be good to include a table or file to download with all the information about proteins and odors. As of now, one can only go protein by protein and access this information by clicking on “Browse” in the field Odors. Also, include either in the submission or in the database a link to a multifasta with all 4426 OR sequences used in the database.

Answer: Thank you for your suggestion. We have added download links for the table and multifasta files in the 'Help' tab of the database.

 

*) There should be a consensus in nomenclature, because in a protein entry the authors refer to “Odors”, while on top of the page they let the users access information about “Ligand”, which are the odors themselves.

Answer: Thanks for your suggestion, we have unified the name "Ligand" in the website.

 

 

Comments on the Quality of English Language

There are a bunch of typos scattered throughout the manuscript; the authors must carefully read the manuscript and correct them. Example: lines 173 and 179, “kinds of information” should read “types of information”. Also, in line 176, “Instatility Index” should read “Instability Index”; this must be replaced also in the database.

Answer: Thanks for your suggestion, we have corrected it in the website.

Reviewer 2 Report

Binding affinities generated from the use of AutoDock Vina should be treated with care. Nguyen et al., Autodock Vina Adopts More Accurate Binding Poses but Autodock4 Forms Better Binding AffinityJ. Chem. Inf. Model. 2020, 60, 1, 204–211.

 

Line 199 states that “In total, 4426 protein three-dimensional structures were predicted by Alphafold2. This in combination with lines 210-211 is rather confusing. Insect odorant receptors are transmembrane heterotetrameric complexes (indicatively see references 1, 2 below). There is no mention to this anywhere in the submitted paper. I would be very surprised if the authors were unaware of this. It is reasonable to assume that the Alphafold generated structures are monomeric. Given that the ORs are ligand gated ion channels, how was it determined where to position the docking box and calculate the dimensions/volume of the binding pocket using fPocket?

 

Finally, it would be more than useful, if the authors provided examples of the generated models with the generated ligand-protein structures.

 

Mika, Benton. Olfactory Receptor Gene Regulation in Insects: Multiple Mechanisms for Singular Expression. Front Neurosci. 2021; 15: 738088

 

Konopka et al. Olfaction in Anopheles mosquitoes. Chemical Senses, 2021, Vol 46, 1–24 

Lines 44-50. The sentence needs restructuring. The words “contradiction” and “inefficient”. Probably, the authors mean that data acquisition has proceeded faster than the acquisition of biological knowledge.

 

Line 53. The word “achieve” is wrong in the context of the sentence. Probably, it is meant that the website allows the query of sequences, etc.

 

Lines 73-74. Does “structure prediction” proceed the acquisition of secondary and tertiary structure analysis or the opposite?

 

Lines 76-80 are totally confusing. The past tense “….data were integrated into a ligand database, which was used to …..” is followed by future tense “When a three-dimensional structural model with higher precision is trained, iORandLigandDB will be fully updated.” Does this mean that the structural model has not been trained?

 

Line 119. The word “dewatering” can be replaced by a more appropriate term.

 

Line 175. Secondary structure rather than “Second structure”. This is obviously a typing error

Author Response

Binding affinities generated from the use of AutoDock Vina should be treated with care. Nguyen et al., Autodock Vina Adopts More Accurate Binding Poses but Autodock4 Forms Better Binding Affinity. J. Chem. Inf. Model. 2020, 60, 1, 204–211.

Answer:Thank you for the reviewer raising this question. We will re-validate our results using Autodock4 to ensure the accuracy of our binding energy. The updated results will be provided in the second version of the website.

 

Line 199 states that “In total, 4426 protein three-dimensional structures were predicted by Alphafold2. This in combination with lines 210-211 is rather confusing. Insect odorant receptors are transmembrane heterotetrameric complexes (indicatively see references 1, 2 below). There is no mention to this anywhere in the submitted paper. I would be very surprised if the authors were unaware of this. It is reasonable to assume that the Alphafold generated structures are monomeric. Given that the ORs are ligand gated ion channels, how was it determined where to position the docking box and calculate the dimensions/volume of the binding pocket using fPocket?

Answer: Thank you for your suggestion. We have read these two references and added an introduction to the transmembrane structure of insect odorant receptors in the manuscript.

Here is an explanation about how we located and calculated the volumes of the docking boxes: fpocket will take as input a protein structure (our predicted protein structure in PDB format) or a list of pdb files and return information about candidate pockets, numbered by rank. On exit, fpocket will generate a file containing statistics about each pocket which lists different characteristics and scores of pockets identified on the surface of the protein and a PDB file containing the atoms defining the pocket (PMID: 19486540, PMID: 20478829). Based on the information output by fpocket, we selected, located and calculated the volumes of docking box.

 

Finally, it would be more than useful, if the authors provided examples of the generated models with the generated ligand-protein structures.

Mika, Benton. Olfactory Receptor Gene Regulation in Insects: Multiple Mechanisms for Singular Expression. Front Neurosci. 2021; 15: 738088

Konopka et al. Olfaction in Anopheles mosquitoes. Chemical Senses, 2021, Vol 46, 1–24

Answer: Thank you for the questions. We have shown examples in Appendix Figure S1, please refer to Appendix Figure S1 for details.

 

 

Comments on the Quality of English Language

Lines 44-50. The sentence needs restructuring. The words “contradiction” and “inefficient”. Probably, the authors mean that data acquisition has proceeded faster than the acquisition of biological knowledge.

Answer: Thanks for your question, we have made it clearer in the manuscript.

 

Line 53. The word “achieve” is wrong in the context of the sentence. Probably, it is meant that the website allows the query of sequences, etc.

Answer: Thanks for asking the question, we have changed the word “achieve” to” allow”.

 

Lines 73-74. Does “structure prediction” proceed the acquisition of secondary and tertiary structure analysis or the opposite?

Answer: Thank you for your question. The correct order is to predict the three-dimensional structure first, and then the secondary structure. We have corrected it in the manuscript.

 

 

Lines 76-80 are totally confusing. The past tense “….data were integrated into a ligand database, which was used to …..” is followed by future tense “When a three-dimensional structural model with higher precision is trained, iORandLigandDB will be fully updated.” Does this mean that the structural model has not been trained?

Answer: We apologize for the confusion caused by the sentences in the manuscript. What we want to express is that we will continue to improve the database in the future. First, we will constantly expand our database. Second, the improvement of the model's predictive ability requires continuous optimization. We will continue to work hard and update the database with more accurate data. The expression in the manuscript is ambiguous and we have carefully corrected it. Thank you very much for asking this question.

 

 

Line 119. The word “dewatering” can be replaced by a more appropriate term.

Answer: Thanks for asking the question, we have changed the word “dewatering” to” dehydration”.

 

 

Line 175. Secondary structure rather than “Second structure”. This is obviously a typing error

Answer: Thanks for your question, we have corrected it in the manuscript.

Reviewer 3 Report

 

The paper describes the construction of a database giving access to the structure of odorant receptor and their binding to odorant molecules. It will undoubtedly be useful for researchers working in the field of insect odorant receptor.

However, several modifications would be necessary for the work to be publishable.

 

- The search for binding sites could be improved. It should be worth to take into account the data already available. Indeed, several publications highlight important amino acids for the odorant binding, sometimes based on experimental data (such as mutagenesis).

It would also be interesting to classify the receptors according to their sequence homology. Homologous proteins would be predicted to have the same binding site.

 

- It is not clear what structural information figure 3 provides. The usefulness of this figure is not apparent.

 

- As the 3D structure of the receptors will be available, what is the point of generating secondary structure as these information are already included in the 3D data?

 

- Some terms should be defined, such as the Instability Index or the InChI Key.

 

- The number of OR sequences studied is not clear. Is it 4426 or 4421? Both numbers appear in the text.

 

Author Response

The paper describes the construction of a database giving access to the structure of odorant receptor and their binding to odorant molecules. It will undoubtedly be useful for researchers working in the field of insect odorant receptor.

 

However, several modifications would be necessary for the work to be publishable.

 

 

- The search for binding sites could be improved. It should be worth to take into account the data already available. Indeed, several publications highlight important amino acids for the odorant binding, sometimes based on experimental data (such as mutagenesis).

 

It would also be interesting to classify the receptors according to their sequence homology. Homologous proteins would be predicted to have the same binding site.

 

Answer: Thank you for your valuable suggestions. We found your suggestions very interesting and meaningful. We are making modifications and refinements relevant to the binding sites. However, due to the short modification time, we cannot complete it promptly this time. We will finish it as soon as possible and update it in the database.

 

- It is not clear what structural information figure 3 provides. The usefulness of this figure is not apparent.

Answer: Thank you for asking this question. The information conveyed in Figure 3 is the volume statistics of the docking box. The docking box is the predicted protein binding cavity. The volume of protein binding cavity has been shown to correlate with docking results. (PMID: 19706358, PMID: 23107480) The information in Figure 3 shows that the binding cavity of most odorant-binding proteins did not differ significantly between different orders. Combined with similar docking binding energy information between different orders in Figure 4, we believe that the docking box and docking binding energy are related. In addition to this, the outliers in Figure 3 also related to the exceptional docking binding energies in Figure 4(iORL002865, iOR002782, iORL003762, etc.). Therefore, we think that Figure 3 is useful, it can help us understand the docking binding energy information in Figure 4.

 

- As the 3D structure of the receptors will be available, what is the point of generating secondary structure as these information are already included in the 3D data?

Answer: Thanks for your question, we do not extract secondary structure just due to its existence within a three-dimensional structure. As you can see, our database provides basic information on insect odorant receptors. First, both the secondary structure information and the 3D structure information belong to the structural information. In order to obtain a comprehensive dataset, we think it is necessary to extract the secondary structure. Furthermore, not every researcher investigates the 3D structure of insect odorant receptor. These researchers may have difficulty identifying secondary structures from 3D structures due to their research fields. We should take these researchers' needs into account and display the secondary structure information clearly. Finally, we also plan to add more analyzes in terms of secondary structure information in the future. Therefore, the extracted secondary structure information can be used as the basis for updating.

 

- Some terms should be defined, such as the Instability Index or the InChI Key.

Answer: Thank you for the questions. Here are the definitions of these terms:

Instability Index: The instability index calculates an estimate of the stability of the protein in a test tube (PMID: 2075190).

InChI: InChI is a structure-based identifier, strictly unique, and non-proprietary, open source and freely accessible (https://www.inchi-trust.org/).

InChI Key: InChI Key is a hashed version of InChI which allows for a compact representation and for searching in standard search engines (https://www.inchi-trust.org/).

The explanations of these terms have been added in the manuscript, and we have also added explanations for other terms.

 

- The number of OR sequences studied is not clear. Is it 4426 or 4421? Both numbers appear in the text.

Answer: Thank you for your question. We need to clarify this matter. Both numbers are correct because they convey different information. 4426 in the manuscript means that there are 4426 sequences and structural information in total in the database. 4421 in the manuscript means that 4421 sequences are predicted to have docking boxes. This also means that 5 sequences are not predicted to have docking boxes. Meanwhile, only 4421 sequences proceeded to the docking. We intended to accurately express each process in this way. That is why both numbers appear in the manuscript.

Round 2

Reviewer 2 Report

See attached file ("Report 2.docx"

Comments for author File: Comments.pdf

The quality of English has been improved since the last submission. Indicative examples.

1. In the abstract and elsewhere, it is better to use "insect-specific odorants", than "insect-sensitive" or "olfactory-sensitive substances".

Line 142-143  "rocket v4.0 was used to predict the active pocket of ORs". Binding pocket should be used instead of "active pocket".

Author Response

It is surprising that the authors make no reference to the following very important papers in the field.

Amino acid coevolution reveals three-dimensional structure and functionaI domains of insect odorant receptors, 2015, Hopf et al.Nat Commun. 2015;6:6077

The structural basis of odorant recognition in insect olfactory receptors,2021,Mármol et al. https://doi.org/10.1038/s41586- 021-03794-8

Functional properties of insect olfactory receptors:ionotropic receptors and odorant receptors, 2021, Wicher, D., Miazzi, F.Cell and Tissue Research (2021)383:7-19

https://doi.org/10.1007/s00441-020-03363-x

A Functional Agonist of Insect Olfactory Receptors: Behavior, Physiology and Structure, 2019,Batra et al.,Front.Cell. Neurosci.13:134. doi:10.3389/fncel.2019.00134

These papers present experimental and bioinformatically-derived OR structures and suggest specific epitopes for ligand binding. In my view, it is extremely important to use the results reported in these papers to cross-validate the docking results reported in the submitted paper.

 

Answer: Thank you very much for raising this question. We have carefully read these papers, cross-validated our results with them, and cited them in the manuscript.

 

The authors used Fpocket to predict the putative binding sites for the docking simulations. They have detected binding pockets well into the transmembrane domains of the olfactory receptors. It is assumed that the lipid bilayer determines to a large extent the packing of the transmembrane helices. In such cases is Fpocket suitable for binding site detection? The authors should address this point by providing appropriate references.

 

Answer: Thank you for your question. Exploring the influence of lipid bilayer on ligand docking is very meaningful. And the analysis of the lipid bilayer can be used for further optimization of odorant receptor ligand prediction (PMID:30967440), such as the analysis of human olfactory receptor hOR 17-209 (PMID: 22563330). However, the current practice of protein structure-affinity relationship analysis is primarily based on protein-ligand interactions (PMID: 28648526), such as the analysis of the insect odorant receptor MhOR5 (PMID: 35933834), the analysis of the mouse eugenol odorant receptor (PMID: 21142015), the analysis of the bark beetle odorant receptors (PMID: 33499862), the comparative prediction of human and mouse odorant receptors (PMID: 14691239), and so on. Many current literature works have not taken into account the impact of the lipid bilayer in OR docking process. We also did not consider the role of the lipid bilayer when using Fpocket software, while optimization for lipid bilayer is a direction for follow-up research. In addition, Regarding the software Fpocket, a 2022 article described it as follows: “Among the few programs that fall into the energy-based category, the open-source tool fpocket is particularly popular” (PMID: 36008829), so we believe Fpocket is an excellent pocket detection algorithm. Thank you again for your questions. We will further study the role of lipid bilayers and demonstrate this in the upcoming database updates.

 

The quality of English has been improved since the last submission. Indicative examples.

  1. In the abstract and elsewhere, it is better to use "insect-specific odorants", than "insect-sensitive" or "olfactory-sensitive substances".

Line 142-143  "rocket v4.0 was used to predict the active pocket of ORs". Binding pocket should be used instead of "active pocket".

Answer: Thank you very much for your suggestion. We have corrected the manuscript accordingly. In addition, we consulted native English speakers and made our English in the manuscript even smoother with their help.

Reviewer 3 Report

The authors have correctly answered to the questions/comments. The work is now acceptable for publication.

Author Response

The authors have correctly answered to the questions/comments. The work is now acceptable for publication.

Answer: Thank you very much for your suggestion. We are pleased to receive your recognition, and we also attach great importance to the issues and comments you have raised before. Thank you again for your recognition and suggestions!

Round 3

Reviewer 2 Report

No further comments

Following the earlier corrections, the use of English is acceptable.

Back to TopTop