1. Introduction
The immune system plays an important role in the prevention of bacterial infections. Neutrophils undergo a chemotaxis, which allows them to migrate toward sites of infection or inflammation. Neutrophils recognize chemotactic factors (chemoattractants) via cell surface receptors [
1]. Detection of chemotactic factors by their receptors triggers the activation of cellular movement and results in quickly congregating at an infection site by attraction by cytokines expressed by activated endothelium, mast cells, and macrophages or by bacterial by-products [
2]. In humans, neutrophil chemoattractants belong to four biochemically distinct subfamilies, i.e., chemokine (C-X-C motif) ligand (CXCL) family (CXCL1 to CXCL3 and CXCL5 to CXCL8 in humans), complement anaphylatoxins (C3a and C5a), chemotactic lipids (e.g., leukotriene B4 or LTB4), and formyl peptides [
3]. A well-known example of a formyl peptide is
N-formyl-Met-Leu-Pro (fMLP), which was synthesized by mimicking bacteria.
N-formylmethionine is commonly used for an initial amino acid of bacterial protein; therefore, Schiffman et al. demonstrated
N-formylmethioninyl peptides showing chemoattractant activity as for resembling chemotactic factors produced by bacteria [
4]. In addition to fMLP, there are no other peptide ligands reported for a long time, and the discovery of new ligands is a very important research target in the field of drug discovery.
Peptides are generally generated by cleaving proteins or by enzymatic ligation of amino acids. The enzymatic method employs amino acid ligases, which are ATP-dependent enzymes [
5]. The enzymatic method is more cost-effective and leads to milder conditions than the solid-phase synthesis method; however, owing to substrate specificity, the variety of peptides is limited by a single kind of amino acid ligase. Therefore, it is necessary to prepare many kinds of amino acid ligases for the enzymatic method for industrial production.
Amino acid ligases have been discovered mainly from microorganisms. The discovery of new amino acid ligases has been performed by homology search using the sequences of known enzymes or by selecting candidate sequences that have unique ATP-binding domains in the ligase, and finally succeeded in expressing them in host cells such as
Escherichia coli for functional analysis [
6]. As this approach requires the full-length sequence information of the genes or genomic sequence, it is mainly performed on established strains. However, standard culturing techniques support the establishment of less than 1% of the bacteria found in the environment [
7]. More than 99% of the microorganisms in natural environment are reported to be difficult or impossible to culture, so-called ‘visible but non-culturable’ (VBNC) bacteria [
8]. Therefore, the VBNC bacteria must be ignored and not to be used for a source of novel enzyme screening and a potential source of functional genes. For this reason, metagenome analyses have gained popularity for the screening and identification of genes of interest. It is expected that, with the metagenome library, it would be possible not only to use the genetic information of VBNCs, but also to express them in the host cells such as
E. coli and determine the activity to meet the non- elucidated characteristics or undetermined phenotypic properties. In fact, we have reported a novel esterase [
9] and a novel gene involved in cadmium accumulation [
10] from sponge-associated bacterial metagenome by library screening. We have constructed two types of sponge-associated bacterial metagenome libraries derived from
Hyrtios erecta [
9] and
Stylissa massa [
10] because marine sponges are known to be one of the largest producers of sondary metabolites [
11] and hold high potential of harboring unique functional genes including those related to heavy metal accumulation [
12,
13].
On the other hand, we have used 26,496 clones of a plasmid library [
9] and 3301 clones of a fosmid library [
10] for a functional screening approach, so it is not easy work to cultivate a large number of clones. To solve this problem of inefficiency, we have determined DNA sequences of about 2 kbp at both ends in every single vector from the metagenome library and an established in-house database to combine in silico screening with functional screening. All DNA sequences were analyzed and annotated based on homology and domain, and registered in the in-house database.
In this study, at first, we selected 120 candidates from the in-house database for amino acid ligases and peptide synthases using clusters of orthologous groups (COG) analysis or selected by Pfam analysis of the ATPases associated with diverse cellular activities (AAA) domain, an ATP-binding motif that is a conserved domain involved in peptide synthethase, and then the gene cluster activating neutrophils was identified.
3. Discussion
Although the
N-terminal domain of ORF1 and
C-terminal domain of ORF4 were deleted, the remaining domains of both ORFs are still functional and the expected peptide was obtained. However, it is not possible to affirm that it is the natural product of the ORF1–4 because of the missing domains. Moreover, ORF1 is required for showing activity suggesting that both products of ORF1 and ORF3 would be combined to form pentapeptide and modified by a phospho-
N-acetylmuramoyl-pentapeptide-transferase of ORF4. ORF2 did not show any significant similarity; however, it would be required for ORF4 (
Figure 6). The genes of ORF1, 3, and 4 showed homologies with the genes involved in peptidoglycan biosynthesis in
E. coli cells [
18]. Peptidoglycan consists of linear glycan chains interlinked by short peptides. The glycan chains are composed of alternating units of
N-acetylglucosamine and
N-acetylmuramic acid. Muramyl residues bear short pentapeptides. Peptidoglycan is synthesized by the key enzymes MurA to MurF. The
murA to
murF genes are all essential in bacteria. MurE, ATP-dependent amino acid ligase, converts UDP-
N-acetylmuramyl-
l-alanine-
d-Glutamate to UDP-
N-acetylmuramyl-tripeptide with meso-diaminopimelic acid. MurF catalyses the addition of
d-alanyl
d-alanine dipeptide to UDP-
N-acetylmuramyl-tripeptide with meso-diaminopimelic acid and forms UDP-
N-acetylmuramyl-pentapeptide [
18]. ORF1 would retain the part of MurE activity. ORF4 showed homology with MraY, which converts UDP-
N-acetylmuramyl-pentapeptide with undecaprenyl-phosphate into undecaprenyl-diphosphate-
N-acetylmuramyl-pentapeptide and UMP [
19].
Based on the peptideglycan biosynthetic pathway, the function of the metagenomic fragment was hypothesized (
Figure 7). As MurE and MurF are homologous to ORF1 and ORF3, respectively, the sequence of the peptide chain might be similar to
l-Ala-
d-Gln-
l-Lys-
d-Ala-
d-Ala or different. As MraY is homologous to ORF4, it may function to add an undecaprenyl-diphosphate to
N-acetylmuramyl-pentapeptide; however, ORF4 is also not in a complete form, and this pentapeptide might be secreted from the cells to the medium, not bound to the cell wall. In addition, as
murA,
murB,
murC, and
murD genes are necessary for peptidoglycan synthesis, they were not included in the metagenomic fragment, and the glycol-pentapeptide would be in an incomplete form. It could be considered that native genes from host
E. coli complemented the functions. A result from de novo sequence through Orbi-trap MS, however, did not correspond to accurate mass of Ala and Lys, indicating that peptide sequence would be different from
l-Ala-
d-Gln-
l-Lys-
d-Ala-
d-Ala. According to Tabata et al. [
5], amino acid ligase is known for low substrate specificity and accepts a wide variety of
l-amino acids. We considered the reason that MS fragments were shown to be complicated and did not show known accurate mass might be that several sequences were contained. Based on the gene analysis, the neurophil activating peptide, at least, would be predicted as the pentapeptide modified by
N-acetylmuramic acid.
Along with improving next generation sequence, metagenome determination became easier and less costly, and could be utilized for gene screening based on the conserved sequence or homology search. As a result, the identified genes show similar characteristics to known genes and the chance to meet novel genes and enzymes is less than the functional screening. Because of the limitation, we have conducted activity-based screening for metagenome analysis so far. The construction of a metagenome library and activity-based screening spent much time and effort; however, to express them in the host cells of
E. coli and determine the activity can lead to meeting the non-elucidated characteristics or undetermined phenotypic properties. Moreover, the in-house database for metagenome libraries was also established, thus we were able to compare the screening efficiency between general activity-based screening and activity-based screening combined with in silico screening, because of the same libraries. As mentioned above, we used 26,496 clones to identify a novel esterase gene [
9]. Similarly, a xylanase gene was selected among 40,000 clones from the soil metagenome library [
20]. Thus, we estimated the efficiency of activity-based screening was 1 out of 10
4 clones. In this study, we cultivated only 120 clones by
in silico screening and the efforts were reduced by 1/100 and the efficiency increased to 100-fold. In addition to this approach’s merit, we could access the candidates and propagate the recombinants once the libraries were constructed.
4. Materials and Methods
4.1. Metagenomic Libraries
Metagenomic libraries used in this study have been established in Okamura et al. [
9] and Mori et al. [
10]. LB medium containing chloramphenicol (Cm; final concentration: 20 μg/mL) was used for bacterial culture.
4.2. In Silico Screening
For the first screening, amino acid ligase or peptide synthase genes were selected from our in-house database. Specifically, the genes contain an adenylation domain, which is required for activation by ATP to form a peptide bond. COG search and Pfam domain searches were employed to select ligase and AAA domains. The clones that conserved full-length ORFs were selected.
4.3. Sample Preparation for Neutrophil Activity Assay
For the second screening, all 120 selected clones were grown in 96-DeepWell plate (maximum volume 2 mL/well) (NuncTM, Nalge Nunc International, New York, NY, USA) with 1 mL of LB (Cm) with continual agitation at 170 rpm in a Bio-Shaker (TITEC, Saitama, Japan) at 37 °C for 8 h. After centrifugation at 10,000× g for 5 min, supernatant was boiled at 80 °C for 20 min.
For the third screening, selected clones were cultured in L-shaped test tube (id: 18 mm, custom-made in Hiroshima University) with 5 mL of LB (Cm) with horizontal shaking at 120 rpm at 37 °C for 8 h. After centrifugation and boiling, the same as above, the supernatant was lyophilized, redissolved in milliQ, and concentrated at fivefold.
4.4. Purification
The culture supernatant after boiling was lyophilized, redissolved in milliQ, concentrated 5–10 times, and used for HPLC purification. The concentrated supernatant was applied on a C18 column (TSKgel ODS-80Ts, Tosoh, Yamaguchi, Japan) and eluted with a linear gradient of 0–100% acetonitrile (solvent A: 0.l% TFA/water; solvent B: 0.l% TFA/acetonitrile) in 8–40 min for 32 min at a flow rate of 1.0 mL/min. After linear gradient, mobile phases at 100% and 0% acetonitrile were kept for 5 min, respectively. A column oven was used to keep the column at 40 °C during the measurement. The eluate was monitored at 230 nm with a MD-2010 multiwavelength photodiode array detector (JASCO Cooperation, Tokyo, Japan). Fractions were obtained at the first 10 min (Fraction #1) and at intervals of 5 min/fraction (Fraction #2–9). For further purification, the same mobile phase was used at a flow rate of 0.4 mL/min in 120 min with linear gradient. The fractions were separated by peak appearance. The obtained fractions were completely dried by decompression drying and re-dissolved in 20 μL of Hanks’ Balanced Salt Solution (HBSS (1×): GIBCO®, Thermo Fisher Scientific, Waltham, MA, USA) for the assay.
4.5. Neutrophil Activating Assay
The assay using neutrophils was based on the method of Hasegawa et al. [
14]. Whole blood was employed in the assay. The principle is based on the NADPH oxidase activity from neutrophils, which receive the chemoattractants, and the resulting superoxide anion reacts with luminol, thus the activity can be detected as a luminescent intensity. Luminol solution was prepared as follows: 17.72 mg of luminol (SIGMA) was dissolved in 3 mL of 1N NaOH, then 2.5 mL of 1N HCl was added. Then, 30 mL of HBSS (-) (HBSS (1×) without Ca
2+ and Mg
2+) was added, adjusting pH to pH = 7.4 (37 °C) with 0.2 N HCl, and finally mess up to 40 mL with HBSS (1×). For the assay, the luminol solution was diluted five times with HBSS (1×) and used. The blood samples were donated by male and female volunteers in their 20s to 40s. To prevent blood coagulation, 10 μL of 10 U/μL heparin (Wako Pure Chemical Corporation, Tokyo, Japan) dissolved in 50 μL of HBSS (1×) was added to the Falcon in advance. Blood was kept at 37 °C as much as possible. A 96-well plate was used for measurement. To all wells, 20 μL of boiling supernatant or purified fraction and 60 μL of reaction mixture of blood and luminol solution in equal volume were added, and the measurement was started immediately. The plate reader was set up so that there was a 10 s shaking step immediately before scanning, and the luminescence in each well was measured for 30 cycles of 3 min each, including a 170 s waiting period after scanning, for an overall measurement time of 90 min. The fluorescence plate reader was ARVO (Perkin Elmer, Waltham, MA, USA) or Fluoroskan Ascent FL (ThermoFisher Scientific, Waltham, MA, USA). Nunclon
® 96 well (bottom clear, flat bottom, black, product no. 165305) (ThermoFisher Scientific, Waltham, MA, USA) was used for measurement. The highest value in the assay was used as the peak height for comparison.
4.6. Nucleotide Sequence Analysis of Metagenomic Insertion Fragments
The plasmid DNA was extracted from positive clones showing neutrophil activation, and the sequence of the metagenome fragment was sent for sequencing by Takara Bio Inc. (Shiga, Japan). ORFs were predicted by a free software ApE (A plasmid editor, v2.0.61) (
https://jorgensen.biology.utah.edu/wayned/ape/, accessed on 10 February 2020). The sequences were analyzed by Basic Local Alignment Search Tool (BLAST) [
15]. Promoter regions were predicted using GENETYX Ver.9 (GENETYX CORPORATION, Tokyo, Japan).
4.7. Construction of Expression Vectors and Recombinants
In order to determine the responsible gene(s) for neutrophil activation, four ORFs from the metagenome fragment were cloned separately or several clusters into pGEX-6p-1 expression vector. PCR primers to amplify each ORF or gene cluster were listed in
Supplemental Table S2. Restriction enzyme sites within the multi-cloning site were employed, and translation was initiated at the vector start codon. KOD-Plus-Neo (TOYOBO, Osaka, Japan) was used with 2 µL of the extracted plasmid (10 ng/μL) as template and 1.5 μL of each primer (10 pmol/μL) in 50 µL of total volume. The thermal cycling conditions were as follows: pre-denaturation at 94 °C for 2 min, followed by 30 cycles of denaturation at 98 °C for 10 s, annealing and extension at 68 °C for 30 s for amplicon length <500 bp, 45 s for <800 bp, 60 s for <1 kbp, 75 s for <1.3 kbp, 105 s for <1.7 kbp, or 150 s for 2.3 kbp. The resulting PCR product was ligated into the appropriate sites downstream of the promoter in the pGEX-6p-1 vector and cloned in
E. coli DH5a and plated on LB agar medium supplemented with ampicillin at 50 μg/mL. Colony PCR was performed to check for insertion of specific open reading frames (ORFs) in the recombinant clones. KOD Dash (TOYOBO) was used for colony PCR. Then, 0.5 μL of each primer (10 pmol/μL) was used for the reaction, and the total volume was 20 μL. The used primers were pGEX 5’ (5’-GGGCTGGCAAGCCACGTTTGGTG-3’) and pGEX 3’ (5’-CCGGGAGCTGCATGTGTCAGAGG-3’). The thermal cycling conditions were as follows: pre-denaturation at 95 °C for 4 min, followed by 35 cycles of denaturation at 98 °C for 20 s, annealing at 60 °C for 10 s, and extension at 72 °C for 2 min, with a final extension at 60 °C for 4 min. The PCR products were confirmed by electrophoresis in 1% agarose gel. The resulting transformants were grown in LB medium supplemented with ampicillin (50 µg/mL) and culture supernatant of each transformant was collected to be used for assay.
4.8. Accession Number
The sequence of metagenomic fragment was deposited to DDBJ (LC634461).