1. Introduction
The COVID-19 pandemic had a global impact on not only human health but the economics of countries and everyday livelihoods as well. Disease surveillance is one of the key tools used to manage pandemics and is an integral part of national health management systems worldwide. Having robust stability regarding point-of-care (POC) diagnostics in remote settings is the key to effective surveillance. This helps to curb the spread of pandemics and we have seen its importance in the largest pandemic of SARS-CoV2 (COVID-19) [
1]. To stop the spread of the disease, it was necessary to detect the disease in patients, quarantine, and prioritize the vaccine implementation program. As a fast and easy-to-use diagnostic tool, the Lateral Flow Immunoassay (LFIA) has caught the attention of many scientists. It has an advantage over other POC techniques as it covers the WHO ASSURED criteria of POC testing in resource-limited settings [
2]. A LFIA, an analytical tool for the onsite detection of target substances, has advantages including rapidity, simplicity, and relative cost-effectiveness [
3].
The coronavirus disease of 2019 (COVID-19) is caused by novel coronavirus SARS-CoV2, a beta-coronavirus with ssRNA genomes that encode sixteen different non-structural proteins (NSPs) and four essential proteins named spike glycoprotein (S), small envelope protein (E), matrix protein (M), and nucleocapsid protein (NP or N-protein) [
1,
4]. A number of diagnostic tests were developed based on molecular diagnostics, antigens, and serology in response to the COVID-19 pandemic to detect the SARS-CoV2′s virus, antigens, or antibodies generated in response to infection. Studies suggest that the majority of patients develop an antibody response only in the second week after the onset of symptoms [
5]. A diagnosis of the COVID-19 infection based on antibody response is often only possible in the recovery phase. This is very useful for antibody detection to establish their usefulness in disease surveillance and epidemiologic research. Vaccines that have proven to be immunogenic incorporate spike protein as the functional antigen over nucleocapsid. To distinguish the antibody response from natural infection and vaccine immunization, the nucleocapsid protein must be used as a diagnostic antigen. Nucleocapsid is used for serology tests due to its abundant expression and conservation in the genome. The high titer antibodies against the N-protein are found in SARS-CoV2 infected patients, which can be detected as a biomarker for infection through various diagnostic tools; these N-specific antibodies dominate the overall antibody response [
6,
7].
The full-length nucleocapsid protein was expressed and purified from bacterial cells in both forms, soluble and as inclusion bodies (IBs). To check the reactivity of both forms of protein to COVID-positive patient serum, an ELISA was developed and the IB form was found to be slightly more reactive [
8]. Researchers have purified the whole nucleocapsid and N-terminal domain for serological assay development. Biochemical studies, such as static light scattering, small-angle X-ray scattering (SAXS) and size exclusion chromatography, confirm that the N-protein is largely present in its dimeric form [
9]. The refolded N-protein (IB form) was also used for the preliminary detection of infection [
1].
According to published studies, the first 40 amino acids of the N-protein sequence are intrinsically disordered. The recombinant fragmented NP (rfNP) devoid of disordered regions (1–40 aa) was expressed and purified in
E. coli in the soluble form. ELISA assay validation with 68 negative and 50 positive patient samples was performed. This showed that rfNP is also useful for the detection of SARS-CoV2 antibodies during infection [
10]. Studies also proved that the N-terminally truncated nucleocapsid protein is a better serological marker than the complete N-protein in evaluating SARS-CoV2 immunogenicity [
11].
One of the studies also elucidated the impact of the dimerization of SARS-CoV2 nucleocapsid protein on the sensitivity of ELISA-based COVID-19 diagnostics and showed that the dimeric form of the N-protein shows higher stability and antigenicity than the monomeric form [
12].
Although efforts have been made to understand infection and immunity mechanisms based on N-protein peptide structure, little has been done to integrate these insights into the development and manufacturing of a stable N-protein to be used in remote settings as a capture antigen for diagnostics. A full-length recombinant N-protein is expressed as inclusion bodies in
E. coli. Incubation temperature and medium composition were optimized for soluble expression. However, the purified protein remained unstable due to aggregation and proteolytic degradation (lab data in
Supplementary File). Unstable capture antigens may not be useful for diagnostic use as they may not have consistency in outcome and may compromise the specificity and sensitivity of the assay.
In our study, to identify the stable and immune-reactive version of the N-protein, we first found computationally predicted linear B-cell epitopes based on the antigen sequence characteristics of the N-protein from the Immune Epitope Database (IEDB) Analysis Resource website. Then, we integrated the peptide instability index information to predict a stable region of N-protein comprising linear B-cell epitopes. For each variant involved, we also integrated the information about variants from UniProt. Based on this analysis, we identified two probable variants of the full-length nucleocapsid protein representing two different regions of the nucleocapsid protein. Using E. coli, these two variants, along with the native full-length N-protein, were successfully expressed and purified. The stability of all three purified proteins was then evaluated in wet lab experiments. Among the three stretches of the N-protein, the smallest region with the least epitope binding sites (according to IEDB analysis) was found to be stable as compared to other constructs at different temperature conditions.
2. Materials and Methods
2.1. Computational Analysis and Websites
2.1.1. UniProt and Protter
The protein sequence of SARS-CoV2 nucleocapsid was obtained from the UniProt database (
https://www.uniprot.org/ initially accessed on 30 April 2020 and rechecked on 6 August 2023) and the variant data retrieved for the SARS-CoV2 nucleocapsid protein were used for further analysis. The variants of nucleocapsid protein are visualized using the online tool Protter (
https://wlab.ethz.ch/protter/# last accessed on 6 August 2023) [
13].
2.1.2. The Immune Epitope Database (IEDB)
The Immune Epitope Database (IEDB) is the online computational software (
https://www.iedb.org/ initially accessed on July 2021 and rechecked on 6 August 2023) used for the prediction of linear B-cell epitopes in the nucleocapsid protein sequence (NP
1-419) of SARS-CoV2 [
14,
15]. The BepiPred-2.0 server predicts B-cell linear epitopes from a protein sequence, using a random forest algorithm trained on epitopes and non-epitope amino acids determined from crystal structures. Sequential prediction smoothing is performed afterwards. The residues with scores above the threshold (default value is 0.5) are predicted to be part of an epitope. The Ê values of the scores are not affected by the selected threshold.
2.1.3. Instability Index and Isoelectric Point
Instability Index: The instability index provides an estimate of the stability of the protein in a test tube. Previously it has been revealed that there are certain dipeptides, the occurrence of which is significantly different in the unstable proteins compared with those in the stable ones. In this method, a weight value of instability is assigned to each of the 400 different dipeptides (DIWV). Using these weight values, it is possible to compute an instability index (II). A protein whose instability index is smaller than 40 is predicted to be stable; a value above 40 predicts that the protein may be unstable [
16,
17]. The instability index (II) and isoelectric point were calculated using the online tool ProtParam (
https://web.expasy.org/cgi-bin/protparam initially accessed on 30 April 2020 and rechecked on 6 August 2023).
2.2. Strains, Plasmids, Chemicals, and Filters
E. coli DH5α [F+ endA1 glnV44 thi-1 recA1 relA1 gyrA96 deoR nupG purB20 φ80dlacZΔM15 Δ(lacZYA-argF)U169, hsdR17(rK–mK+), λ] was used for sub-cloning and plasmid construction. BL21(DE3) [F– ompT gal dcm lon hsdSB(rB–mB–) λ(DE3 [lacI lacUV5-T7p07 ind1 sam7 nin5]) [malB+]K-12(λS)] was used for the expression of the fusion protein. Plasmid pET-28b with 6Xhis tag was used for the expression of the fusion proteins. The SARS-CoV2 nucleocapsid protein gene was cloned in the restriction sites
NcoI and
BamHI in the pET-28b vector, which was synthesized from GenScript. Details of the hosts and plasmids are given in
Table 1. Restriction endonucleases, Agarose ITM (cat No.: 0710), acrylamide (cat No.: 0341), bis-acrylamide (cat No.: 0172), sodium dodecyl sulfate (SDS) (cat No.: 0227), ammonium per sulfate (APS) (cat No.: 0486), isopropyl β-D-1-thiogalactopyranoside (IPTG) (cat No.: 0487), and TEMED (cat No.: 0761) were purchased from Amresco, Framingham, MA, USA. The GeneJET plasmid miniprep kit (cat No.: K0502) and GeneJET gel extraction kit (cat No.: K0691) were purchased from Thermo Fisher Scientific, Lithuania, Europe. Luria Bertani (LB) medium (cat No.: M1245) and kanamycin (cat No.: MB105) were purchased from HiMedia, Mumbai, India. Sodium hydroxide (cat No.: 68451) and ethylene diamine tetra-acetic acid (EDTA) (cat No.: 054960) were purchased from SRL, Gurugram, India. All other chemicals and buffer salts were procured from Merck-Millipore, Darmstadt, Germany. Amicon
® ULTRAcel
® 10K (cat No.: UFC901096) and 3K (cat No.: UFC900324) centrifugal filters were purchased from Merck-Millipore, Cork, Ireland and restriction enzymes, ligase, and polymerase enzymes were purchased from Fermentas, Vilnius, Lithuania. The PierceTM BCA protein assay kit (cat No.: 23227) was procured from Thermo Fisher Scientific, Tokyo, Japan and NuviaTM IMAC Ni-Charged Resin (cat No.: 780-0800) was procured from Bio-Rad, Feldkirchen, Germany.
2.3. Construction of Fusion Nucleocapsid Protein (NP1-419) and its Variants (NP121-419 and NP250-365)
The SARS-CoV2 nucleocapsid gene NP
1-419 (codon-optimised for
E. coli) was procured from GenScript, Piscataway, NJ, USA. The cloning of its two variants, NP
121-419 and NP
250-365, was completed in-house in the pET-28b vector with the 6Xhis tag at the C-terminal. The NP
121-419 and NP
250-365 were amplified from the nucleocapsid gene plasmid with their respective primers. The primers’ details are provided in
Table 2. The PCR amplification for NP
121-419 consisted of 30 cycles, where each cycle consisted of a denaturation step at 94 °C for 30 s, annealing at 62 °C for 30 s, and elongation at 72 °C for 45 s. This was followed by a final extension step at 72 °C for 5 min. Amplified genes and vectors were digested with restriction enzymes and ligated suitably. The digested vector and the insert were purified with the gel extraction process and ligated in the presence of the T4 DNA ligase enzyme. The ligated product was transformed in
E. coli DH5α cells for the screening of the positive recombinant clones. Positive clones, pET-28-NP
121-419 and pET-28-NP
250-365, were confirmed with colony PCR with similar PCR conditions to those mentioned above. All of the positive recombinant clones were confirmed through Sanger sequence analysis. The glycerol stocks for positive clones were prepared and stored at −70 °C or below for further use.
2.4. Expression Study of Nucleocapsid Protein and Its Variants (NP1-419, NP121-419 and NP250-365)
The positive clones of pET-28-NP1-419 and the variants pET-28-NP121-419 and pET-28-NP250-365 were transformed in BL21(DE3) competent cells. The transformed colonies were grown overnight in Luria broth medium at 25 °C and 200 rpm. The overnight cultures were inoculated in secondary media, which was Terrific broth (TB) 1000 mL with 1% glycerol at 25 °C and 200 rpm. Cultures were allowed to grow until the OD at 600 nm (OD600) reached 1, followed by the addition of 3% ethanol in the culture. Again, the cells were allowed to grow until the OD600 reached 1.5 and was induced with 0.5 mM IPTG. Additionally, 16 h post-induction, the samples were collected for the evaluation of protein expression. According to the final OD of the cells, 108 cells/mL were collected from each flask, centrifuged at 8000× g, and resuspended in 100 µL of reduced SDS-loading buffer. The samples were boiled for 10 min, centrifuged at 8000× g, and the supernatant was loaded on the 12% SDS-PAGE to check the expression of the cells. The rest culture was harvested by centrifuging at 8000× g for 30 min at 4 °C. The lysis buffer (10 mM sodium phosphate pH 8, 150 mM NaCl, 0.25% Triton X100, 5 mM benzamidine, and 0.2% lysozyme) was added and sonicated in an ice bath. The soluble fraction of the protein was collected after centrifugation at 10,000× g for 30 min at 4 °C for further purification with IMAC.
2.5. Purification of Nucleocapsid Protein and Its Variants (NP1-419, NP121-419 and NP250-365) with Immobilized Metal Affinity Chromatography (IMAC)
The soluble fractions of NP
1-419, NP
121-419, and NP
250-365 were filtered with 0.45 µ filters. The filtered fraction was loaded on the FPLC column (ÄKTA start, Cytiva, Maralborough, USA) packed with Ni-NTA resin (Biorad, California, USA) pre-equilibrated with five column volumes (CVs) of 10 mM sodium phosphate buffer pH 8, 50 mM NaCl, 5 mM benzamidine, and 5 mM imidazole at a flow rate of 0.5 mL/min. Flow through was collected separately to further analyze the unbounded proteins. The column was washed with 5 CV base buffer containing 10 mM sodium phosphate, 5 mM benzamidine, 50 mM NaCl, and 5 mM imidazole. Additionally, 2 CV elution at 0.5 mL/min was completed at an imidazole gradient of 20–500 mM. The eluted fractions were collected and analyzed on 12% SDS-PAGE [
18,
19]. The samples having the band of protein with a corresponding size to that of NP
1-419, NP
121-419, and NP
250-365 were pooled down, respectively, and buffer was exchanged with 10 mM sodium phosphate buffer pH 8 and 150 mM NaCl buffer.
2.6. Qualitative and Quantitative Estimations of the Purified Proteins
The quantitative estimation of the purified protein was conducted with the Pierce BCA kit (Thermo Fisher Scientific) according to the method mentioned by the kit manufacturer. The purified NP1-419, NP121-419, and NP250-365 were analyzed by SDS-PAGE, using a 12% SDS-polyacrylamide gel under a reducing (25% β-mercaptoethanol, v/v) condition for the qualitative estimation of the purified protein. The molecular weights of the proteins were estimated with suitable markers. The gel was stained with Coomassie blue R-250.
2.7. Activity of the Purified Nucleocapsid Protein and Its Variants with the COVID-19 Positive Sera
(i) Enzyme-linked Immunoassay (ELISA): The proteins generated in this study, along with the positive control protein (full-length nucleocapsid protein obtained from Fapon Biotech, China) and negative control protein (glutathione S-transferase), were further analyzed for their activity with COVID-positive serum. In addition, 50 µL of each protein (stock 2 µg/mL) was coated in duplicate on the high binding ELISA plate with carbonate buffer pH = 9.5 and incubated overnight at 4 °C. The next day, the plate was washed with 100 µL of wash buffer (1XPBS 0.1% Tween 20). Then, 200 µL of blocking buffer (3% skimmed milk in 1XPBS) was added to the plate and incubated at RT for 1 h. After the blocking step, the plate was washed thrice with the wash buffer. Following the washing step, 50 µL of COVID-positive sera (1:50 dilution) was added to each well and the plate was further incubated at RT for 1 h. After primary antibody incubation, the plate was washed 6 times with the wash buffer. Then, 100 µL of secondary antibody anti-human antibody Sigma A0170 (1:10,000) dilution was added to each well. The plate was incubated for 1 h at RT. After incubation, the plate was washed 6 times with wash buffer. Then, 100 µL of TMB/H2O2 was added to each well and incubated at RT for 10 min. Following this, 100 µL of stop solution (1 M sulphuric acid) was added to each well and the absorbance was measured at 450 nm with reference to 650 nm.
(ii) Lateral Flow Immunoassay (LFIA): LFIAs were prepared to check the sero activity of recombinant proteins with COVID-positive sera. Here, a strip containing an absorbent pad, nitrocellulose membrane, conjugate pad, and sample pad was placed inside the plastic housing/cassette. The nitrocellulose membrane was immobilized with goat anti-rabbit IgG at control line position ‘C’ and anti-human IgG at test line position ‘2′. A gold conjugate pad containing colloidal gold sol conjugated with recombinant N-protein and Rabbit IgG was prepared. Test strips of 3.0 mm were prepared and housed in a plastic cassette.
Colloidal gold sol was prepared using gold chloride and sodium citrate. The pH of the colloidal gold solution was adjusted to pH 8.0 ± 0.2 using potassium carbonate. Recombinant protein at the concentration of 4.0 mg protein was added slowly into 1000 mL of pH-adjusted colloidal gold sol kept in a beaker on a magnetic stirrer. The colloidal gold sol containing recombinant protein was allowed to mix under mild stirring conditions for one hour. A scan on a spectrophotometer was conducted to check the maxima absorbance peak at 530 ± 5 nm. Unbound sites of the colloidal gold were blocked with the addition of BSA (bovine serum albumin) at a final concentration of 2% and the entire solution was kept on gentle mixing for half an hour. The entire solution was centrifuged. After centrifugation, the supernatant was discarded and pellets were reconstituted in the 10 mM Tris buffer with a pH of 8.0 containing BSA, Tween-20, sucrose, and trehalose. A cocktail of two conjugates (12 OD recombinant nucleocapsid proteins and 3 OD of Rabbit IgG) was prepared for drying on a conjugate pad with a 5 mm width. The strip of the conjugate pad was impregnated with the cocktail of colloidal gold conjugates and dried under a humidity control area for further use along with a nitrocellulose membrane immobilized with goat anti-rabbit IgG at control line region C, anti-human IgG at test line region ‘2′, and anti-human IgM at test line region ’1′.
Tests were run using known COVID-positive and -negative serums obtained from SS serum, India. A 10 µL sample was added the to sample well (S) of the plastic cassette/device followed by 1–2 drops of assay buffer (50 mM Tris buffer, pH 8.5 containing biological detergents and preservatives). The results were read within 15–20.
2.8. Protein Stability Studies
The purified proteins were also checked for their stability at different temperatures for 3 days. In addition, 10 µg of each protein was stored in 3 different sealed tubes at different temperatures (−20 °C, 4 °C, and 37 °C) for 3 consecutive days. The temperature of each set was constant. Once the incubation period was over, samples were finally analyzed on 12% SDS-PAGE for their stability as well as degradation at different temperatures.
The protein samples charged under stability were also analyzed for mass using an Agilent 1260 Infinity Bio-inert Quaternary LC system made up of a quaternary pump with a degasser, an autosampler with a cooling unit, and a diode array detector (DAD) connected to an Agilent 6230 ESI-TOF-MS instrument, RP-HPLC; the intact mass was measured with an AdvanceBio RP-mAb C8 column. The column was saturated with 90% mobile phase A (0.1% v/v formic acid (FA) in MilliQ) and 10% mobile phase B (0.1% v/v FA in acetonitrile) prior to injection. Samples were filtered through 3 kDa Nanosep® centrifugal filters and the buffer was swapped to 0.1% v/v FA in MilliQ (Pall Corporation, New York, USA). The sample was placed onto the column at a concentration of 1 mg/mL and it was separated using a linear gradient of 10 to 60% B over 35 min at a flow rate of 0.5 mL/min. Monitoring UV absorbance at 280 nm and recording TIC for 1000–7000 m/z were used to achieve the detection. Prior to analysis, positive ion mode calibration for MS spectra was performed. The voltage for the fragmentor (Vfrag) was set to 400 V and the capillary gas voltage (Vcap) was set to 350 °C at 5500 V. The MS spectra were deconvoluted using the Agilent MassHunter Qualitative Analysis and BioConfirm software, ver 10.0, utilizing maximum entropy (MaxEnt) and peak modelling (pMod) algorithms.
4. Conclusions
The approach adopted in the work involves reducing the size of a nucleocapsid antigen to enhance its stability. This is conducted while maintaining the required epitopes for antibody identification. This approach can significantly improve diagnostic tests’ accuracy and efficiency. One of the previous studies has shown that the truncation of antigens is the key to improving the sensitivity of nucleocapsid antigens. It demonstrated that the C-terminal domain shows better sensitivity than the N-terminal; however, details of stability or manufacturability were not presented [
7]. Integrating theoretical knowledge about B-cell epitope occurrence with peptide stability helped to identify the stretch which was experimentally quite stable. This was probably due to a decrease in net positive charge on the protein. Coincidently, this stable region was also identified as the CTD domain, which interacts with SARS-CoV2 RNA [
20].
Stability-enhanced diagnostic antigens are less prone to denaturation or degradation during storage and assay procedures. We have also shown that the stability of the nucleocapsid can be improved by using a suitable formulation buffer (borax buffer with trehalose). However, such a formulation buffer does not support the further downstream process of manufacturing stable gold nanoparticles. We have shown the improved stability of the theoretically identified truncated short nucleocapsid protein over the native protein. We also assume that this will increase the quality of portable testing devices, such as the Lateral Flow Immunoassay (LFIA). Improved stability can ensure the diagnostic test performs consistently and reliably over time due to this critical quality attribute. Smaller antigens with enhanced stability are less likely to interact with unrelated antibodies or molecules, minimizing potential cross-reactivity issues. Specificity is crucial in diagnostic tests to avoid interference and improve test accuracy [
21].
This kind of strategy can be used to improve diagnostic antigen stability for new and existing LFIA devices. These devices are intended for use in resource-limited settings and remote areas, improving diagnostic service accessibility. The strategy adopted for the improvement of stability for diagnostic applications can be applied in various fields other than diagnostics. This is particularly true in the contexts of immunology, biotechnology, and medicine.