1. Introduction
Hepatocellular carcinoma (HCC) is a global health problem, as it is the fifth most common cancer and the third leading cause of cancer-related death [
1]. Etiologically, the majority of HCC develops in chronic hepatitis B virus (HBV) carriers, especially in East Asia and sub-Saharan Africa, where HBV is endemic [
2]. Although an effective vaccine has been used for about two decades, more than 350 million people in the world are chronic carriers of this virus [
3]. Until now, the exact mechanism underlying hepatocarcinogenesis in chronic HBV infection remains elusive.
The HBV genome comprises a partially double-stranded circular DNA molecule of approximately 3200 base pairs. This DNA strand encodes four overlapping open reading frames (ORFs):
X, for the
X protein; precore/core (
C), for the nucleocapsid; pre-
S/
S, for the surface/envelope protein; and
P, for the DNA polymerase [
4]. HBV replicates through an RNA intermediate, using a reverse transcriptase that lacks a proofreading function. Thus, HBV exhibits replication errors at a much higher rate than in other DNA viruses, and the estimated mutation rate is about 1 nucleotide/10,000 bases per year [
5]. The effect of viral mutations on HCC pathogenesis has been investigated extensively, and several important high-risk mutations have been identified. The most convincing association between viral mutation and the development of HCC is A1762T/G1764A double mutations in the basal core promoter (BCP) [
6,
7]. Additionally, pre-
S deletion, T1753V mutation in BCP, and C1653T mutation in box-α of Enhancer II have been reported to be related with increased risk of HCC in several reports [
8,
9,
10,
11,
12]. However, current studies concerning other potential predictive mutations by comparative analysis of the complete HBV genomes remain limited. Meanwhile, distinct clinical and virologic characteristics of HBV infection have been reported in different geographical parts of the world, and are increasingly associated with genetic diversity of the infecting virus [
13]. The township of Qidong is one of the highest endemic regions for HBV-related HCC in China. In this two-stage study, the complete HBV genome was initially analyzed in the serums of patients from a prospective cohort of male HBV carriers in Qidong, in order to explore new mutation biomarkers of HCC development in addition to traditional hot-spot mutations in pre-
S and
X genes. Second, an independent validation study was conducted to confirm the relationship between the newly identified mutations and HCC risk. Furthermore, sequential serum sequencings of the
C gene were carried out to assess the longitudinal evolution of mutations in the C gene during HCC development.
3. Discussion
HBV infection is a major risk factor for HCC occurrence; ≥75% cases of HCC are associated with HBV infection in China [
14]. Compared with non-carriers, patients with chronic HBV infection have a greater than 100-fold increased risk of developing HCC [
15]. During the course of chronic HBV infection, a wide variety of liver diseases are observed, ranging from an asymptomatic carrier state to liver cirrhosis and HCC [
16]. In our previous studies conducted in Qidong, pre-
S deletion and specific mutations in BCP were confirmed to be associated with a high risk of HCC occurrence [
17,
18]. However, the risk of mutations in other regions of the full HBV genome was seldom reported. In the present study, the full HBV genome was analyzed in the serum of patients within a large cohort of male HBV carriers in Qidong. The number of nucleotide substitutions in the full-length sequence was significantly higher in HCC cases than in controls. Meanwhile, our data demonstrated that high HCC risk mutations were not likely to distribute evenly throughout the complete HBV genome. The regions with significant differences in the mutation number between HCC and control patients were (in rank order)
X (
p = 0.002), pre-
S2 (
p = 0.015), pre-
C/
C (
p = 0.016),
P (
p = 0.112), pre-
S1 (
p = 0.483), and
S (
p = 0.636). Similar to previous studies, pre-
S deletion and pre-
S2 start codon mutation in pre-
S gene, C1653T, and A1762T/G1764A mutations in
X gene were associated with significantly higher risk of HCC development. Furthermore, in this full HBV genome comparison between 30 HCC cases and 30 controls, we also identify some rarely reported or new HCC-related mutations, including G1613A in the
X gene, and A2159G, A2189Y, G2203W, and C2288R mutations in the
C gene. These new high risk mutations—together with those confirmatory mutations in pre-
S and Enhancer II/BCP regions from previous studies—suggested that mutation combinations in the full genome sequence might serve as potential viral markers for predicting the development of HBV-related HCC.
Among the 10 HCC-related mutations acquired from the full genome analysis in stage 1, four (A2159G, A2189Y, G2203W, and C2288R mutations) were located in the
C gene. Meanwhile, the carcinogenic risk of pre-
S deletion and pre-
S2 start codon mutation in pre-
S gene, C1653T, and A1762T/G1764A mutations in
X gene has been extensively investigated in this cohort [
17,
18]. Compared to those studies focusing on mutations in pre-
S and
X genes, only a few studies have investigated the effect of mutations in the
C gene during natural HBV infection. The clinical effect of mutations in this region was less well elucidated, and the results were inconsistent [
19,
20,
21,
22]. Thus, the temporal relationship between the
C gene mutations and HCC in chronic HBV infection need to be further studied. In view of this, we carried out an independent validation study to confirm the findings from the initial full-length sequence comparison. After adjustment for age, history of cigarette smoking and alcohol consumption, unconditional logistic regression analyses showed that pre-
S deletion, C1653T, A1762T/G1764A, A2159G, A2189Y, G2203W, and C2288R mutations were significantly associated with high HCC risk. Multivariate analysis indicated that pre-
S deletion, A1762T/G1764A, A2159G, and A2189Y mutations were independent risk factors for HCC progression. To our knowledge, the clinical implications of these mutations in the
C gene during HCC occurrence have been reported in very limited studies. Ni et al. reported that HCC children had more mutations in the
C gene than chronic HBV carriers. The mutation sites at core codon 74, 87, and 159 were related to the development of HCC in a small scale study [
21]. In a nested case–control study within a prospective study form Taiwan, six mutations in the
C gene (nt 1961, 1938, 2045, 2136, 2239, and 2441) were identified to be associated with decreased risk of HCC after accounting for viral genotype. Meanwhile, these mutations were also related with a 0.7- to 1-log decrease in plasma viral load and a high rate of HBeAg sero-conversion [
19]. However, we did not observe this protective effect of such mutations in the
C gene on HCC development in the current study from Qidong in the mainland of China. We speculated that this is probably because of distinct clinical and virologic characteristics of HBV infection in different geographical parts of the world, such as the different prevalent HBV genotype and sub-genotype between Taiwan and Qidong. Recently, Zhu et al. demonstrated that A2189C and G2203W mutations were independent risk factors for HCC in another study from Qidong, showing odds ratios 3.99 and 9.70, respectively [
22]. In accordance with the results of Zhu et al., in the present study, we confirmed that A2159G, A2189Y, G2203W, and C2288R mutations were associated with high HCC risk in univariate analysis. However, in multivariate analysis, G2203W and C2288R mutations were not independent risk factors in predicting HCC occurrence. The exact mechanism of hepatocarcinogenesis relating to the above
C gene mutations remains uncertain. Theoretically, hepatitis B core antigen (HBcAg) contains the principal target for the cytotoxic T lymphocyte (CTL) attack and various epitopes of HBcAg recognized by immune cells, such as T or B lymphocytes. The amino acid substitutions in the
C gene may permit a change in the immune recognition sites of HBcAg, thereby allowing the virus to elicit or evade immune clearance and have a more direct impact on the natural course of hepatitis B [
23,
24,
25]. A2159G and A2189Y mutations are missense mutations resulting in an amino acid change of HBcAg codons 87 (S87G) and 97 (I97L/F), respectively. Because codon 87 is located in the B cell epitope of
C gene, the missense substitution at codon 87 may alter the recognition site for B cells or antibodies and allow the virus to escape from attacking antibodies [
26]. Substitution at codon 97 was the most frequently detected substitution of the
C gene in this study. Since codon 97 is located within a potent T-cell epitope, this substitution may lower the quantity of antigen presentation in secretions of immature HBV particles. Thus, the codon 97 substitution may inhibit the immune response and lead to successful maintenance of chronic infection in human HBV carriers [
27,
28]. The prolonged viral persistence causes continuous liver injury and subsequent regeneration, which significantly increases the risk for HCC.
The effect of combined mutations in Enh II/BCP regions on increased risk of HCC was extensively identified in several studies [
29,
30,
31]. It has been reported that most
C gene mutations were prone to be clustered in the middle core region [
32,
33]. However, the majority of earlier studies primarily focused on the relationship between a certain point mutation of the
C gene and HCC. In the present study, the risk of combined mutations in the
C gene, including the A2159G, A2189Y, G2203W, and C2288R mutations, on HCC was explored. To examine the potential value of the presence of HBV mutation patterns—either alone or in combination—we evaluated the potential value of each mutation or combined mutations in the
C gene for the prediction of HCC. The key finding of this study was that the number and pattern of multiple mutations in the
C gene (A2159G, A2189Y, G2203W, and C2288R mutations) showed the additive combined effects that related to HCC progression. Compared to patients with wild-type in four hot spot nucleotides, our data indicated that the presence of any mutation combination in the
C gene was associated with an increased risk of HCC. The OR of HCC cases that had any single hot spot mutation was 2.904 (95% CI, 1.508–5.590), it increased to 6.027 (95% CI, 2.232–16.275) with double mutations, and to 8.630 (95% CI, 1.588–46.891) with triple mutations. A significant biological gradient of HCC risk by an increasing number of mutations in the
C gene was observed. We then recruited a series of serum samples spanning years before and after HCC diagnosis. The longitudinal observation demonstrated a sequential and accumulative combination of mutations in the
C gene during the development of HCC. During the course of chronic HBV infection, it is speculated that the accumulation of HBV complex mutations may have a sequential and synergistic role in the development of HCC. Although the mechanism is unclear, this finding suggests that the detection of these combined mutations may aid in screening the high HCC risk subjects in chronic HBV carriers. Additionally, HCC mostly develops in patients with cirrhosis. Therefore, HCC and cirrhosis may share the same risk factors, including high HBV DNA levels, certain genotypes, and naturally occurring viral mutations. In view of this, we speculate that such mutations in the
C gene were accompanied by the progression of advanced liver disease—not only HCC but also liver cirrhosis.
There are also some limitations that should be considered in the present study. First, most analysis of HBV mutations was based on a single blood sample obtained at the diagnosis of HCC, and we could not assess the effect of changes in mutation status on the development of HCC; Second, the direct sequencing method only revealed the predominant strains in the host, and it may underestimate the real mutation level in patients, as in most cases, mixture infection of different viral strains was common; Third, as an important risk factor of HCC, liver cirrhosis or fibrosis was not evaluated in this study; Finally, the result was limited because all the study subjects were males, a larger cohort and a longer follow-up time are needed for a similar study in females. Because there were several limitations existing in the current study, our results should be interpreted with caution. Likewise, the conclusions of this study should also be drawn cautiously. Therefore, future studies or analyses assessing the risk of C gene mutations on the development of HCC should be performed on the basis of overcoming such limitations.