1. Introduction
Major histocompatibility complex (MHC) is a series of cell surface molecules encoded by a large gene family present in all vertebrates. They occupy an important place in immune systems. Generally, MHC can be divided into three classes. Class I molecules are peptide-binding proteins that select short peptides for antigen presentation and molecules aiding antigen procession. Class II molecules include peptide-binding proteins and some proteins assisting antigen loading onto MHC class II’s peptide-binding proteins. Functioning very differently from the other two classes, Class III molecules include several secreted proteins with immune functions, such as components of the complement system, cytokines and heat shock proteins. Because of their important immune roles in all vertebrates, MHC molecules arouse great enthusiasm in researchers to explore their function mechanism in antigen processing and presentation, mate selection, transplant rejection, and their evolutionary diversity.
MHC genes sparkle for their polymorphism, which can be embodied in their sequence variation and multi-locus. Some
MHC loci are extraordinarily polymorphic while some are not. Researchers have drawn the first complete sequence and gene map of the human
MHC [
1]. Two hundred twenty-four identified gene loci spanned on chromosome 6, and one hundred twenty-eight of them are predicted to be expressed. These results have uncovered the extraordinary polymorphism and evolution of this region. However, there are some species whose
MHC loci are generally monomorphic. The whale
MHC genes, for instance, are less polymorphic than those of human [
2]. In any case, the polymorphic
MHC genes are considered to be related to disease resistance of vertebrates, assisting an organism to fight various parasites and pathogens, which partially infects the organism evolution and its living environment.
Great efforts have been made to reveal the complicate structures of teleost
MHC genes. For example in half-smooth tongue sole (
Cynoglossus semilaevis), two
MHC class II
A genes are identified [
3]. And from the allele distribution in several individuals, it is inferred that at least two loci exist in each gene. By analyzing bacterial artificial chromosome (BAC) clones, researchers find nine class II
A and 15 class II
B loci in tilapiine fish
Oreochromis niloticus [
4]. By hybridization with specific probes and gene sequencing, two class II
A loci and six class II
B loci have been identified in zebrafish (
Brachydanio rerio) [
5]. In recent years, more studies have focused on the origin and evolution of genes. Researchers have demonstrated a novel class II DE group in some teleost species, and that the teleost fish class II genes can be classified into three major groups [
6]. Medaka (
Oryzias latipes) has five pairs of expressed class II genes, each comprising one II
A and one II
B gene, and the tightly linked II
A and one II
B genes have undergone conserved evolution [
7]. In these studies, the accurate information of
MHC structure and locus distribution in the genome enabled a deeper understanding of their evolution.
Despite the abundant research on MHC gene structure, some problems remain to be solved. Lack of clear loci delimitations and lack of integrate sequence information increase difficulties of disclosing the complexity of MHC system. Some conclusions may cause inaccurate results and misunderstanding if they are deduced from partial MHC sequences with ambiguous locus information. So it is necessary to distinguish MHC gene sequences of different loci. Here we analyzed genome sequences of stone flounder (Kareius bicoloratus, Basilewsky, 1855) and Japanese flounder (Paralichthys olivaceus, Temminck and Schlegel, 1846) MHC class II B gene, a gene encoding the β chain of MHC class II molecules. The two fish species belong to different families of Pleuronectiformes. The sequences we used were obtained by gene cloning and sequencing and some of them covered all exons and introns of the gene. In this way, we were expecting to speculate the locus number of this gene in stone flounder. Furthermore, we also identified the MHC class II B gene in Japanese flounder, and tried to estimate its locus number from a homozygous diploid genome. Enriching basic MHC sequence information, our study might contribute to more accurate prediction of gene locus number and assist further research on MHC genes.
3. Discussion
Due to the lack of BAC library and whole genome information of stone flounder, here in this study, we used two strategies to minimize errors. The first strategy was about primer design. There was no sequence disparity in the 5'-UTR or 3'-UTR when conducting rapid amplification of cDNA ends (RACE) for stone flounder gene
B [
8]. So the primers for amplifying the whole-length cDNA and DNA sequences were designed on the UTR regions. When amplifying the partial DNA sequences in eight individuals, we designed the primers on the relatively conserved region of exons 1, 3 and 4, according to the cDNA sequences (
Figure 1). The trial in partial gene
B sequence amplification provided new proofs for the complication of gene
B structure. We only used one pair of primers to amplify the whole-length DNA sequence of this gene in stone flounder or Japanese flounder. Thus all the exons and introns could be amplified once, which ensured the accuracy of gene structure analysis. Considering the conservation of UTR sequences, we could speculate the sequences got in this study were mostly comprehensive. But we could not exclude the possibility that the RACE primers were not degenerate enough to acquire all UTR sequences. Another strategy was using high fidelity DNA polymerase in every PCR system, conducting multiple batches with the same templates and primers in different thermocyclers at different time, and sequencing as many clones as possible. High fidelity DNA polymerase could reduce the proportion of amplification errors. Repeating the same experimental batches at different time could reduce the percent of wrong sequences. Meanwhile, we set a standard that each sequence used in analysis was verified by at least three clones to guarantee their accuracy. But it might also exclude sequences that were expressed lower or that were harder to be amplified. So the standard we used might be a compromise. Combining the two strategies, we supposed the sequences got in this study to be relatively comprehensive and accurate.
Figure 5.
DNA sequence and putative amino acid sequence of the Japanese flounder MHC class II B. The intron sequences are in lowercase letters and the exon sequences in DNA are in uppercase letters. The amino acid sequences encoded by the open reading frame (ORF) are in uppercase letters below the exon sequences. The translational initiation codon and termination codon are in shade. Four cysteines are boxed. The start of GT repeat in intron 2 and the decamer repeat GTCCAGTTGA in intron 3 are underlined.
Figure 5.
DNA sequence and putative amino acid sequence of the Japanese flounder MHC class II B. The intron sequences are in lowercase letters and the exon sequences in DNA are in uppercase letters. The amino acid sequences encoded by the open reading frame (ORF) are in uppercase letters below the exon sequences. The translational initiation codon and termination codon are in shade. Four cysteines are boxed. The start of GT repeat in intron 2 and the decamer repeat GTCCAGTTGA in intron 3 are underlined.
Figure 6.
Alignment of peptides encoded by MHC class II B in fourteen species. Gaps are indicated by dashes. Conserved sites are marked black. The consensus sites are indicated on top. The Genbank numbers of these sequences are: AFY98547 (stone flounder), AII02001 (Japanese flounder), ADB43564 (spotted halibut), AAP20186 (red sea bream), ABV48909 (large yellow croaker), ADV36785 (miiuy croaker), AEM75094 (grass carp), CAD87794 (zebrafish), AAD53026 (rainbow trout), CAA49726 (Atlantic salmon), AAC34836 (guppy), P18469 (mouse), and AAA59781 (human).
Figure 6.
Alignment of peptides encoded by MHC class II B in fourteen species. Gaps are indicated by dashes. Conserved sites are marked black. The consensus sites are indicated on top. The Genbank numbers of these sequences are: AFY98547 (stone flounder), AII02001 (Japanese flounder), ADB43564 (spotted halibut), AAP20186 (red sea bream), ABV48909 (large yellow croaker), ADV36785 (miiuy croaker), AEM75094 (grass carp), CAD87794 (zebrafish), AAD53026 (rainbow trout), CAA49726 (Atlantic salmon), AAC34836 (guppy), P18469 (mouse), and AAA59781 (human).
Figure 7.
Phylogenetic tree of peptides encoded by
MHC class II
B in fourteen species. The human MHC class II B was set as an outgroup. The 11 alleles in stone flounder and the four alleles in Japanese flounder are indicated by their Genbank numbers. Sequences from other species are the same to those in
Figure 6.
Figure 7.
Phylogenetic tree of peptides encoded by
MHC class II
B in fourteen species. The human MHC class II B was set as an outgroup. The 11 alleles in stone flounder and the four alleles in Japanese flounder are indicated by their Genbank numbers. Sequences from other species are the same to those in
Figure 6.
Figure 8.
MHC class II B structures of seven fish species. Exons are indicated by boxes and introns are indicated by lines. The taxonomy status of these species is: stone flounder: Pleuronectidae; Japanese flounder: Paralichthyidae; half-smooth tongue sole: Cynoglossidae; spotted halibut: Pleuronectidae; miiuy croaker: Sciaenidae; grass carp: Cyprinidae; and zebrafish: Cyprinidae.
Figure 8.
MHC class II B structures of seven fish species. Exons are indicated by boxes and introns are indicated by lines. The taxonomy status of these species is: stone flounder: Pleuronectidae; Japanese flounder: Paralichthyidae; half-smooth tongue sole: Cynoglossidae; spotted halibut: Pleuronectidae; miiuy croaker: Sciaenidae; grass carp: Cyprinidae; and zebrafish: Cyprinidae.
Many studies have focused either on limited subset of
MHC loci or on partial sequences, such as PBR and its flanking sequences. These indistinct locus definitions may impose much redundant information on MHC studies. Some researchers have emphasized the significance of complete
MHC class II
B sequences covering the full genetic diversity present in a species, which can provide a useful tool to compare the molecular evolution of these genes between different groups of vertebrates [
9]. In zebrafish (
Brachydanio rerio), three possible loci have been primarily identified base on 20 cDNA
MHC class II
B sequences. But after investigating the intron sequences, the presence of a fourth locus is inferred [
10]. In three-spined stickleback (
Gasterosteus aculeatus), fifteen distinct class I exon 2 and exon 3 sequences are assigned to twelve loci based on their intron 2 length differences, and they are further grouped into three families derived from different ancestral genes [
11]. Though results deduced from partial sequences of
MHC may not fundamentally affect the precision of analysis, discrimination of
MHC loci by whole-length sequences does matter in related studies. Here in stone flounder, we found some alleles with similar intron 1 sequences might also differ greatly in the other intron sequences (
Figure 3). The phenomenon furnishes valid evidence that the
MHC genome sequences may contain much more information than ever expected.
Homozygous or haploid genome is beneficial for locus number studies of
MHC genes. With a single copy of the
MHC class II
B gene, potbellied seahorse (
Hippocampus abdominalis) provides an ideal system for
MHC-related studies [
12]. For lack of such materials, the genome of a single individual offers easier access to locus number analysis of the gene, especially in heterozygous diploid materials. But the disparity of locus number among different individuals should not be ignored. In a study of cichlid
MHC class II
B, altogether 17 different loci have been found in various cichlid species, but probably none of these individuals carry all the loci in a single haplotype, varying from one to thirteen per individual [
13]. In this study, we estimated locus number from partial DNA sequences of eight individuals, and analyzed each intron disparity from whole-length sequences of only one individual. The rough speculation showed at least four
MHC class II
B loci existed in this species and they were not carried by all the individuals. To practice locus number analysis in homozygous individuals, a homozygous diploid Japanese flounder was used in this study. Not surprisingly, the level of
MHC class II
B polymorphism was lower than that of diploid individuals in terms of its cDNA polymorphism. By cloning the whole-length DNA sequence, no fragment of different lengths was observed from the electrophoresis pattern. It could be conjectured that if more loci of this gene were found in Japanese flounder, their sequences would differ in intron lengths.
The genome structure of
MHC class II
B is not very conserved among different teleosts. Previous genome structure comparison has revealed the presence of an extra intron in stone flounder [
8]. In this study, it was also found in Japanese flounder, which was different from the result in an earlier study [
14]. This extra intron between exons 3 and 4 has been demonstrated in the Percomorpha and Atherinomorpha rather than non-acanthopterygian fishes. It arises after the divergence of Ostariophysi and Protacanthopterygii, and before the divergence of Acanthopterygii [
15]. The extra intron has also been proved in grass carp (
Ctenopharyngodon idella) [
16], miiuy croaker (
Miichthys miiuy) [
17], spotted halibut (
Verasper variegatus) [
18], and half-smooth tongue sole (
Cynoglossus semilaevis) [
19], but not in zebrafish (
Brachydanio rerio) [
5]. The variant genome structures of
MHC class II
B may help to understand its evolution. The four fish species in Pleuronectiformes all possess this extra intron and conserved
MHC class II
B structures, despite the different families they belong to. Both the Perciformes fish miiuy croaker and the Cypriniformes fish grass carp have this extra intron. Another Cypriniformes species zebrafish, however, does not have it. It can be reasonably inferred that this extra intron is a common feature shared by Pleuronectiformes fishes, and the structure may vary in other orders of fish. A repeat hexamer CCAGGT is conspicuous in this extra intron of perch-like fish [
15]. Noticing the 5' (GT) and 3' (AG) splice sites in this hexamer, the authors hypothesize that additional duplications of the hexamer increase once a tandem duplication locates in this region. Intriguingly, in the present study, the repeat decamer GTCCAGTTGA in stone flounder and Japanese flounder also included 5' and a 3' splice sites, which inclined to the former hypothesis.
Microsatellites have been detected in introns and flanking regions of
MHC genes in many species. In house mice (
Mus domesticus), a series of microsatellites in and near to
MHC provide a simple and inexpensive method to discriminate haplotypes [
20]. In human, the association between some microsatellites and haplotypes provide a powerful tool for studying genetic drift and admixture of populations [
21]. So far, some species have been proved to possess locus-specific intron length variation, most of which generates from microsatellites. In a study of two species of cichlid fishes, the amplified segment lengths of
MHC class II
B vary from 335 to 457 bp, with different repeat times of a 12-nucleotide element in intron 1. This can serve as a marker for the classification of genes into groups [
22]. In half-smooth tongue sole, two distinct loci of
MHC class II
B are identified by different nucleotides and lengths of intron 1, which also possess AC repeats [
23]. In the stone flounder and Japanese flounder
MHC class II
B, rich repeats, such as GT and GTCCAGTTGA, were found distributing in introns. The difference of their repeating numbers contributed to diverse intron lengths. In the intron 4 of Japanese flounder
MHC class II
B, there was only one copy of ACCTGTCT, which was similar to the repeat unit in stone flounder. It was reasonable to suspect their roles in changing introns length of different gene loci in diploid or polyploidy genomes, albeit only a DNA sequence from one locus had been acquired in the homozygous Japanese flounder individual.
Pseudogenes are not rare in
MHC family. In bovine, two out of three
DRB genes are pseudogenes [
24]. In six different class II loci of the zebrafish (
Brachydanio rerio), three are truncated pseudogenes [
5]. A class I pseudogene in the rat is distinctive for an additional stop codon in exon 2 [
25]. Here in the No. 3 stone flounder individual, each of the four loci corresponded to one transcriptional product. So with the present results, no pseudogene could be proved in stone flounder
MHC class II
B gene. By alignment of different cDNA sequences from stone flounder, polymorphic sites were found distributing mainly in PBR. The constrained distribution has also been found in many MHC studies, which can be elucidated by the protein binding function of PBR. Some researchers have proposed that the
dN/
dS pattern is different between PBR and other regions [
26]. As was shown in the stone flounder
MHC class II
B,
dN was higher than
dS in PBR. While in the remained regions, the two figures were similar. Positive selection might play a prominent role in PBR evolution of the stone flounder
MHC class II
B.
The multi-locus feature of
MHC genes is an exemplification of their highly polymorphism. Just as at least four
MHC class II
B were found in one stone flounder individual, the duplicated
MHC genes might enhance the diversity of immune response in this species. And it was possible to find more loci of this gene in stone flounder. However, the more
MHC gene loci may not mean the better immune ability. Organisms balance between a high immune ability and a moderate pressure they can stand. Too much disparate MHC molecules may occupy much space on the cytomembrane. As proposed by some researchers, increasing the number of MHC molecules per individual will increase the number of foreign peptides that can be presented and the number of different T-cell receptors positively selected in the thymus. But it will also reduce the number of TCRs by negative selection [
27]. So the number of MHC molecules expressed in an individual is constrained in actual situations.
The present study provided information for the complexity of MHC genes structure and established a feasible method to infer the locus number of such genes. At least four loci of MHC class II B were found in eight stone flounder individuals. We also examined this gene in one heterozygous diploid individual and in a homozygous genome, and discovered its sequence features. These results would assist further extensive polymorphism studies and development of molecular markers to distinguish individuals and populations. They would also help to deduce the evolutionary relationship of MHC genes.