XB-ART-53689Dev Biol July 1, 2017; 427 (1): 84-92.
Show Gene links Show Anatomy links
Co-accumulation of cis-regulatory and coding mutations during the pseudogenization of the Xenopus laevis homoeologs six6.L and six6.S.
Common models for the evolution of duplicated genes after genome duplication are subfunctionalization, neofunctionalization, and pseudogenization. Although the crucial roles of cis-regulatory mutations in subfunctionalization are well-documented, their involvement in pseudogenization and/or neofunctionalization remains unclear. We addressed this issue by investigating the evolution of duplicated homeobox genes, six6.L and six6.S, in the allotetraploid frog Xenopus laevis. Based on a comparative expression analysis, we observed similar eye-specific expression patterns for the two loci and their single ortholog in the ancestral-type diploid species Xenopus tropicalis. However, we detected lower levels of six6.S expression than six6.L expression. The six6.S enhancer sequence was more highly diverged from the orthologous enhancer of X. tropicalis than the six6.L enhancer, and showed weaker activity in a transgenic reporter assay. Based on a phylogenetic analysis of the protein sequences, we observed greater divergence between X. tropicalis Six6 and Six6.S than between X. tropicalis Six6 and Six6.L, and the observed mutations were reminiscent of a microphthalmia mutation in human SIX6. Misexpression experiments showed that six6.S has weaker eye-enlarging activity than six6.L, and targeted disruption of six6.L reduced the eye size more significantly than that of six6.S. These results suggest that enhancer attenuation stimulates the accumulation of hypomorphic coding mutations, or vice versa, in one duplicated gene copy and facilitates pseudogenization. We also underscore the value of the allotetraploid genome of X. laevis as a resource for studying latent pathogenic mutations.
PubMed ID: 28501477
Article link: Dev Biol
Species referenced: Xenopus tropicalis Xenopus laevis
Genes referenced: c14orf39 neurog3 pam six3 six6
Article Images: [+] show captions
|Fig. 1. Synteny conservation around six6 loci in Xenopus tropicalis, X. laevis, and Homo sapiens, and quantitative divergence in the expression of X. laevis six6 homeologs. (A) Diagram of orthologous or homeologous chromosomal regions around X. tropicalis six6, X. laevis six6.L and six6.S, and H. sapiens SIX6. Pentagonal arrows represent genes and their transcriptional directions, and those with the same color are orthologs or homeologs. Gene names are indicated only on the X. tropicalis chromosome, except for six6 genes. The gene named c14orf39 in H. sapiens was named LOC100486324 in the X. tropicalis genome assembly. *Gene models for X. laevis c14orf39/ LOC100486324 are absent from the genome assembly, but a BLAST search with X. tropicalis LOC100486324 identified partial coding sequences of X. laevis orthologs in the corresponding regions of chromosomes 8L and 8S. (B) RNA-seq analysis of six6.L and six6.S using whole embryos at various developmental stages. Graph label suffixes “_T” and “_U” indicate the duplicated data sets “Taira201203” and “Ueno201210”, respectively. (C) In situ hybridization analysis of X. tropicalis six6, X. laevis six6.L, and six6.S expression in tailbud embryos. Arrowheads and arrows indicate expression in the retina and optic stalk, respectively.|
|Fig. 2. The six6.S CNE was more highly diverged from X. tropicalis six6 CNE than the six6.L CNE. (A) Genomic sequence of the X. tropicalis six6 locus aligned with the orthologous X. laevis, H. sapiens, M. musculus, and T. rubripes sequences using the Vista alignment tool. A black arrow indicates the X. tropicalis six6 gene and its transcriptional direction. Conserved non-coding, coding, and untranslated sequences are shaded in vermilion, dark blue, and light cyan, respectively. A magenta box indicates a CNE identified with the default conservation thresholds of the alignment tool (70% identity over 100 bp). The scale at the bottom of the alignment indicates the relative positions in the X. tropicalis six6 locus. (B) Unrooted neighbor-joining tree showing the relationships among X. tropicalis six6-CNE, X. laevis six6.L-CNE, and six6.S-CNE. Numbers in parentheses indicate distances calculated based on nucleotide substitution rates. (C) Alignment of six6-CNE sequences from X. tropicalis, X. laevis, H. sapiens, M. musculus, and T. rubripes. Nucleotides identical to those in the X. tropicalis sequences are shaded in gray. Broken lines indicate unaligned sequences. Putative transcription factor-binding motifs mapped in the conserved core region are shown in boxes of different colors.|
|Fig. 3. Quantitative divergence in the enhancer activity of homeologous six6-CNEs. (A) GFP expression detected by in situ hybridization in founder transgenic embryos generated with the reporter constructs shown to the left. Shown are all embryos scored as “reproducible expression” in Supplementary Table S2 and used for the semi-quantitative analysis of expression levels. Expression signals in the eye, brain, and pharyngeal arch are indicated in an embryo injected with six6.S-CNE-βGFP. Abbreviations: e, eye; b, brain; pa, pharyngeal arch. (B) Semi-quantitative analysis of GFP expression in the eye region. Bars represent means and standard deviations. Differences were statistically evaluated using Student's t-tests.|
|Fig. 4. Six6.S was more highly diverged from X. tropicalis Six6 than Six6.L, and contained amino acid substitutions reminiscent of pathogenic mutations identified in human SIX6 and Six3. (A) Unrooted neighbor-joining tree showing the relationships among X. tropicalis Six6 and X. laevis Six6.L and Six6.S proteins. Numbers in parentheses indicate distances calculated based on amino acid substitution rates. (B) Amino acid sequence alignment of X. tropicalis, X. laevis, H. sapiens, M. musculus, and T. rubripes Six6 proteins, and the X. tropicalis and H. sapiens Six3 proteins. Amino acids conserved in more than three sequences are shaded in gray. The Six domain and homeodomain are shown in boxes with blue and green lines, respectively. Amino acid substitutions unique to Six6.S are shaded in pink. Amino acids with substitutions or deletions encoded by pathogenic alleles in H. sapiens SIX6 and Six3 are shaded in purple and orange, respectively. Double-headed arrows at the bottom of the alignment indicate amino acids involved in the formation of three α-helices of the homeodomain.|
|Fig. 5. six6.S encodes a hypomorphic mutant protein. (A) The uninjected and injected sides of representative embryos injected with GFP, six6.L, or six6.S mRNA, and subjected to lacZ staining are shown (blue). Strong staining was detected on the injected side. In some embryos, some lacZ staining on the injected side was observed from the uninjected side through the head tissues. Magenta arrows indicate the eyes on the injected sides. (B) Analysis of the eye size of the embryos injected with GFP, six6.L, or six6.S mRNA. Bars represent means of eye area size for the injected side relative to the uninjected side, with standard deviations. Numbers of analyzed embryos are shown in parenthesis in the bars. Data were statistically analyzed using Student's t-tests (**p<0.01).|
|Fig. 6. Targeted disruption of six6.L resulted in a microphthalmia-like phenotype. (A) An uninjected control embryo, a representative embryo injected with Cas9 mRNA alone, and representative embryos co-injected with Cas9 mRNA and either six6.L-sgRNA or six6.S-sgRNA (upper panel). The head region of each embryo was subjected to genomic DNA extraction, and the resulting DNA was used for HMA (lower panel). A white triangle indicates homoduplex products. Bands shifted relative to the homoduplex products are heteroduplex products. (B) Analysis of the eye size of the uninjected control embryos, embryos injected with Cas9 mRNA alone, and embryos co-injected with Cas9 mRNA and either six6.L-sgRNA or six6.S-sgRNA. Dots indicate relative eye area sizes of analyzed embryos, and bars represent the means with standard deviations. Numbers of analyzed embryos are shown in parenthesis in the bars. Data were statistically analyzed using one-way ANOVA followed by Tukey's multiple comparison test (****p<0.0001, **p<0.01, *p<0.05, n.s.=not significant). (C) Sequence analysis of the targeted genomic regions of six6.L (upper panel) and six6.S (lower panel). For each case, we analyzed 10 clones of the targeted region that were obtained from a single representative embryo. Target sequences of six6.L-sgRNA and six6.S-sgRNA are shaded in blue, and PAM sequences are shaded in pink.|
|Fig. 7. A possible model for the evolution of six6 homeologs in X. laevis. Blue circles and black rectangles represent ancestral enhancers and coding sequences inherited from the progenitor gene, respectively. Red circles and a red rectangle represent mutated enhancers and a mutated coding sequence, respectively. Decreases and increases in purifying selection or expression levels are indicated by downward arrows and upward arrows, respectively.|
|Supplementary material. Figure S1. Relationships between six6 homeologs identified in the J-strain and previously reported X. laevis six6 genes. (A) Comparison of Six6.L and Six6.S protein sequences with identical proteins identified by a BLAST search against GenBank. The Six6.1 and Six6.2 sequences from GenBank were partial, but others were full-length. Amino acids that were identical to those of J-strain Six6.L are shaded in gray. Amino acids unique to Six6.S proteins are shaded in pink. Six6.L (non-inbred) and Six6.S (non-inbred) sequences were deduced from cDNA clones isolated from non-inbred animals purchased from a local breeder. NP001079186 is labeled as Six3.2 in GenBank, but shows 100% identity with Six6.S sequences. (B) Comparison of six6.L (upper panel) or six6.S coding sequences (lower panel) with those encoding other proteins shown in (A). Nucleotides identical to those of the J-strain six6.L or six6.S are shaded in gray, synonymous base substitutions are shaded in pink, and target sequences of six6.L-sgRNA and six6.S-sgRNA (without PAM sequences) are shaded in blue. Base substitutions close to the 5′ and 3′ ends of six6.1 or six6.2 appear to be generated by degenerate primers used to clone these cDNAs ( Ghanbari et al., 2001).|
|Supplementary material. Figure S2. Asymmetric expression reduction and coding mutations in neurog3 homeologs. (A) RNA-seq analysis of neurog3.L and neurog3.S using whole embryos at various developmental stages. (B) Amino acid sequence alignment of X. tropicalis, X. laevis, H. sapiens, and M. musculus Neurog3 proteins. Amino acids conserved in more than two sequences are shaded in gray. The basic domain and helix-loop-helix (HLH) domain are shown in boxes with blue and green lines, respectively. Amino acid deletions and substitutions unique to Neurog3.L are shaded in pink. Amino acid substitutions encoded by malabsorptive diarrhea alleles of H. sapiens NEUROG3 are shaded in purple (R93L and R107S). Double-headed arrows at the bottom of the alignment indicate amino acids involved in the formation of two α-helices and the intervening loop structure of the HLH domain.|
|Supplementary material. Table S1. RNA-seq analysis of the expression of six6 and neurog3 homeologs during development. Gene name suffixes “_T” and “_U” indicate the duplicated data sets, “Taira201203” and “Ueno201210,” respectively. Expression levels are expressed as TPMs.|
|Supplementary material. Table S2. Scoring results from the transgenic reporter assay in X. laevis embryos. For each construct, embryos with consistent GFP expression in the eye, brain, and pharyngeal arches were scored as “reproducible expression” and are shown in Fig. 3A. Embryos with non-reproducible, spotty GFP expression were collectively scored as “ectopic expression.”|