XB-ART-52717Dev Biol June 15, 2017; 426 (2): 301-324.
Conservatism and variability of gene expression profiles among homeologous transcription factors in Xenopus laevis.
Xenopus laevis has an allotetraploid genome of 3.1Gb, in contrast to the diploid genome of a closely related species, Xenopus tropicalis. Here, we identified 412 genes (189 homeolog pairs, one homeologous gene cluster pair, and 28 singletons) encoding transcription factors (TFs) in the X. laevis genome by comparing them with their orthologs from X. tropicalis. Those genes include the homeobox gene family (Mix/Bix, Lhx, Nkx, Paired, POU, and Vent), Sox, Fox, Pax, Dmrt, Hes, GATA, T-box, and some clock genes. Most homeolog pairs for TFs are retained in two X. laevis subgenomes, named L and S, at higher than average rates (87.1% vs 60.2%). Among the 28 singletons, 82.1% were deleted from chromosomes of the S subgenome, a rate similar to the genome-wide average (82.1% vs 74.6%). Interestingly, nkx2-1, nkx2-8, and pax9, which reside consecutively in a postulated functional gene cluster, were deleted from the S chromosome, suggesting cluster-level gene regulation. Transcriptome correlation analysis demonstrated that TF homeolog pairs tend to have more conservative developmental expression profiles than most other types of genes. In some cases, however, either of the homeologs may show strongly different spatio-temporal expression patterns, suggesting neofunctionalization, subfunctionalization, or nonfunctionalization after allotetraploidization. Analyses of otx1 suggests that homeologs with much lower expression levels have undergone greater amino acid sequence diversification. Our comprehensive study implies that TF homeologs are highly conservative after allotetraploidization, possibly because the DNA sequences that they bind were also duplicated, but in some cases, they differed in expression levels or became singletons due to dosage-sensitive regulation of their target genes.
PubMed ID: 27810169
Article link: Dev Biol
Genes referenced: alx1 arntl arntl2 arx bix1.1 bix2 bix3 brms1l clock cry1 cry2 crygdl.43 ctc1 dmrt1 drgx eomes foxa1 foxa2 foxd3 foxg1 foxh1 foxh1.2 foxj2 foxm1 foxn1 foxn3 foxp4 foxr1 gata1 gata2 gata3 glis1 grm6 gsc gsc2 hbox10 hes1 hes2 hes3 hes4 hes5.1 hes5.10 hes5.2 hes5.3 hes5.4 hes5.5 hes5.6 hes5.7 hes5.8 hes5.9 hes6.2 hes7.2 id4 impdh1 isl2 isx lhx2 lhx3 lhx5 lhx9 lmx1b.1 mbip mipol1 mix1 mixer nkx1-2 nkx2-1 nkx2-2 nkx2-3 nkx2-4 nkx2-6 nkx2-8 nkx3-1 nkx3-3 nkx6-1 nkx6-2 npas2 nudt9 otx1 otx2 pax1 pax2 pax3 pax4 pax5 pax6 pax7 pax8 pax9 pdcl3 per1 per2 per3 phox2a pitx1 pitx2 pou1f1 pou5f3.1 pou6f1 pou6f2 prrx1 prrx2 rax rlim rpl31 sebox snd1 sox12 sox13 sox14 sox17a sox18 sox2 sox21 sox6 ssbp3 sstr1 tbx1 tbx6 tbx6r tcf19 tdrd6 tef vamp1 vamp2 ventx1.1 ventx1.2 ventx2.1 ventx2.2 ventx3.1 ventx3.2
Article Images: [+] show captions
|Fig. 1. Overview of transcriptome correlation analysis. (A) Workflow of the transcriptome correlation analysis. See Section 2 for details. Homeolog pairs with inconsistent results between clutches are labelled "inc". (B-E) Examples of four groups. The group name is presented in each graph. In the case of inconsistent group names in different clutches, names are presented separately. (B) eomes.L and eomes.S (HCSE), showed quite similar expression patterns. (C) arx.L (HCDE), showed stronger expression than arx.S throughout developmental stages (st15-35). (D) alx1.L and alx1.S (NCSE) showed different expression patterns. (E) prrx1.L (NCDE) showed stronger expression than prrx1.S and their expression patterns are quite different throughout developmental stages. (F, G) Examples of inconsistent categories between clutches. (F) lhx3 in Clutch T (HCDE) had stronger expression of lhx3.L than lhx3.S. However, lhx3 was categorized as NCSE in Clutch U with very low expression levels of homeologs. (G) tbx1 in Clutch U (HCDE) showed stronger expression of tbx1.L than tbx1.S at st15-35. However, tbx1 was categorized as HCSE in Clutch T with similar expression levels of homeologs at st25-35. Line graphs show expression levels of genes during oogenesis and embryogenesis. Magenta, L genes; Blue, S genes; circles, Clutch T; triangles, Clutch U.|
|Fig. 2. Identification and syntenic analysis of Nkx family genes. (A) Using a Xenopus genome browser search, we identified 15 nkx genes from X. tropicalis and 28 from X. laevis. Phylogenetic analysis of deduced amino acid sequences indicates that Xenopus nkx genes belong to four classes. (B) nkx genes of the X. laevis genome retain their homeolog pairs, except nkx2-1.L and nkx2-8.L. Homeologs of these genes are missing, probably due to chromosomal deletion. (C) Schematic representation of chromosomal deletion found in XLA8S including nkx2-1and 2-1 genes. Genes surrounding the deleted region of X. tropicalis (XTR8) and X. laevis (XLA8L and XLA8S) are shown.|
|Fig. 3. Identification and syntenic analysis of clock genes. Phylogenetic analysis of Per (A), Cry (B), Clock (C), and Bmal (D). In circadian rhythms, there are two clock, two bmal, two cry, and three per genes in X. tropicalis, as in mammals. Using clock genes of X. tropicalis, we identified 15 homeolog candidates of X. laevis. Phylogenetic analysis indicated six homeolog pairs (clock.L/clock.S, bma1.L/bmal1.S, bmal2.L/bmal2.S, per3.L/per3.S, cry1.L/cry1.S, and cry2.L/cry2.S) and three singletons (per1.L, per2.L, and npas2.L). Synteny analyses of per1(E), per2(F), and npas2(G). Synteny analysis of X. laevis and X. tropicalis indicated that per1.S, per2.S, and npas2.S were lost from XLA3S, XLA5S, and XLA2S, respectively.|
|Fig. 4. Identification and syntenic analysis of POU family genes. (A) Phylogenetic analysis of predicted amino acid sequences from 31 genes that encode POU domain proteins. Xtr, X. tropicalis, Xla, X. laevis. As in mammals, POU family genes of X. laevis could be grouped into 6 classes according to sequence similarities. (B) Location of homeologous gene pairs on X. laevis chromosomes. In almost all POU classes, both homeologs were identified (blue box). Gene loss of one homeolog was found in several homeologous gene pairs (white box), leaving singletons (red box). Orthologs of mammalian pou5f1 and pou5f2 were not found in the corresponding position (white box).|
|Fig. 5. Identification, transcriptomic analysis, and syntenic analysis of GATA family genes. (A) Phylogenetic analysis of GATA family genes. GATA family genes are separated into two groups, GATA1-3 and GATA4-6. (B) Expression of gata1.L and gata1.S in oocytes / developmental stages (above) and adult tissues (below). Although gata1 is categorized as HCSE in both data sets, 55.3/82.7-fold expression differences between homeologs are seen in adult testis of Clutches T/U (marked with an asterisk). (C) Comparison of synteny around gata1.L and gata1.S genes. These genes are located on XLA8L and 8S, respectively. XLA8S has undergone multiple, large-scale chromosome rearrangements ( Session et al., 2016) and gata1.S is positioned at the rim of the inverted region. grm6.L and LOC496143.L are pseudogenes.|
|Fig. 6. Identification, transcriptomic analysis, and sequence diversification of Pax family genes. (A) Phylogenetic tree analysis of X. tropicalis and X. laevis Pax family proteins. (B) RNA-seq analysis of differential expression of pax8 homeologs during development and in adult tissues. Transcriptomic data are shown in graphs similar to those in Fig. 1. (C) Comparison of amino acid sequences of mammalian and Xenopus Pax5 proteins. Magenta shading indicates substitution of a basic amino acid (R) with a non-polar amino acid (G) in the paired-box domain of Pax5.S. GenBank accession numbers of human and mouse Pax5 proteins used for this alignment are NP057953 and NP032808, respectively.|
|Fig. 7. Identification and transcriptomic analysis of Lhx family genes. (A) Phylogenetic tree of Lhx family genes of Xenopus. Xtr, X. tropicalis, Xla, X. laevis. Bootstrap support values for nodes are indicated (n=100). Subfamilies were supported at 100 values and orthologs were supported at 93−100 values, except for lhx2 (83) and lhx9 (65). Transcriptomic data of lhx5 (B), isl2 (C), and lmx1b.1 (D) are shown in graphs similar to those in Fig. 1. See text for detailed explanations of variable expression profiles.|
|Fig. 8. Diversification of otx1 homeologs in sequence and expression profiles. (A) Phylogenetic tree of the Otx subfamily with distances. A scale bar of branch length indicates substitutions per site. The branch length of X. laevis otx1.S is much longer than that of X. laevis otx1.L and X. tropicalis otx1, suggesting that otx1.S evolved fast and diverged from other otx1 genes. (B) Transcriptomic data of otx1 in developmental stages. (C) RT-qPCR using outbred animals. Bars represent means±s.d. of expression levels of otx1.L relative to otx1.S at indicated stages. **, P<0.01, ***, P<0.001 (t-test, two-tailed). (D) Transcriptomic data of otx1 in adult tissues. Two panels represent the same data at different scales. Data are shown in graphs similar to those in Fig. 1. See text for detailed explanations of variable expression profiles.|
|Fig. 9. Genomic structure and expression profiles of tbx6 and tbx6r. (A) Schematic representation of genomic organization around the tbx6 and tbx6r genes. Syntenic genes of tbx6 and tbx6r are fragmented in scaffolds in the X. tropicalis genome. Unfortunately, tbx6 was not found in v9 of the X. tropicalis genome, although it was found in v4.1. Transposon sequences in scaffolds suggested that scaffold_22 is from the L chromosome and scaffold_29 is from the S chromosome. The sequence of tbx6.S is partial, possibly because 5′ exons were covered with gaps (Ns) in the genome. tbx6rp.S means that tbx6r.S became a pseudogene. In addition to tbx6r.S, nudt9.S seems to have been deleted from the X. laevis genome. (B) Sequences of exon4 for tbx6r.L and tbx6r.S with translated amino acid sequences. A mutation to insert a stop codon was found in tbx6r.S, as indicated with bold letters and an underline, resulting in a pseudogene. (C) Expression profiles of tbx6 and tbx6r were similar to those previously reported ( Callery et al., 2010), in which tbx6 is analyzed as one gene without distinguishing tbx6.L and tbx6.S and relative expression levels were analyzed with quantitative RT-PCR. Data are shown in graphs similar to those in Fig. 1 with additional data of tbx6r shown in green. See text for detailed explanations of the variable expression profiles. (D) RT-qPCR using outbred animals. Bars represent means±s.d. of expression levels of tbx6.L relative to tbx6.S at indicated stages. **, P<0.01, ***, P<0.001 (t-test, two-tailed).|
|Fig. 10. Genomic structure of Mix/Bix gene clusters and multi-to-multi orthologous relationships between bix genes in Xenopus. (A) Synteny of the mix/mixer/bix gene clusters. Human (Homo sapiens; HSA1), chicken (Gallus gallus; GGA3), X. tropicalis (XTR5), X. laevis (XLA5L and XLA5S), Tibetan frog (Nanorana parkeri; NPA scaffold31), pufferfish (Trafugu rubrpies; FUGU scaffold72), and zebrafish (Danio rerio; DRE20) chromosomes are indicated. Black gene symbols represent pseudogenes. (B) Phylogenetic analysis of mix/bix family genes. Maximum likelihood methods were performed with 1,000 bootstrap pseudoreplicates. The parameter model was estimated with MEGA6 ( Tamura et al., 2013), using JTT model. Nodes of + and ++ are prospective first and second tandem duplications of ancestral mix genes, respectively. (C) A hypothesis for the generation of mix/bix gene cluster. A single mix/bix ancestral gene in the vertebrate ancestor was duplicated into proto-mix1/mixer and proto-bix genes in the anuran ancestor (indicted as +; see also B), followed by duplication of the proto-mix1/mixer to mix1 and mixer genes in the Xenopus ancestor (++). Further, the gene expansion and subsequent gene conversion of the bix gene took place in each species and/or subgenome.|
|Fig. 11. Genomic structure and expression profiles of Ventx gene clusters. (A) Structures of ventx gene clusters. Humans only have one ventx gene in HSA10. However, in Xenopus, ventx genes are tandemly duplicated to form a cluster. X. tropicalis has 6 genes, whereas X. laevis has 11 genes, in which ventx3.1.S is a singleton due to pseudogenization of ventx3.1.L. Nanorana has single copies of ventx1 and vextx2 and two copies of ventx3 in scaffold_336. Zebrafish have vent and vox on DRE13 and ved on DRE10. HSA, Homo sapiens; NPA, Nanorana parkeri; DRE, Danio rerio. (B-G) Transcriptomic analyses for ventx cluster genes during oogenesis and embryogenesis. Data are shown in graphs similar to those in Fig. 1. See text for detailed explanations of variable expression profiles.|
|Fig. 12. Syntenic analysis of hes genes between X. laevis and X. tropicalis. (A) Synteny around hes1 and hes6 loci. (B) Synteny around hes2, hes4, hes5, and the hes5.3 cluster loci. (C) Synteny around the hes7 locus. Species and chromosome numbers are as indicated: XTR, X. tropicalis; XLA, X. laevis. Sc indicates scaffold. Pentagon arrows show genes with the 5′ to 3′ direction. Curved blue arrows indicate flipping of genome sequences to align gene orders. A broken-lined pentagon arrow indicates a pseudogene.|
|Fig. 13. Comparison of pharyngeal gene clusters in vertebrate genomes. (A) Comparison of pharyngeal cluster 1, which includes transcription factors, nkx2-1, nkx2-8, pax9, and foxa1. In the X. laevis genome, genes from mbip.S to mipol1.S are deleted, but foxa1.S is retained in XLA8S. In actinopterygian genomes, genes from mbip to slc25A21 are retained as a cluster, but genes from mipol to sstr1 are located in a distant genomic region. Teleost genomes only possess one cluster, possibly due to deletion of the duplicated cluster after whole genome duplication. (B) Comparisons of pharyngeal cluster 2, which consists of transcription factors, nkx2-4, nkx2-2, pax1, and foxa2. In the X. laevis genome, duplicated clusters are completely conserved in both XLA5L and 5S. In the coelacanth genome, nkx2-2, pax1, and foxa2 are separated into scaffolds, JH130518, JH128098, and JH128726, respectively, one gene per scaffold. In teleost genomes, duplicated clusters are partially conserved after the teleost-specific, whole-genome duplication. Syntenic gene homologs with foxa genes are indicated with bold letters. HSA, Homo sapiens; XTR, Xenopus tropicalis; XLA, Xenopus laevis; LCH, Latimeria chalumnae (coelacanth); LOC, Lepisosteus oculatus (spotted gar); DRE, Danio rerio (zebrafish); TRU, Takifugu rubripes (fugu); CMI, Callorhinchus milii (elephant shark).|