XB-ART-374PLoS Genet. April 1, 2006; 2 (4): e56.
Multiple mechanisms promote the retained expression of gene duplicates in the tetraploid frog Xenopus laevis.
Gene duplication provides a window of opportunity for biological variants to persist under the protection of a co-expressed copy with similar or redundant function. Duplication catalyzes innovation (neofunctionalization), subfunction degeneration (subfunctionalization), and genetic buffering (redundancy), and the genetic survival of each paralog is triggered by mechanisms that add, compromise, or do not alter protein function. We tested the applicability of three types of mechanisms for promoting the retained expression of duplicated genes in 290 expressed paralogs of the tetraploid clawed frog, Xenopus laevis. Tests were based on explicit expectations concerning the ka/ks ratio, and the number and location of nonsynonymous substitutions after duplication. Functional constraints on the majority of paralogs are not significantly different from a singleton ortholog. However, we recover strong support that some of them have an asymmetric rate of nonsynonymous substitution: 6% match predictions of the neofunctionalization hypothesis in that (1) each paralog accumulated nonsynonymous substitutions at a significantly different rate and (2) the one that evolves faster has a higher ka/ks ratio than the other paralog and than a singleton ortholog. Fewer paralogs (3%) exhibit a complementary pattern of substitution at the protein level that is predicted by enhancement or degradation of different functional domains, and the remaining 13% have a higher average ka/ks ratio in both paralogs that is consistent with altered functional constraints, diversifying selection, or activity-reducing mutations after duplication. We estimate that these paralogs have been retained since they originated by genome duplication between 21 and 41 million years ago. Multiple mechanisms operate to promote the retained expression of duplicates in the same genome, in genes in the same functional class, over the same period of time following duplication, and sometimes in the same pair of paralogs. None of these paralogs are superfluous; degradation or enhancement of different protein subfunctions and neofunctionalization are plausible hypotheses for the retained expression of some of them. Evolution of most X. laevis paralogs, however, is consistent with retained expression via mechanisms that do not radically alter functional constraints, such as selection to preserve post-duplication stoichiometry or temporal, quantitative, or spatial subfunctionalization.
PubMed ID: 16683033
PMC ID: PMC1449897
Article link: PLoS Genet.
Genes referenced: dsp fgf4 nr5a2 tbx2 trim9
Article Images: [+] show captions
|Figure 1. A Non-Exhaustive Diagram Relating Various Models for the Fate of Duplicate GenesCitations that either propose mechanisms or discuss them: Clark 1994 ; Ferris and Whitt 1979 ; Force et al. 1999 ; Gibson and Spring 1998 ; Goodman et al. 1987 ; Gu et al. 2003 ; Hughes 1994 ; Jensen 1976 ; Kondrashov et al. 2002 ; Li 1980 ;Li et al. 1982 ; Lynch and Conery 2000 ; Lynch and Conery 2003 ; Lynch and Force 2000 ; Ohno 1973 ; Ohta 1987 ; Piatigorsky and Wistow 1991 ; Rodin and Riggs 2003 ).; Sidow 1996 ; Stoltzfus 1999 ; Takahata and Maruyama 1979 ; Wagner 1999 ; Wagner 2000 ; and Zhang et al. 1998 .|
|Figure 2. Putative Allopolyploid Evolution of the Tetraploid X. laevisDaggers indicate extinct diploid ancestors or genes. Nodes 1 and 2 correspond with the divergence and union, respectively, of two diploid genomes, and Node 3 marks the diversification of Xenopus tetraploids.(A) A reticulate phylogeny with ploidy in parentheses.(B) Nuclear genealogy assuming no recombination and no gene conversion between alleles at different paralogous loci (α and β). The dashed portion of the paralogous lineages evolved independently in different diploid ancestors.|
|Figure 3. Assignment of Putative Retention Mechanisms Based on Molecular Changes in the Coding RegionWe assigned a retention mechanism to paralogs based on the results of three analyses. The first one compared a model with no change in the ka/ks ratio after duplication (Model A in which the ka/ks ratio on all branches is indicated by R0) to a model with a higher ka/ks ratio after duplication (Model B with ka/ks ratio R1 > R0). The second one compared a model with no difference in the nonsynonymous substitution rate (Model B, in which R0 and R1 are nonsynonymous rates on each branch) to a model with different rates of nonsynonymous substitution in each paralog (Model C in which R0, R1, and R2 are nonsynonymous rates on each branch), with the stipulation that the paralog with the higher nonsynonymous rate also have a higher ka/ks ratio than the slower paralog and a higher ka/ks ratio than the diploid lineage. The third analysis tested for complementarity of amino acid substitution in each paralog.In the table in the figure, a minus sign (−) indicates either no significant difference between the models or no significant complementarity of nonsynonymous substitutions. A plus sign (+) indicates a significant improvement in likelihood of the more parameterized model or significant complementarity of nonsynonymous substitution. An asterisk (*) denotes the caveat that an increased substitution ratio could stem from relaxed purifying selection and therefore be a consequence of rather than a cause for retention.|
|Figure 4. Nonsynonymous Substitutions in Each X. laevis Paralog (α and β) and the Diploid Lineage in Representative GenesSubstitutions in the diploid lineage (d) occurred on the thick branches in the rooted topologies to the left of each locus. (A) liver-type arginase, (B) fibroblast growth factor receptor (FGFR), (C) embryonic fibroblast growth factor (EFGF), and (D) FTZ-F1–related orphan receptor. In (A) a gap indicates a single amino acid deletion, an arrow above the paralog indicates a single amino acid insertion, and this paralog is shortened due to an early stop codon. In (B) three red boxes and a blue box indicate three immunoglobulin domains and a tyrosine kinase domain. In (C) arrows below the paralog indicate predicted cleavage sites in each paralog . In (D) yellow, green, and two lighter blue boxes indicate the DNA-binding C-domain, FTZ-F1 box, and DNA binding domain regions II and III .|
|Figure 5. Probability Distribution of the Difference in the Number of Substitutions in Concatenated Paralogs (“Superparalogs”)Analysis was performed on concatenated data from (A) nonsynonymous substitutions of paralogs identified by the likelihood analysis as having asymmetric rates of evolution, (B) synonymous substitutions of these paralogs, (C) nonsynonymous substitutions of the other paralogs that were not identified as having asymmetric rates and (D) synonymous substitutions of these paralogs.Black circles are the expected Skellam distributions, gray dots are dSP distributions from ten example simulations (out of 1,000 total), and white circles are the observed distribution of superparalog differences.|
|Figure 6. The Observed Relationship between ka/ks and ksThe observed relationship between ka/ks and ks corresponds with simulations that predict a negative relationship under neutral or near-neutral evolution of synonymous substitutions because of stochastic sampling of synonymous substitutions at in slowly evolving or young genes . The plot shows the average ka/ks ratio on each branch of 290 genealogies versus average ks of bins of 50 lineages ranked by ks of each one. The last bin has only 20 lineages. Bars indicate the standard deviation of each bin.|
|Figure 7. The ka/ks Ratio of Genes with No Significant Difference before and after TetraploidizationThe ka/ks ratio is often slightly higher in the paralogs (above the dashed line), even though this average is not significantly higher than the diploid lineage. Only ratios from genes with no significant difference are shown (226 out of 292 genes). A dashed line indicates an equal ka/ks ratio before and after duplication.|