XB-ART-70BMC Evol Biol August 7, 2006; 6 60.
Phylogenetic analysis of the tenascin gene family: evidence of origin early in the chordate lineage.
Tenascins are a family of glycoproteins found primarily in the extracellular matrix of embryos where they help to regulate cell proliferation, adhesion and migration. In order to learn more about their origins and relationships to each other, as well as to clarify the nomenclature used to describe them, the tenascin genes of the urochordate Ciona intestinalis, the pufferfish Tetraodon nigroviridis and Takifugu rubripes and the frog Xenopus tropicalis were identified and their gene organization and predicted protein products compared with the previously characterized tenascins of amniotes. A single tenascin gene was identified in the genome of C. intestinalis that encodes a polypeptide with domain features common to all vertebrate tenascins. Both pufferfish genomes encode five tenascin genes: two tenascin-C paralogs, a tenascin-R with domain organization identical to mammalian and avian tenascin-R, a small tenascin-X with previously undescribed GK repeats, and a tenascin-W. Four tenascin genes corresponding to tenascin-C, tenascin-R, tenascin-X and tenascin-W were also identified in the X. tropicalis genome. Multiple sequence alignment reveals that differences in the size of tenascin-W from various vertebrate classes can be explained by duplications of specific fibronectin type III domains. The duplicated domains are encoded on single exons and contain putative integrin-binding motifs. A phylogenetic tree based on the predicted amino acid sequences of the fibrinogen-related domains demonstrates that tenascin-C and tenascin-R are the most closely related vertebrate tenascins, with the most conserved repeat and domain organization. Taking all lines of evidence together, the data show that the tenascins referred to as tenascin-Y and tenascin-N are actually members of the tenascin-X and tenascin-W gene families, respectively. The presence of a tenascin gene in urochordates but not other invertebrate phyla suggests that tenascins may be specific to chordates. Later genomic duplication events led to the appearance of four family members in vertebrates: tenascin-C, tenascin-R, tenascin-W and tenascin-X.
PubMed ID: 16893461
PMC ID: PMC1578592
Article link: BMC Evol Biol
Grant support: C06 RR-12088-01 NCRR NIH HHS , C06 RR012088 NCRR NIH HHS
Genes referenced: egf fga fn1 rxrb tnc tnn tnr tnxb
Article Images: [+] show captions
|Figure 1. The tenascins. Six tenascins have been described in the literature: tenascins-C, -R, -X, -W, -Y and -N. This figure shows the repeat and domain organization of a tenascin that is representative of the group belonging to the genus where it was first described. The shapes found in the diagrams at the right symbolize the N-terminal linker domain (home plate), heptad repeats (zig-zag), EGF-like repeats (diamonds and partial diamonds), FN type III domains (rectangles), and a C-terminal FReD (circle). The serine/proline-rich domains of tenascin-X and tenascin-Y are indicated by an oval.|
|Figure 2. Ciona intestinalis tenascin. A. The amino acid sequence of a tenascin from C. intestinalis. The N-terminal linker region is at the top, with a signal peptide shown in bold and putative heptad repeats underlined. Between amino acids 208 and 458 are 8 EGF-like repeats. These are followed by 18 FN type III domains between amino acids 459 and 2120. The tryptophan (w), leucine (l) and tyrosine (y) residues that are characteristic of these domains are highlighted and aligned, and a putative integrin-binding motif (rge) found in the third FN type III domain is shown in bold. The C-terminal FReD is composed of amino acids 2128 through 2355. B. The repeat and domain organization of the C. intestinalis tenascin shown in A. A key to the shapes symbolizing each domain can be found in the legend to Figure 1. C. A rabbit antiserum against a recombinant fragment of C. intestinalis tenascin was used to immunostain whole larvae. The antiserum recognized the tunic, a line of matrix in the tail (arrows), and faintly labelled the tail muscles (between the arrows). D. The rabbit preimmune serum inconsistently labelled the tunic but not the line of matrix in tail or the tail muscles.|
|Figure 3. There are two tenascin-Cs in pufferfish. A. Analysis of the genomic sequences of Tetraodon nigroviridis (T.n.) and Takifugu rubripes (T.r.) reveals that each species of pufferfish had two tenascin-C genes. The repeat and domain organization of the paralogous genes are illustrated here. All four have a putative integrin binding motifs (kgd, rgd or kge) in the third FN type III domain. B. The C-terminal FReDs (underlined in red) of the two tenascin-Cs from Tetraodon nigroviridis are highly conserved. Identical residues are boxed in blue and similar residues are boxed in yellow.|
|Figure 4. Tetraodon nigroviridis tenascin-X. A. The predicted amino acid sequence of T. nigroviridis tenascin-X. Putative heptad repeats (underlined) are found near the N-terminus. Between amino acids 41 and 86 are one complete are two partial EGF-like repeats. This is followed by a short (48 amino acid) linking region and 7 complete and two partial GK repeats (between amino acids 135 and 270). Between the GK repeats and amino acid 571 is a region rich in charged amino acids. Three complete and one partial FN type III domains are found between amino acids 572 and 884. There is a FReD at the C-terminus. B. The repeat and domain organization of pufferfish tenascin-X. The region between amino acids 87 and 571 is similar to the DUF612 domain of UNC-89. C. The T. nigroviridis tenascin-X gene (TNX) is found between the genes encoding cytochrome p450 21-hydroxylase (cp450), C4 complement and retinoid X receptor beta (RXRB). The same genes overlap or flank tenascin-X genes in birds and mammals.|
|Figure 5. Tenascin-W diversity is generated by duplications of a FN type III domain. A. The tenascin-W of Tetraodon nigroviridis is predicted to be encoded on 14 exons. The figure shows a schematic of the predicted protein's repeat and domain organization and the corresponding exons. The N-terminal linker is encoded on the first exon. The second exon encodes the heptad repeats and the EGF-like repeats. This is conserved in all of the tenascin-Ws illustrated here. FN type III domains 1, 2 and 4 are encoded on two exons, but the third FN type III domain is encoded on a single exon (shaded). The FReDs of all of the tenascin-Ws is encoded on five exons. B. The full-length predicted tenascin-W of Takifugu rubripes has five FN type III domains. The additional domain is the result of a duplication of the third FN type III domain, which is encoded on a single exon. C, D. The predicted tenascin-Ws of Danio rerio (C) and Gallus gallus (D) have 6 FN type III domains, the apparent consequence of an additional duplication of the third FN type III domain. E, F. In mouse (Mus) and man (Homo) the very large tenascin-W predicted proteins can also be explained by multiple duplications of the third FN type III domain. Note that the first FN type III domains of the pufferfish, but not the other tenascin-Ws, are encoded on two exons (black). The relative sizes of the exons and introns between the different genera are not shown to scale.|
|Figure 6. Alignment of the duplicated FN type III domains oftenascin-W. The third FN type III domain of Tetraodon nigroviridis tenascin-W has been duplicated one or more times in other tensacin-Ws. Alignment reveals the conservation of sequences within these domains, including putative integrin binding motifs near the N-terminus of the domain (underlined). The region where an integrin-binding RGD sequence in an exposed loop is found in chicken tenascin-C is indicated by asterisks. Several tenascin-Ws have a potentially active KGD motif in this region. At the left is a rooted phylogenetic tree generated by SATCHMO. This analysis indicates that many of the domain duplications took place after the divergence of primate and rodent lineages. Identical amino acids are shaded blue, while similar amino acids are boxed in yellow.|
|Figure 7. Xenopus tropicalis tenascins. Four tenascins were identified in the X. tropicalis genome. Stick diagrams of the three complete sequences are shown here. The X. tropicalis tenascin-C gene encodes 14.5 EGF-like repeats and 8 FN type III domains. There are no RGD motifs. The domain represented by an oval between the second and third FN type III domains shares sequences similarities with the DUF612 domain of pufferfish tenascin-X and the SP-domain of avian and mammalian tenascin-X. The amphibian tenascin-W contains an RGD domain in an exposed loop in the fourth FN type III domain and a KGD motif in the fifth FN type III domain; these domains appear to have undergone a recent duplication.|
|Figure 8. Tenascins. Stick diagrams illustrating the hypothetical repeat and domain organizations of tenascins based on genomic sequences. A key to the shapes used can be found in the legend to Figure 1. h, Homo; m, Mus; g, Gallus; t, Tetraodon; x, Xenopus.|
|Figure 9. A tenascin phylogenetic tree. The amino acid sequences of the FReDs from urochordate, fish, amphibian and mammalian tenascins were used to construct an unrooted molecular phlyogenetic tree. The numbers at the internal nodes are the probability that each branch point is correct. The scale represents 0.100 expected changes. The tree reveals that there are four members of the tenascin family in vertebrates: tenascin-X, tenascin-W, tenascin-C and tenascin-R.|