January 1, 2012;
Molecular phylogeny of OVOL genes illustrates a conserved C2H2 zinc finger domain coupled by hypervariable unstructured regions.
OVO-like proteins (OVOL) are members of the zinc finger protein family and serve as transcription factors to regulate gene expression in various differentiation processes. Recent studies have shown that OVOL genes are involved in epithelial development and differentiation in a wide variety of organisms; yet there is a lack of comprehensive studies that describe OVOL proteins from an evolutionary perspective. Using comparative genomic analysis, we traced three different OVOL genes (OVOL1
-3) in vertebrates. One gene, OVOL3
, was duplicated during a whole-genome-duplication event in fish, but only the copy (OVOL3b) was retained. From early-branching metazoa to humans, we found that a core domain, comprising a tetrad of C2H2 zinc fingers, is conserved. By domain comparison of the OVOL proteins, we found that they evolved in different metazoan lineages by attaching intrinsically-disordered (ID) segments of N/C-terminal extensions of 100 to 1000 amino acids to this conserved core. These ID regions originated independently across different animal lineages giving rise to different types of OVOL genes over the course of metazoan evolution. We illustrated the molecular evolution of metazoan OVOL genes over a period of 700 million years (MY). This study both extends our current understanding of the structure/function relationship of metazoan OVOL genes, and assembles a good platform for further characterization of OVOL genes from diverged organisms.
[+] show captions
References [+] :
Figure 1. OVOL proteins are characterized by the presence of hypervariable ID regions. A. Mouse OVOL1 has ID residues in the first 100 amino acids. B. Mouse OVOL2 possesses ID residues in the first 50 amino acids with a glycine-rich and serine rich region as marked in red color. C. Mouse OVOL3 has ID segments within the N-terminal 100 residues. D. Drosophila OVO is intrinsically disordered with large patches of residue biasness as indicated by the red color. We used DISOPRED2 software  for the prediction of ID regions. The horizontal line indicates the ordered/disordered threshold for the default false positive rate of 5%. The 'filter' curve represents the outputs from DISOPRED2 and the 'output' curve represents the outputs from a linear support vector machine (SVM) classifier (DISOPREDsvm). The outputs from DISOPREDsvm are included to indicate shorter as low confidence predictions of disorder.
Figure 2. The disordered regions of OVOL have evolved more rapidly than structured regions.A) Structured regions only, B) Disordered segments only, C) Full-length OVOL.
Figure 3. Chromosomal localization of OVOL1 gene from selected vertebrates, flanked by a set of conserved marker genes.SIPA1: signal-induced proliferation-associated 1; RELA: v-rel reticuloendotheliosis viral oncogene homolog A (avian); KAT5: K (lysine) acetyltransferase; SNX32: sorting nexin 32; MUS81: MUS81 endonuclease homolog (S. cerevisiae); BANF1: barrier to autointegration factor 1; EXOC6B: exocyst complex component 6B; DYSF: dysferlin, limb girdle muscular dystrophy 2B; COL4A5: collagen, type IV, alpha 5; DAK: dihydroxyacetone kinase 2 S. cerevisiae homolog.
Figure 4. OVOL2 orthologs identified in vertebrates by comparing chromosomal localization.RRBP1: ribosome binding protein 1 homolog; BANF2: barrier to autointegration factor 2; SNX5: sorting nexin 5; CSRP2BP: CSRP2 binding protein; SEC23B: protein transport protein Sec23B; POLR3F: polymerase (RNA) III (DNA directed) polypeptide F; RBBP9: Retinoblastoma-binding protein 9; DTD1: D-tyrosyl-tRNA deacylase 1.
Figure 5. Synteny analysis of OVOL3 genes illustrates the loss of OVOL3a after duplication event and maintenance of paralogous OVOL3b in fishes.LIN37: lin-37 homolog (C. elegans); PRODH2: proline dehydrogenase (oxidase) 2; KIRREL2: kin of IRRE like 2 (Drosophila); APLP1: amyloid beta (A4) precursor-like protein 1; NFKBID: nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor, delta; LRFN3: leucine rich repeat and fibronectin type III domain containing 3; SDHAF1: succinate dehydrogenase complex assembly factor 1; CLIP3: CAP-GLY domain containing linker protein 3; POLR2I: polymerase (RNA) II (DNA directed) polypeptide I, 14.5 kDa; CAPNS1: calpain, small subunit 1; COX7A1: cytochrome c oxidase subunit VIIa polypeptide 1 (muscle); DMPK: dystrophia myotonica-protein kinase; HLCS: holocarboxylase synthetase; AMOT: angiomotin; REXO2: REX2 RNA exonuclease 2 homolog (S. cerevisiae).
Figure 6. Sequence logo of four different Cys2-His2 (C2H2) zinc finger motifs (I-IV) present in different OVOL proteins from metazoan genomes.We generated this sequence logo using WebLogo 3.0 . C2H2 zinc finger motif IV has 25 amino acids due to the presence of one extra amino acid at the eleventh position in the OVOLNVE1 protein from sea anemone.
Figure 7. Phylogenetic history of OVOL proteins using the Bayesian method. A. Full-length OVOL proteins. B. Selected region of OVOL proteins.Posterior probabilities scores are depicted by various color balls. The placozoan OVOL protein (e_gw1.4.509.1) was used as the outgroup in this phylogenetic tree. Red x indicates sequence position, which did not accord with species phylogeny. BFL: B. floridae (lancelet), SPU: S. purpuratus (sea urchin), NVE: N. vectensis (sea anemone), HRO: H. robusta (annelids), LGI: L. gigantean (molluscs) and TAD: T. adhaerens (placozoan). Trees in figures 7A and 7B are generated using the MrBayes 3.2  from alignments supplied in supplementary Files S1 and S2, respectively.
Figure 8. Protein domain evolution of OVOL proteins from different metazoan lineages over a period of >700 MY.A highly conserved domain of a tetrad of C2H2 zinc finger motifs (red and yellow box) is found in various metazoa. Primarily, the N-terminal extensions in C2H2 lead to different types of protein with the exceptions of OVOL proteins from the leech and sea urchin where extension was found in the C-terminal end of C2H2 zinc finger motif. The ID segments with no homology in evolutionary distant organisms are marked in different colors. Times of divergence are taken from Kumar and Hedge (2003)  and Ponting (2008) .
Genome sequence of the nematode C. elegans: a platform for investigating biology.