Click here to close
Hello! We notice that you are using Internet Explorer, which is not supported by Xenbase and may cause the site to display incorrectly.
We suggest using a current version of Chrome,
FireFox, or Safari.
PLoS One
2012 Jan 01;76:e39399. doi: 10.1371/journal.pone.0039399.
Show Gene links
Show Anatomy links
Molecular phylogeny of OVOL genes illustrates a conserved C2H2 zinc finger domain coupled by hypervariable unstructured regions.
Kumar A, Bhandari A, Sinha R, Sardar P, Sushma M, Goyal P, Goswami C, Grapputo A.
???displayArticle.abstract???
OVO-like proteins (OVOL) are members of the zinc finger protein family and serve as transcription factors to regulate gene expression in various differentiation processes. Recent studies have shown that OVOL genes are involved in epithelial development and differentiation in a wide variety of organisms; yet there is a lack of comprehensive studies that describe OVOL proteins from an evolutionary perspective. Using comparative genomic analysis, we traced three different OVOL genes (OVOL1-3) in vertebrates. One gene, OVOL3, was duplicated during a whole-genome-duplication event in fish, but only the copy (OVOL3b) was retained. From early-branching metazoa to humans, we found that a core domain, comprising a tetrad of C2H2 zinc fingers, is conserved. By domain comparison of the OVOL proteins, we found that they evolved in different metazoan lineages by attaching intrinsically-disordered (ID) segments of N/C-terminal extensions of 100 to 1000 amino acids to this conserved core. These ID regions originated independently across different animal lineages giving rise to different types of OVOL genes over the course of metazoan evolution. We illustrated the molecular evolution of metazoan OVOL genes over a period of 700 million years (MY). This study both extends our current understanding of the structure/function relationship of metazoan OVOL genes, and assembles a good platform for further characterization of OVOL genes from diverged organisms.
???displayArticle.pubmedLink???
22737237 ???displayArticle.pmcLink???PMC3380836 ???displayArticle.link???PLoS One
Figure 1. OVOL proteins are characterized by the presence of hypervariable ID regions. A. Mouse OVOL1 has ID residues in the first 100 amino acids. B. Mouse OVOL2 possesses ID residues in the first 50 amino acids with a glycine-rich and serine rich region as marked in red color. C. Mouse OVOL3 has ID segments within the N-terminal 100 residues. D. Drosophila OVO is intrinsically disordered with large patches of residue biasness as indicated by the red color. We used DISOPRED2 software [47] for the prediction of ID regions. The horizontal line indicates the ordered/disordered threshold for the default false positive rate of 5%. The 'filter' curve represents the outputs from DISOPRED2 and the 'output' curve represents the outputs from a linear support vector machine (SVM) classifier (DISOPREDsvm). The outputs from DISOPREDsvm are included to indicate shorter as low confidence predictions of disorder.
Figure 2. The disordered regions of OVOL have evolved more rapidly than structured regions.A) Structured regions only, B) Disordered segments only, C) Full-length OVOL.
Figure 3. Chromosomal localization of OVOL1 gene from selected vertebrates, flanked by a set of conserved marker genes.SIPA1: signal-induced proliferation-associated 1; RELA: v-rel reticuloendotheliosis viral oncogene homolog A (avian); KAT5: K (lysine) acetyltransferase; SNX32: sorting nexin 32; MUS81: MUS81 endonuclease homolog (S. cerevisiae); BANF1: barrier to autointegration factor 1; EXOC6B: exocyst complex component 6B; DYSF: dysferlin, limb girdle muscular dystrophy 2B; COL4A5: collagen, type IV, alpha 5; DAK: dihydroxyacetone kinase 2 S. cerevisiae homolog.
Figure 4. OVOL2 orthologs identified in vertebrates by comparing chromosomal localization.RRBP1: ribosome binding protein 1 homolog; BANF2: barrier to autointegration factor 2; SNX5: sorting nexin 5; CSRP2BP: CSRP2 binding protein; SEC23B: protein transport protein Sec23B; POLR3F: polymerase (RNA) III (DNA directed) polypeptide F; RBBP9: Retinoblastoma-binding protein 9; DTD1: D-tyrosyl-tRNA deacylase 1.
Figure 5. Synteny analysis of OVOL3 genes illustrates the loss of OVOL3a after duplication event and maintenance of paralogous OVOL3b in fishes.LIN37: lin-37 homolog (C. elegans); PRODH2: proline dehydrogenase (oxidase) 2; KIRREL2: kin of IRRE like 2 (Drosophila); APLP1: amyloid beta (A4) precursor-like protein 1; NFKBID: nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor, delta; LRFN3: leucine rich repeat and fibronectin type III domain containing 3; SDHAF1: succinate dehydrogenase complex assembly factor 1; CLIP3: CAP-GLY domain containing linker protein 3; POLR2I: polymerase (RNA) II (DNA directed) polypeptide I, 14.5 kDa; CAPNS1: calpain, small subunit 1; COX7A1: cytochrome c oxidase subunit VIIa polypeptide 1 (muscle); DMPK: dystrophia myotonica-protein kinase; HLCS: holocarboxylase synthetase; AMOT: angiomotin; REXO2: REX2 RNA exonuclease 2 homolog (S. cerevisiae).
Figure 6. Sequence logo of four different Cys2-His2 (C2H2) zinc finger motifs (I-IV) present in different OVOL proteins from metazoan genomes.We generated this sequence logo using WebLogo 3.0 [81]. C2H2 zinc finger motif IV has 25 amino acids due to the presence of one extra amino acid at the eleventh position in the OVOLNVE1 protein from sea anemone.
Figure 7. Phylogenetic history of OVOL proteins using the Bayesian method. A. Full-length OVOL proteins. B. Selected region of OVOL proteins.Posterior probabilities scores are depicted by various color balls. The placozoan OVOL protein (e_gw1.4.509.1) was used as the outgroup in this phylogenetic tree. Red x indicates sequence position, which did not accord with species phylogeny. BFL: B. floridae (lancelet), SPU: S. purpuratus (sea urchin), NVE: N. vectensis (sea anemone), HRO: H. robusta (annelids), LGI: L. gigantean (molluscs) and TAD: T. adhaerens (placozoan). Trees in figures 7A and 7B are generated using the MrBayes 3.2 [53] from alignments supplied in supplementary Files S1 and S2, respectively.
Figure 8. Protein domain evolution of OVOL proteins from different metazoan lineages over a period of >700 MY.A highly conserved domain of a tetrad of C2H2 zinc finger motifs (red and yellow box) is found in various metazoa. Primarily, the N-terminal extensions in C2H2 lead to different types of protein with the exceptions of OVOL proteins from the leech and sea urchin where extension was found in the C-terminal end of C2H2 zinc finger motif. The ID segments with no homology in evolutionary distant organisms are marked in different colors. Times of divergence are taken from Kumar and Hedge (2003) [82] and Ponting (2008) [83].
Adams,
The genome sequence of Drosophila melanogaster.
2000, Pubmed
Adams,
The genome sequence of Drosophila melanogaster.
2000,
Pubmed Altschul,
Local alignment statistics.
1996,
Pubmed Altschul,
Basic local alignment search tool.
1990,
Pubmed Altschul,
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
1997,
Pubmed Andrews,
New AUG initiation codons in a long 5' UTR create four dominant negative alleles of the Drosophila C2H2 zinc-finger gene ovo.
1998,
Pubmed Andrews,
OVO transcription factors function antagonistically in the Drosophila female germline.
2000,
Pubmed Aparicio,
Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes.
2002,
Pubmed Blair Hedges,
Genomic clocks and evolutionary timescales.
2003,
Pubmed Brown,
Evolutionary rate heterogeneity in proteins with long disordered regions.
2002,
Pubmed Buchan,
Protein annotation and modelling servers at University College London.
2010,
Pubmed Campen,
TOP-IDP-scale: a new amino acid scale measuring propensity for intrinsic disorder.
2008,
Pubmed C. elegans Sequencing Consortium,
Genome sequence of the nematode C. elegans: a platform for investigating biology.
1998,
Pubmed Conrad,
Gene duplication: a drive for phenotypic diversity and cause of human disease.
2007,
Pubmed Crooks,
WebLogo: a sequence logo generator.
2004,
Pubmed Dalloul,
Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis.
2010,
Pubmed Delon,
The Ovo/Shavenbaby transcription factor specifies actin remodelling during epidermal differentiation in Drosophila.
2003,
Pubmed Dunker,
Flexible nets. The roles of intrinsic disorder in protein interaction networks.
2005,
Pubmed Dunker,
Identification and functions of usefully disordered proteins.
2002,
Pubmed Dunker,
The unfoldomics decade: an update on intrinsically disordered proteins.
2008,
Pubmed Dunker,
Intrinsic disorder and protein function.
2002,
Pubmed Dyson,
Intrinsically unstructured proteins and their functions.
2005,
Pubmed Edgar,
MUSCLE: a multiple sequence alignment method with reduced time and space complexity.
2004,
Pubmed Edgar,
MUSCLE: multiple sequence alignment with high accuracy and high throughput.
2004,
Pubmed Flicek,
Ensembl's 10th year.
2010,
Pubmed Garza,
Role of intrinsically disordered protein regions/domains in transcriptional regulation.
2009,
Pubmed Gibbs,
Genome sequence of the Brown Norway rat yields insights into mammalian evolution.
2004,
Pubmed Guerzoni,
De novo origins of human genes.
2011,
Pubmed Hellsten,
The genome of the Western clawed frog Xenopus tropicalis.
2010,
Pubmed
,
Xenbase Hubbard,
Ensembl 2009.
2009,
Pubmed Hulo,
The PROSITE database.
2006,
Pubmed Iakoucheva,
Intrinsic disorder in cell-signaling and cancer-associated proteins.
2002,
Pubmed International Chicken Genome Sequencing Consortium,
Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.
2004,
Pubmed Jaillon,
Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype.
2004,
Pubmed Johnson,
EGL-38 Pax regulates the ovo-related gene lin-48 during Caenorhabditis elegans organ development.
2001,
Pubmed Karolchik,
The UCSC Genome Browser Database: 2008 update.
2008,
Pubmed Kasahara,
The medaka draft genome and insights into vertebrate genome evolution.
2007,
Pubmed Knowles,
Recent de novo origin of human protein-coding genes.
2009,
Pubmed Kondo,
Small peptides switch the transcriptional activity of Shavenbaby during Drosophila embryogenesis.
2010,
Pubmed Koonin,
Orthologs, paralogs, and evolutionary genomics.
2005,
Pubmed Li,
The LEF1/beta -catenin complex activates movo1, a mouse homolog of Drosophila ovo required for epidermal appendage differentiation.
2002,
Pubmed Li,
Ovol2, a mammalian homolog of Drosophila ovo: gene structure, chromosomal mapping, and aberrant expression in blind-sterile mice.
2002,
Pubmed Li,
Ovol1 regulates meiotic pachytene progression during spermatogenesis by repressing Id2 expression.
2005,
Pubmed Lü,
Drosophila OVO zinc-finger protein regulates ovo and ovarian tumor target promoters.
1998,
Pubmed Masel,
Cryptic genetic variation is enriched for potential adaptations.
2006,
Pubmed McGuffin,
The PSIPRED protein structure prediction server.
2000,
Pubmed Mével-Ninio,
The ovo gene of Drosophila encodes a zinc finger protein required for female germ line development.
1991,
Pubmed Mével-Ninio,
ovo, a Drosophila gene required for ovarian development, is specifically expressed in the germline and shares most of its coding sequences with shavenbaby, a gene involved in embryo patterning.
1995,
Pubmed Mével-Ninio,
The three dominant female-sterile mutations of the Drosophila ovo gene are point mutations that create new translation-initiator AUG codons.
1996,
Pubmed Mikkelsen,
Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences.
2007,
Pubmed Nair,
Ovol1 regulates the growth arrest of embryonic epidermal progenitor cells and represses c-myc transcription.
2006,
Pubmed Nair,
Ovol1 represses its own transcription by competing with transcription activator c-Myb and by recruiting histone deacetylase activity.
2007,
Pubmed Ohno,
Gene duplication and the uniqueness of vertebrate genomes circa 1970-1999.
1999,
Pubmed Oliver,
The ovo locus is required for sex-specific germ line maintenance in Drosophila.
1987,
Pubmed Payre,
ovo/svb integrates Wingless and DER pathways to control epidermis differentiation.
1999,
Pubmed Ponting,
The functional repertoires of metazoan genomes.
2008,
Pubmed Putnam,
The amphioxus genome and the evolution of the chordate karyotype.
2008,
Pubmed Putnam,
Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization.
2007,
Pubmed Ronquist,
MrBayes 3: Bayesian phylogenetic inference under mixed models.
2003,
Pubmed Sardar,
Conservation of tubulin-binding sequences in TRPV1 throughout evolution.
2012,
Pubmed Siltberg-Liberles,
Evolution of structurally disordered proteins promotes neostructuralization.
2011,
Pubmed Sodergren,
The genome of the sea urchin Strongylocentrotus purpuratus.
2006,
Pubmed Srivastava,
The Trichoplax genome and the nature of placozoans.
2008,
Pubmed Szilágyi,
The twilight zone between protein order and disorder.
2008,
Pubmed Tamura,
MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.
2011,
Pubmed Tamura,
MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0.
2007,
Pubmed Teng,
Strain-dependent perinatal lethality of Ovol1-deficient mice and identification of Ovol2 as a downstream target of Ovol1 in skin epidermis.
2007,
Pubmed Uversky,
Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling.
2005,
Pubmed Uversky,
Why are "natively unfolded" proteins unstructured under physiologic conditions?
2000,
Pubmed Uversky,
Natively unfolded proteins: a point where biology waits for physics.
2002,
Pubmed Uversky,
Intrinsically disordered proteins in human diseases: introducing the D2 concept.
2008,
Pubmed Venter,
The sequence of the human genome.
2001,
Pubmed Ward,
The DISOPRED server for the prediction of protein disorder.
2004,
Pubmed Ward,
Prediction and functional analysis of native disorder in proteins from the three kingdoms of life.
2004,
Pubmed Warren,
The genome of a songbird.
2010,
Pubmed Waterston,
Initial sequencing and comparative analysis of the mouse genome.
2002,
Pubmed Wheeler,
Database resources of the National Center for Biotechnology Information.
2006,
Pubmed Williams,
The protein non-folding problem: amino acid determinants of intrinsic order and disorder.
2001,
Pubmed Wolfsberg,
Using the NCBI map viewer to browse genomic sequence data.
2010,
Pubmed Wolfsberg,
Using the NCBI Map Viewer to browse genomic sequence data.
2007,
Pubmed