Click here to close
Hello! We notice that you are using Internet Explorer, which is not supported by Xenbase and may cause the site to display incorrectly.
We suggest using a current version of Chrome,
FireFox, or Safari.
PLoS One
2012 Jan 01;76:e39399. doi: 10.1371/journal.pone.0039399.
Show Gene links
Show Anatomy links
Molecular phylogeny of OVOL genes illustrates a conserved C2H2 zinc finger domain coupled by hypervariable unstructured regions.
Kumar A
,
Bhandari A
,
Sinha R
,
Sardar P
,
Sushma M
,
Goyal P
,
Goswami C
,
Grapputo A
.
Abstract
OVO-like proteins (OVOL) are members of the zinc finger protein family and serve as transcription factors to regulate gene expression in various differentiation processes. Recent studies have shown that OVOL genes are involved in epithelial development and differentiation in a wide variety of organisms; yet there is a lack of comprehensive studies that describe OVOL proteins from an evolutionary perspective. Using comparative genomic analysis, we traced three different OVOL genes (OVOL1-3) in vertebrates. One gene, OVOL3, was duplicated during a whole-genome-duplication event in fish, but only the copy (OVOL3b) was retained. From early-branching metazoa to humans, we found that a core domain, comprising a tetrad of C2H2 zinc fingers, is conserved. By domain comparison of the OVOL proteins, we found that they evolved in different metazoan lineages by attaching intrinsically-disordered (ID) segments of N/C-terminal extensions of 100 to 1000 amino acids to this conserved core. These ID regions originated independently across different animal lineages giving rise to different types of OVOL genes over the course of metazoan evolution. We illustrated the molecular evolution of metazoan OVOL genes over a period of 700 million years (MY). This study both extends our current understanding of the structure/function relationship of metazoan OVOL genes, and assembles a good platform for further characterization of OVOL genes from diverged organisms.
Figure 1. OVOL proteins are characterized by the presence of hypervariable ID regions. A. Mouse OVOL1 has ID residues in the first 100 amino acids. B. Mouse OVOL2 possesses ID residues in the first 50 amino acids with a glycine-rich and serine rich region as marked in red color. C. Mouse OVOL3 has ID segments within the N-terminal 100 residues. D. Drosophila OVO is intrinsically disordered with large patches of residue biasness as indicated by the red color. We used DISOPRED2 software [47] for the prediction of ID regions. The horizontal line indicates the ordered/disordered threshold for the default false positive rate of 5%. The 'filter' curve represents the outputs from DISOPRED2 and the 'output' curve represents the outputs from a linear support vector machine (SVM) classifier (DISOPREDsvm). The outputs from DISOPREDsvm are included to indicate shorter as low confidence predictions of disorder.
Figure 2. The disordered regions of OVOL have evolved more rapidly than structured regions.A) Structured regions only, B) Disordered segments only, C) Full-length OVOL.
Figure 3. Chromosomal localization of OVOL1 gene from selected vertebrates, flanked by a set of conserved marker genes.SIPA1: signal-induced proliferation-associated 1; RELA: v-rel reticuloendotheliosis viral oncogene homolog A (avian); KAT5: K (lysine) acetyltransferase; SNX32: sorting nexin 32; MUS81: MUS81 endonuclease homolog (S. cerevisiae); BANF1: barrier to autointegration factor 1; EXOC6B: exocyst complex component 6B; DYSF: dysferlin, limb girdle muscular dystrophy 2B; COL4A5: collagen, type IV, alpha 5; DAK: dihydroxyacetone kinase 2 S. cerevisiae homolog.
Figure 4. OVOL2 orthologs identified in vertebrates by comparing chromosomal localization.RRBP1: ribosome binding protein 1 homolog; BANF2: barrier to autointegration factor 2; SNX5: sorting nexin 5; CSRP2BP: CSRP2 binding protein; SEC23B: protein transport protein Sec23B; POLR3F: polymerase (RNA) III (DNA directed) polypeptide F; RBBP9: Retinoblastoma-binding protein 9; DTD1: D-tyrosyl-tRNA deacylase 1.
Figure 5. Synteny analysis of OVOL3 genes illustrates the loss of OVOL3a after duplication event and maintenance of paralogous OVOL3b in fishes.LIN37: lin-37 homolog (C. elegans); PRODH2: proline dehydrogenase (oxidase) 2; KIRREL2: kin of IRRE like 2 (Drosophila); APLP1: amyloid beta (A4) precursor-like protein 1; NFKBID: nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor, delta; LRFN3: leucine rich repeat and fibronectin type III domain containing 3; SDHAF1: succinate dehydrogenase complex assembly factor 1; CLIP3: CAP-GLY domain containing linker protein 3; POLR2I: polymerase (RNA) II (DNA directed) polypeptide I, 14.5 kDa; CAPNS1: calpain, small subunit 1; COX7A1: cytochrome c oxidase subunit VIIa polypeptide 1 (muscle); DMPK: dystrophia myotonica-protein kinase; HLCS: holocarboxylase synthetase; AMOT: angiomotin; REXO2: REX2 RNA exonuclease 2 homolog (S. cerevisiae).
Figure 6. Sequence logo of four different Cys2-His2 (C2H2) zinc finger motifs (I-IV) present in different OVOL proteins from metazoan genomes.We generated this sequence logo using WebLogo 3.0 [81]. C2H2 zinc finger motif IV has 25 amino acids due to the presence of one extra amino acid at the eleventh position in the OVOLNVE1 protein from sea anemone.
Figure 7. Phylogenetic history of OVOL proteins using the Bayesian method. A. Full-length OVOL proteins. B. Selected region of OVOL proteins.Posterior probabilities scores are depicted by various color balls. The placozoan OVOL protein (e_gw1.4.509.1) was used as the outgroup in this phylogenetic tree. Red x indicates sequence position, which did not accord with species phylogeny. BFL: B. floridae (lancelet), SPU: S. purpuratus (sea urchin), NVE: N. vectensis (sea anemone), HRO: H. robusta (annelids), LGI: L. gigantean (molluscs) and TAD: T. adhaerens (placozoan). Trees in figures 7A and 7B are generated using the MrBayes 3.2 [53] from alignments supplied in supplementary Files S1 and S2, respectively.
Figure 8. Protein domain evolution of OVOL proteins from different metazoan lineages over a period of >700 MY.A highly conserved domain of a tetrad of C2H2 zinc finger motifs (red and yellow box) is found in various metazoa. Primarily, the N-terminal extensions in C2H2 lead to different types of protein with the exceptions of OVOL proteins from the leech and sea urchin where extension was found in the C-terminal end of C2H2 zinc finger motif. The ID segments with no homology in evolutionary distant organisms are marked in different colors. Times of divergence are taken from Kumar and Hedge (2003) [82] and Ponting (2008) [83].
Adams,
The genome sequence of Drosophila melanogaster.
2000, Pubmed
Adams,
The genome sequence of Drosophila melanogaster.
2000,
Pubmed
Altschul,
Local alignment statistics.
1996,
Pubmed
Altschul,
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
1997,
Pubmed
Altschul,
Basic local alignment search tool.
1990,
Pubmed
Andrews,
New AUG initiation codons in a long 5' UTR create four dominant negative alleles of the Drosophila C2H2 zinc-finger gene ovo.
1998,
Pubmed
Andrews,
OVO transcription factors function antagonistically in the Drosophila female germline.
2000,
Pubmed
Aparicio,
Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes.
2002,
Pubmed
Blair Hedges,
Genomic clocks and evolutionary timescales.
2003,
Pubmed
Brown,
Evolutionary rate heterogeneity in proteins with long disordered regions.
2002,
Pubmed
Buchan,
Protein annotation and modelling servers at University College London.
2010,
Pubmed
C. elegans Sequencing Consortium,
Genome sequence of the nematode C. elegans: a platform for investigating biology.
1998,
Pubmed
Campen,
TOP-IDP-scale: a new amino acid scale measuring propensity for intrinsic disorder.
2008,
Pubmed
Conrad,
Gene duplication: a drive for phenotypic diversity and cause of human disease.
2007,
Pubmed
Crooks,
WebLogo: a sequence logo generator.
2004,
Pubmed
Dalloul,
Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis.
2010,
Pubmed
Delon,
The Ovo/Shavenbaby transcription factor specifies actin remodelling during epidermal differentiation in Drosophila.
2003,
Pubmed
Dunker,
Flexible nets. The roles of intrinsic disorder in protein interaction networks.
2005,
Pubmed
Dunker,
The unfoldomics decade: an update on intrinsically disordered proteins.
2008,
Pubmed
Dunker,
Intrinsic disorder and protein function.
2002,
Pubmed
Dunker,
Identification and functions of usefully disordered proteins.
2002,
Pubmed
Dyson,
Intrinsically unstructured proteins and their functions.
2005,
Pubmed
Edgar,
MUSCLE: multiple sequence alignment with high accuracy and high throughput.
2004,
Pubmed
Edgar,
MUSCLE: a multiple sequence alignment method with reduced time and space complexity.
2004,
Pubmed
Flicek,
Ensembl's 10th year.
2010,
Pubmed
Garza,
Role of intrinsically disordered protein regions/domains in transcriptional regulation.
2009,
Pubmed
Gibbs,
Genome sequence of the Brown Norway rat yields insights into mammalian evolution.
2004,
Pubmed
Guerzoni,
De novo origins of human genes.
2011,
Pubmed
Hellsten,
The genome of the Western clawed frog Xenopus tropicalis.
2010,
Pubmed
,
Xenbase
Hubbard,
Ensembl 2009.
2009,
Pubmed
Hulo,
The PROSITE database.
2006,
Pubmed
Iakoucheva,
Intrinsic disorder in cell-signaling and cancer-associated proteins.
2002,
Pubmed
International Chicken Genome Sequencing Consortium,
Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.
2004,
Pubmed
Jaillon,
Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype.
2004,
Pubmed
Johnson,
EGL-38 Pax regulates the ovo-related gene lin-48 during Caenorhabditis elegans organ development.
2001,
Pubmed
Karolchik,
The UCSC Genome Browser Database: 2008 update.
2008,
Pubmed
Kasahara,
The medaka draft genome and insights into vertebrate genome evolution.
2007,
Pubmed
Knowles,
Recent de novo origin of human protein-coding genes.
2009,
Pubmed
Kondo,
Small peptides switch the transcriptional activity of Shavenbaby during Drosophila embryogenesis.
2010,
Pubmed
Koonin,
Orthologs, paralogs, and evolutionary genomics.
2005,
Pubmed
Li,
Ovol1 regulates meiotic pachytene progression during spermatogenesis by repressing Id2 expression.
2005,
Pubmed
Li,
The LEF1/beta -catenin complex activates movo1, a mouse homolog of Drosophila ovo required for epidermal appendage differentiation.
2002,
Pubmed
Li,
Ovol2, a mammalian homolog of Drosophila ovo: gene structure, chromosomal mapping, and aberrant expression in blind-sterile mice.
2002,
Pubmed
Lü,
Drosophila OVO zinc-finger protein regulates ovo and ovarian tumor target promoters.
1998,
Pubmed
Masel,
Cryptic genetic variation is enriched for potential adaptations.
2006,
Pubmed
McGuffin,
The PSIPRED protein structure prediction server.
2000,
Pubmed
Mikkelsen,
Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences.
2007,
Pubmed
Mével-Ninio,
The ovo gene of Drosophila encodes a zinc finger protein required for female germ line development.
1991,
Pubmed
Mével-Ninio,
ovo, a Drosophila gene required for ovarian development, is specifically expressed in the germline and shares most of its coding sequences with shavenbaby, a gene involved in embryo patterning.
1995,
Pubmed
Mével-Ninio,
The three dominant female-sterile mutations of the Drosophila ovo gene are point mutations that create new translation-initiator AUG codons.
1996,
Pubmed
Nair,
Ovol1 regulates the growth arrest of embryonic epidermal progenitor cells and represses c-myc transcription.
2006,
Pubmed
Nair,
Ovol1 represses its own transcription by competing with transcription activator c-Myb and by recruiting histone deacetylase activity.
2007,
Pubmed
Ohno,
Gene duplication and the uniqueness of vertebrate genomes circa 1970-1999.
1999,
Pubmed
Oliver,
The ovo locus is required for sex-specific germ line maintenance in Drosophila.
1987,
Pubmed
Payre,
ovo/svb integrates Wingless and DER pathways to control epidermis differentiation.
1999,
Pubmed
Ponting,
The functional repertoires of metazoan genomes.
2008,
Pubmed
Putnam,
Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization.
2007,
Pubmed
Putnam,
The amphioxus genome and the evolution of the chordate karyotype.
2008,
Pubmed
Ronquist,
MrBayes 3: Bayesian phylogenetic inference under mixed models.
2003,
Pubmed
Sardar,
Conservation of tubulin-binding sequences in TRPV1 throughout evolution.
2012,
Pubmed
Siltberg-Liberles,
Evolution of structurally disordered proteins promotes neostructuralization.
2011,
Pubmed
Sodergren,
The genome of the sea urchin Strongylocentrotus purpuratus.
2006,
Pubmed
Srivastava,
The Trichoplax genome and the nature of placozoans.
2008,
Pubmed
Szilágyi,
The twilight zone between protein order and disorder.
2008,
Pubmed
Tamura,
MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0.
2007,
Pubmed
Tamura,
MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.
2011,
Pubmed
Teng,
Strain-dependent perinatal lethality of Ovol1-deficient mice and identification of Ovol2 as a downstream target of Ovol1 in skin epidermis.
2007,
Pubmed
Uversky,
Showing your ID: intrinsic disorder as an ID for recognition, regulation and cell signaling.
2005,
Pubmed
Uversky,
Intrinsically disordered proteins in human diseases: introducing the D2 concept.
2008,
Pubmed
Uversky,
Why are "natively unfolded" proteins unstructured under physiologic conditions?
2000,
Pubmed
Uversky,
Natively unfolded proteins: a point where biology waits for physics.
2002,
Pubmed
Venter,
The sequence of the human genome.
2001,
Pubmed
Ward,
Prediction and functional analysis of native disorder in proteins from the three kingdoms of life.
2004,
Pubmed
Ward,
The DISOPRED server for the prediction of protein disorder.
2004,
Pubmed
Warren,
The genome of a songbird.
2010,
Pubmed
Waterston,
Initial sequencing and comparative analysis of the mouse genome.
2002,
Pubmed
Wheeler,
Database resources of the National Center for Biotechnology Information.
2006,
Pubmed
Williams,
The protein non-folding problem: amino acid determinants of intrinsic order and disorder.
2001,
Pubmed
Wolfsberg,
Using the NCBI Map Viewer to browse genomic sequence data.
2007,
Pubmed
Wolfsberg,
Using the NCBI map viewer to browse genomic sequence data.
2010,
Pubmed