Click here to close
Hello! We notice that you are using Internet Explorer, which is not supported by Xenbase and may cause the site to display incorrectly.
We suggest using a current version of Chrome,
FireFox, or Safari.
???displayArticle.abstract???
BACKGROUND: With the availability of well-assembled genomes of a growing number of organisms, identifying the bioinformatic basis of whole genome duplication (WGD) is a growing field of genomics. The most extant software for detecting footprints of WGDs has been restricted to a well-assembled genome. However, the massive poor quality genomes and the more accessible transcriptomes have been largely ignored, and in theoretically they are also likely to contribute to detect WGD using dS based method. Here, to resolve these problems, we have designed a universal and simple technical tool WGDdetector for detecting WGDs using either genome or transcriptome annotations in different organisms based on the widely used dS based method.
RESULTS: We have constructed WGDdetector pipeline that integrates all analyses including gene family constructing, dS estimating and phasing, and outputting the dS values of each paralogs pairs processed with only one command. We further chose four species (Arabidopsis thaliana, Juglans regia, Populus trichocarpa and Xenopus laevis) representing herb, wood and animal, to test its practicability. Our final results showed a high degree of accuracy with the previous studies using both genome and transcriptome data.
CONCLUSION: WGDdetector is not only reliable and stable for genome data, but also a new way to using the transcriptome data to obtain the correct dS distribution for detecting WGD. The source code is freely available, and is implemented in Windows and Linux operation system.
NULL "1000 Youth Talents Plan" of Yunnan Province, NULL CAS "Light of West China" Program, NULL start-up research fund of Lanzhou University to YY, No. B18114BN start-up research fund of XTBG to ZL
Fig. 1. Workflow in WGDdetector. The input files only including the protein and CDS files. The proteins were used in the similarity searching and gene family constructing. The CDSs were used to calculating dS values based on the proteins constructed gene family information. The further sub-gene family building and dS phasing were implemented with the Perl scripts and the R software
Fig. 2. Comparison of the dS distributions within A. thaliana and P. trichocarpa. The y axis is the density values and the x axis represent the dS values. The four species were marked at the top of each sub picture. The software and corresponding dataset were listed at the right of each sub picture
Arrigo,
Rarely successful polyploids and their legacy in plant genomes.
2012, Pubmed
Arrigo,
Rarely successful polyploids and their legacy in plant genomes.
2012,
Pubmed
Barker,
Multiple paleopolyploidizations during the evolution of the Compositae reveal parallel patterns of duplicate gene retention after millions of years.
2008,
Pubmed
Barker,
EvoPipes.net: Bioinformatic Tools for Ecological and Evolutionary Genomics.
2010,
Pubmed
Blanc,
A recent polyploidy superimposed on older large-scale duplications in the Arabidopsis genome.
2003,
Pubmed
Camacho,
BLAST+: architecture and applications.
2009,
Pubmed
Conant,
Comparative genomics as a time machine: how relative gene dosage and metabolic requirements shaped the time-dependent resolution of yeast polyploidy.
2014,
Pubmed
del Pozo,
Whole genome duplications in plants: an overview from Arabidopsis.
2015,
Pubmed
Dubcovsky,
Genome plasticity a key factor in the success of polyploid wheat under domestication.
2007,
Pubmed
Gibbons,
Evaluation of BLAST-based edge-weighting metrics used for homology inference with the Markov Clustering algorithm.
2015,
Pubmed
Gout,
Maintenance and Loss of Duplicated Genes by Dosage Subfunctionalization.
2015,
Pubmed
Grabherr,
Full-length transcriptome assembly from RNA-Seq data without a reference genome.
2011,
Pubmed
Haas,
DAGchainer: a tool for mining segmental genome duplications and synteny.
2004,
Pubmed
Jackson,
Genomic and expression plasticity of polyploidy.
2010,
Pubmed
Jaillon,
Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype.
2004,
Pubmed
Jiao,
Ancestral polyploidy in seed plants and angiosperms.
2011,
Pubmed
Katoh,
MAFFT multiple sequence alignment software version 7: improvements in performance and usability.
2013,
Pubmed
Li,
Early genome duplications in conifers and other seed plants.
2015,
Pubmed
Li,
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.
2006,
Pubmed
Maere,
Modeling gene and genome duplications in eukaryotes.
2005,
Pubmed
Martínez-García,
The walnut (Juglans regia) genome sequence reveals diversity in genes coding for the biosynthesis of non-structural polyphenols.
2016,
Pubmed
Nystedt,
The Norway spruce genome sequence and conifer genome evolution.
2013,
Pubmed
Proost,
i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets.
2012,
Pubmed
Rabier,
Detecting and locating whole genome duplications on a phylogeny: a probabilistic approach.
2014,
Pubmed
Raes,
Investigating ancient duplication events in the Arabidopsis genome.
2003,
Pubmed
Soderlund,
SyMAP: A system for discovering and viewing syntenic regions of FPC maps.
2006,
Pubmed
Soltis,
The polyploidy revolution then…and now: Stebbins revisited.
2014,
Pubmed
Steinegger,
MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets.
2017,
Pubmed
Suyama,
PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments.
2006,
Pubmed
Tang,
Unraveling ancient hexaploidy through multiply-aligned angiosperm gene maps.
2008,
Pubmed
Tiley,
Evaluating and Characterizing Ancient Whole-Genome Duplications in Plants with Gene Count Data.
2016,
Pubmed
Van de Peer,
The evolutionary significance of polyploidy.
2017,
Pubmed
Vanneste,
Inference of genome duplications from age distributions revisited.
2013,
Pubmed
Vanneste,
Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous-Paleogene boundary.
2014,
Pubmed
Wang,
Statistical inference of chromosomal homology based on gene colinearity and applications to Arabidopsis and rice.
2006,
Pubmed
Wang,
MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity.
2012,
Pubmed
Wendel,
The wondrous cycles of polyploidy in plants.
2015,
Pubmed
Wendel,
Genome evolution in polyploids.
2000,
Pubmed