Click here to close
Hello! We notice that you are using Internet Explorer, which is not supported by Xenbase and may cause the site to display incorrectly.
We suggest using a current version of Chrome,
FireFox, or Safari.
Front Genet
2019 Jul 03;10:647. doi: 10.3389/fgene.2019.00647.
Show Gene links
Show Anatomy links
Modeling of Genome-Wide Polyadenylation Signals in Xenopus tropicalis.
Zhu S
,
Wu X
,
Fu H
,
Ye C
,
Chen M
,
Jiang Z
,
Ji G
.
???displayArticle.abstract???
Alternative polyadenylation (APA) is an important post-transcriptional modification event to process messenger RNA (mRNA) for transcriptional termination, transport, and translation. In the present study, we characterized poly(A) signals in Xenopus tropicalis using 70,918 highly confident poly(A) sites derived from 16,511 protein-coding genes to understand their roles in the regulation of embryo development and gender difference. We examined potential factors, including the gene length, the number of introns in a gene, and the intron length, that may affect the prevalence of APA. We observed 12 prominent poly(A) signal patterns, which accounted for approximately 92% of total APA sites in Xenopus tropicalis. Among them, three patterns are specific to X. tropicalis, so they are absent in other animals such as humans or mice. We catalogued APA sites based on their genomic regions and developed a bioinformatics pipeline to identify over-represented signal patterns for each class. Then the schema of cis elements for APA sites in each genomic region was proposed. More importantly, APA usage is dramatically dynamic in embryos along five developmental stages and well-coordinated with the maternal-to-zygotic transition event. We used an entropy-based method to identify developmental stage-specific APA sites and identified significant signal patterns around specific sites and constitutive sites. We found that the APA frequency in different genomic regions varies with developmental stages and that those sites located in intron or coding sequence regions contribute most to the dynamics of gene expression during developmental stages. This study deciphers the characteristics and poly(A) signal patterns for both canonical APA sites and non-canonical APA sites across different developmental stages and gender dimorphisms in X. tropicalis, providing new insights into the dynamic regulation of distal and proximalAPA.
Figure 1. Genomic distribution of poly(A) sites in X. tropicalis. (A) Distribution of different types of genes. Protein-coding gene accounts for more than 97% of annotated genes. (B) Distribution of APA sites in different locations. “3′ end ss” refers to the 3′ UTR and the extended region of 3′ UTR. “5′ end ss” refers to the 5′ UTR and the extended region of 5′ UTR. “Ex_3′ UTR” refers to the extended region of 3′ UTR. “Ex_5′ UTR” refers to the extended region of 5′ UTR. (C) Distribution of APA frequencies in different genomic regions. The left y-axis denotes the percentage of respective genes with the poly(A) site(s) in the specific region. The right y-axis denotes the APA ratio, which is the ratio between the number of APA sites in the specific region and the number of genes these APA sites are located in. Proximal sites are defined as poly(A) sites located in non-3′ UTR regions, while distal sites are those in 3′ UTR or extended 3′ UTR regions. “non-APA gene” refers to no APA event in the gene. “rare APA gene” refers to one APA site per gene. “moderate APA gene” refers to two to four APA sites per gene. “abundant APA gene” refers to more than four APA sites per gene. “APA ratio” refers to the average number of APA per gene (APA site number/APA gene number). (D) The relationship among the APA frequency, the gene length, the number of introns, etc. “Freq” represents the frequency of APA sites in the gene; “Length” represents the length of a gene; “Inum” represents the number of introns; “5′ UTR,” “CDS,” and “Intron” represent the frequency of APA sites in the respective genomic regions, respectively. “proximal” represents the frequency of APA sites in CDS, intron, or 5′ UTR regions. “distal” represents the frequency of APA sites in 3′ UTR.
Figure 2. Characteristics of poly(A) signals. (A) Nucleotide profiles surrounding poly(A) sites. (B) The difference of frequency between the highest signal and the second highest signal (displayed in the legend) with different pattern sizes. (C) Top 50 hexamers visual alignment as in the sequence graphics view. Each sequence is present as a single pixel on a horizontal line, and the bright spot represents each occurrence of the signal patterns with respect to their locations on each sequence. The pattern is ranked according to the total frequency that appears in the dataset. The higher the ranking is, the brighter the point is represented. The continuous vertical band of lines from top to bottom indicates the common locations of the signal element. AAUAAA (brightest point) mainly appears around −30 nt to −10 nt (the red dashed box), and the signal aggregation was also observed around +20 nt (the blue dashed box). (D) Schematic of cis elements for poly(A) sites in X. tropicalis. Five regions were determined based on the nucleotide composition profile and the signal analysis. The GU-rich element is overlapped with the downstream U-rich element.
Figure 3. Top-ranked patterns in different poly(A) signal elements. (A) Top 20 4-nt patterns in USE according to the occurrence number in the upstream region of poly(A) site (−34 to −100 bp). (B) Top 20 4-nt patterns in CEL according to the occurrence number in the upstream region of poly(A) site (−3 to −12 bp). (C) Top 20 4-nt patterns in CER according to the occurrence number in the downstream region of poly(A) site (2 to 34 bp). (D) Top 20 4-nt patterns in PE region according to the occurrence number in the downstream region of poly(A) site (−13 to −34 bp).
Figure 4. Signal distribution of non-3′ UTR APA sites in X. tropicalis. (A) Top 20 hexamers according to the occurrence number from −100 to 100 bp around 5′ UTR poly(A) sites. The signal has changed dramatically in the −40 nt to −1 nt. The most significant signal in PE region is AAUAAA, similar to the PE signal on the 3′ UTR. (B) Top 20 hexamers according to the occurrence number from −100 to 100 bp around CDS poly(A) sites. The most significant hexamer is ACUUAC, and the highly overlapping hexamer is AAGAAA in CE. (C) Top 20 hexamers sorted according to the occurrence number from −100 to 100 bp around intronic poly(A) sites. The most significant signal is AAUAAA in PE, and there is a GA-rich element around CS. (D) Polyadenylation signal models on 5′ UTR, CDS, and intron. The red triangle denotes the poly(A) site. USE, upstream sequence element; PE, positioning element; CE, cleavage element; CEL, left cleavage element; CS, cleavage site; CER, right cleavage element; DSE, downstream sequence element.
Figure 5. Characteristics of poly(A) signals during the development of X. tropicalis. (A) Information entropy distribution of poly(A) sites. The x-axis is the conventional information entropy (H), and the dashed lines represent thresholds of 0.82 and 2.66. The y-axis represents the adjusted information entropy (modeH), and the dashed lines represent thresholds of 2.65 and 2.70. Each point denotes one poly(A) site. Specific site is colored in blue, and constitutive site is colored in red. Red points in the top-right and bottom-left corners are sites that were selected based on the entropy value but were not defined as specific or constitutive sites because of their low expression levels (supported by less than five reads). AL: all poly(A) sites; SP: specific sites; CP: constitutive sites. (B) Venn diagram showing the overlap of specifically expressed genes at different developmental stages. “F_adult” includes young females and growing females. “M_adult” includes young males and growing males. (C) Percentages of specific poly(A) sites located in different locations across different periods. We randomly selected the same number of specific sites and constitutive sites and calculated the percentage of the genomic regions where these sites are located. In this figure, we only selected six periods (embryo stage 6, embryo stage 28, young female, growing female, young male, and growing male) with sufficient quantity for analysis.
Amaya,
Frog genetics: Xenopus tropicalis jumps into the future.
1998, Pubmed,
Xenbase
Amaya,
Frog genetics: Xenopus tropicalis jumps into the future.
1998,
Pubmed
,
Xenbase
Beaudoing,
Patterns of variant polyadenylation signal usage in human genes.
2000,
Pubmed
Bowes,
Xenbase: a Xenopus biology and genomics resource.
2008,
Pubmed
,
Xenbase
Brockman,
PACdb: PolyA Cleavage Site and 3'-UTR Database.
2005,
Pubmed
Brutman,
Mapping diet-induced alternative polyadenylation of hypothalamic transcripts in the obese rat.
2018,
Pubmed
Chen,
Alternative Polyadenylation: Methods, Findings, and Impacts.
2017,
Pubmed
Chizhikov,
A four-nucleotide translation enhancer in the 3'-terminal consensus sequence of the nonpolyadenylated mRNAs of rotavirus.
2000,
Pubmed
Crooks,
WebLogo: a sequence logo generator.
2004,
Pubmed
Di Giammartino,
Mechanisms and consequences of alternative polyadenylation.
2011,
Pubmed
Gautheret,
Alternate polyadenylation in human mRNAs: a large-scale analysis by EST clustering.
1998,
Pubmed
Graber,
In silico detection of control signals: mRNA 3'-end-processing sequences in diverse species.
1999,
Pubmed
Gruber,
A comprehensive analysis of 3' end sequencing data sets reveals novel polyadenylation signals and the repressive role of heterogeneous ribonucleoprotein C on cleavage and polyadenylation.
2016,
Pubmed
Guo,
A Genome-wide Study of "Non-3UTR" Polyadenylation Sites in Arabidopsis thaliana.
2016,
Pubmed
Guo,
Signals sufficient for 3'-end formation of yeast mRNA.
1996,
Pubmed
Guo,
3'-end-forming signals of yeast mRNA.
1996,
Pubmed
Hajarnavis,
A probabilistic model of 3' end formation in Caenorhabditis elegans.
2004,
Pubmed
Hake,
CPEB is a specificity factor that mediates cytoplasmic polyadenylation during Xenopus oocyte maturation.
1994,
Pubmed
,
Xenbase
Hellsten,
The genome of the Western clawed frog Xenopus tropicalis.
2010,
Pubmed
,
Xenbase
Higgins,
CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.
1988,
Pubmed
Hu,
Bioinformatic identification of candidate cis-regulatory elements involved in human mRNA polyadenylation.
2005,
Pubmed
Hubbell,
Robust estimators for expression analysis.
2002,
Pubmed
Hutchins,
Position-dependent motif characterization using non-negative matrix factorization.
2008,
Pubmed
Ji,
Predictive modeling of plant messenger RNA polyadenylation sites.
2007,
Pubmed
Ji,
Genome-wide identification and predictive modeling of polyadenylation sites in eukaryotes.
2015,
Pubmed
Ji,
TSAPA: identification of tissue-specific alternative polyadenylation sites in plants.
2018,
Pubmed
Kadota,
ROKU: a novel method for identification of tissue-specific genes.
2006,
Pubmed
Lee,
PolyA_DB 2: mRNA polyadenylation sites in vertebrate genes.
2007,
Pubmed
Legendre,
Sequence determinants in human polyadenylation site selection.
2003,
Pubmed
Li,
The Polyadenylation of RNA in Plants.
1997,
Pubmed
Loke,
Compilation of mRNA polyadenylation signals in Arabidopsis revealed a new signal element and potential secondary structures.
2005,
Pubmed
Mason,
Polyadenylation of the Xenopus beta 1 globin mRNA at a downstream minor site in the absence of the major site and utilization of an AAUACA polyadenylation signal.
1985,
Pubmed
,
Xenbase
Mason,
Mutations downstream of the polyadenylation site of a Xenopus beta-globin mRNA affect the position but not the efficiency of 3' processing.
1986,
Pubmed
,
Xenbase
Minshall,
Dual roles of p82, the clam CPEB homolog, in cytoplasmic polyadenylation and translational masking.
1999,
Pubmed
,
Xenbase
Ozsolak,
Comprehensive polyadenylation site maps in yeast and human reveal pervasive alternative polyadenylation.
2010,
Pubmed
Rabbitts,
Alternative 3' processing of Xenopus alpha-tubulin mRNAs; efficient use of a CAUAAA polyadenylation signal.
1992,
Pubmed
,
Xenbase
Retelska,
Similarities and differences of polyadenylation signals in human and fly.
2006,
Pubmed
Rothnie,
Polyadenylation in rice tungro bacilliform virus: cis-acting signals and regulation.
2001,
Pubmed
Rothnie,
Plant mRNA 3'-end formation.
1996,
Pubmed
Salisbury,
A multispecies comparison of the metazoan 3'-processing downstream elements and the CstF-64 RNA recognition motif.
2006,
Pubmed
Sheets,
The 3'-untranslated regions of c-mos and cyclin mRNAs stimulate translation by regulating cytoplasmic polyadenylation.
1994,
Pubmed
,
Xenbase
Shen,
Genome level analysis of rice mRNA 3'-end processing signals and alternative polyadenylation.
2008,
Pubmed
Sherstnev,
Direct sequencing of Arabidopsis thaliana RNA reveals patterns of cleavage and polyadenylation.
2012,
Pubmed
Shi,
Alternative polyadenylation: new insights from global analyses.
2012,
Pubmed
Singh,
Widespread intronic polyadenylation diversifies immune cell transcriptomes.
2018,
Pubmed
Smyth,
In cell mutational interference mapping experiment (in cell MIME) identifies the 5' polyadenylation signal as a dual regulator of HIV-1 genomic RNA production and packaging.
2018,
Pubmed
Tian,
A large-scale analysis of mRNA polyadenylation of human and mouse genes.
2005,
Pubmed
Tian,
Widespread mRNA polyadenylation events in introns indicate dynamic interplay between polyadenylation and splicing.
2007,
Pubmed
Tian,
Signals for pre-mRNA cleavage and polyadenylation.
2012,
Pubmed
Venkataraman,
Analysis of a noncanonical poly(A) site reveals a tripartite mechanism for vertebrate poly(A) site recognition.
2005,
Pubmed
Wu,
Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation.
2011,
Pubmed
Wu,
The regulation of mRNA stability in mammalian cells: 2.0.
2012,
Pubmed
Wu,
Genome-wide characterization of intergenic polyadenylation sites redefines gene spaces in Arabidopsis thaliana.
2015,
Pubmed
Xing,
Alternative polyadenylation and gene expression regulation in plants.
2011,
Pubmed
Zhang,
Alternative polyadenylation drives genome-to-phenome information detours in the AMPKα1 and AMPKα2 knockout mice.
2018,
Pubmed
Zhou,
Accurate Profiling of Gene Expression and Alternative Polyadenylation with Whole Transcriptome Termini Site Sequencing (WTTS-Seq).
2016,
Pubmed
,
Xenbase
Zhou,
Alternative polyadenylation coordinates embryonic development, sexual dimorphism and longitudinal growth in Xenopus tropicalis.
2019,
Pubmed
,
Xenbase
van Helden,
Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals.
2000,
Pubmed