XB-ART-47572BMC Genomics. May 28, 2013; 14 762.
A genome-wide survey of maternal and embryonic transcripts during Xenopus tropicalis development.
Dynamics of polyadenylation vs. deadenylation determine the fate of several developmentally regulated genes. Decay of a subset of maternal mRNAs and new transcription define the maternal-to-zygotic transition, but the full complement of polyadenylated and deadenylated coding and non-coding transcripts has not yet been assessed in Xenopus embryos. To analyze the dynamics and diversity of coding and non-coding transcripts during development, both polyadenylated mRNA and ribosomal RNA-depleted total RNA were harvested across six developmental stages and subjected to high throughput sequencing. The maternally loaded transcriptome is highly diverse and consists of both polyadenylated and deadenylated transcripts. Many maternal genes show peak expression in the oocyte and include genes which are known to be the key regulators of events like oocyte maturation and fertilization. Of all the transcripts that increase in abundance between early blastula and larval stages, about 30% of the embryonic genes are induced by fourfold or more by the late blastula stage and another 35% by late gastrulation. Using a gene model validation and discovery pipeline, we identified novel transcripts and putative long non-coding RNAs (lncRNA). These lncRNA transcripts were stringently selected as spliced transcripts generated from independent promoters, with limited coding potential and a codon bias characteristic of noncoding sequences. Many lncRNAs are conserved and expressed in a developmental stage-specific fashion. These data reveal dynamics of transcriptome polyadenylation and abundance and provides a high-confidence catalogue of novel and long non-coding RNAs.
PubMed ID: 24195446
PMC ID: PMC3907017
Article link: BMC Genomics.
Grant support: R01HD054356 NICHD NIH HHS , R01 HD069344 NICHD NIH HHS , R01 HD054356 NICHD NIH HHS
Genes referenced: aurka aurkb birc5.1 bmp4 cbx4 ccnb1 cdc25c cdk2 celf1 celf3 cer1 chrd.1 eomes ezh2 fbxo43 fbxo5 fgf8 foxa2 foxc1 foxc2 gata1 gata4 gsc hdac3 hes7.1 hist1h2ad hist1h2al hist1h4k hist2h2ab irx5 kif11 lef1 lhx1 mos myod1 neurog1 nkx2-5 nodal3.2 nodal5 nodal6 pax6 sia1 sia2 tacc3 tal1 tbxt tcf3 wnt4
Article Images: [+] show captions
|Figure 1. Generation of RNA-sequencing libraries.(a) Developmental stages of Xenopus tropicalis. (b) RPKM distribution across six developmental time-points. Numbers on the x-axis are Xenopus tropicalis Nieuwkoop and Faber developmental stages, Oocyte (Oo), stage 6, stage 9, stage 12, stage 16 and stage 30. (c) Heat map to show Pearson correlation of expression (RPKM) between all 9 RNA-seq libraries. (d) Scatter plots to show stage specific Pearson correlation between RNA-seq data generated using two different methods. Log2 RPKM values are plotted on x and y axis respectively. PolyA+ (RNA harvested with double PolyA+ selection), RZ (ribosomal rRNA depleted-total RNA).|
|Figure 2. Total and Polyadenylated RNA profiles of the Maternal Transcriptome.(a) Barplots to show gene specific distribution of log2 RPKM ratios during early development. (b) Heatmap to show stage specific comparison between PolyA+ and RZ data. The barplots to the right of the figure represent average PolyA+ and RZ ratios per stage for the same cluster numbered to the left of the heatmap. Gene names are representative examples from the corresponding cluster. (c) Heatmap to show abundance of polyadenylated maternal genes from six developmental time points. Gene names are representative examples from the corresponding cluster. The heatmaps (bandc) show scaled expression values (the sum of expression per gene across all stages is set to one). PolyA+(RNA harvested with double PolyA+ selection), RZ (ribosomal rRNA depleted-total RNA).|
|Figure 3. Overview of the Embryonic Transcriptome.(a) Density plot to show distribution of Maternal-Embryonic (grey) and Embryonic (red) ratios of polyA+ vs. RZ expression (RPKM) at Stage 9. (b) Heatmap to show dynamic expression of 2,481 polyadenylated embryonic genes. Scale represents the log2 transformed RPKM values. Gene names are representative examples from the corresponding cluster. (c) A pie-chart to show percentage of genes whose expression is increased four folds or more relative to Oocyte. (d) A heatmap to show scaled expression (the sum of expression per gene across all stages is set to one) of 2,481 polyadenylated embryonic genes. Gene names are representative examples from the corresponding cluster. (e) A pie-chart to show percentage of embryonic genes peaking in expression per stage.|
|Figure 5. Analysis of Novel transcripts.(a) Subsets of gene models from the updated Xenopus tropicalis gene annotation pipeline. (bandc) Cumulative frequency chart to show distribution of codon bias (LLR score) and ORF length between new gene models (NGM), all gene models (GM), new gene models with validation support (NGM-vv), random genomic sequences (Genomic seq.) and Xenbase extracted X.tropicalis mRNAs (X.trop mRNA).|
|Figure 6. Analysis of NGM-vvo transcripts.(a) Cumulative frequency chart to show distribution of codon bias (LLR score) for NGM-vvo, random genomic sequences (Genomic seq.) and Xenbase extracted X.tropicalis mRNAs (X.trop mRNA). (b) An example to illustrate NGM-vvo gene model. H3K4me3 peak demonstrates the gene being transcribed from its own promoter . (candd) Frequency distribution to compare number of exons and transcript length (nt, nucleotides) between all gene models (GM) and new gene models (NGM-vvo).|
|Figure 7. Expression analysis of putative long non-coding RNAs (NGM-vvo).(a) Boxplot to show log transformed expression (RPKM, PolyA+) across six developmental stages in the NGM-vvo subset. (b) Density graph to compare stage-9 expression (RPKM, PolyA+) between all gene models (GM) and NGM-vvo subset. (c) Heat map to show unsupervised hierarchical clustering of expression (RPKM) of polyA+ and RZ data across embryogenesis. Colorscale represents deviation from mean expression calculated row-wise. (d) Density plot to compare distribution of log10 transformed conservation score (phastCons analysis, see Materials and methods) between random genomic sequences (Genomic Seq), Xenbase extracted X.tropicalis mRNAs (X.trop mRNA) and NGM-vvo subset. PolyA+(RNA harvested with double polyA+ selection), RZ (ribosomal rRNA depleted-total RNA).|