Owens ND et al. (2016), Measuring Absolute RNA Copy Numbers at High Tem...

XB-ART-51804

Cell Rep 2016 Jan 26;143:632-47. doi: 10.1016/j.celrep.2015.12.050.

Show Gene links Show Anatomy links

Measuring Absolute RNA Copy Numbers at High Temporal Resolution Reveals Transcriptome Kinetics in Development.

Owens ND , Blitz IL , Lane MA , Patrushev I , Overton JD , Gilchrist MJ , Cho KW , Khokha MK .

???displayArticle.abstract???
Transcript regulation is essential for cell function, and misregulation can lead to disease. Despite technologies to survey the transcriptome, we lack a comprehensive understanding of transcript kinetics, which limits quantitative biology. This is an acute challenge in embryonic development, where rapid changes in gene expression dictate cell fate decisions. By ultra-high-frequency sampling of Xenopus embryos and absolute normalization of sequence reads, we present smooth gene expression trajectories in absolute transcript numbers. During a developmental period approximating the first 8 weeks of human gestation, transcript kinetics vary by eight orders of magnitude. Ordering genes by expression dynamics, we find that "temporal synexpression" predicts common gene function. Remarkably, a single parameter, the characteristic timescale, can classify transcript kinetics globally and distinguish genes regulating development from those involved in cellular metabolism. Overall, our analysis provides unprecedented insight into the reorganization of maternal and embryonic transcripts and redefines our ability to perform quantitative biology.

???displayArticle.pubmedLink??? 26774488
???displayArticle.pmcLink??? PMC4731879
???displayArticle.link??? Cell Rep
???displayArticle.grants??? [+]

Species referenced: Xenopus
Genes referenced: ace2 acta4 actc1 admp ankrd2 bad bicd2 btbd3 calhm2 cirbp ckb col2a1 col3a1 cpb1 ctrb1 ctrl dhx32 eef1a1 eef1a1o foxh1.2 g6pc1 gnao1 gpi hbe1 hhatl hspb1 klf17 klhl41 krt19 krt61 krt62 krt7 lima1 mep1a mep1b mixer myf5 myl1 myl11 myod1 nek2l nodal3 nodal3.2 nodal3.3 nodal5 nodal5.3 nodal5.4 nodal5.5 nodal6 odc1 sia1 sia2 sypl2 tbxt thdl17 thdl18 tmod4 tnnc2 tpi1 ventx3 ventx3.2

???displayArticle.gses??? GSE65785: Xenbase, NCBI

???attribute.lit??? ???displayArticles.show???

	Data generation, mRNA content of the embryo and gene expression dynamics, see also Fig S1, S2 A – Experimental design and sample collection B – Left: eef1a1o and ERCC-00171 abundance in relative normalization (TPM), Clutch A polyA+. RNA standard decreases with time. Right: absolute normalization of left. Red line indicates ERCC-00171 transcripts added based on the manufacturer’s datasheet. The RNA standard is constant at the correct abundance. C – Total mRNA (PolyA+ and rdRNA) in the embryo in nanograms with time. Gray region and line respectively mark Gaussian process 95% confidence interval (CI) and median of Clutch A & B PolyA+ data. D – Dynamics of gene expression of PolyA+ RNA in transcripts per embryo. Circles mark Clutch A PolyA+, triangles mark Clutch B PolyA+. The horizontal axis marks time (bottom) and NF developmental stage (top). Shaded regions mark Gaussian process 95% CIs of the data.
	Clutch A vs Clutch B PolyA+ Differential Expression A – Clutch A vs Clutch B histogram of differential expression scores as measured by the Bayesian Information Criterion (BIC), larger scores indicate greater differential expression (Supplemental Experimental Procedures). Three regions marked: 1) no differential expression BIC ≤ 0, 2) differential expression small effect 0 < BIC ≤ 60, 3) differential expression large effect BIC > 60. See B for explanation of thresholds. B – Differential expression effect size as measured by the log10 mean overlap between Clutch A and Clutch B Gaussian process models (Supplemental Experimental Procedures). Mean overlap decreases with increasing BIC. BIC = 0, 60 and mean overlaps of 60%, 10% and 2.5% marked. At BIC = 60 approximately all genes have less than 10% overlap, and all genes with less than 2.5% overlap have BIC > 60. C – Differential expression examples with decreasing BIC (top right). Genes on the boundary for strong differential expression have highly correlated expression profiles in A and B. Vertical lines mark convergences (red) or divergences (black). D – Left: Examples of convergence (dhx32) and divergence (Xetro.K05169). Right: Histogram of convergence and divergence times.
	Isoform Differential Expression in Clutch A, see also Fig S5 A â€“ Isoform switching event examples: maternal to maternal; maternal to zygotic; and zygotic to zygotic switching events (left). Histogram of isoform switching events by category and time (right). B â€“ bicd2 isoforms switch in PolyA+ data (circles), and do not switch in rdRNA data (squares) indicating differential polyadenylation. PolyA+ and rdRNA abundances agree within uncertainty of the absolute normalization (Supplemental Experimental Procedures). C â€“ Normalized read depth on bicd2 locus in Clutch A polyA+ between 0â€“4.5 hpf. Note temporal dynamics shared by the final three bicd2(1) exons. Heatmap and depth of pile ups corrected for changing total mRNA. D â€“ Spatial expression of three genes with isoforms with differential dynamics. All isoforms show different temporal and spatial expression domains. Grey lines and points mark total expression for each gene (sum of the red and blue isoforms).
	Temporal Synexpression in Clutch A PolyA+, see also Fig S6 A – Temporal map of the transcriptome: an enumeration of all gene expression dynamics in the embryo. Heat maps display all genes normalized by maximum expression and ordered by similarity. Inset (right) expands on genes transiently expressed with early onset. Vertical bars mark Gene Ontology (GO) enrichment (colors correspond to GO terms), numbers appended to GO bars indicate the number of genes of given category. All reported GO blocks are enriched with p < 2×10−4 (Fisher Exact) for entire block. S1, S2 and V1, V2 mark the location of somite and vision temporal synexpression genes respectively in B. “Transcription Factors *” labels an annotation of transcription factors separate to GO (Supplemental Experimental Methods). B – Somite (left) and Vision (right) synexpression groups.
	Transcript copy numbers and kinetics per embryo in Clutch A A â€“ Transcript copy number histogram in polyA+ sequencing at 0.0 (egg), 4.5 hpf (early stage 9), 10 hpf (stage 12.5, end of gastrulation), 66 hpf (tadpole stage 42). Detection limits give number of transcripts required to generate a single read averaged over all transcript lengths (Fig S2E). Transcript copy number distributions transition smoothly between these timepoints (data not shown). B â€“ Histogram of maximum rates of increase in transcripts/hour between any consecutive measurement points calculated from Gaussian process medians for each gene. Blue â€“ PolyA+ 0â€“23.5 hpf; Red â€“ rdRNA 0â€“23.5 hpf; Green â€“ PolyA+ 24â€“66 hpf. Vertical lines mark the gene with max. rate of increase for each category. C â€“ As B but kb/hour. D â€“ Maximum rate of increase in transcripts/hour (left) and kb/hour (right) for 0â€“3.5 hpf and 4â€“23.5 hpf time intervals. PolyA+ and rdRNA distributions are discrepant for 0â€“3.5 hpf reflecting polyadenylation of maternal RNAs. Distributions are identical over 4â€“23.5 hpf.
	Accumulation kinetics of pri-mir427 per allele in Clutch A rdRNA, see also Fig S7 A â€“ pri-mir427 expression over first 12 hpf in rdRNA data and in situ hybridizations. pri-mir427 show ubiquitous expression matching RNA-seq temporal data. Examples of most abundant transcripts shown for comparison. Inset: first detection of pri-mir427 above detection limit at 2.0 hpf (8â€“16 cell transition). B â€“ pri-mir427 abundance in kb per cell. Line marks Gaussian process median, error bar is 95% CI (Supplemental Experimental Procedures). Peak per cell abundance occurs after 11th division at 4.4 hpf. C â€“ pri-mir427 accumulation rate in kb/min/allele with time. Line is median of differential of Gaussian Process, error bar gives 95% CI (Supplemental Experimental Procedures). Peak accumulation rate is achieved at 4.2 hpf just after 10th cell division. D â€“ Organization of pri-mir427 locus on X. tropicalis Chromosome 3 (X. tropicalis v8 genome, Supplemental Experimental Procedures). Clusters of mir427 hairpins appear in tandem with a repeated sequence arranged symmetrically on opposite strands around a gap in the genome assembly.
	Characteristic Timescales of Gene Expression in Clutch A PolyA+ A – Distribution of max. rate of increase of all transcripts (gray) (as Fig 5), transcription factors (blue), and ribosomal proteins rps* and rpl* (red). B – Theoretical covariance (Matérn covariance function, Supplemental Experimental Procedures) between two measurements by time interval for example genes in C. C – Histogram of characteristic timescale over all genes in in Clutch A polyA+. Top row annotates example genes (timescale inset), fast/short timescales are on the left slow/long timescales on the right, Gaussian process 95% CI (gray) and median (blue) marked. Bars below give GO enrichment (calculated as Fig 2, Fig S6A) of histogram, numbers indicate total genes with given category. All enrichments have p < 2.6×10−5 (Table S4). “Transcription Factors *” labels an annotation of transcription factors separate to GO (Supplemental Experimental Methods).
	Fig S1 – RNA-seq data quality, related to Fig 1, Supplemental Experimental Procedures (A) Spearman correlation between all pairs of all samples for relative normalized TPM abundances. Neighboring samples are as correlated as replicates. (B) Scatter plots of all TPM abundances for all samples. Color indicates density of points. Horizontal and vertical strips indicate measurements only present in one sample. (C) RNA standard performance. Scatter plots compare RNA standard TPM abundances between all pairs of datasets. Color indicates spike-in species. Red boxes mark ArrayControl spikes present only in Clutch B PolyA+. Although, we find small numbers of reads that align uniquely to ArrayControl sequences when ArrayControl spikes are not present in the sample. We note that this is not true for ERCC spikes, we never find ERCC reads when ERCC spikes are not present (data not shown). ERCC PolyA+ and rdRNA comparisons lie off the diagonal indicating poor PolyA+ performance. ERCC-00116 has the worst PolyA+ performance and is not used in absolute normalization. (D) Identification of poorly performing libraries. Normalized leave-one-out score (Supplemental Experimental Procedures), higher scores indicate a poorly performing sample. Samples adjacent to a bad sample can also receive high scores. Red and blue samples in Clutch A PolyA+ are selected for repeat library construction and sequencing (E). Red samples excluded from all analysis. (E) Technical replicates in Clutch A PolyA+ of poorly performing samples reveal that the poor performance is highly reproducible indicating that performance issues are not due to library construction and sequencing. Red boxes marked those discarded, black boxes marked those retained. (F) Projections onto the first two principle components for Clutch A PolyA+ (left), Clutch B PolyA+ (center), and Clutch A rdRNA (right). Note strong temporal correlation between samples, major changes in the transcriptome are visible.
	Fig S2 – Statistics of Absolute Normalization, related to Fig 1, Supplemental Experimental Procedures (A) Absolute normalization of RNA standards. Left: relative normalization in TPM on linear and log scale for the three datasets. The trend shared by all spikes is clearly visible and differs between PolyA+ and rdRNA samples. Right: absolute normalization in transcripts per embryo on log and linear scale. Trend in spikes is removed, the sample noise of the spikes is retained and does not contaminate the absolute normalization of native transcripts. (B) Absolute normalization factors. Circles mark per sample factors, lines mark Gaussian Process smoothed factors. Left: Estimates of 𝛽𝑗 (Supplemental Experimental Procedures), for three datasets. Center-left: per dataset correction factors e−𝛽𝑗, note discrepancy in magnitude but not in trend between datasets. Center-right: correction for PolyA+ bias, and averaging Clutch A and Clutch B spike levels results in excellent agreement between the correction factors. Right: Smoothed correction factors calculated with non-stationary Gaussian Process (Supplemental Experimental Procedures). (C) Absolute normalization consistency and accuracy. Scatter plots of ERCC standards spiked vs calculated for Clutch A PolyA+, Clutch A rdRNA, Clutch B PolyA+ and Clutch A/Clutch B PolyA+ combined. Points are averaged over all samples in time to remove spikein sample noise. ArrayControl spikes are given with fold change errors for Clutch B PolyA+, fold changes range from 1.11-1.25. ArrayControl spikes EC02 and EC12 have are spiked in with the same copy number and are marked with the same vertical line.Combined Clutch A/Clutch B gives 95% confidence interval for a linear model with Gaussan noise, with slope unity and with standard deviation 𝜎𝑠 = 0.25. Note that residuals do not depend on expression level. (D) Absolute normalization uncertainty model. Left: model as described in C (far-right), with 25%, 50%, 75%, 95% confidence intervals for 106 transcripts. Right: Propagation of uncertainty model to absolute normalization of eef1a1o Clutch A polyA+, again 25%, 50% 75% and 95% of confidence intervals for the true number of eef1a1o transcripts/embryo marked. (E) RNA-seq detection limits – Transcript abundance required to produce a single read with time, detection limits increase as RNA content in the embryo increases. Top row: Detection limits in kb, points give the per sample detection limits, lines and shaded area give Gaussian Process median and 95% confidence interval. Bottom: Detection limit in transcripts/embryo, calculated by averaging kb Gaussian Process distribution over all transcript lengths. (F) The depletion of non-ribosomal RNAs in rdRNA sequencing. Comparisons between Clutch A PolyA+ and rdRNA sequencing (top). Examples of extreme depletion: mixer and foxh1.2.
	Fig S3 â€“ Spatial expression of selected genes by in situ, related to Fig 1 All panels give Clutch A PolyA+ (circles), Clutch B PolyA+ (triangles) and Gaussian Process median line and 95% confidence intervals. Numbers on in situ images indicate the Gaussian process median number of transcripts for the given stage. (A) Genes involved in muscle induction. (B) Genes with onset midway through the timecourse. (C) Genes with onset late in the timecourse.
	Fig S4 – Genes with constant expression – loading control candidates, related to Table S3 and Supplemental Experimental Procedures (A) Histogram of fold change envelope between maximum and minimum Gaussian Process median abundances in transcripts per embryo for all genes for Clutch A PolyA+ and rdRNA. Candidates for loading control with a fold change < 2 are highlighted. Histogram displays up to 8-fold envelope. (B) Traditional loading controls gapdh, odc1, eef1a1 have large fold changes between maximum and minimum expression. Fold change envelopes indicated. (C) Candidates for loading controls with maximum/minimum fold changes < 2. Full list in Table S3.
	Fig S5 – Differential Isoform Dynamics in Clutch A polyA+, related to Fig 3. (A) Explanation and detection of differential isoform dynamics. Left: isoform expression in transcripts/embryo. Center: Joint model of all isoforms assumed to have the same expression on z-normalized scale (see Supplemental Experimental Methods). Right: Models of all isoforms considered separately. If Bayesian Information Criterion (BIC) favors the joint model, the gene has differential abundance, whereas if the BIC favors the individual models a gene has differential dynamics. eef1a1 has differential abundance, nek2 and lima1 have differential dynamics. (B) Differential dynamics of groups of isoforms, BIC graph (left) between all lima1 isoforms. lima2(1) and lima2(2) have differential abundance, and they have differential dynamics with lima2(3) defining two groups of isoforms (right).
	Fig S6 â€“ Temporal synexpression and gene ontology in Clutch A PolyA+, related to Fig 4. (A) Sliding window gene ontology enrichment. Bottom: Heatmap of all genes (Fig 4) arranged by similarity. Middle: The location of all transcripts factors, blue line marks the number of transcription factors per 500 gene window. Red line marks the expected number of transcripts for any 500 gene window. Top: -log(p) chi-squared test for the association between the window position and the transcription factor density. A region that exceeds â€“log(p) > 5 is considered for annotation (Supplemental Experimental Procedures). (B) Somite Synexpression. Examples from the somite temporal synexpresssion group. Ranks give similarity to ckb between 34-66 hpf (Supplemental Experimental Procedures). Image source given at bottom.
	Fig S7 â€“ Early transcription events, related to Fig 6. (A) Stage 9 RNA pol II ChIP - Pile ups on mir427, eef1a1o, eef1a1, krt5.7 and cirbp loci of RNA pol II ChIP-seq reads (van Heeringen et al., 2014). See Supplemental Experimental Procedures for calculation of mir427 pile up. (B) Accumulation rates in kb/min/allele averaged over all cells in the embryo â€“ for nodal3, nodal5, nodal6, sia1, sia2, klf17, admp and ventx3.1. Left column: kb/embryo. Center column: Detection of activation of each gene during the cleavage stages. Note square root vertical scale. Vertical gray dashed lines indicate cell divisions. Horizontal dashed line gives average detection limit of 970kb for the cleavage stages. Right column: Accumulation rate in kb/min/allele, rates account for exponential increase in cell numbers and are correct if each transcript were transcribed in all cells (Supplemental Experimental Procedures).
	ace2 (angiotensin I converting enzyme 2) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 43, lateral view, anterior right, dorsal up.
	btbd3 (BTB (POZ) domain containing 3) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 33, lateral view, anterior right, dorsal up.
	cpb1 (carboxypeptidase B1) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 41, lateral view, anterior right, dorsal up.
	ctrb1 (chymotrypsinogen B1) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 41, lateral view, anterior right, dorsal up.
	ctrl (chymotrypsin-like) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 42, lateral view, anterior right, dorsal up.
	g6pc2 (glucose-6-phosphatase, catalytic, 2) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 29, lateral view, anterior right, dorsal up.
	hbe1 (hemoglobin subunit epsilon 1) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 29, lateral view, anterior right, dorsal up.
	mep1a (meprin A subunit alpha) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 43, lateral view, anterior right, dorsal up.
	mep1b (meprin A subunit beta) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 43, lateral view, anterior right, dorsal up.
	thdl17(thyroid hormone down-regulated protein (gene 17) ) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 43, lateral view, anterior right, dorsal up.
	thdl18 ( thyroid hormone down-regulated protein (gene 18) ) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 40, lateral view, anterior right, dorsal up.

References [+] :

Aanes, Zebrafish mRNA sequencing deciphers novelties in transcriptome dynamics during maternal to zygotic transition. 2011, Pubmed