|
Data generation, mRNA content of the embryo and gene expression dynamics, see also Fig S1, S2
A – Experimental design and sample collection
B – Left: eef1a1o and ERCC-00171 abundance in relative normalization (TPM), Clutch A polyA+. RNA standard decreases with time. Right: absolute normalization of left. Red line indicates ERCC-00171 transcripts added based on the manufacturer’s datasheet. The RNA standard is constant at the correct abundance.
C – Total mRNA (PolyA+ and rdRNA) in the embryo in nanograms with time. Gray region and line respectively mark Gaussian process 95% confidence interval (CI) and median of Clutch A & B PolyA+ data.
D – Dynamics of gene expression of PolyA+ RNA in transcripts per embryo. Circles mark Clutch A PolyA+, triangles mark Clutch B PolyA+. The horizontal axis marks time (bottom) and NF developmental stage (top). Shaded regions mark Gaussian process 95% CIs of the data.
|
|
Clutch A vs Clutch B PolyA+ Differential Expression
A – Clutch A vs Clutch B histogram of differential expression scores as measured by the Bayesian Information Criterion (BIC), larger scores indicate greater differential expression (Supplemental Experimental Procedures). Three regions marked: 1) no differential expression BIC ≤ 0, 2) differential expression small effect 0 < BIC ≤ 60, 3) differential expression large effect BIC > 60. See B for explanation of thresholds.
B – Differential expression effect size as measured by the log10 mean overlap between Clutch A and Clutch B Gaussian process models (Supplemental Experimental Procedures). Mean overlap decreases with increasing BIC. BIC = 0, 60 and mean overlaps of 60%, 10% and 2.5% marked. At BIC = 60 approximately all genes have less than 10% overlap, and all genes with less than 2.5% overlap have BIC > 60.
C – Differential expression examples with decreasing BIC (top right). Genes on the boundary for strong differential expression have highly correlated expression profiles in A and B. Vertical lines mark convergences (red) or divergences (black).
D – Left: Examples of convergence (dhx32) and divergence (Xetro.K05169). Right: Histogram of convergence and divergence times.
|
|
Isoform Differential Expression in Clutch A, see also Fig S5
A – Isoform switching event examples: maternal to maternal; maternal to zygotic; and zygotic to zygotic switching events (left). Histogram of isoform switching events by category and time (right).
B – bicd2 isoforms switch in PolyA+ data (circles), and do not switch in rdRNA data (squares) indicating differential polyadenylation. PolyA+ and rdRNA abundances agree within uncertainty of the absolute normalization (Supplemental Experimental Procedures).
C – Normalized read depth on bicd2 locus in Clutch A polyA+ between 0–4.5 hpf. Note temporal dynamics shared by the final three bicd2(1) exons. Heatmap and depth of pile ups corrected for changing total mRNA.
D – Spatial expression of three genes with isoforms with differential dynamics. All isoforms show different temporal and spatial expression domains. Grey lines and points mark total expression for each gene (sum of the red and blue isoforms).
|
|
Temporal Synexpression in Clutch A PolyA+, see also Fig S6
A – Temporal map of the transcriptome: an enumeration of all gene expression dynamics in the embryo. Heat maps display all genes normalized by maximum expression and ordered by similarity. Inset (right) expands on genes transiently expressed with early onset. Vertical bars mark Gene Ontology (GO) enrichment (colors correspond to GO terms), numbers appended to GO bars indicate the number of genes of given category. All reported GO blocks are enriched with p < 2×10−4 (Fisher Exact) for entire block. S1, S2 and V1, V2 mark the location of somite and vision temporal synexpression genes respectively in B. “Transcription Factors *” labels an annotation of transcription factors separate to GO (Supplemental Experimental Methods).
B – Somite (left) and Vision (right) synexpression groups.
|
|
Transcript copy numbers and kinetics per embryo in Clutch A
A – Transcript copy number histogram in polyA+ sequencing at 0.0 (egg), 4.5 hpf (early stage 9), 10 hpf (stage 12.5, end of gastrulation), 66 hpf (tadpole stage 42). Detection limits give number of transcripts required to generate a single read averaged over all transcript lengths (Fig S2E). Transcript copy number distributions transition smoothly between these timepoints (data not shown).
B – Histogram of maximum rates of increase in transcripts/hour between any consecutive measurement points calculated from Gaussian process medians for each gene. Blue – PolyA+ 0–23.5 hpf; Red – rdRNA 0–23.5 hpf; Green – PolyA+ 24–66 hpf. Vertical lines mark the gene with max. rate of increase for each category.
C – As B but kb/hour.
D – Maximum rate of increase in transcripts/hour (left) and kb/hour (right) for 0–3.5 hpf and 4–23.5 hpf time intervals. PolyA+ and rdRNA distributions are discrepant for 0–3.5 hpf reflecting polyadenylation of maternal RNAs. Distributions are identical over 4–23.5 hpf.
|
|
Accumulation kinetics of pri-mir427 per allele in Clutch A rdRNA, see also Fig S7
A – pri-mir427 expression over first 12 hpf in rdRNA data and in situ hybridizations. pri-mir427 show ubiquitous expression matching RNA-seq temporal data. Examples of most abundant transcripts shown for comparison. Inset: first detection of pri-mir427 above detection limit at 2.0 hpf (8–16 cell transition).
B – pri-mir427 abundance in kb per cell. Line marks Gaussian process median, error bar is 95% CI (Supplemental Experimental Procedures). Peak per cell abundance occurs after 11th division at 4.4 hpf.
C – pri-mir427 accumulation rate in kb/min/allele with time. Line is median of differential of Gaussian Process, error bar gives 95% CI (Supplemental Experimental Procedures). Peak accumulation rate is achieved at 4.2 hpf just after 10th cell division.
D – Organization of pri-mir427 locus on X. tropicalis Chromosome 3 (X. tropicalis v8 genome, Supplemental Experimental Procedures). Clusters of mir427 hairpins appear in tandem with a repeated sequence arranged symmetrically on opposite strands around a gap in the genome assembly.
|
|
Characteristic Timescales of Gene Expression in Clutch A PolyA+
A – Distribution of max. rate of increase of all transcripts (gray) (as Fig 5), transcription factors (blue), and ribosomal proteins rps* and rpl* (red).
B – Theoretical covariance (Matérn covariance function, Supplemental Experimental Procedures) between two measurements by time interval for example genes in C.
C – Histogram of characteristic timescale over all genes in in Clutch A polyA+. Top row annotates example genes (timescale inset), fast/short timescales are on the left slow/long timescales on the right, Gaussian process 95% CI (gray) and median (blue) marked. Bars below give GO enrichment (calculated as Fig 2, Fig S6A) of histogram, numbers indicate total genes with given category. All enrichments have p < 2.6×10−5 (Table S4). “Transcription Factors *” labels an annotation of transcription factors separate to GO (Supplemental Experimental Methods).
|
|
Fig S1 – RNA-seq data quality, related to Fig 1, Supplemental Experimental
Procedures
(A) Spearman correlation between all pairs of all samples for relative normalized TPM
abundances. Neighboring samples are as correlated as replicates.
(B) Scatter plots of all TPM abundances for all samples. Color indicates density of points.
Horizontal and vertical strips indicate measurements only present in one sample.
(C) RNA standard performance. Scatter plots compare RNA standard TPM abundances
between all pairs of datasets. Color indicates spike-in species. Red boxes mark ArrayControl
spikes present only in Clutch B PolyA+. Although, we find small numbers of reads that align
uniquely to ArrayControl sequences when ArrayControl spikes are not present in the sample.
We note that this is not true for ERCC spikes, we never find ERCC reads when ERCC
spikes are not present (data not shown). ERCC PolyA+ and rdRNA comparisons lie off the
diagonal indicating poor PolyA+ performance. ERCC-00116 has the worst PolyA+
performance and is not used in absolute normalization.
(D) Identification of poorly performing libraries. Normalized leave-one-out score
(Supplemental Experimental Procedures), higher scores indicate a poorly performing
sample. Samples adjacent to a bad sample can also receive high scores. Red and blue
samples in Clutch A PolyA+ are selected for repeat library construction and sequencing (E).
Red samples excluded from all analysis.
(E) Technical replicates in Clutch A PolyA+ of poorly performing samples reveal that the
poor performance is highly reproducible indicating that performance issues are not due to
library construction and sequencing. Red boxes marked those discarded, black boxes
marked those retained.
(F) Projections onto the first two principle components for Clutch A PolyA+ (left), Clutch B
PolyA+ (center), and Clutch A rdRNA (right). Note strong temporal correlation between
samples, major changes in the transcriptome are visible.
|
|
Fig S2 – Statistics of Absolute Normalization, related to Fig 1, Supplemental
Experimental Procedures
(A) Absolute normalization of RNA standards. Left: relative normalization in TPM on linear
and log scale for the three datasets. The trend shared by all spikes is clearly visible and
differs between PolyA+ and rdRNA samples. Right: absolute normalization in transcripts per
embryo on log and linear scale. Trend in spikes is removed, the sample noise of the spikes
is retained and does not contaminate the absolute normalization of native transcripts.
(B) Absolute normalization factors. Circles mark per sample factors, lines mark Gaussian
Process smoothed factors. Left: Estimates of 𝛽𝑗 (Supplemental Experimental Procedures),
for three datasets. Center-left: per dataset correction factors e−𝛽𝑗, note discrepancy in
magnitude but not in trend between datasets. Center-right: correction for PolyA+ bias, and
averaging Clutch A and Clutch B spike levels results in excellent agreement between the
correction factors. Right: Smoothed correction factors calculated with non-stationary
Gaussian Process (Supplemental Experimental Procedures).
(C) Absolute normalization consistency and accuracy. Scatter plots of ERCC standards
spiked vs calculated for Clutch A PolyA+, Clutch A rdRNA, Clutch B PolyA+ and Clutch
A/Clutch B PolyA+ combined. Points are averaged over all samples in time to remove spikein
sample noise. ArrayControl spikes are given with fold change errors for Clutch B PolyA+,
fold changes range from 1.11-1.25. ArrayControl spikes EC02 and EC12 have are spiked in
with the same copy number and are marked with the same vertical line.Combined Clutch
A/Clutch B gives 95% confidence interval for a linear model with Gaussan noise, with slope
unity and with standard deviation 𝜎𝑠 = 0.25. Note that residuals do not depend on expression
level.
(D) Absolute normalization uncertainty model. Left: model as described in C (far-right), with
25%, 50%, 75%, 95% confidence intervals for 106 transcripts. Right: Propagation of
uncertainty model to absolute normalization of eef1a1o Clutch A polyA+, again 25%, 50%
75% and 95% of confidence intervals for the true number of eef1a1o transcripts/embryo
marked.
(E) RNA-seq detection limits – Transcript abundance required to produce a single read with
time, detection limits increase as RNA content in the embryo increases. Top row: Detection
limits in kb, points give the per sample detection limits, lines and shaded area give Gaussian
Process median and 95% confidence interval. Bottom: Detection limit in transcripts/embryo,
calculated by averaging kb Gaussian Process distribution over all transcript lengths.
(F) The depletion of non-ribosomal RNAs in rdRNA sequencing. Comparisons between
Clutch A PolyA+ and rdRNA sequencing (top). Examples of extreme depletion: mixer and
foxh1.2.
|
|
Fig S3 – Spatial expression of selected genes by in situ, related to Fig 1
All panels give Clutch A PolyA+ (circles), Clutch B PolyA+ (triangles) and Gaussian Process median line and 95% confidence
intervals. Numbers on in situ images indicate the Gaussian process median number of transcripts for the given stage.
(A) Genes involved in muscle induction.
(B) Genes with onset midway through the timecourse.
(C) Genes with onset late in the timecourse.
|
|
Fig S4 – Genes with constant expression – loading control candidates, related to Table S3 and
Supplemental Experimental Procedures
(A) Histogram of fold change envelope between maximum and minimum Gaussian Process median abundances in transcripts
per embryo for all genes for Clutch A PolyA+ and rdRNA. Candidates for loading control with a fold change < 2 are highlighted.
Histogram displays up to 8-fold envelope.
(B) Traditional loading controls gapdh, odc1, eef1a1 have large fold changes between maximum and minimum expression. Fold
change envelopes indicated.
(C) Candidates for loading controls with maximum/minimum fold changes < 2. Full list in Table S3.
|
|
Fig S5 – Differential Isoform Dynamics in Clutch A polyA+, related to Fig 3.
(A) Explanation and detection of differential isoform dynamics. Left: isoform expression in transcripts/embryo. Center: Joint
model of all isoforms assumed to have the same expression on z-normalized scale (see Supplemental Experimental Methods).
Right: Models of all isoforms considered separately. If Bayesian Information Criterion (BIC) favors the joint model, the gene has
differential abundance, whereas if the BIC favors the individual models a gene has differential dynamics. eef1a1 has differential
abundance, nek2 and lima1 have differential dynamics.
(B) Differential dynamics of groups of isoforms, BIC graph (left) between all lima1 isoforms. lima2(1) and lima2(2) have differential
abundance, and they have differential dynamics with lima2(3) defining two groups of isoforms (right).
|
|
Fig S6 – Temporal synexpression and gene ontology in Clutch A PolyA+, related to Fig 4.
(A) Sliding window gene ontology enrichment. Bottom: Heatmap of all genes (Fig 4) arranged by similarity. Middle: The location
of all transcripts factors, blue line marks the number of transcription factors per 500 gene window. Red line marks the expected
number of transcripts for any 500 gene window. Top: -log(p) chi-squared test for the association between the window position
and the transcription factor density. A region that exceeds –log(p) > 5 is considered for annotation (Supplemental Experimental
Procedures).
(B) Somite Synexpression. Examples from the somite temporal synexpresssion group. Ranks give similarity to ckb between
34-66 hpf (Supplemental Experimental Procedures). Image source given at bottom.
|
|
Fig S7 – Early transcription events, related to Fig 6.
(A) Stage 9 RNA pol II ChIP - Pile ups on mir427, eef1a1o, eef1a1, krt5.7 and cirbp loci of RNA pol II ChIP-seq reads (van
Heeringen et al., 2014). See Supplemental Experimental Procedures for calculation of mir427 pile up.
(B) Accumulation rates in kb/min/allele averaged over all cells in the embryo – for nodal3, nodal5, nodal6, sia1, sia2, klf17,
admp and ventx3.1. Left column: kb/embryo. Center column: Detection of activation of each gene during the cleavage stages.
Note square root vertical scale. Vertical gray dashed lines indicate cell divisions. Horizontal dashed line gives average detection
limit of 970kb for the cleavage stages. Right column: Accumulation rate in kb/min/allele, rates account for exponential increase
in cell numbers and are correct if each transcript were transcribed in all cells (Supplemental Experimental Procedures).
|
|
ace2 (angiotensin I converting enzyme 2) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 43, lateral view, anterior right, dorsal up.
|
|
btbd3 (BTB (POZ) domain containing 3) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 33, lateral view, anterior right, dorsal up.
|
|
cpb1 (carboxypeptidase B1) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 41, lateral view, anterior right, dorsal up.
|
|
ctrb1 (chymotrypsinogen B1) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 41, lateral view, anterior right, dorsal up.
|
|
ctrl (chymotrypsin-like) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 42, lateral view, anterior right, dorsal up.
|
|
g6pc2 (glucose-6-phosphatase, catalytic, 2) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 29, lateral view, anterior right, dorsal up.
|
|
hbe1 (hemoglobin subunit epsilon 1) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 29, lateral view, anterior right, dorsal up.
|
|
mep1a (meprin A subunit alpha) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 43, lateral view, anterior right, dorsal up.
|
|
mep1b (meprin A subunit beta) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 43, lateral view, anterior right, dorsal up.
|
|
thdl17(thyroid hormone down-regulated protein (gene 17) ) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 43, lateral view, anterior right, dorsal up.
|
|
thdl18 ( thyroid hormone down-regulated protein (gene 18) ) gene expression in Xenopus tropicalis embryo, assayed via in situ hybridization, NF stage 40, lateral view, anterior right, dorsal up.
|