|
Graphical Abstract.
|
|
Figure 1. Filtering of replication patterns of single DNA molecules from DNA combing experiments during unperturbed S phase by an auto-correlation function C(r, f). (A) Workflow of DNA combing experiments in the Xenopus in vitro system. Sperm nuclei were incubated in egg extract in the presence of biotin-dUTP; replication reactions were stopped at different times during the unperturbed, naturally synchronous S phase; and DNA was purified and stretched onto coverslips. Replicated tracks on single DNA fibers were revealed by fluorescence microscopy after immunolabeling [replication eyes (red) and DNA molecule (green)]. (B) Auto-correlation function C(r) measures the spatial regularity of replication patterns of a DNA molecule and its shifted copies as a function of the lag distance r. (C) C(r, f) profile for a simulated dataset (mean with error, blue) for constant I(f) (0.03 kb−1 min−1) and constant \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
$v$\end{document} (1 kb/min) and fit (red) with one process at different bins of replicated fractions f. (D) Mean simulated C(r, f) profiles with standard deviation for three independent control DNA combing experiment at different bins of replicated fractions f and fit with one process using the KJMA model. Bins of replicated fraction are f1 = (0, 0.11], f2 = (0.11, 0.21], f3 = (0.21, 0.32], f4 = (0.32, 0.42], f5 = (0.42, 0.54], f6 = (0.54, 0.64], and f7 = (0.64, 0.75].
|
|
Figure 2. Hierarchical classification of DNA molecule’s replication patterns. (A) Similarity matrix between molecule’s C(r) for different replication fractions f. Each cell in the matrix represents the similarity s = 1 − pair-wise correlation coefficient between the C(r) of two molecules in the replication bin. A score s = 0 represents two perfectly similar C(r), and a score s = 1 represents two completely different C(r). (B) Pareto chart of the percent variability explained by each principal component. Each chart corresponds to a different interval of replicated fractions from 0 to 0.75; the average replicated fraction is reported on top. In each chart, the bars represent the percentage of variance described by the relative principal component in descending order. The line above represents the cumulative total. (C) Mean C(r, f) profiles for molecules hierarchically classified into two clusters, as suggested by the average silhouette value (Supplementary Figure S1). The C(r, f) of the category containing the smaller number of molecules is in cluster 1, and the C(r, f) of the category containing the larger number of molecules is in cluster 2. The fraction of molecules in each cluster is reported in Supplementary Figure S2. Error bars are standard deviations.
|
|
Figure 3. Categorization of DNA molecule’s replication patterns into two dynamical processes by RepliCorr. (A) Fit (red curve) of mean C(r, f) profile (blue curve, with standard deviation from three independent experiments) for each replication fraction bin calculated as C(r, f) = Θ(f)C1(r, f) + [1 − Θ(f)]*C2(r, f). The green curve is the correlation profile produced by the fast fork process [C1(r, f), model 1], and the black curve is the correlation profile produced by the slow fork process [C2(r, f), model 2]. The error bars are standard deviations. The inset is the Θ(f) profile. (B) Normalized correlation coefficients (ρ1, ρ2) between the molecule’s C(r, f) and C1(r, f) and C2(r, f) were calculated. The similarity distance between the molecule and each process was defined as 1 − ρ1 for the fast process 1 and 1 − ρ2 for the slow process and represented on a two orthogonal axis plot. The diagonal represents points of equal similarity to the two processes. Points above the diagonal are more similar to the fast process and points below are more similar to the slow process. (C) Representative fibers with replication patterns in the fast and slow process in the same replicated fraction bins. Bins of replicated fraction are f1 = (0, 0.11], f2 = (0.11, 0.21], f3 = (0.21, 0.32], f4 = (0.32, 0.42], f5 = (0.42, 0.54], f6 = (0.54, 0.64], and f7 = (0.64, 0.75].
|
|
Figure 4. RepliCorr analysis of the HOMARD experiments in the Xenopus in vitro system. (A) Workflow of the experiment using HOMARD: Sperm nuclei were incubated in egg extracts in the presence of AF647-aha-dUTP, stopped in early (35 min) and late S phase (120 min), DNA was isolated and separated in Irys system with Yoyo-1 stain. Example DNA fiber with Bionano Irys system, Yoyo-1, whole DNA stain, small replication tracks (=initiations) labeled directly by AF647-aha-dUTP, early S phase (35 min), size bar 20 kb. (B) Normalized correlation coefficients (ρ1, ρ2) between the fibers’ C(r,f) and C1(r,f) and C2(r,f) were calculated. The similarity distance between the fiber and each process was defined as 1 − ρ1 for the fast process and 1 − ρ2 for the slow process and represented on a two-orthogonal axis plot. The diagonal represents points of equal similarity to the two processes;.
|
|
Figure 5. Strong modification of the replication patterns after Plk1 depletion but less after Chk1 inhibition, overexpression or Rif1 depletion. (A) An outline of experimental workflow: sperm nuclei were incubated in Chk1 inhibited or Chk1 overexpressed egg extracts or Rif1 or Plk1 immunodepleted egg extracts in the presence of biotin-dUTP. Genomic DNA was isolated at different times during the S phase, subjected to combing analysis and further analyzed by RepliCorr. Normalized correlation coefficients (ρ1, ρ2) between the fiber’s C(r, f) and C1(r, f) and C2(r, f) were calculated. The similarity distance between the fiber and each process was defined as 1 − ρ1 for the fast process and 1 − ρ2 for the slow process and represented on a two orthogonal axis plot. The diagonal represents points of equal similarity to the two processes. (B) Chk1 inhibition by UCN-01 (independent experiments n = 2). (C) Chk1 overexpression (n = 2). (D) Rif1 depletion (n = 2) and (E) Plk1 depletion (n = 3).
|
|
Figure 6. Plk1 depletion strongly increases fork speed and decreases initiation frequency in the slow process. (A) Comparison of initiation frequency and fork speed for the fast and slow replication process after Plk1 or Rif1 depletion and Chk1 inhibition or overexpression. Mean fold change of experiment/control for initiation frequencies per replicated fractions and fork speeds per experiment/control are shown for each replication mode, values from Supplementary Figure S10 and Supplementary Table S3, respectively. (B) Model of a Plk1-dependent regulation of the spatial replication program by a fast and slow replication process in Xenopus: in the presence of Plk1, two different replication patterns on DNA molecules can be distinguished, characterized by different fork speeds and initiation rates, leading to a non-uniform pattern of origin activation. Upon Plk1 depletion, origin activation along the genome becomes more homogeneous; the slow replication mode approaches the fast mode.
|
|
Fig. 1. Nucleation and growth in one dimension. The stable domains that
grow from multiple nucleation sites are unions of triangles in the spacetime representation; the resulting region is highlighted in grey.
|
|
Fig. 2. Representation of the one dimensional causal cone for the point
(x, t).
|
|
Fig. 3. Representation of the causal cones for two uncorrelated points.
|
|
Fig. 4. Representation of the causal cones for two correlated points. The
region of overlap (in red) defines the degree of correlation between the
two points.
|
|
Supplementary Figure S1: Clustering evaluation by the average of silhouette values for the first
control set. Each image corresponds to a different interval of replicated fractions from 0 to 75% (Bin 1-
7) and the averaged replicated fraction is reported on the top. The fibers in each interval of replicated
fraction were grouped into two to five clusters and the average silhouette value was calculated for each
configuration. The silhouette value for a single fiber in a cluster measures how similar its correlation
function is to the correlation functions of other fibers within the same cluster, compared to fibers in
different clusters. The silhouette value ranges from -1 to +1. A value close to 1 indicates that the fiber's
correlation function is much more similar to those within its own cluster than to those in other clusters.
The average silhouette value across all fibers is then computed to assess the overall quality of the
clustering. Mean C(r, f ) profiles for molecules hierarchically classified into two clusters are reported in
Fig. 2C.
|
|
Supplementary Figure S2: Histogram reporting the fraction of fibers in each cluster for the first
control set. Each image corresponds to a different interval of replicated fractions from 0 to 75% (Bin 1-
7) and the averaged replicated fraction is reported on the top. Mean C(r, f ) profiles for molecules
hierarchically classified in the two represented clusters are reported in Fig. 2C.
|
|
Supplementary Figure S3: Comparison between fits with two processes for correlation
profiles with different or same I0,a and v. (A) with different v and same I0. (B) with same v and
different I0 for the two processes. (C) table with fitted parameter values from A and B compared to
the parameters from Figure 3 A; with Ttot being the time in min to replicate a fiber to 75 %. We
highlight values in red those differing from the fit with two different I0 and v. The parameter values of
fork speed and initiation frequency were averaged over 100 trials.
|
|
Supplementary Figure S4: Representative computer reconstructed fibers based on real fibers for the fast and
slow process at the different replicated fractions for the control condition.
|
|
Supplementary Figure S4: Representative computer reconstructed fibers based on real fibers for the fast and
slow process at the different replicated fractions for the control condition.
|
|
Supplementary Figure S5: Replicorr analysis per individual time point from three different experiments:
Fibers from individual in vitro time points with similar mean replication extent compared to Figure 3B were
analysed as in Figure 3B. Normalized correlation coefficients (1, 2) between the molecule's C(r,f) and C1(r,f)
and C2(r,f) were calculated. The similarity distance between the molecule and each process was defined as 1-
1 for the fast process 1 and 1- 2 for the slow process and represented on a two orthogonal axis plot. The red
diagonal represents points of equal similarity to the two processes. Points above the diagonal are more similar
to the fast process and points below are more similar to the slow process. Fibers with three different bins of
replication inside each time point are shown in blue (0.5-25%), black (25-50%) and orange (50-75%). (45 min
Replicate 1 (0.07), 50 min Replicate 2 (0.11), 55 min Replicate 1 (0.26) , (0 min Replicate 2 (0.3), 70 min
replicate 3 (0.44), 85 min Replicate 3 (0,67), 100 min Replicate 3 (0.8)).
|
|
Supplementary Figure S6: Control 32P-dATP incorporation kinetics for the HOMARD experiment: Sperm
nuclei (1320 nuclei/µl), were incubated in the same egg extract than in the HOMARD experiment in the presence
of 32P-dATP and AF647-dUTP, reactions were stopped, DNA was precipitated as described (DeCarli et al., 2017)
and quantified as % of DNA synthesized per input DNA. Time points for the two equivalent incubation times (35,
120 min) analysed in HOMARD experiments are indicated.
|
|
Supplementary Figure S7:
Mean autocorrelation function
C(r,f) profiles (blue curve, with
standard deviation) with fit
(red curve) as in Figure 3A for
different pathways perturbations and corresponding
controls. The green curve is
the correlation profile
produced by the fast fork
process (C1(r,f), model 1),
and the black curve is the
correlation profile produced
by the slow fork process
(C2(r,f), model 2), f. (A) Chk1
inhibition by UCN-01 (n=2).
(B) Chk1 overexpression
(n=2). (C) Rif1 depletion
(n=2). (D) Plk1 depletion
(n=3).
|
|
Supplementary Figure S7:
Mean autocorrelation function
C(r,f) profiles (blue curve, with
standard deviation) with fit
(red curve) as in Figure 3A for
different pathways perturbations and corresponding
controls. The green curve is
the correlation profile
produced by the fast fork
process (C1(r,f), model 1),
and the black curve is the
correlation profile produced
by the slow fork process
(C2(r,f), model 2), f. (A) Chk1
inhibition by UCN-01 (n=2).
(B) Chk1 overexpression
(n=2). (C) Rif1 depletion
(n=2). (D) Plk1 depletion
(n=3).
|
|
Supplementary Figure S8: Similarity distances of the fast and the slow process from control experiments
from Figure 5. Normalized correlation coefficients (ϱ1, ϱ2) between the fiber’s C(r,f) and C1(r,f) and C2(r,f) were
calculated for control conditions in Figure 5 B-D. The similarity distance between the fiber and each process was
defined as 1- ϱ1 for fast process and 1- ϱ2 for slow process and represented on a two orthogonal axis plot. The red
diagonal represents points of equal similarity to the two processes. (A) +DMSO as control condition for Chk1
inhibition by UCN. (B) Control for Chk1 overexpression protein buffer addition. (C) Control depletion for Rif1
experiment.
|
|
Supplementary Figure S9: Comparative partitions of the replication process between slow and fast
process for each replicated fraction under different conditions from Figure 5. (A) Principle of the
analysis: if the two processes have an equal influence on the replication of a fiber, then the fiber is
represented by a point M that lies on the identity line of equation yM=xM. In other words, if the angle alpha
between the x-axis and the radial line OM, connecting the origin (0,0) and the point M, is π/4, then the fast
and slow process equiprobably influences the observed replication pattern. To have a measure of how each
fiber deviates from this equiprobable influence scenario, we construct the variable zeta = π/4 - arctan( yM /xM
), that is a measure how the radial line OM deviates from the identity line. To represent the distribution of
observed zeta variables for a given sample and a specified bin of replicated fraction, we use the Kernel
Density Estimation method to estimate the probability density function of the variable zeta based on the
sample data. (B) We perform this analysis for all replication fraction bins and all conditions. Positive zeta
values correspond to fibers where the slow process mainly drives the replication patterns, while negative zeta
values reflect the preferential influence of the fast process. The Plk1 depletion condition is in the only
condition where both processes are present at all replication fractions. Moreover, in the last replication bin,
both peaks approach 0 after Plk1 depletion but not, or less, in the other conditions.
|
|
Supplementary Figure S10: Fits for correlation functions C(r,f) and associated initiation rates for fast
process (process 1, green curves) and slow process (process 2, black curves) from experiments of (A)
Plk1 depletion and its control. (B) UCN inhibition and control. (C) Chk1 overexpression and control. (D)
Rif1 depletion and control.
|
|
Supplementary Figure S11: Increase of time period to reach near 100 % replication after Plk1
depletion: (A) Sperm nuclei were incubated in control or Plk1-depleted egg extracts in the presence of
a32P-dATP, reactions were stopped at indicated times, DNA was purified and quantified as described in
Ciardo et al., 2020. In order to compare different independent experiments, each experimental curve was
independently fitted to a logistic function 1
1+exp(−𝑏(𝑥−𝑐))
and normalised to the inferred maximum of
incorporation for each experimental condition. The time period in min until 99% of incorporation was
determined. A shift of 13 min in this period was found in the shown respresentative experiment. Similar
calculated shifts were obtained at 95% or 50% of max. incorporation. (B) Mean ratio Plk1/control depletion
time periods from 4 independent experiments (black points), one sample t-test, two tailed, p=0.040. In the
absence of any detectable difference in S phase entry (Ciardo et al., 2020), this observation suggests an
increase in S phase length.
|
|
Supplementary Figure S12: Application of RepliCorr to DNA combing data from S. cerevisiae: (A) outline of
experimental procedure as described in Ma et al., 2012. Budding yeast cells were synchronised with α-factor and
released in the presence of BrdU for 35 min and 105 min, then chased with thymidine; DNA combing analysis was
performed as described.
(B) Mean initiation frequencies (blue points with standard deviation) and autocorrelation function C(r,f) profiles
(blue curve, with standard deviation) with fit (red curve), modeled by either considering two processes or (C) a
single process. The fitness function was defined as the normalised mean square error (NMSE):
σ(𝑦 − 𝑦𝑟𝑒𝑓) ൗ
2 σ(𝑦 − 𝑦ത𝑟𝑒𝑓)
2
. Values close to 0 indicate excellent fits. (D) Values from fits to autocorrelation
profiles. An F-test was used to test whether the second process significantly improve the fit. The test statistic
was calculated as 𝐹 = 𝑑𝑓2 ∗ (χ1 − χ2)Τ 𝑑𝑓1 ∗ χ2 , where 𝑑𝑓1 indicates the difference in degrees of freedom
between the two models, 𝑑𝑓2 the degrees of freedom of the model with two processes, χ1 and χ2 the sum of
squares of misfits for the two models. Small p-values indicates that the addition of the second process
significantly improve the fit. (E) Normalized correlation coefficients (1, 2) between the molecule's C(r,f) and
C1(r,f) and C2(r,f) were calculated. The similarity distance between the molecule and each process was
defined as 1- 1 for the fast process 1 and 1- 2 for the slow process 2 and represented on a two orthogonal
axis plot. The red diagonal represents points of equal similarity to the two processes. Points above the diagonal
are more similar to the fast process and points below are more similar to the slow process. (F) Examples of
reconstructed fibers for both processes at one replicated fraction.
|
|
Supplementary Figure S12: Application of RepliCorr to DNA combing data from S. cerevisiae: (A) outline of
experimental procedure as described in Ma et al., 2012. Budding yeast cells were synchronised with α-factor and
released in the presence of BrdU for 35 min and 105 min, then chased with thymidine; DNA combing analysis was
performed as described.
(B) Mean initiation frequencies (blue points with standard deviation) and autocorrelation function C(r,f) profiles
(blue curve, with standard deviation) with fit (red curve), modeled by either considering two processes or (C) a
single process. The fitness function was defined as the normalised mean square error (NMSE):
σ(𝑦 − 𝑦𝑟𝑒𝑓) ൗ
2 σ(𝑦 − 𝑦ത𝑟𝑒𝑓)
2
. Values close to 0 indicate excellent fits. (D) Values from fits to autocorrelation
profiles. An F-test was used to test whether the second process significantly improve the fit. The test statistic
was calculated as 𝐹 = 𝑑𝑓2 ∗ (χ1 − χ2)Τ 𝑑𝑓1 ∗ χ2 , where 𝑑𝑓1 indicates the difference in degrees of freedom
between the two models, 𝑑𝑓2 the degrees of freedom of the model with two processes, χ1 and χ2 the sum of
squares of misfits for the two models. Small p-values indicates that the addition of the second process
significantly improve the fit. (E) Normalized correlation coefficients (1, 2) between the molecule's C(r,f) and
C1(r,f) and C2(r,f) were calculated. The similarity distance between the molecule and each process was
defined as 1- 1 for the fast process 1 and 1- 2 for the slow process 2 and represented on a two orthogonal
axis plot. The red diagonal represents points of equal similarity to the two processes. Points above the diagonal
are more similar to the fast process and points below are more similar to the slow process. (F) Examples of
reconstructed fibers for both processes at one replicated fraction.
|