Tolonen M et al. (2026), Single-cell morphodynamics predict cell fate de...

XB-ART-61839

Mol Syst Biol 2026 May 11; doi: 10.1038/s44320-026-00212-x.

Show Gene links Show Anatomy links

Single-cell morphodynamics predict cell fate decisions during mucociliary epithelial differentiation.

Tolonen M, Xu Z, Beker O, Kapoor V, Dumitrascu B, Sedzinski J.

???displayArticle.abstract???
Cell state transitions underlie the emergence of diverse cell types and are traditionally defined by changes in gene expression. Yet these transitions also involve coordinated shifts in cell morphology and behavior, which remain poorly characterized in densely packed epithelia. We developed a quantitative live-imaging and computational framework to track thousands of individual cells over time in the rapidly differentiating Xenopus mucociliary epithelium (MCE). From segmentations and trajectories, we extracted dynamic features-cell and nuclear shape, movement, and position-to create a time-resolved morphodynamic dataset spanning the full course of differentiation. While single features showed high noise and low separability of ground-truth cell types, supervised machine learning revealed that integrating time-resolved features improves the prediction of final cell fate. Gradient-boosted trees and multinomial logistic regression achieved moderate but consistent accuracy, especially for abundant epithelial lineages. Key discriminants included normalized Z position, membrane-nucleus offset, and absolute experimental time, whereas movement contributed minimally to the results. Our data show that morphodynamic signatures encode predictive information about cell identity and provide a framework linking cellular dynamics with molecular state.

???displayArticle.pubmedLink??? 42115437
???displayArticle.link??? Mol Syst Biol
???displayArticle.grants??? [+]

Species referenced: Xenopus laevis
GO keywords: epidermal cell fate specification [+]
???displayArticle.antibodies??? Tp63 Ab2 Tuba1b Ab3 Tuba4b Ab4

???attribute.lit??? ???displayArticles.show???

	Figure 1. Experimental overview of cell fate prediction. (A) Animal caps are cut from Xenopus embryos at NF stages 8–9. (B) The prospective MCE undergoes morphogenetic shaping and cell fate decisions during development into differentiated tissue. (C) Immunostained differentiated tissue apical surface at NF stage 32. α-tubulin labels multiciliated cells (MCC) (green) and lectin (orange) labels goblet cells, and small secretory cells (SSC) (granular staining). Ionocytes (IC) are distinct for their lack of labeling. Arrowheads and labels mark single cells of each cell type. Scale bar: 50 µm. (D) Nuclei and membrane labeling (H2B-RFP and mem-mNeonGreen, respectively) of developing tissue at 0, 8, and 22 h. Overall image scale bar: 300 µm, zoomed-in scale bar: 50 µm. (E) Image analysis pipeline from live image acquisition to cell fate prediction. Membrane and nucleus objects are achieved via 3D segmentation, and cell shapes and movement are quantified. Single-cell features are then used to classify cell fate using ground-truth data from cell immunolabeling.
	Figure 2. Segmentation and tracking overview. (A, B) Nuclei and membrane stacks (raw for nuclei, CARE denoised for membrane) are used as inputs for StarDist (nuclei) and Cellpose (membrane). For Cellpose, segmentation is performed slicewise, and 3D objects are reconstructed based on maximal slice-to-slice overlap (IoU). Close-up panels are single z slices (A). (C) Segmented nuclei and membrane object labels. (D) Cell trajectories are generated by tracking nuclei objects, which are then assigned membrane objects based on the closest object centroid matching. Tracks are initially generated with TrackMate, using StarDist-labeled objects as input, and trajectory branching points are imposed at cells classified as dividing by Oneat using the Fiji plugin TrackMate-Oneat. (E) Resulting cell trajectories at 0, 10, and 20 h. Scale bars for (A, C, E): overall image 300 µm, close-up 50 µm.
	Figure 3. Principal component analysis of single-cell features. (A) Each constructed trajectory in time consists of single cells with shape, nucleus shape, movement, and positional features. Scale bar: 300 µm. (B, C) Membrane and nucleus shapes of example trajectories at 8 and 16 h, with the corresponding point cloud to 16 h. 3D shape measurements of both cell membranes and nuclei are derived from point clouds. For membranes, additional 2D shape descriptor features are calculated from the object center slice. (D) 1st and 2nd principal components of membrane shape, nucleus shape, movement, and all features’ feature space. (E) Feature correlations with principal components. For all features, only the top five correlating features for PC1 and PC2 are shown. (F) Selected feature trends across developmental time for all cells. (G) Spearman rank correlations of each feature with time, calculated per trajectory (n = 4495 from Dataset 2), and a boxplot shows the distribution of correlations. Box center line shows 50th percentile of the data. Box bounds are set at the 25th percentile (Q1, lower bound); and the 75th percentile (Q3, higher bound). Whiskers extend to Q1–1.5IQR and Q3 + 1.5IQR (IQR, interquantile range). Outliers beyond the whiskers are not shown. Asterisks denote features with significant nonzero correlation after FDR correction (***P < 0.001) (Wilcoxon signed-rank test across trackwise correlations).
	Figure 4. Backtracked cell types’ feature distributions. (A, B) Live-imaged cells are fixed at the end of the experiment and immunolabeled for MCE cell types (A). Cell fate is assigned to a track based on manual annotation of the final timepoint cell type, determined by cell type-specific immunolabelling pattern (B, cyan: Tp63, yellow: lectin PNA, green: ɑ-tubulin). Arrowheads show example positions of manually annotated cells. Scale bar: 30 µm. (C) Tracks with cell fate labeling, overlaid against the last timepoint. Scale bar: 300 µm. (D) Mean and s.d. of selected features, colored by cell type. (E) Membrane shape of randomly selected cells, one per cell type, at 1 h, 4 h, 8 h, 16 h, and 22 h. (F) Single-cell features of randomly selected cells across time, one per cell type.
	Figure 5. Classifying cell fate using XGBoost and logistic regression. (A) Feature vectors of single cells are used as independent features, and cell fate as a dependent feature, for training a classifier model, based on either XGBoost or multinomial (softmax) logistic regressor. Then, each cell in the test dataset is assigned a cell fate based on its feature vector. (B, C) Confusion matrices for the test dataset, cell-fate prediction vs. ground-truth, for the XGBoost and logistic regression models, respectively. (D) Example test dataset trajectories showing single-cell predictions per track. (E, F) Percent point-wise accuracy difference to baseline accuracy of the XGBoost model for leaving out a feature category (E) or a single feature (F, only the top feature per feature category is shown). (G) Feature coefficients for a logistic regressor model, only the top feature per category is shown. All calculations in (B, C, E, F, G) are averaged over 20 iterations of a randomized test–train split, model training, and prediction.
	Figure EV1. Supplemental data for principal component analysis. (A) PCA feature spaces of dataset 1. (B) Individual cell trajectories in feature spaces fit for dataset 2. (C) Timewise positional features and individual trajectories’ positional features for dataset 2. (D) All features’ correlation with all feature PC1 and PC2.
	Figure EV2. Backtracked cell types in PCA feature space. (A) Separability metrics for feature categories. Mean accuracy score (KNN accuracy) based on KNearestClassifier (scikit-learn) of different feature categories, where k = 5, Adjusted Rand Index score (KMeans ARI), Normalized Mutual Information score (KMeans NMI) for comparing k-means clustering of selected feature spaces (corresponding to (B)) with ground-truth cell type labels. Average silhouette width (ASW) is calculated for different feature categories against ground-truth cell-type labels. (B) Cell types labeled in feature spaces corresponding to Fig. 3D and Appendix Fig. S2C. (C) Kernel density estimation of time windowed cells in the all features PCA space, estimated per cell type.
	Figure EV3. Comparison of high-resolution imaging and live imaging of the membrane signal and feature space of an extended shape feature set. (A) Top: Representative example of an animal cap region showing apical and basal Z-projections and an orthogonal XZ slice from the reconstructed high-resolution fixed-tissue dataset, generated by registration and combination of apical and basal image stacks. Bottom: representative example of apical and basal Z projections and orthogonal XZ slice from the final timepoint of a live-imaging experiment (Dataset 2). Scale bar: 30 µm. (B) Schematic of the extended membrane feature set used for hand-annotated cells and Live-imaging final timepoint comparison, including 2D membrane descriptors extracted from apical, central, and basal slices and 3D point-cloud–based descriptors for membrane and nucleus shapes. (C) Left: PCA of nuclear shape features from the same manually annotated fixed dataset, right: PCA of membrane shape features derived from manually annotated high-resolution fixed samples. (D) PCA of membrane shape features derived from automatically segmented cells at the final time point of live imaging (Dataset 2). (E, F) Correlation matrices of PCA loadings for membrane features from the fixed high-resolution dataset (E) and live-imaging final timepoint dataset (F).
	Figure EV4. Classifier model performance analysis and feature importances. (A) Classifier training dataset sample counts across dataset time, binned every 2 h, mean across n = 20 independent iterations of randomized train–test sampling. Error bars represent the standard deviation (±s.d.) across iterations. For basal, goblet, and MCC classes, only real samples were used; for IC and SSC classes, some samples were synthesized using SMOTE oversampling (imbalanced-learn). (B) Mean and s.d. prediction confidence of predicted class of XGBoost-based predictions, colored by class. (C) Time-wise accuracy confusion matrix of XGBoost-based predictions. (D) Percent point-wise accuracy difference to baseline accuracy of the XGBoost model for leaving out a single feature. (E) Feature coefficients for a logistic regressor model. All calculations in (B–E) are averaged over 20 iterations of a randomized test–train split, model training, and prediction.
	Figure EV5. Effect of global versus local feature normalization on XGBoost model performance and feature importance. (A) Schematic depicting the compared normalization strategies. Left: In global normalization, cell features are standardized across all cells and all experimental timepoints. Right: In the local normalization strategy, k= 15 nearest neighbors in the cell experimental timepoint are used for min–max normalizing feature values. (B) Confusion matrix showing the difference in XGBoost prediction accuracy between the local and global normalization strategy. (C, D) Change in overall accuracy, balanced accuracy, and per cell-type accuracy upon removal of individual features under the global normalization strategy and under the local normalization strategy (D). The five highest- and lowest-accuracy features are shown in each heatmap. (E, F) Change in overall accuracy, balanced accuracy, and per cell-type accuracy upon removal of entire feature categories (position, membrane shape, nucleus shape, movement, neighbor feature distance) under the global normalization strategy (E) or the local normalization strategy (F).