Naert T et al. (2020), Maximizing CRISPR/Cas9 phenotype penetrance app...

XB-ART-57328

Sci Rep 2020 Sep 04;101:14662. doi: 10.1038/s41598-020-71412-0.

Show Gene links Show Anatomy links

Maximizing CRISPR/Cas9 phenotype penetrance applying predictive modeling of editing outcomes in Xenopus and zebrafish embryos.

Naert T , Tulkens D , Edwards NA , Carron M , Shaidani NI , Wlizla M , Boel A , Demuynck S , Horb ME , Coucke P , Willaert A , Zorn AM .

???displayArticle.abstract???
CRISPR/Cas9 genome editing has revolutionized functional genomics in vertebrates. However, CRISPR/Cas9 edited F0 animals too often demonstrate variable phenotypic penetrance due to the mosaic nature of editing outcomes after double strand break (DSB) repair. Even with high efficiency levels of genome editing, phenotypes may be obscured by proportional presence of in-frame mutations that still produce functional protein. Recently, studies in cell culture systems have shown that the nature of CRISPR/Cas9-mediated mutations can be dependent on local sequence context and can be predicted by computational methods. Here, we demonstrate that similar approaches can be used to forecast CRISPR/Cas9 gene editing outcomes in Xenopus tropicalis, Xenopus laevis, and zebrafish. We show that a publicly available neural network previously trained in mouse embryonic stem cell cultures (InDelphi-mESC) is able to accurately predict CRISPR/Cas9 gene editing outcomes in early vertebrate embryos. Our observations can have direct implications for experiment design, allowing the selection of guide RNAs with predicted repair outcome signatures enriched towards frameshift mutations, allowing maximization of CRISPR/Cas9 phenotype penetrance in the F0 generation.

???displayArticle.pubmedLink??? 32887910
???displayArticle.pmcLink??? PMC7473854
???displayArticle.link??? Sci Rep
???displayArticle.grants??? [+]

Species referenced: Xenopus tropicalis Xenopus laevis
Genes referenced: abca4 adam12 amer3 bmal1 ccn4 celsr2 dnm2 eya1 ezh2 fap fbxw7 fxr1 hmgcr hmmr itsn1 kdm6a mdk nf1 notch1 nuak1 parm1 pkd1 pkd2 prickle1 prph2 pten pycr1 rab3gap2 rspo2 slc16a2 smad6 sox2 stag2 suz12 tbx4 tpcn1 ush2a
gRNAs referenced: abca4 gRNA1 abca4 gRNA2 abca4 gRNA3 adam12 gRNA1 amer3 gRNA1 arntl gRNA1 ccn4 gRNA1 celsr2 gRNA1 celsr2 gRNA2 dnm2 gRNA1 dnm2 gRNA2 eya1 gRNA1 ezh2 gRNA1 ezh2 gRNA2 fap gRNA1 fbxw7 gRNA1 fxr1 gRNA1 hmgcr gRNA1 hmmr gRNA1 itsn1 gRNA1 itsn1 gRNA2 kdm6a gRNA1 mdk gRNA1 nf1 gRNA1 nf1 gRNA2 nog gRNA1 notch1 gRNA1 nuak1 gRNA1 parm1 gRNA1 pkd1 gRNA1 pkd2 gRNA1 prickle1 gRNA1 prph2 gRNA1 pten gRNA1 rab3gap2 gRNA1 rab3gap2 gRNA2 rspo2 gRNA1 slc16a2 gRNA1 smad6 gRNA1 smad6 gRNA2 sox2 gRNA1 stag2 gRNA1 stag2 gRNA2 suz12 gRNA1 tpcn1 gRNA1 tpcn1 gRNA2 tyr gRNA1 tyr gRNA2 tyr gRNA3 tyr gRNA4 tyr gRNA5 tyr gRNA6 ush2a gRNA2

Phenotypes: Xtr + tyr sgRNA CRISPR (Fig. S6) [+]

???attribute.lit??? ???displayArticles.show???

	Figure 1. Theoretical models of how gRNA-specific efficiencies and frameshift gene editing outcome probabilities influence the cellular composition and percentage of protein knockout cells in a mosaic F0 animal model. (A) There is a non-linear relationship between gRNA-specific probability of obtaining a frameshift gene editing outcome (x-axis) and the probability of obtaining a biallelic frameshift gene editing outcome in a single cell (y-axis). E.g. upon a gRNA-specific frameshift frequency of 80%, the probability of a single biallelic edited cells to be biallelic frameshift mutant is 64% (0.80*0.80). (Grey demarcation). (B) Examples of theoretical outcomes of gene editing (presuming 100% on-target efficiency) in an F0 mosaic varying one parameter: gRNA-specific probability of frameshift editing. (C) Examples of theoretical outcomes of gene editing in an F0 mosaic varying two parameters: gRNA-specific probability of frameshift editing and gRNA-specific on-target efficiency. E.g. for a 100% efficient gRNA with an 80% gRNA-specific probability of frameshift editing, we expect 64% of the cells to be biallelic frameshift mutant (see grey demarcation in A). Please note, blue circles represent cells that are biallelic gene edited, but retain at least one in-frame mutation and cannot be considered complete protein knock-out. (D) Flowchart representing the pipe-line for investigating the correlations between experimentally observed in vivo gene editing outcomes and gene editing outcomes projected by computational prediction models.
	Figure 2. The InDelphi prediction model, trained in mESC cells, accurately predicts CRISPR/Cas9 gene editing outcomes and outperforms several other prediction models in X. tropicalis embryos. (A) Scatter plot with model-predicted cumulative frameshift gene editing frequencies correlated to experimentally observed cumulative frameshift gene editing frequencies, for each sgRNA (n = 28) separately, in X. tropicalis embryos. Black demarcated lines show the perfect correlation r = 1. Light-grey shows the standard error of the best-fit linear regression line. (B) Scatter plot with model-predicted INDEL patterns correlated to experimentally observed INDEL patterns, for all gRNAs simultaneously. Black lines show linear regression models of all correlations. Black demarcated lines show the perfect correlation r = 1. (C) Correlations between model-predicted and experimentally observed INDEL patterns, for each gRNA separately. Error bars represent mean ± SD. (*p < 0.001; p < 0.01; *p < 0.05; ns = not significant; Shapiro–Wilk (p > 0.05); Levene (p < 0.05); One-way Welsh ANOVA to adjust for unequal variances (p < 0.001), with Games-Howell multiple comparisons) (Table S2). (D) Violin plots of the residuals (predicted frequency—observed frequency) between model-predicted and experimentally observed frequency of + 1 insertion gene editing outcome. (E) The SEM of the mean residual difference (predicted frequency—observed frequency) between model-predicted and experimentally observed frequency of all deletion variants modeled.
	Figure 3. The InDelphi-mESC model accurately predicts CRISPR/Cas9 gene editing outcomes in X. tropicalis, X. laevis and zebrafish embryos which can be exploited to identify high-frameshift frequency gRNAs. (A–F) Scatter plot with InDelphi-mESC-predicted cumulative frameshift gene editing frequencies correlated to experimentally observed cumulative frameshift gene editing frequencies, for each sgRNA separately, in X. tropicalis (n = 14) (Panel A), in X. laevis (n = 6) (Panel B) and in zebrafish (n = 15) embryos (Panel C). Scatter plot with InDelphi-mESC-predicted INDEL patterns correlated to experimentally observed INDEL patterns, for all gRNAs simultaneous, in X. tropicalis (n = 14) (Panel D), in X. laevis (n = 6) (Panel E) and zebrafish (n = 15) (Panel F) embryos. Black demarcated lines show the perfect correlation r = 1. Light-grey areas show the standard error on the best-fit linear regression line. Black lines show linear regression model. (G) Correlations between model-predicted INDEL patterns to experimentally observed INDEL patterns, for each gRNA separately. Correlations for X. tropicalis embryos (n = 14) (dark blue) and X. laevis embryos (n = 6) (middle blue) analyzed by Sanger sequencing and sequence trace decomposition. Correlations for zebrafish embryos analyzed by targeted amplicon sequencing (TAS) (n = 15) (light blue). (H) Using the distribution of the expected probability of frameshift frequency for a large dataset of SpCas9 human target sites in mESC cells from Shen et al. 2018 (black line—monoallelic)27, we draw the derivative distribution of the probability of a randomly designed gRNA to generate biallelic frameshift editing. This distribution is shown for different editing efficiencies within the F0 mosaic animal: 100%, 50% and 25% (in reducing intensities of blue—100 circles, each circle represents a cell within a total mosaic of a 100 cells). E.g. The probability of a randomly designed gRNA to yield more than 80% biallelic frameshift mutant cells in a developing mosaic, assuming 100% efficiency, is the area under curve highlighted in pink and represents only a 3.24% probability.
	Figure 4. Integrating CRISPRscan and the InDelphi-mESC model allows identification of efficient high frameshift frequency gRNAs in X. tropicalis. (A) Scatterplot with marginal histograms demonstrating for 339,693 gRNAs across the coding sequence for 4,860 X. tropicalis genes the relationships between calculated CRISPRscan score, InDelphi-mESC predicted frequency of MMEJ repair and InDelphi-mESC predicted knockout-score (KO-score). KO-score is defined as the predicted percentage of cells with biallelic out-of-frame mutations within the pool of all mutant cells (i.e. in-frame and out-of-frame; mono- and bi-allelic) in the mosaic mutant embryo and is calculated as the square of the frameshift frequency predicted by InDelphi-mESC. For each gene (nâ€‰=â€‰4,860), the gRNA with the highest predicted KO-score (Highest-in-class) is highlighted in blue, while the gRNA with the lowest predicted KO-score (Lowest-in-class) is highlighted in orange. Demarcations illustrate those quadrants where gRNAs suffice to certain cutoff thresholds. Ideally, designed gRNAs fall within the aquamarine demarcation (high predicted KO-score, high CRISPRscan score), but not the orange (low predicted KO-score, high CRISPRscan score) or purple demarcation (high predicted KO-score, low predicted CRISPRscan score). (B) Violin plot illustrating that highest-in-class gRNAs and lowest-in-class gRNAs have a higher predicted percentage of repair by microhomology-mediated end joining than a random selection of guides. (****pâ€‰<â€‰0.001â€”Table S2). (C) No distinct difference in calculated CRISPRscan scores between highest-in-class gRNAs, lowest-in-class gRNAs and a random selection of gRNAs. (D) Comparison of three pairs of gRNAs targeting the second exon of the tyrosinase gene responsible for pigmentation in X. tropicalis. As these three pairs of guides have very similar genome editing efficiencies, as determined by targeted amplicon sequencing, the impact of differential predicted KO-scores on phenotypic penetrance is revealed. (D, E) Phenotypic scoring is based on retinal pigmentation at Nieuwkoop-Faber stage 38 and a trend is observed where guides with higher predicted KO-scores yield a higher phenotypic score under very similar genome editing efficiencies.
	Fig. S1: ezh2 CRISPR/Cas9 gene editing outcome can be accurately predicted via the online prediction algorithm InDelphi. (A) Column graphs showing overlay of variant calls (%) between in vivo observations and in silico predictions (B) Pearson correlation with significance interval between in vivo observations and in silico predictions for the ezh2 gRNA.
	Fig. S2: Pearson correlations between in vivo observed (obtained by targeted amplicon sequencing) and respective in silico predicted variant frequencies for 28 gRNAs injected in X. tropicalis embryos. gRNAs are injected as Cas9/gRNA-ribonucleoprotein complexes at early developmental stages (2 to 8 cell stage). Target regions are PCR amplified and sequenced using MiSeq sequencing (Illumina) and raw data is processed using the BATCH-GE analysis software. In silico predictions are generated by the InDelphi software algorithm. Plots show correlations between in vivo observed and in silico predicted variant frequencies. x_g1, x_g2, x_g3 refers to different guide RNAs against the same gene. (**p < 0.0001; p < 0.001; *p < 0.01).
	Fig. S3: Pearson correlations between in vivo observations (generated by Sanger sequencing and sequence trace deconvolution) and respective in silico predictions of 14 gRNAs injected in X. tropicalis embryos. gRNAs are injected as Cas9/gRNA-ribonucleoprotein complexes at early developmental stages (1-cell stage). Target regions are PCR amplified and sequenced using Sanger sequencing and deconvoluted using the Inference of CRISPR Edits (ICE) algorithm. In silico predictions are generated by the InDelphi software algorithm. Plots show correlations between in vivo observed and in silico predicted variant frequencies. x_g1, x_g2 refers to different guide RNAs against the same gene. (**p < 0.0001; p < 0.001; p < 0.01; p < 0.05; ns = not significant).
	Fig. S4: Pearson correlations between in vivo observations (generated by Sanger sequencing and sequence trace deconvolution) and respective in silico predictions of 10 gRNAs injected in X. laevis embryos. gRNAs are injected as Cas9/gRNA-ribonucleoprotein complexes at early developmental stages (1-cell stage). Target regions are PCR amplified and sequenced using Sanger sequencing and deconvoluted using the Inference of CRISPR Edits (ICE) algorithm. In silico predictions are generated by the InDelphi software algorithm. Plots show correlations between in vivo observed and in silico predicted variant frequencies. Gene name_S and gene name_L refers to the two homeologues of a particular gene present on the small and large chromosome, respectively. (**p < 0.0001; p < 0.01; *p < 0.05; ns = not significant).
	Fig. S5: Pearson correlations between in vivo observations (generated by targeted amplicon sequencing) and respective in silico predictions of 15 gRNAs injected in zebrafish embryos. gRNAs are injected as Cas9/gRNA- ribonucleoprotein complexes at early developmental stages (1 cell stage). Target regions are PCR amplified and sequenced using MiSeq sequencing (Illumina) and raw data is processed using the BATCH-GE analysis software. In silico predictions are generated by the InDelphi software algorithm. Plots show correlations between in vivo observed and in silico predicted variant frequencies. x_g1, x_g2, x_g3 refers to different guide RNAs against the same gene. (**p < 0.0001; p < 0.001; *p < 0.01).
	Fig. S6: Pictures from eyes of tyrosinase mutant embryos with their associated threshold mask used for quantification.

References [+] :

Allen, Predicting the mutations generated by repair of Cas9-induced double-strand breaks. 2018, Pubmed