Click here to close
Hello! We notice that you are using Internet Explorer, which is not supported by Xenbase and may cause the site to display incorrectly.
We suggest using a current version of Chrome,
FireFox, or Safari.
Environ Toxicol Chem
2025 May 01;445:1435-1447. doi: 10.1093/etojnl/vgaf067.
Show Gene links
Show Anatomy links
Amphibian studies to investigate the endocrine-disrupting properties of chemicals through the thyroid modality: a comparison of their statistical power.
Rizzuto S
,
Neri FM
,
Ercolano V
,
Ippolito A
,
Linguadoca A
,
Villamar Bouza L
,
Arena M
.
???displayArticle.abstract???
Amphibians are the current model species for investigating the endocrine-disrupting (ED) properties through the thyroid modality in non-mammalian species. A recurrent question in the European Union (EU) regulatory endocrine assessment of pesticide active substances (2018/605) is whether the positive results from an in vivo screening test, that is, Amphibian Metamorphosis Assay (AMA), can be considered sufficient to conclude on the ED properties of a pesticide active substance or whether the larval amphibian growth and developmental assay (LAGDA) is a necessary step to further clarify the concerns identified in the AMA. Another one is the consideration of the extended AMA (EAMA). To further clarify some of the uncertainties around the use of the LAGDA and to help further consideration of the EAMA in a regulatory context, the statistical power of the three test designs was tested for all the parameters entailed to be measured in the respective study design (except for thyroid histopathology) by using data from real experimental studies. Our findings showed that the statistical power of the EAMA is in line with other Organisation for Economic Co-operation and Development standardized tests, that is, AMA, LAGDA. Our results also confirmed that the LAGDA is more powerful to detect effects on relevant parameters, that is, time to reach metamorphosis, compared to other in vivo tests. However, the difference in power was small, questioning its contribution to an overall weight of evidence already supporting the identification of a substance as an ED. These findings should be considered only in the context of hazard-based endocrine assessment of active substances (i.e., EU regulatory ED assessment of pesticide active substances, 2018/65), while they may not be fully applicable for a risk assessment-based approach.
Figure 1. Proportional minimum detecatable difference (pMDD), proportonal upper bound of the confidence interval (pCI), and Cohen’s d of the continuous variables investigated in the amphibian metamorphosis assay (AMA), extended (E)AMA, and larval amphibian growth and development assay (LAGDA). Comparative assessment of hind limb length (HLL) and normalized (n)HLL was not possible for the LAGDA because this parameter is not measured. Horizontal red dashed lines indicate the threshold of 10% for pMDD and pCI, while for Cohen’s d, they highlight the standard thresholds for small (0.2) and medium (0.5) effect size. Black dots represent data points outside 1.5 times the interquartile range. The grey annotations indicate the statistically significant differences between test designs (*p < 0.05, **p < 0.01, and ***p < 0.001). SVL = snout to vent length.
Figure 2. Results of the power analysis for scenario A, with a mean effect of one-stage for treatment Group 3 (highest treatment). The power is defined as the proportion of simulations that estimated a no observed effect concentration (NOEC) = 2 (matching the no effect concentration [NEC]). The study with ID 10 was omitted from the analysis because there was no information on the developmental stage of replicates. AMA = amphibian metamorphosis assay.
Figure 3.Overview of the results of the power analysis for the three main scenarios A, B, and C, based on a total of 17 amphibian metamorphis assay (AMA) studies. The studies are classified into different ranges (colors) of statistical power. A no effect concentration (NEC) = 2 corresponds to an effect on the highest treatment group (treatment = 3); a NEC = 1 to an effect on the highest and second highest treatment groups. The results are for the decreasing direction of the shift; those for the increasing direction are only slightly different (data not shown here but reproducible as included in the available R script).
Figure 4.Results of the power analysis on the parameter time to reach Nieuwkoop and Faber (NF) stage 62 for nine extended amphibian metamorphis assay (EAMA) studies and five larval amphibian metamorphis assay (LAGDA) studies, based on two different approaches: the Cox proprtional hazard (PH) model (left) and the restricted mean survival time (RMST; right). The upper panels report two-sample comparison between the control and a single treatment, the ones below report multiple comparison between the control and all treatments groups followed by a Bonferroni–Holm multiplicity correction.
Figure S1. Comparison of estimated quantiles for two different studies. The circles are the median of the distributions of the quantiles of the resampled datasets. The vertical lines are the 95% interval of variability around the median. The distributions are discrete and extremely skewed and the median coincides in several cases with the upper/lower limit of the interval.
Figure S2. Sensitivity to additional noise. The x coordinates are the values of statistical power for scenario A, with decreasing direction and variance = 0 (the same results as the downward blue triangles in Figure 2). The y coordinates are the values calculated for the same scenario, direction and variance adding a small noise to the whole dataset (the control and all the treatments). Two distributions with mean = 0 were used for the noise: with variance = 0.5 (values -1, 0 and 1 with probabilities 0.25, 0.5 and 0.25, respectively) and with variance = 1 (values -1 and 1 with probabilities 0.5 and 0.5, respectively). The line y = x is plotted as a guide.
Figure S3. Sensitivity to methodology for quantiles. The x coordinates are the values of statistical power for scenario A, calculated using the default method in R (number 2 in Hyndman and Fan (1996)) for the quantiles – the same results as in Figure 2. The y coordinates are the values calculated with the default method in SAS for the quantiles, for each corresponding combination of direction and variance. The line y = x is plotted as a guide.
Figure S4. Effect of the inclusion of solvent controls. The results are for six studies with both negative and solvent control data. Each point represents the results for a given study, scenario (shape), value of the variance within the scenario (colour) and direction of the shift (not represented). The values of the variance between brackets are for scenario B; those outside the brackets are for scenarios A and C. The x coordinate is the statistical power calculated using only negative control data – the same results as in Figure 2 and Figure 3. The y coordinate is the statistical power calculated by pooling negative and solvent control data. The line y = x is plotted as a guide.
Figure S5. Distribution of NOECs for scenario C. Each point in the scatter plot corresponds to a study and a combination of direction (shape) and variance (colour) of the shift. The x coordinate is the probability that NOEC = 1 (= NEC), i.e. the statistical power as discussed in the main text; the y coordinate is the probability that NOEC = 1 or 2. The line y = x is shown as a guide. The tables in the margins show the number of points falling in each of the three probability ranges (dashed green and pink lines) for the x and y coordinates separately (tables on the top and the right, respectively). The counts are further broken down by direction and values of the variance within each table.
The step graphs of the Empirical Cumulative Distribution Functions are represented in different colours based on the response variable. The vertical dashed line represents a p-value of 0.05.
Figure S8. Mean of the control group, pooled standard deviation, coefficient of variation and relative standard error for the continuous variables investigated in the AMA, EAMA and LAGDA. Black dots represent data points outside 1.5 times the interquartile range. The results for the coefficient of variation (for the AMA and EAMA) and the results for the relative standard error (including the LAGDA) should be compared with the pMDD in Figure 1.