The impact of biostatistics on hazard characterization using in vitro developmental neurotoxicity assays

Main Article Content

Hagen Eike Keßel
Stefan Masjosthusmann
Kristina Bartmann
Jonathan Blum
Arif Dönmez
Nils Förster
Jördis Klose
Axel Mosig
Melanie Pahl
Marcel Leist
Martin Scholze
Ellen Fritsche


In chemical safety assessment, benchmark concentrations (BMC) and their associated uncertainty are needed for the toxicological evaluation of in vitro data sets. A BMC estimation is derived from concentration-response modelling and results from various statistical decisions, which depend on factors such as experimental design and assay endpoint features. In current data practice, the experimenter is often responsible for the data analysis and therefore relies on statistical software often without being aware of the software default settings and how they can impact the outputs of data analysis. To provide more insight into how statistical decision-making can influence the outcomes of data analysis and interpretation, we have developed an automatic platform that includes statistical methods for BMC estimation, a novel endpoint-specific hazard classification system, and routines that flag data sets that are outside the applicability domain for an automatic data evaluation. We used case studies on a large dataset produced by a developmental neurotoxicity (DNT) in vitro battery (DNT IVB). Here we focused on the BMC and its confidence interval (CI) estimation as well as on final hazard classification. We identified five crucial statistical decisions the experimenter must make during data analysis: choice of replicate averaging, response data normalization, regression modelling, BMC and CI estimation, and choice of benchmark response levels. The insights gained in are intended to raise more awareness among experimenters on the importance of statistical decisions and methods but also to demonstrate how important fit-for-purpose, internationally harmonized and accepted data evaluation and analysis procedures are for objective hazard classification.

Article Details

How to Cite
Keßel, H. E., Masjosthusmann, S., Bartmann, K., Blum, J., Dönmez, A., Förster, N., Klose, J., Mosig, A., Pahl, M., Leist, M., Scholze, M. and Fritsche, E. (2023) “The impact of biostatistics on hazard characterization using in vitro developmental neurotoxicity assays”, ALTEX - Alternatives to animal experimentation. doi: 10.14573/altex.2210171.

Aerts, M., Wheeler, M. W., Abrahantes, J. C. (2020). An extended and unified modeling framework for benchmark dose estimation for both continuous and binary data. Environmetrics 31, e2630. doi:10.1002/env.2630

Blum, J., Masjosthusmann, S., Bartmann, K. et al. (2022). Establishment of a human cell-based in vitro battery to assess developmental neurotoxicity hazard of chemicals. Chemosphere 311, 137035. doi:10.1016/j.chemosphere.2022.137035

Breusch, T. S. and Pagan, A. R. (1979). A simple test for heteroscedasticity and random coefficient variation. Econometrica 47, 1287–1294. doi:10.2307/1911963

Buckley, B. E., Piegorsch, W. W., West, R. W. (2009). Confidence limits on one-stage model parameters in benchmark risk assessment. Environ Ecol Stat 16, 53–62. doi:10.1007/s10651-007-0076-2

Claeskens, G. and Hjort, N. (2008). Model selection and model averaging. Cambridge University Press, Cambridge. doi:10.1017/CBO9780511790485

Cox, C. (1990). Fieller’s theorem, the Likelihood and the delta method. Biometrics 46, 709–18. doi:10.2307/2532090

Crofton, K. M., Mundy, W. R. (2021). External Scientific Report on the Interpretation of Data from the Developmental Neurotoxicity In Vitro Testing Assays for Use in Integrated Approaches for Testing and Assessment. EFSA supporting publication; 18( 10):EN-6924. 42 pp. doi:10.2903/sp.efsa.2021.EN-6924

Crump, K. S. (1995). Calculation of benchmark doses from continuous data. Risk Anal 15, 79-89, doi:10.1111/j.1539-6924.1995.tb00095.x

Dent, M. P., Vaillancourt, E., Thomas, R. S. et al. (2021). Paving the way for application of next generation risk assessment to safety decision-making for cosmetic ingredients. Regul Toxicol Pharmacol 125, 105026. doi:10.1016/j.yrtph.2021.105026

Efron, B. and Tibshirani, R. J. (1994). An introduction to the bootstrap. CRC press. doi:10.1201/9780429246593

EFSA Scientific Committee, Hardy, A., Benford, D. et al. (2017). Update: Guidance on the use of the benchmark dose approach in risk assessment. EFSA J 15, 4658. doi:10.2903/j.efsa.2017.4658

Fang, Q., Piegorsch, W. W., and Barnes, K. Y. (2015). Bayesian benchmark dose analysis. Environmetrics 26, 373– 382. doi:10.1002/env.2339

Förster, N., Butke, J., Keßel, H. E. et al. (2022). Reliable identification and quantification of neural cells in microscopic images of neurospheres. Cytometry 101, 411-422. doi:10.1002/cyto.a.24514

Gerrard, P., Maindonald, J. and Braun, W. J. (2010). Data Analysis and Graphics Using R: An Example Based Approach. Cambridge Series in Statistical and Probabilistic Mathematics. Third Edition, Psychometrika 78, 856-857. doi:10.1007/s11336-013-9349-x

Jensen, S. M., Kluxen, F. M., Ritz, C. (2019). A review of recent advances in benchmark dose methodology. Risk Anal 39, 2295-2315. doi:10.1111/risa.13324

Jensen, S. M., Kluxen, F. M., Streibig, J. C. et al. (2020). bmd: an R package for benchmark dose estimation. PeerJ 8, e10557. doi:10.7717/peerj.10557

Krebs, A., Nyffeler, J., Karreman, C. et al. (2020). Determination of benchmark concentrations and their statistical uncertainty for cytotoxicity test data and functional in vitro assays, ALTEX 37, 155–163. doi:10.14573/altex.1912021

Krebs, A., Nyffeler, J., Rahnenführer, J., & Leist, M. (2018). Normalization of data for viability and relative cell function curves. ALTEX 35, 268–271. doi:10.14573/1803231

Leist, M., Hasiwa, N., Rovida, C. et al. (2014) Consensus report on the future of animal-free systemic toxicity testing. ALTEX 31, 341–356. doi:10.14573/altex.1406091

Leontaridou, M., Urbisch, D., Kolle, S. N. et al. (2017). The borderline range of toxicological methods: Quantification and implications for evaluating precision. ALTEX 34, 525–538. doi:10.14573/altex.1606271

Li, H., Yuan, H., Middleton, A. et al. (2021). Next generation risk assessment (NGRA): Bridging in vitro points-of-departure to human safety assessment using physiologically-based kinetic (PBK) modelling – A case study of doxorubicin with dose metrics considerations. Toxicol in Vitro 74, 105171. doi:10.1016/j.tiv.2021.105171

Masjosthusmann, S., Blum, J., Bartmann, K. et al. (2020). Establishment of an a priori protocol for the implementation and interpretation of an in-vitro testing battery for the assessment of developmental neurotoxicity. EFSA supporting publication 17, EN-1938. 152 pp. doi:10.2903/sp.efsa.2020.EN-1938

Moerbeek, M., Piersma, A. H., Slob, W. (2004). A comparison of three methods for calculating confidence intervals for the benchmark dose. Risk Anal 24, 31-40. doi:10.1111/j.0272-4332.2004.00409.x

OECD (2006). Current Approaches in the Statistical Analysis of Ecotoxicity Data: A guidance to application (annexes to this publication exist as a separate document). OECD Series on Testing and Assessment, No. 54, OECD Publishing, Paris. doi:10.1787/9789264085275-en.

Pallocca, G., Moné, M. J., Kamp, H. et al. (2022). Next-generation risk assessment of chemicals – Rolling out a human-centric testing strategy to drive 3R implementation: The RISK-HUNT3R project perspective, ALTEX 39, 419–426. doi:10.14573/altex.2204051

Piegorsch, W. W., An, L., Wickens, A. A. et al. (2013), Information-theoretic model-averaged benchmark dose analysis in environmental risk assessment. Environmetrics 24, 143-157. doi:10.1002/env.2201

Portet, S. (2020). A primer on model selection using the Akaike Information Criterion. Infect Dis Model 5, 111-128. doi:10.1016/j.idm.2019.12.010

Randles, R. H., Fligner, M. A., Policello, G. E. II, Wolfe, D. A. (1980). An asymptotically distribution-free test for symmetry versus asymmetry. J Am Stat Assoc 75, 168-172, doi:10.1080/01621459.1980.10477448

Ritz, C., Baty, F., Streibig, J. C., Gerhard, D. (2015) Dose-response analysis using R. PLoS ONE 10, e0146021. doi:10.1371/journal.pone.0146021

Ritz. C., Gerhard, D., Hothorn, L. A. (2013). A unified framework for benchmark dose estimation applied to mixed models and model averaging. Statistics in Biopharmaceutical Research 5, 79–90; doi:10.1080/19466315.2012.757559

Ritz, C., Jensen, S. M., Gerhard, D., Streibig, J. C. (2019). Dose–response analysis using R. Boca Raton: Chapman & Hall, Pages 226, doi:10.1201/b21966

Sand, S., Parham, F., Portier, C. J. et al. (2017). Comparison of points of departure for health risk assessment based on high-throughput screening data. Environ Health Perspect 125, 623–633; doi:10.1289/EHP408

Schmuck, M. R., Temme, T., Dach, K. et al. (2017). Omnisphero: a high-content image analysis (HCA) approach for phenotypic developmental neurotoxicity (DNT) screenings of organoid neurosphere cultures in vitro. Arch Toxicol 91, 2017-2028. doi:10.1007/s00204-016-1852-2

Scholze, M., Boedeker, W., Faust, M. et al. (2001), A general best-fit method for concentration-response curves and the estimation of low-effect concentrations. Environ Toxicol Chem 20, 448-457. doi:10.1002/etc.5620200228

Villeneuve, D. L., Coady, K., Escher, B. I. et al. (2019). 38, 12-26. doi:10.1002/etc.4315

West, R. W., Piegorsch, W. W., Peña, E. A. et al. (2012), The impact of model uncertainty on benchmark dose estimation. Environmetrics 23, 706-716. doi:10.1002/env.2180

Wheeler, M. W., Park, R. M., Bailer A. J., Whittaker, C., (2015). Historical context and recent advances in exposure-response estimation for deriving occupational exposure limits. J Occup Environ Hyg 12 Supp1, S7-S17. doi:10.1080/15459624.2015.1076934

Yandell, B. S. (1997). Practical data analysis for designed experiments (1st ed.). Routledge. doi:10.1201/9780203742563

Zhu, Y., Wang, T. and Jelsovsky, J. Z. (2007). Bootstrap estimation of benchmark doses and confidence limits with clustered quantal data. Risk Analysis 27, 447-465. doi:10.1111/j.1539-6924.2007.00897.x

Most read articles by the same author(s)

1 2 3 4 5 6 7 8 > >>