Assessing Proarrhythmic Potential of Environmental Chemicals Using a High Throughput In Vitro-In Silico Model with Human Induced Pluripotent Stem Cell-Derived Cardiomyocytes

TdP-Abstract QT prolongation and the potentially fatal arrhythmia Torsades de Pointes are common causes for withdrawing or restricting drugs; however, little is known about similar liabilities of environmental chemicals. Current in vitro - in silico models for testing proarrhythmic liabilities, using human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CM), provide an opportunity to address this data gap. These methods are still low-to medium-throughput and not suitable for testing the tens of thousands of chemicals in commerce. We hypothesized that combining high-throughput population-based in vitro testing in hiPSC-CMs with a fully in silico data analysis workflow can offer sensitive and specific predictions of proarrhythmic potential. We calibrated the model with a published hiPSC-CM dataset of drugs known to be positive or negative for proarrhythmia and tested its performance using internal cross-validation and external validation. Additionally, we used computational down-sampling to examine three study designs for hiPSC-CM data: one replicate of one donor, five replicates of one donor

inducing drugs have these effects (Shah, 2002;Woosley et al., 1993).These have led to the publication of preclinical IKr/QTc assessment guidelines (EMA, 2005).However, subsequent studies indicated that tests for IKr blockage may not have sufficient specificity to predict arrhythmia (Gintant et al., 2011) and that some QTc prolongation-inducing drugs do not result in TdP (Vargas et al., 2021), information that suggested that additional tests are needed to predict the proarrhythmic potential.Concomitantly, the Comprehensive in vitro Proarrhythmia Assay (CiPA) suggested the importance of mechanistic electrophysiological knowledge when assessing the proarrhythmic potential (Sager et al., 2014).Also, CiPA endorsed the use of human induced pluripotent stem

Introduction
In pharmaceutical development, issues of cardiovascular safety, especially drug-induced QTc interval prolongation and the potentially fatal arrhythmia Torsades de Pointes (TdP), are common concerns that may lead to drug withdrawal or restriction of use (Stockbridge et al., 2013;Onakpoya et al., 2016).Cardiotoxicity testing during both preclinical development and in early phase clinical trials is required (Pognan et al., 2023).QTc interval prolongation and the inhibition of cardiac potassium current (IKr blockage) have been established as proarrhythmic biomarkers, based on mechanistic considerations and the finding that TdP-

Abstract
QT prolongation and the potentially fatal arrhythmia Torsades de Pointes are common causes for withdrawing or restricting drugs; however, little is known about similar liabilities of environmental chemicals.Current in vitro-in silico models for testing proarrhythmic liabilities, using human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CM), provide an opportunity to address this data gap.These methods are still low-to medium-throughput and not suitable for testing the tens of thousands of chemicals in commerce.We hypothesized that combining high-throughput population-based in vitro testing in hiPSC-CMs with a fully in silico data analysis workflow can offer sensitive and specific predictions of proarrhythmic potential.We calibrated the model with a published hiPSC-CM dataset of drugs known to be positive or negative for proarrhythmia and tested its performance using internal cross-validation and external validation.Additionally, we used computational down-sampling to examine three study designs for hiPSC-CM data: one replicate of one donor, five replicates of one donor, and one replicate of a population of five donors.We found that the population of five donors had the best performance for predicting proarrhythmic potential.The resulting model was then applied to predict the proarrhythmic potential of environmental chemicals, additionally characterizing risk through margin of exposure (MOE) calculations.Out of over 900 environmental chemicals tested, over 150 were predicted to have proarrhythmic potential, but only seven chemicals had a MOE < 1.We conclude that a high-throughput in vitro-in silico approach using population-based hiPSC-CM testing provides a reasonable strategy to screen environmental chemicals for proarrhythmic potential.

Plain language summary
This article discusses a new method for testing the potential harmful effects of environmental chemicals on the heart.We used human heart cells grown in a lab to test the chemicals and developed a computer model to predict their potential to cause dangerous heart rhythms.This method could help identify harmful chemicals more quickly and accurately than current testing methods.The study has the potential to improve evaluation of chemical risks and protect public health without the use of animals.
tions of proarrhythmic potential for environmental chemicals and expect that this approach can be a useful alternative to in vivo toxicity testing.We calibrate and validate this approach using previously published hiPSC-CM datasets of drugs known to be positive or negative for proarrhythmia.Additionally, we evaluate the merits of using single versus multiple hiPSC-CM donors.

Materials and methods
Overviews of the overall conceptual model and model development workflow are presented in Figure 1A and B, respectively.The general approach (Fig. 1A) is to conduct high-throughput screening in hiPSC-CMs, perform automated processing of the resulting Ca 2+ flux and high-content imaging data, determine PODs for key functional and viability endpoints, and finally utilize these PODs in a logistic regression model for proarrhythmic potential.The specific steps for model development (Fig. 1B) are described in detail below.

In vitro experimental cardiotoxicity data
The first step was to identify relevant hiPSC-CMs datasets for use in model building and testing.Two in vitro experimental datasets were identified: Blanchette et al. (2020) (Dataset I) and Burnett et al. (2021a) (Dataset II).The summary of the number of pharmaceuticals and non-pharmaceuticals in both datasets is presented in Table 1.Pharmaceuticals in Dataset I (n = 56) were used as training set and for model calibration and internal cross-validation, and pharmaceuticals in Dataset II (n = 160) were used for independent external validation.The model was then applied to the remaining chemicals in each dataset to make predictions as to proarrhythmic potential for non-pharmaceuticals.Predictions for non-pharmaceuticals in Dataset I (n = 82) represent predictions based on "within study" data, while predictions for non-pharmaceuticals in Dataset II (n = 875) represent predictions based on "out of study" data.The chemicals in these two datasets with their respective CAS numbers and chemical classifications are provided in Table S1 and S2 1 .The experimental information regarding chemicals, hiPSC-CM lines, Ca 2+ flux assay, and high-content imaging assays and data processing for each dataset were previously reported as detailed below.Briefly, chemicals in Dataset I were provided by the US Environmental Protection Agency, National Center for Computational Toxicology (Research Triangle Park, North Carolina), including 56 pharmaceutical compounds (partially identified by the CiPA initiative) and 82 nondrugs, namely food additives, industrial chemicals, flame retardants, pesticides, polycyclic aromatic hydrocarbons (PAHs), and metals (Blanchette et al., 2020).For cell-derived cardiomyocytes (hiPSC-CM) in the evaluation of potentially abnormal electrophysiological ECG signals (Yang et al., 2022).However, while this approach could successfully provide accurate and precise estimates for QTc prolongation (Fermini et al., 2016), it may not necessarily be specific for TdP.
In contrast, the potential for cardiotoxicity of environmental and industrial chemicals is largely unknown because cardiotoxicity testing is not a commonly required test, and it is not feasible to conduct in vivo tests for the tens of thousands of compounds in commerce.A number of in vitro and in silico approaches have been proposed for evaluating cardiotoxicity of non-pharmaceuticals (Daley et al., 2023;Sirenko et al., 2017;Burnett et al., 2021b).While recent studies have focused on identifying the most sensitive cardiotoxicity endpoint for use in screening-level risk assessments (Grimm et al., 2020;Sirenko et al., 2017), there is also a need to better understand proarrhythmic potential in terms of TdP.Recent studies of pharmaceuticals integrating in vitro hiPSC-CM testing and in silico approaches provide an opportunity to improve strategies for assessing proarrhythmic potential (Pognan et al., 2023).For instance, Blinova et al. (2018) used the arrhythmia-like event counts from electrophysiological data in hiPSC-CM exposed to 28 drugs from the CiPA dataset as predictors for their TdP risk categories (low, intermediate, and high).The authors developed logistical and ordinal linear regression models to predict proarrhythmic potential.As presented, the method requires manually categorizing and counting arrhythmia-like events, which hinders the transition to high-throughput screening.Because hiPSC-CM experiments can be done in a high-throughput manner (Burnett et al., 2021a), a fully automated data analysis algorithm would enable evaluation of a far greater number of compounds of interest.
Several previous studies showed that points of departure (PODs) of functional and viability endpoints from high-throughput hiPSC-CM testing using various donor cells can effectively characterize the hazard and risk of QTc prolongation in response to many drugs and hundreds of environmental chemicals (Burnett et al., 2021a;Blanchette et al., 2020).The PODs were derived by using an automated peak parameter analysis and Bayesian concentration-response modeling.Therefore, we hypothesized that these PODs of hiPSC-CM-derived functional endpoints can be used as predictors to build a classification model for proarrhythmic potential based on the known TdP risk category of tested drugs.Additionally, in order to optimize experimental design, we drew upon the computational down-sampling analysis by Blanchette et al. (2022), which found that 5 randomly selected donors can provide adequately accurate and precise population-wide PODs.
In this study, we hypothesized that combining high-throughput population-based in vitro testing in hiPSC-CMs with a fully in silico analysis workflow can offer sensitive and specific predic-Abbreviations: AUC, area under the curve; CiPA, Comprehensive in vitro Proarrhythmia Assay; Css, steady-state plasma concentration; EPA, US Environmental Protection Agency; hiPSC-CM, induced pluripotent stem cell-derived cardiomyocytes; HQ, hazard quotient; ICH, International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use; IVIVE, in vitro to in vivo extrapolation; MOE, margin of exposure; PAH, polycyclic aromatic hydrocarbon; POD, point of departure; RfD, reference dose; ROC, receiver operating characteristic; RSL, regional screening level; TdP, Torsades de Pointes; ToxCast, Toxicity ForeCaster 1 doi:10.14573/altex.2306231s1ancestry.The experimental design of these studies can be summarized as follows: hiPSC-CM were exposed to test compounds for 90 minutes; then the cells were examined to obtain the results of functional performance and viability by using the Ca 2+ flux assay and high-content imaging assays.The data from the Ca 2+ flux assay was evaluated with a peak-processing algorithm (Blanchette et al., 2019) to obtain beating parameters.Fluorescence imaging data was obtained as described previously (Burnett et al., 2019).All experiments were performed in a concentration-response design with quality control through positive and negative compounds and replication.
Dataset II, there were 160 pharmaceutical compounds and 875 environmental chemicals, including industrial chemicals, pesticides, flame retardants, plasticizers, PAHs, solvents, surfactants, dyes, and food/flavor/fragrance agents (Burnett et al., 2021a).The chemicals in Dataset II were selected from the US Environmental Protection Agency (EPA) Toxicity ForeCaster (ToxCast) library (Williams et al., 2017).Between the two datasets, the overlapping human iPSC-derived cardiomyocyte lines (Tab.S3 1 ) were from 5 healthy donors without cardiovascular disease or familial history of cardiovascular disease, and consisted of 3 females and 2 males, including individuals of African American, European, and mixed

Model calibration Model inputs: Bayesian population concentration-response modeling of in vitro cardiotoxicity data
Because our goal was to develop a high-throughput screening approach for proarrhythmic potential, we did not use the Ca 2+ flux and high-content imaging data directly, but rather summarized them first in terms of concentration-based points of departure (PODs).We determined PODs using three different subsamples of Dataset I in order of increasing complexity: Design A, single replicate of one donor (designated #1434), Design B, five replicates of donor #1434, and Design C, single replicates of a population of five donors.External validation was assessed using the parallel subsets of Dataset II.
Because our analysis involved subsets of the original datasets, new PODs were derived using each subsample.Using previously reported methods (Blanchette et al., 2019(Blanchette et al., , 2020)), new Bayesian concentration-response modeling was conducted for 5 phenotypes that reflect both cardiomyocyte function and viability: positive and negative chronotropy, asystole, QT prolongation, Only data from pharmaceuticals (see list in Tab.S1 and S2 1 ) were used for model calibration and validation because clinical data on TdP risk is only available for these compounds.Specifically, the TdP risks of drugs were obtained from a QT/TdP drug list in the CredibleMeds ® database 2 .CredibleMeds ® has systematically examined all available evidence to place drugs into three main categories based on whether there are data to indicate QT prolongation or TdP.The group of "known risk" for TdP compounds is defined as drugs that cause QT prolongation and have a clear association with TdP; drugs in the group of "conditional risk" of TdP have been associated with TdP risk only under certain use scenarios.Finally, compounds in the "possible risk" of TdP category are drugs that can cause QT prolongation but currently lack evidence to definitively ascertain the association with TdP risk.The detailed definition for each group is available on the CredibleMeds ® website.For the analysis herein, the drugs were categorized as "positive" if they were listed in the group of either "known" or "conditional" risk, while the others were categorized as "negative."a These chemicals were used as training dataset, including 18 and 38 chemicals with positive and negative proarrhythmic potential, respectively.b These chemicals were used for conducting the external validation, including 28 and 132 chemicals with positive and negative proarrhythmic potential, respectively.
2 https://crediblemeds.org/ (Wang and Zou, 2021).Specifically, we used the pharmaceuticals in Dataset I, removing one chemical from the training set to create a "leave-one-out" model, and then applied this model to predict proarrhythmic potential of the left-out chemical.We repeated these processes by leaving each one of 56 chemicals out in turn; therefore, there were 56 resulting models, each with their own sensitivity, specificity, area und the curve (AUC) of ROC, optimized threshold, and accuracy.The performance metrics (i.e., sensitivity, specificity, and accuracy) of predicting proarrhythmic potential of the left-out chemical were then calculated and aggregated across the 56 chemicals to summarize the "out of sample" performance.
Next, external validation was conducted with an independent dataset, using the PODs of drugs in Dataset II to predict their proarrhythmic potential.We also compared performance metrics with a previously published model by Blinova et al. (2018).
Once the optimized model was determined, the applicability domains, i.e., the POD distributions of 5 phenotypes, of the training and test sets were examined to make sure they were in a similar range.Additionally, to ensure the unbalanced drug data regarding positive and negative outcomes would not influence the model performance, a down-sampling approach was applied as a sensitivity analysis to examine the effect of data imbalance on the model performance.We used all positive chemicals (n = 18) and randomly sampled the same number of negative chemicals as a combined training set to build the model and then applied this model to predict the proarrhythmic potential of all chemicals in Dataset I, as well as recording the AUC of ROC, of the models, sensitivity, specificity, and accuracy for predicting proarrhythmic potential of drugs in Dataset I.These processes were repeated 1,000 times, and the results were compared to the sensitivity, specificity, accuracy, and AUC of ROC of our original approach using the full (unbalanced) dataset.

TdP risk characterization for environmental chemicals
The study design (A)-(C) with the best performance was applied to predict the proarrhythmic potential of environmental chemicals in Dataset I and Dataset II using the data from hiPSC-CM experiments that yielded PODs for the 5 input parameters.For environmental chemicals predicted to be "positive" (above the logistic regression "threshold") for TdP hazard, potential risks were quantitatively characterized using a margin of exposure (MOE) approach comparing predicted exposure to the most sensitive POD.The MOE could only be calculated for chemicals that had both exposure predictions as well as in vitro to in vivo extrapolation (IVIVE) data.Exposure predictions were obtained from Expocast (Ring et al., 2018) data in the EPA CompTox database (Tab.S4 1 ) and consisted of the median and upper 95 th percent estimate for the population median exposure.For comparison with the in vitro PODs, rather than converting the in vitro PODs to oral equivalent dose, these exposure predictions were converted to steady-state plasma concentrations (C ss ) using the httk package (version 2.2.2) (Pearce et al., 2017).Specifically, using the defaults of the function calc_css, the physiologically-based toxicokinetic model with assumption of oral intake was used to cal-and cytotoxicity.PODs were derived based on changes in peak frequency, decay-to-rise time ratio, and cell number.Specifically, PODs for positive and negative chronotropy were defined as the concentration eliciting a 5% increase or decrease compared to peak frequency (EC 05 ), that for asystole as a 95% decrease in peak frequency (EC 95 ), that for QT prolongation as a 5% increase in decay-rise ratio (EC 05 ), and that for cytotoxicity as a 10% decrease in total control cells (EC 10 ) (Abdo et al., 2015;Chiu et al., 2017).
All parameters under all study designs were fitted under natural-log transformation in order to make sure the parameters were strictly positive.The assumptions and limitations of prior distributions for all parameters followed those reported in Blanchette et al. (2020).For Design C, population random effects were included, where the individual level is normally distributed under a given population mean and standard deviation of hyperparameters.The prior assumption of hyperparameters is normal-distributed for population mean and half normally-distributed for population standard deviation.
The posterior distribution sampling was performed by Markov chain Monte Carlo simulation.In each simulation, four independent Markov chains were generated with 4,000-36,000 iterations, with the first half of iterations discarded and the last half of those used as output for verifying the convergence and applying to derive PODs.The final number of iterations for each endpoint from different data profiles was based on the convergence, which is diagnosed by the estimated scale reduction factor (R ̂) being lower than 1.2 (Gelman and Rubin, 1992).The convergence for a specific chemical and endpoint was tested until the iterations increased to 36,000, and PODs were based on 1,000 posterior samples, consisting of 4 chains of 250 samples from each chain.

TdP prediction model calibration
With the PODs of each of the 5 phenotypes as inputs, each model utilized logistic regression for predicting proarrhythmic potential: All PODs were log 10 transformed for analysis.The thresholds for identifying the positive and negative outcome were determined by considering the trade-off for sensitivity and specificity.Because of the unbalanced drug data in terms of positive and negative outcomes, the threshold value leading to the maximal geometric mean of sensitivity and specificity was used to categorize proarrhythmic potential.The relationships of sensitivity and specificity at different thresholds were visualized by the receiver operating characteristic (ROC) curve.

Model performance and validation
Performance of the model under each study design (A)-(C) was evaluated using the metrics of sensitivity, specificity, and area under the ROC curve.First this was evaluated for predicting TdP for drugs in Dataset I using the full model.Then, internal crossvalidation was conducted using the "leave-one-out" approach dplyr (version 1.0.10), and httk (version 2.2.2).Raw data and computational executing codes for modeling and data analysis are available in the supplementary files 1,3 and the GitHub repository 4 .

Bayesian concentration-response modeling
Bayesian concentration-response modeling was conducted on a total of 1085 test compounds and 5 phenotypes: positive chronotropy, negative chronotropy, asystole, QT prolongation, and cytotoxicity, using 3 subsamples from each dataset: Design A, single replicate of a single donor; Design B, five replicates of a single donor; and Design C, single replicates of a population of five separate donors.Most chemicals and phenotypes for using each of the 3 subsets had sufficient Markov Chain Monte Carlo samples to achieve convergence.The detailed information on the number of iterations and convergence status for each compound and phenotype in each dataset is compiled in the GitHub repository 4 .The representative results of concentration-response modeling are shown in Figure 2, using the response of decay-rise ratio with cisapride monohydrate and disopyramide phosphate treatment as examples.Comparing the patterns of concentration-culate the in vivo C ss .The parameters used are listed in an Excel file in the supplementary materials 3 .The MOEs were calculated as the lowest POD divided by in vivo C ss .MOE values less than 1 are typically considered likely to be of concern; values between 1 and 100 are considered of potential concern; and values larger than 100 are considered to be of no concern.
Because Datasets I and II contained some common chemicals (i.e., tested in both Blanchette et al., 2020 andBurnett et al., 2021a), we also compared predictions for proarrhythmic potential to evaluate their reproducibility across datasets.Additionally, one drug in Dataset I and 11 environmental chemicals in Dataset II had replicates, which were also evaluated for reproducibility.

Software
Bayesian population concentration-response modeling with Markov chain Monte Carlo algorithm was conducted by the Rstan package (version 2.21.5) using R (version 4.0.3) on the Texas A&M University High Performance Research Computing Platform.Other statistical analyses, such as logistic regression modeling and following posterior sample processing were executed using R (version 4.1.2) with RStudio as interface.The following packages were used for data analysis, model calibration, evaluation, and application: ROSE (version 0.0-4), pROC (version 1.18),   and negatives in the training dataset we used the maximal geometric mean accuracy (geometric mean of sensitivity and specificity) to determine the optimized threshold (0.27 for Design C). Figure 3C shows the ROC curve for Design C, with the value of AUC, and the point corresponding to the optimized threshold.Finally, Figure 3D shows the actual and Design C-predicted probability of TdP for individual drugs.
To evaluate the optimized logistic regression model among Designs A to C, we also implemented internal cross-validation using the "leaving-one-out" approach and external validation using Dataset II, and calculated their sensitivity, specificity, and AUC of ROC.The detailed results of internal cross-validation and external validation of Designs A to C are presented in Table S6, Figure S3 and S4 1 .Figure 4 summarizes the sensitivity, specificity, and AUC of ROC of each model, along with the analogous results from internal cross-validation and external validation.For comparison, the available evaluation metrics from the model developed by Blinova et al. (2018) are also included.Design C had the highest sensitivity across all results, with a value of 0.89 for the model overall, 0.83 based on internal cross-validation, and 0.79 based on external validation.In terms of specificity, Design A performed response among the 3 subsamples, we can find that under some subsamples, the concentration-response curves demonstrated no response on the QT prolongation.This finding suggested that different study designs may significantly affect the observed concentration-response relationship.The resulting PODs from these analyses were used for building and evaluating the logistic regression model for proarrhythmic potential.

Modeling of drug proarrhythmic potential based on the PODs of hiPSC-CM effects
We utilized the PODs for the 5 phenotypes from each subsample of Dataset I as predictors for logistic regression (Fig. S1, S2 1 , and Fig. 3, respectively for Design A, B, and C; fitted parameters in Tab.S5 1 ). Figure 3A presents results for Design C for the relationship of linear predictor and logistic probability, showing a separation, though with overlap, between negative and positive drugs.Figure 3B shows for Design C the relationship between the threshold for defining "positive" or "negative" substances the resulting sensitivity, specificity, true accuracy, and geometric mean accuracy.As the threshold varies, so does the trade-off between sensitivity and specificity; due to the imbalance between positives to have potential to cause TdP.Chemicals with positive proarrhythmic potential mainly belong to the pesticide, microbiocide, herbicide, and chemical intermediate classes.Since the model prediction of proarrhythmic potential is based on hazard alone, we further characterized risk for those environmental chemicals predicted to be positive for TdP using the MOE approach, comparing the most sensitive POD with predicted exposure from Expocast, converted to concentration units using IVIVE.Figure 6 shows the resulting MOEs based on each chemical's median (right-most point) and 95 th percentile upper limit (left-most point) prediction for population median exposure.The most common sensitive endpoint of most chemicals (48 out of 105) is QT prolongation (increasing decay-rise ratio), followed by negative chronotropy (decreasing peak frequency).
The lower confidence bound for the MOEs of eleven chemicals were lower than 1, suggesting that these chemicals are of particular concern for TdP (four chemical intermediates (ammonium perfluorooctanoate, 2,5-di-tert-butylbenzene-1,4-diol, 2-anisidine and caprolactam), a dye (rhodamine 6G), three food flavor fragrances (coumarin, benzophenone and nicotinic acid), an herbicide (dinoseb), a microbiocide (cyazofamid), a pesticide (hexachlorophene), and a surfactant (1-dodecyl-2-pyrrolidinone)) (Fig. 6).For 35 additional chemicals, the lower confidence bounds on the MOE were between 1 and 100, meaning there is potential concern as to TdP (several chemical intermediates (2,6-diethylaniline, 1,3-diphenylguanidine, tetrabromophenolphthalein ethyl ester, 1,5-naphthalenediamine, 2,4-dinitrotoluene, 3,3'-dimethylbenzidine, 2,4-di-tert-butylphenol and 2,2'-methylenebis(4methyl-6-tert-butylphenol)), one dye (basic blue 7), one flame retardant (triphenyl phosphate), one food flavor fragrance (gen-the best overall (0.95) and under external validation (0.89), but Design C performed the best under internal cross-validation (0.84).Finally, for the AUC of ROC, Designs A and C performed almost equally well and were highest overall and under internal cross-validation, while Design C was slightly higher (0.83 versus 0.78) under external validation.Designs A and C performed similarly or better than the previously published model by Blinova et al. (2018).Because for environmental chemicals there is greater tolerance for false positives than for false negatives, we decided to apply Design C, which was based on a population of hiPSC-CMs from 5 donors, for making screening predictions for proarrhythmic potential.The distributions of PODs for 5 functional phenotypes in the training (drugs in Dataset I) and test (non-drugs in Dataset I and all chemicals in Dataset II) sets are presented in Figure S5 1 .The distributions of PODs for the training set are generally wider than distributions for the test set, suggesting that the test set data are within the applicability domain of the model.The results of the sensitivity analysis using down-sampling to understand the effect of dataset imbalance are shown in Figure S6 1 (for Design C only).The results show that data imbalance does not affect the sensitivity, specificity, and accuracy of predicting proarrhythmic potentials, but the AUC of ROC of the model was slightly lower when the full (imbalanced) dataset was used to construct it.
such methods to the tens of thousands of environmental chemicals to which people are exposed (Guyton et al., 2009).
To address these limitations and increase the throughput of environmental chemical cardiotoxicity testing for proarrhythmic potential, our study proposes an in vitro-in silico approach combining high-throughput hiPSC-CM testing, PODs from Bayesian concentration-response modeling, and logistic regression modeling to predict proarrhythmic potential.Our results demonstrate that this approach can predict proarrhythmic potential with accuracy and precision comparable to other, lower-throughput testing methods.Additionally, we found that the accuracy of the logistic regression model can be improved by using concentration-response data from a population of donors, as compared to using data from single or multiple replicates of a single donor.
Our approach has several important strengths and advantages over existing models.The first advantage of our study resulted from the technology of recording the electrophysiological pattern of cardiomyocytes.Our screening approach uses only Ca 2+ flux and high-content imaging of unpaced cardiomyocytes, which are less resource-intensive than the methods of voltage-sensing dyes, multi-electrode arrays, and pacing used by previous studies.The other advantage is that our study used automated peak parameter analysis and Bayesian concentration-response modeling to analyze high-throughput data and obtain PODs as predictors, rather than using manual counting of arrhythmia-like events by instrument operators.The tradeoff, however, is that increasing accuracy and precision to levels comparable to existing models requires screening hiPSC-CMs from multiple donors, as there are more false negatives when testing only a single donor.
Our in vitro-in silico model for proarrhythmic potentials predicted a total of 156 out of the 953 environmental chemicals tested to have potential to cause TdP.Among different classes of environmental chemicals, we found that pesticides, herbicides, and microbiocides have more chemicals predicted to be positive for TdP.In comparison, Krishna et al. (2021) used Tox21/ToxCast highthroughput screening data, none of them using cardiomyocytes, to predict cardiotoxicity.A strength of the study by Krishna et al. (2021) is that they analyzed a broad range of chemicals, including pharmaceutical and environmental compounds, at the same time, and examined multiple endpoints for cardiotoxicity.However, the Tox21/ToxCast data were not originally designed with cardiotoxicity in mind, and the biological relevance of those assays to proarrhythmic potential is not clear.Our study, on the other hand, shows the feasibility of high-throughput screening for proarrhythmic potential in hundreds if not thousands of chemicals, similar to the capacity of Tox21/ToxCast, but utilizing the more biologically relevant model of iPSC-derived cardiomyocytes.
Our approach has a number of limitations that require additional research to address.First, the drug datasets used to calibrate the model were imbalanced with many more negative compounds than positive compounds.Although the use of geometric mean accuracy for determining the classification threshold could partially address this issue and the sensitivity analysis (Fig. S6 1 ) demonstrated that our imbalance dataset would not significantly affect the model performance, future studies of hiPSC-CM testing could test a wider range of compounds with positive TdP effect, perhaps

Reproducibility evaluation
A total of 87 compounds were tested in both datasets, including 33 drugs (9 drugs labeled positive for TdP in CredibleMeds ® and 24 drugs labeled negative) and 54 environmental chemicals.This replication enabled us to evaluate the reproducibility of our determinations of proarrhythmic potential.Table S7 1 summarizes the number of drugs and environmental chemicals in the different combinations of risk prediction in the two datasets, as well as the percentages of consistency for predicting the proarrhythmic potential.For the category of drugs positive for TdP, 8 out of 9 drugs were predicted correctly based on both datasets, with a consistency of 89%; for drugs negative for TdP, the consistency was 71%.For environmental chemicals, the consistency was 78%.Additionally, for the 12 pairs of intra-dataset replicates, the consistency was 75%.These values are similar to the range of evaluation metrics for sensitivity, specificity, and AUC of the ROC curve, suggesting that our proposed approach of integrating in vitro hiPSC-CM testing, Bayesian population concentrationresponse, and logistic modeling possesses good reproducibility.

Discussion
Development of effective in vitro approaches for cardiac safety assessment, especially for fatal arrhythmic TdP, has been a crucial need in drug development.The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) published the ICH S7B/E14 concept paper, which is intended to provide standardizations on developing methods including in vitro human primary and hiPSC cardiomyocyte testing, multi-ion channel assay, and in silico computational models (ICH, 2018;Pognan et al., 2023).However, results of these preclinical tests for drugs are ultimately verified (or not) in in vivo clinical trials.For the environmental chemicals, not only is there rarely sufficient toxicological cardiotoxicity information, but human data can only be obtained after the fact in observational studies.Daley et al. (2023) recently reviewed how the integration of in vitro and in silico approaches can potentially provide promising methods for cardiac safety assessment of environmental chemicals.
Recent approaches have focused on hazard and risk based on identifying the most sensitive in vitro phenotype, including both function phenotypes such as QT prolongation as well as viability phenotypes.However, for drugs, there has been a concern that QT prolongation may not be adequately specific for identifying severe risks, such as TdP.To date, however, the combination of hiPSC-CM testing and computational modeling to examine the specific issue of proarrhythmic potential has been confined to drugs (Blinova et al., 2018;Patel et al., 2019).These studies are challenging to translate to high throughput screening of environmental chemicals because they require low-or medium-throughput testing platforms as well as manual categorization of the electrophysiological traces from hiPSC-CM testing (Blinova et al., 2018;Patel et al., 2019).While these approaches are feasible for pharmaceutical testing where the drug candidates are often relatively few (no more than a dozen or so), it is not feasible to apply biological and experimental variability, applying a classification model built from one experiment to the results of another experiment will inevitably degrade performance.Fortunately, the highthroughput nature of our suggested experimental design means that it should be feasible to include enough positive and negative controls to build a robust model.
In conclusion, we have demonstrated how high-throughput, population-based hiPSC-CM testing can be integrated with Bayesian concentration-response modeling and a logistic classification model to predict proarrhythmic potential.This approach possesses comparable performance in terms of accuracy and sensitivity to previous studies, but, unlike previous studies, utilizes an in vitro-in silico design that can be implemented in high-throughput screening with fully computational data analysis.Thus, we expect that developing a large-scale screening program using this approach, along with improved predictions for exposure and IVIVE, will fill a critical data gap in the characterization of proarrhythmic risks for environmental chemicals.
drawing on the QT/TdP drug list published on CredibleMeds ® for chemicals selection.Also, there is a lack of systematic validation data for TdP for environmental chemicals, so the logistic regression model could only be built based on drug data, which occupy a narrower range of chemical space.Thus, this limitation may introduce unquantified uncertainties when applying the model to predict the proarrhythmic potential of environmental chemicals.
As for risk characterization for drugs, the comparisons of in vitro PODs with the therapeutic dose/concentration for drugs was previously reported (Blanchette et al., 2020;Burnett et al., 2021a).Specifically, these studies calculated the margins of safety based on the maximum blood concentration of a drug when administered as intended (C max ) and the in vitro POD of the given drug's critical endpoint (the lowest value of POD) (Blanchette et al., 2020;Burnett et al., 2021a).As expected, many cardiotoxic drugs had C max values that overlapped with the distribution of in vitro PODs, particularly CiPA and other high-TdP-risk drugs.Quantitatively, Blanchette et al. (2019) previously showed that the in vitro concentration-response for QT prolongation is consistent with in vivo data.
For environmental chemicals, even though we also calculate MOE for risk characterization based on the POD of the most sensitive endpoint, their exposure estimates tend to be highly uncertain.Additionally, from the regulatory point of view, it also would be worth examining whether existing toxicity values are protective of cardiotoxicity.We sourced reference doses (RfDs) from EPA's Regional Screening Levels (RSLs) table5 , converting to in vivo C ss by using the IVIVE approach, and then compared these values to the in vitro POD of the most sensitive endpoint.As shown in Figure S7A 1 , most of the C ss associated with RfDs are lower than in vitro PODs, but there are still a few chemicals (mostly pesticides) for which the RfD may not be fully protective of cardiotoxicity.We also considered the predicted exposure from ExpoCast to calculate two hazard quotients (HQs): (i) one using predicted exposure and the RfD, and (ii) one using predicted exposure converted to C ss and in vitro POD of the most sensitive endpoint, and then compared them to examine the consistency of risk characterization.Figure S7B 1 shows that the HQs of most chemicals are highly correlated and that the HQs for most of the chemicals whose RfDs are not fully protective of cardiotoxicity are lower than 1, indicating lower concern.Two exceptions are hexachlorophene and isopropalin, where the HQ based on cardiotoxicity was greater than 1 and thus may warrant additional investigation as to their exposures and risks.
Our results suggest a number of considerations when designing future high-throughput screening experiments for assessing proarrhythmic potential.First, in order to achieve at least ~80% sensitivity, it appears that screening needs to be conducted using multiple donors, as opposed to a single donor as is commonly the case.Second, our results for internal cross-validation vs. external validation suggest that the optimal design would be to include a large number of positive and negative controls during screening in order to build an experiment-specific classification model.Due to

Fig
Fig. 2: Concentration-response modeling of representative chemicals for a given phenotype using QT prolongation (i.e., increasing decay-rise ratio) as example, based on three study designs with (Design A) single replicate of a single donor, (Design B) five replicates of a single donor, and (Design C) single replicates of a population of five separate donors.The gray line represents the median estimate of concentrationresponse The different colors of points represent different donors.

Fig. 3 :
Fig. 3: Logistic regression model calibration based on the study design with single replicates of population of five separate donors (Design C) (A) Results of curve fitting, (B) relationships among thresholds, sensitivity, specificity, true and geometric accuracy, (C) ROC curve and (D) predicted probability of positive proarrhythmic potential for drugs in Dataset I with the threshold (red dashed line) determined by the maximum geometric accuracy.The red color points represent the drugs considered with positive QT/TdP risk category based on CredibleMeds ® database, and the blue ones the drugs considered negative.The solid points represent correct predictions, and the open points are incorrect predictions.

Fig
Fig. 4: Bar charts comparing the sensitivity, specificity and AUC of ROC among previous studies (Blinova et al. 2018) and the three study designs (Design A, single replicate of single donor, Design B, five replicates of single donor, Design C, single replicates of five donors) predicting proarrhythmic potential in training dataset (Dataset I), internal crossvalidation (Dataset I), and external validation (Dataset II)

Fig. 5 :
Fig. 5: Bar charts summarizing the number of environmental chemicals (A: Dataset I, B: Dataset II), by chemical class, predicted to be positive or negative for proarrhythmic potential The numbers above bars represent the number of environmental chemicals predicted to be positive for proarrhythmic potential out of the total number of each chemical class.

Fig. 6 :
Fig. 6: Margin of exposure (MOE) for chemicals predicted to have proarrhythmic potential under Design C, based on the most sensitive cardio phenotype and median and upper-confidence bound exposure estimates Chemicals are separated by chemical class and ordered from lowest to highest MOE in each class.The detailed information of chemical name, CAS number, and classes is listed in TableS41 .The color band gradients show the regions where exposure is considered "unsafe" (MOE < 1), of "potential concern" (MOE between 1 and 100), and "safe" (MOE > 100), from darker to lighter yellow shades.