Application of machine-learning to predict early spontaneous preterm birth among nulliparous non-Hispanic black and white women

Application of machine-learning to predict early spontaneous preterm birth among nulliparous non-Hispanic black and white women

Accepted Manuscript Application of machine-learning to predict early spontaneous preterm birth among nulliparous non-Hispanic black and white women An...

888KB Sizes 0 Downloads 51 Views

Accepted Manuscript Application of machine-learning to predict early spontaneous preterm birth among nulliparous non-Hispanic black and white women Ann Weber, Gary L. Darmstadt, Susan Gruber, Megan E. Foeller, Suzan L. Carmichael, David K. Stevenson, Gary M. Shaw PII:

S1047-2797(18)30494-0

DOI:

10.1016/j.annepidem.2018.08.008

Reference:

AEP 8495

To appear in:

Annals of Epidemiology

Received Date: 26 May 2018 Revised Date:

11 August 2018

Accepted Date: 20 August 2018

Please cite this article as: Weber A, Darmstadt GL, Gruber S, Foeller ME, Carmichael SL, Stevenson DK, Shaw GM, Application of machine-learning to predict early spontaneous preterm birth among nulliparous non-Hispanic black and white women, Annals of Epidemiology (2018), doi: 10.1016/ j.annepidem.2018.08.008. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Application of machine-learning to predict early spontaneous preterm birth among nulliparous nonHispanic black and white women Ann Webera, Gary L. Darmstadta, Susan Gruberb, Megan E. Foellerc, Suzan L. Carmichaela, David K. Stevensona, and Gary M. Shawa a

b

Putnam Data Sciences, Cambridge, MA 02139

Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA 94305

For correspondence: Ann Weber

Department of Pediatrics

M AN U

Stanford University, School of Medicine

Division of Neonatal and Developmental Medicine Medical School Office Building 1265 Welch Road, X1C21

Tel: 650-724-1322

TE D

Stanford, CA 94305

AC C

EP

Email: [email protected]

SC

c

RI PT

March of Dimes Prematurity Center, Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305

ACCEPTED MANUSCRIPT

Application of machine-learning to predict early spontaneous preterm birth among nulliparous nonHispanic black and white women

RI PT

Abstract Purpose: Spontaneous preterm birth is a leading cause of perinatal mortality in the U.S., occurring disproportionately among non-Hispanic black women compared to other race-ethnicities. Clinicians lack tools to identify first-time mothers at risk for spontaneous preterm birth. This study assessed prediction

SC

of early (<32 weeks) spontaneous preterm birth among non-Hispanic black and white women by

M AN U

applying state-of-the-art machine-learning to multilevel data from a large birth cohort. Methods: Data from birth certificate and hospital discharge records for 336,214 singleton births to nulliparous women in California from 2007-2011 were used in cross-validated regressions, with multiple imputation for missing covariate data. Residential census tract information was overlaid for 281,733

TE D

births. Prediction was assessed with areas under the receiver operator characteristic curves (AUCs). Results: Cross-validated AUCs were low (0.62 [min = 0.60, max = 0.63] for non-Hispanic blacks and 0.63 [min = 0.61, max = 0.65] for non-Hispanic whites). Combining racial-ethnic groups improved prediction

EP

(cross-validated AUC = 0.67 [min = 0.65, max = 0.68]), approaching what others have achieved using biomarkers. Census tract-level information did not improve prediction.

AC C

Conclusions: The resolution of administrative data was inadequate to precisely predict individual risk for early spontaneous preterm birth despite the use of advanced statistical methods.

Key words: machine learning; prediction; spontaneous preterm birth; race-ethnicity; disparity

1

ACCEPTED MANUSCRIPT

List of abbreviations: Preterm birth (PTB)

Non-hispanic (NH) Cross-validation/cross-validated (cv) Receiver operating characteristics (ROC) Area under the receiver operating characteristics curve (AUC)

SC

Predictive value positive (PVP)

RI PT

Spontaneous preterm birth (sPTB)

Predictive value negative (PVN)

AC C

EP

TE D

M AN U

SuperLearner (SL)

2

ACCEPTED MANUSCRIPT

Introduction Spontaneous preterm birth (sPTB) is a leading cause of perinatal mortality and child morbidity

RI PT

globally, [1] occurring disproportionately in the U.S. among non-Hispanic (NH) black women (16.3%) as compared to NH white women (10.2%). [2] Although prior preterm birth is recognized as a strong

predictor for future preterm birth among multiparous women (women having at least one prior birth), [3] few indicators of risk for preterm birth exist in nulliparous women (women without a prior birth). A

SC

non-invasive screening measure to quantify the risk for sPTB among nulliparous women at the onset of

M AN U

pregnancy could help guide clinical decisions regarding proper pregnancy surveillance and management. Maternal demographic [4,5] and health characteristics [6] early in pregnancy have lacked power to predict individual risk despite their statistical association with PTB at a population level. The screening performance of traditional regression algorithms that include these factors are only a little better than chance, with an area under the receiver operating characteristic curve (AUC) on the order of

TE D

0.61 to 0.63 for predicting early sPTB (< 32 weeks). [4] The addition of maternal biomarkers (e.g., cervical length, fetal fibronectin, and maternal serum analytes) improves prediction, but only marginally (e.g., AUCs of 0.67 and 0.70). [7,8] The application of machine-learning techniques may uncover more

EP

sensitive and specific screening algorithms than is possible with traditional methods. It is becoming increasingly common to use machine-learning algorithms to learn from “big data” for more reliable

AC C

predictions. Many popular machine-learning algorithms relax the modeling assumptions and restrictions of traditional regression (e.g., factors are linearly associated to the risk of PTB) by including semi- and non-parametric models (e.g., non-linear models or decision trees) and parametric models with multiple higher-order terms (e.g., squared terms or two-way interactions). Disparate machinelearning algorithms are easily compared using cross-validation, a technique used to assess how well a prediction model performs outside of the sample with which the model was fit. The original sample is partitioned into a training set to fit the model and a testing set to evaluate the goodness of the fit. In 3

ACCEPTED MANUSCRIPT

this way, the generalizability of results is assessed while minimizing the risk of over-fitting the model to the available data. In this paper, we used machine-learning methods to explore how well a set of demographic and

RI PT

maternal characteristics that could be known at the first prenatal visit could predict early sPTB and whether prediction differed for NH black and white women. We restricted the analysis to nulliparous women, for whom the etiologies of early sPTB are perhaps the least well understood. [9] We assessed

SC

whether prediction could be improved by overlaying information about residential census tract,

including indicators of social disadvantage (e.g., poverty) [10] and of air, water, and soil pollutants (from

M AN U

the CalEnviroScreen 2.0). [11] Finally, we tested for statistically significant differences in the magnitudes of association of factors with early sPTB for NH black vs. white women that might help explain the blackwhite disparity in prevalence of early sPTB.

Study population

TE D

Methods

The study’s source population consisted of ~2.7 million singleton births in the state of California

EP

from 2007-2011 with individual-level birth records obtained from Vital Statistics for the State of California. Birth record data were linked with hospital discharge ICD-9 codes at the time of delivery

AC C

from the Office of Statewide Health Planning and Development (OSHPD). The algorithm used to assemble these data is accurate and has been described previously. [12] Just under one million of births were to nulliparous women, 2.5% of whom were dropped due to missing gestational age, race-ethnicity and indicator for spontaneous birth (as opposed to medical-indicated). Because we were interested in the black-white disparity in preterm birth, we included only women who self-identified as either NH black or white (n= 356,269). We further restricted our sample to early sPTB (≥ 20 and < 32 weeks, preceded by spontaneous onset of labor [ICD-9-CM code 644] or rupture of membranes [ICD-9-CM code

4

ACCEPTED MANUSCRIPT

658.1 or birth certificate complication of labor/delivery code 10]) and term delivery (≥ 37 weeks). The final analytic sample included 336,214 singleton births to nulliparous NH black (n=54,084) and NH white (n=282,130) women.

RI PT

Predictors

Candidate predictor variables were selected in a multi-step process. First, we started with over 1000 demographic and maternal characteristics variables obtained from linked hospital discharge

SC

records and birth certificates. Indicators for pre-existing medical conditions, including hypertension,

M AN U

diabetes mellitus, thyroid dysfunction and asthma were derived from ICD-9 codes (rolled up to the 4th digit) and birth certificates. In a second step, the variables were grouped as demographic or clinical indicators for the mother or for the baby. Next, we retained all demographic factors and those clinical indicators for the mother that could be known at first prenatal visit, including those considered important predictors of preterm birth by a group of co-author obstetricians, pediatricians and

TE D

epidemiologists who study the etiologies of preterm birth and how to prevent it. Factors for which the diagnosis was uncertain (e.g., the timing of anemia) or the coding was ambiguous (e.g., “other complications”) were excluded as interpretation would be uncertain. Finally, we collapsed highly

EP

correlated variables (r > 0.8) and excluded variables with no variation. The final factors included in the analyses are shown in Table 1.

AC C

Maternal residence based on street address at the time of birth was obtained from birth

certificates. Street addresses were geocoded using SAS software in combination with Census Bureau’s Topologically Integrated Geographic Encoding and Referencing (TIGER) shape and lookup files, [13] and to obtain mothers’ residential census tract. For 84% of women (n=281,733) with non-missing residential data, we overlaid census-tract level information (from the 2010 census and 2007-2011 American Community Survey), [10] as well as environmental indicators of air, water, and soil pollution (from the

5

ACCEPTED MANUSCRIPT

CalEnviroScreen 2.0). [11] In our analysis, we used percentile scores of the CalEnviroScreen indicators, which are rank-ordered pollution scores (ordered from highest to lowest) for the entire state.

RI PT

Statistical Analyses To avoid omitting observations with incomplete data, we replaced missing values of the

predictor variables with a set of plausible values using multiple imputation (i.e., Multiple Imputation by Chained Equations or MICE). [14] We created multiple imputed datasets to account for the statistical

SC

uncertainty in predicting the missing values (in contrast to single imputation, which does not and leads

M AN U

to underestimating the true variance). The multiple datasets were analyzed separately and the results combined for inference. Due to the computational burden of imputation with a very large dataset, only three imputations were performed.

An approach known as super learning (SL) [15] was used to evaluate the ability of different machine learning algorithms to discriminate between early sPTB and full-term births. Separate

TE D

prediction models were developed using data on NH blacks only, NH whites only, both groups with (and without) race-ethnicity interacted with all other factors, and the subset of both groups with non-missing census-tract data. We used SL to compare the performance of ordinary logistic regression, random

EP

forest, [16] k-nearest neighbors, [17] generalized additive models (GAM) [18] lasso regression, ridge regression, and an elastic net with mixing parameter 0.5. [19] The last three are variants of penalized

AC C

logistic regression modeling, so named because they add a “penalty” parameter to reduce variance that can occur from including collinear variables in the model, but at a cost of added bias. These models effectively perform covariate selection, and can consider pre-specified interaction terms. GAM can discover non-linear relationships between predictors and outcome. Random forest and k-nearest neighbors are non-parametric classification algorithms, which can be used to predict to which of a set of categories or classes an observation belongs.

6

ACCEPTED MANUSCRIPT

SL used five-fold cross-validation (cv) to estimate the area under the receiver operating characteristics (ROC) curve (cv-AUC) for each algorithm. For the 5-fold cross-validation, data were split into 5 parts of equal size, the model fitted to the other 4 parts, and error estimated by prediction of the

RI PT

remaining part. This procedure was repeated 5 times. CV-AUC provides a reliable measure of the predictive power of the model applied to novel data on a similar study population.

For each model, we calculated the sensitivity, specificity, predictive value positive (PVP), and

SC

predictive value negative (PVN) using the optimal cut-point for the models. The optimal cut-point on the ROC was defined as the point closest to the true positive rate of 1 and false positive rate of 0, giving

M AN U

equal weight to sensitivity and specificity. Unless otherwise specified, we report pooled results from the imputed datasets based on Rubin’s rules. [20] Regression coefficients were pooled by taking the average of the coefficient estimates from all imputed datasets. Standard errors of the coefficients were pooled by combining within imputation variance and the between imputation variance.

Results

TE D

Analyses were run in R version 3.3.2[21] using the SuperLearner package. [22]

EP

The demographic and maternal characteristics included in the models are shown in Table 1 separately for NH black (n=54,084) and NH white (n=282,130) nulliparous women with non-imputed

AC C

data. Early sPTB occurred 2.6 times more frequently among NH blacks as compared to whites. NH black women were more than twice as likely to have used MediCal for payment of the delivery as white women (54.6% vs. 22.7%). In addition, NH black women had lower educational attainment than white women (14.6% vs. 45.6% had a bachelor’s degree or higher), as did the fathers (10.6% vs 38.6%). Paternal education and paternal race/ethnicity were the most frequently co-occurring missing data in the sample (see table of missingness in the supplementary materials) and were missing more frequently for NH blacks (25% and 19.3% for paternal education and race-ethnicity, respectively) as 7

ACCEPTED MANUSCRIPT

compared to whites (7.5% and 5.3%). Maternal pre-pregnancy body mass index was the next most common missing factor (9.2% of NH black and 5.9% of NH white). Other factors were missing for 5% or less of the sample.

RI PT

Predictive ability of models

In preliminary analyses we found that all algorithms had similar performance, as measured by the cv-AUC (results not shown). Because of this, we present results from ordinary logistic regression.

SC

The predictive accuracy of the demographic and maternal characteristics was not sensitive to choice of

M AN U

imputed datasets. Using the second imputed dataset and logistic regression, the cv-AUC for NH black women was 0.62 (min = 0.60, max = 0.63), and for NH white women the cv-AUC was nearly identical at 0.63 (min = 0.61, max = 0.65) (Table 2 and Figure 1).

Combining the women from both groups into a single model with a race-ethnicity indicator improved the cv-AUC to 0.67 (min = 0.65, max = 0.68), indicating race-ethnicity is a clear predictor of

TE D

PTB (Table 2 and Figure 1). The distribution of predicted probabilities of early sPTB among NH black women was shifted higher compared to those for whites (see distribution plot in the supplementary materials). When we add the two distributions together, we add proportionately more early sPTB cases

EP

than we add term births, so the AUC increases. The AUC did not improve further when race-ethnicity

AC C

was considered simultaneously with the other factors. Sensitivity, specificity, PVP, and PVN for the above models are shown in Table 2. The highest

sensitivity was 0.61, with a specificity of 0.64, for the model with both race-ethnicity groups combined. Due to the low sensitivity and low prevalence of early sPTB (~1%), the PPV was negligible (0.012 – 0.031) and the PVN was high (0.985 – 0.994) for both groups. Overlaying census tract data

8

ACCEPTED MANUSCRIPT

NH black women as compared to white women were residing in census tracts with higher proportions of unmarried mothers, unemployment, and children under 5 years living below the poverty level. NH black women also resided in census tracts with higher rates of emergency department visits

RI PT

for asthma as compared to white women, and in tracts ranking higher in numbers of hazardous waste facilities and concentrations of modeled chemical releases to air. The AUC did not improve (cv-AUC = 0.67 [min = 0.67, max = 0.68]) when adding the census tract information to the prediction model with

SC

both groups combined. The sensitivity and specificity were 0.62 and 0.65, respectively.

M AN U

Magnitudes of association by race-ethnicity

To compare the magnitudes of association of factors with early sPTB by race-ethnicity, we graphed the coefficients (in pooled log odds units) from the logistic regression models for NH blacks against those for NH whites (Figure 2). A linear fit of the data points (black dotted line) close to the line of perfect concordance (grey dashed line) was found, indicating that most factors had similar

TE D

associations with early sPTB in both groups.

The pooled odds ratios (OR) from the prediction models are shown in Table 3 for three models: NH blacks only, NH whites only and both groups combined (without interaction terms). The factors with

EP

the strongest positive associations for both groups were: a history of prior stillbirth (in utero fetal death delivered at 20 or more weeks’ gestation), self-pay for delivery (as compared to private insurance

AC C

coverage), in-vitro fertilization, chronic hypertension, and diabetes mellitus. The factors with the strongest negative (protective) associations for both groups were the mother and father having achieved a bachelor’s degree or higher education (as compared to achieving less than a high school degree). These two sets of factors are circled in Figure 2. Two factors showed statistically significant differences in association with early sPTB by maternal race-ethnicity based on pooled interaction terms in the combined model: pre-pregnancy

9

ACCEPTED MANUSCRIPT

smoking and thyroid dysfunction. Pre-pregnancy smoking among NH blacks demonstrated an increased risk for sPTB (pooled Odds Ratio (OR) = 1.43 [95% Confidence Interval (CI) = 0.96, 2.14]) but was associated with reduced risk for NH whites (pooled OR = 0.73 [95% CI = 0.56, 0.94]). The association of

RI PT

thyroid dysfunction with early sPTB was higher for NH blacks (pooled OR = 2.13 [95% CI = 1.39, 3.28]) as compared to NH whites (pooled OR = 1.31 [95% CI = 1.05, 1.63]), with a statistically significant difference between the groups (pooled OR = 1.63 [95% CI = 1.01, 2.64]).

SC

In two of the three imputed datasets, MediCal payment for delivery and paternal race-ethnicity showed statistically significant differences by maternal race-ethnicity, but the pooled effect estimates

M AN U

on the interaction terms were not significant to the 5% level.

Discussion

Prediction models using machine-learning methods with population-based cohort data were marginally predictive of early sPTB as compared to term birth, and performed similarly for NH black and

TE D

NH white women in California. Predictive power improved when the two race-ethnicity groups were combined, such that the AUC values surpassed those reported by others [4] and approached those combining maternal characteristics with biological markers (e.g., serum analytes). [7,8] Census tract-

AC C

factors.

EP

level socio-economic and pollution indicators did not improve the AUC over and above individual-level

The magnitude of association between pre-pregnancy smoking and early sPTB was significantly

different by race-ethnicity, with an increased risk for NH blacks and a reduced risk for whites. However, the reduced risk for NH whites may be explained by bias similar to the “Birthweight paradox,” in which smoking appears to be protective of infant mortality among low birthweight infants. [23] Here, prepregnancy smoking is associated with sPTB as well as other factors (e.g., hypertension), which are also associated with sPTB. Therefore, an unmeasured common cause of sPTB and one of these factors would

10

ACCEPTED MANUSCRIPT

bias the association between pre-pregnancy smoking and sPTB. In simple bivariate logistic models, prepregnancy smoking is a risk for sPTB, for both NH whites and blacks (OR = 1.63 [1.29-2.08] and OR = 1.20 [1.03-1.38] respectively), as we would expect.

RI PT

The risk of sPTB associated with maternal thyroid dysfunction was also significantly different by race-ethnicity, with an increased risk among NH blacks compared to whites. However, the prevalence of thyroid dysfunction was higher among NH whites (2.55% vs. 0.78% for NH blacks), such that the

SC

population attributable risk percent is nearly identical for both groups (0.8%, assuming OR ≈ RR).

Therefore, even if exposure from thyroid dysfunction could be eliminated, the difference in prevalence

M AN U

of early PTB between NH blacks and whites would not change.

Although no other statistically significant differences were found in the magnitudes of association between sPTB and other factors by race-ethnicity, differences exist in the prevalence of factors associated with poverty, which might explain some of the black-white disparity. In a recent

TE D

study of the impact of social disadvantage on PTB, the authors found that the majority of variability in the black-white disparity could not be explained by differences in the prevalence of individual-level and census tract-level social disadvantage variables (i.e., maternal education, payment type for delivery,

EP

census tract poverty and census tract income inequality) among NH blacks and whites. [24] Although the overall prevalence of early sPTB would be reduced if multiple social disadvantage variables were set to

8.8%.

AC C

‘favorable’ values for everyone, the authors estimated that the disparity in early sPTB would increase by

A major strength of this study was the use of a unique population-based cohort of registered

singleton births in the state of California from 2007-2011, with linked data ranging from individual hospital discharge records to residential census tract information. We introduced additional methodological strengths compared to other studies, including cross-validation to reduce the risk of

11

ACCEPTED MANUSCRIPT

over-fitting the data and missing data imputation to minimize bias from excluding cases with incomplete information. We explored three penalized regression algorithms offering variance-bias tradeoffs for better prediction, as well as a semi-parametric generalized additive model that allowed for a non-linear

RI PT

relation between the continuous variables (e.g., age, BMI) and sPTB. All regression models performed comparably, with nearly identical cv-AUC for both NH black and white women. However, we cannot overlook the fact that machine-learning algorithms not tested might improve the predictive

SC

performance of these data. We plan to investigate alternate algorithms in future research.

A limitation of this study is the use of a binary outcome for sPTB, which is subject to potential

M AN U

bias from incorrect categorization. However, given the large sample size and the high computational burden of running machine-learning with multi-fold cross-validation, we chose logistic regression with a binary indicator (one row per person) over the more demanding alternative time-to-event analysis with gestational age in weeks (one row per week per person). In addition, the exclusion of medically-

TE D

indicated early PTB (about 10% of the early PTB sample) may have resulted in selection bias if certain factors predicted both early sPTB and inclusion in the sample. Finally, a limitation of the census tracklevel data is that these were based on women’s residence at time of delivery, rather than possible

EP

residence at other locations during gestation (e.g., if she moved). Also, residential track does not account for exposures encountered in a woman’s broader environment, such as a place of employment.

AC C

Despite the use of state-of-the-art statistical techniques, there remains a great deal of

heterogeneity that we were unable to explain with the basic demographic and maternal characteristics available in administrative data. Comparable results across prediction models suggest that we may be limited by the data, rather than the statistical tools with which to assess these data. For example, coding of clinical diagnoses from discharge records and birth certificates introduced uncertainty about specific diagnoses as well as the timing of diagnoses. In addition, the lack of detailed individual exposure information of the census tract data may be insufficient for predicting individual risk with 12

ACCEPTED MANUSCRIPT

precision – or we may be missing census tract data for factors that are actually predictive of sPTB. Importantly, these data sources do not include other individual risk factors, such as marital status and stress, [25] nor do they include risk exposure over the life course – which might be useful for predicting

RI PT

PTB. Future data collection efforts to improve the prediction models would ideally be informed by what is known of the etiology of sPTB. However, the power of prediction modeling may remain limited until we improve our understanding of the fundamental origins of PTB and racial disparities in PTB.

SC

Acknowledgments & Funding

M AN U

We are grateful for technical and data support from Jonathan A. Mayo and John W. Oehlert. This work was supported by the Stanford Child Health Research Institute and Stanford NIH-NCATS-CTSA (grant no. UL1 TR001085) and the March of Dimes Prematurity Research Center at Stanford (MOD PR625253). The funding sources had no involvement in study design, analysis and interpretation of

AC C

EP

TE D

data, writing of the report, or in the decision to submit the article for publication.

13

ACCEPTED MANUSCRIPT

References [1]

Blencowe H, Cousens S, Chou D, Oestergaard M, Say L, Moller A-B, et al. Born Too Soon: The

4755-10-S1-S2. [2]

Frey HA, Klebanoff MA. The epidemiology, etiology, and costs of preterm birth. Semin Fetal

SC

Neonatal Med 2016;21:68–73. doi:10.1016/j.siny.2015.12.011. [3]

RI PT

global epidemiology of 15 million preterm births. Reprod Health 2013;10:S2. doi:10.1186/1742-

Mercer BM, Goldenberg RL, Moawad AH, Meis PJ, Iams JD, Das AF, et al. The Preterm Prediction

M AN U

Study: Effect of gestational age and cause of preterm birth on subsequent obstetric outcome. Am J Obstet Gynecol 1999;181:1216–21. doi:10.1016/S0002-9378(99)70111-0. [4]

Smith GCS, Shah I, White IR, Pell JP, Crossley JA, Dobbie R. Maternal and biochemical predictors of spontaneous preterm birth among nulliparous women: a systematic analysis in relation to the

[5]

TE D

degree of prematurity. Int J Epidemiol 2006;35:1169–77. doi:10.1093/ije/dyl154. Culhane JF, Goldenberg RL. Racial Disparities in Preterm Birth. Semin Perinatol 2011;35:234–9. doi:10.1053/j.semperi.2011.02.020.

Kelly R. Psychiatric and substance use disorders as risk factors for low birth weight and preterm

EP

[6]

delivery. Obstet Gynecol 2002;100:297–304. doi:10.1016/S0029-7844(02)02014-8. Alleman BW, Smith AR, Byers HM, Bedell B, Ryckman KK, Murray JC, et al. A proposed method to

AC C

[7]

predict preterm birth using clinical data, standard maternal serum screening, and cholesterol. Am J Obstet Gynecol 2013;208:472.e1-472.e11. doi:10.1016/j.ajog.2013.03.005.

[8]

Esplin MS, Elovitz MA, Iams JD, Parker CB, Wapner RJ, Grobman WA, et al. Predictive Accuracy of Serial Transvaginal Cervical Lengths and Quantitative Vaginal Fetal Fibronectin Levels for Spontaneous Preterm Birth Among Nulliparous Women. JAMA 2017;317:1047.

14

ACCEPTED MANUSCRIPT

doi:10.1001/jama.2017.1373. [9]

Menon R. Spontaneous preterm birth, a clinical dilemma: Etiologic, pathophysiologic and genetic heterogeneities and racial disparity. Acta Obstet Gynecol 2008:590–600.

RI PT

doi:10.1080/00016340802005126.

United States Census Bureau. 2007 – 2011 American Community Survey 2013.

[11]

The Office of Environmental Health Hazard Assessment. California Communities Environmental

SC

[10]

Health Screening Tool. Version 2.0 (CalEnviroScreen 2.0) 2014.

Herrchen B, Gould JB, Nesbitt TS. Vital Statistics Linked Birth/Infant Death and Hospital Discharge

M AN U

[12]

Record Linkage for Epidemiological Studies. Comput Biomed Res 1997;30:290–305. doi:10.1006/cbmr.1997.1448.

Massengill D, Odom E, SAS Institue. PROC GEOCODE: Now with Street-Level Geocoding. 2010.

[14]

Buuren S van, Groothuis-Oudshoorn K. mice : Multivariate Imputation by Chained Equations in R.

TE D

[13]

J Stat Softw 2011;45. doi:10.18637/jss.v045.i03. [15]

Polley EC, van der Laan MJ. Super learner in prediction. UC Berkeley Div Biostat Work Pap Ser

[17] [18] [19]

Liaw A, Wiener M. Classification and Regression by randomForest. R News 2002;2:18–22.

AC C

[16]

EP

2010:Paper 266.

Venables WN, Ripley BD. Modern Applied Statistics with S. 4th ed. New York: Springer; 2002. Hastie T. gam: Generalized Additive Models 2016. Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via

Coordinate Descent. J Stat Softw 2010;33:1–20. doi:10.18637/jss.v033.i01. [20]

Rubin DB. Multiple Imputation after 18+ years. J Am Stat Assoc 1996;91:473–89.

15

ACCEPTED MANUSCRIPT

R Core Team. R: A Language and Environment for Statistical Computing 2016.

[22]

Polley E, LeDell E, Kennedy C, van der Laan M. SuperLearner: Super Learner Prediction 2017.

[23]

Hernandez-Diaz S, Schisterman EF, Hernan MA. The Birth Weight “Paradox” Uncovered? Am J Epidemiol 2006;164:1115–20. doi:10.1093/aje/kwj275.

[24]

RI PT

[21]

Carmichael SL, Kan P, Padula AM, Rehkopf DH, Oehlert JW, Mayo JA, et al. Social disadvantage

2017;12:e0182862. doi:10.1371/journal.pone.0182862.

M AN U

Copper RL, Goldenberg RL, Das A, Elder N, Swain M, Norman G, et al. The preterm prediction study: Maternal stress is associated with spontaneous preterm birth at less than thirty-five

EP

TE D

weeks’ gestation. Am J Obstet Gynecol 1996;175:1286–92. doi:10.1016/S0002-9378(96)70042-X.

AC C

[25]

SC

and the black-white disparity in spontaneous preterm delivery among California births. PLoS One

16

ACCEPTED MANUSCRIPT

Legends for Tables & Figures Table 1: Parental demographics and maternal characteristics by maternal race-ethnicity Table 2: Prediction of separate models

RI PT

Figure 1: Receiver operator characteristic curves for NH black women only, NH white women only, and both race-ethnicity groups combined.

Figure 2: Comparison of the magnitude of logistic regression coefficients predicting early (≤ 32 weeks)

SC

spontaneous preterm birth for non-Hispanic black (NHB) vs. non-Hispanic white (NHW) first-time

AC C

EP

TE D

M AN U

mothers

17

ACCEPTED MANUSCRIPT

Table 1: Parental demographics and maternal characteristics by maternal race-ethnicity NH whites

(n=54,084)

(n=282,130)

Early spontaneous PTB (n (%))

1130 (2.1)

2305 (0.8)

Maternal age (y) (mean (SD))

23.43 (5.78)

28.03 (6.1)

Paternal age (y) (mean (SD))

27.25 (7.74)

30.84 (6.98)

1187 (2.2)

1366 (0.5)

Maternal education

High school to AA degree (n (%)) BA or above (n (%)) Missing (n (%))

16224 (5.7)

35068 (64.8)

132196 (46.9)

7893 (14.6)

128739 (45.6)

1143 (2.1)

4971 (1.8)

TE D

Paternal education

9980 (18.4)

M AN U

Less than high school (n (%))

SC

Missing (n (%))

RI PT

NH blacks

6046 (11.2)

14728 (5.2)

High school to AA degree (n (%))

28804 (53.3)

137497 (48.7)

BA or above (n (%))

5716 (10.6)

108845 (38.6)

13518 (25)

21060 (7.5)

NH white (n (%))

2968 (5.5)

212427 (75.3)

NH black (n (%))

35895 (66.4)

7546 (2.7)

Hispanic (n (%))

3899 (7.2)

37585 (13.3)

Asian, Pacific Islander or other (n (%))

840 (1.6)

9495 (3.4)

Missing (n (%))

10482 (19.4)

15077 (5.3)

Missing (n (%))

EP

Less than high school (n (%))

AC C

Paternal race-ethnicity

18

ACCEPTED MANUSCRIPT

NH blacks

NH whites

(n=54,084)

(n=282,130)

No (n (%))

4713 (8.7)

35560 (12.6)

Yes (n (%))

49335 (91.2)

246453 (87.4)

Missing (n (%))

36 (0.1)

117 (0)

20807 (38.5)

Self-pay (n (%))

762 (1.4)

MediCal (n (%)) Other government program (n (%)) Other (n (%))

Chronic hypertension (n (%))

2447 (0.9)

29511 (54.6)

63985 (22.7)

1691 (3.1)

3940 (1.4)

1313 (2.4)

6280 (2.2)

444 (0.8)

1638 (0.6)

TE D

Diabetes mellitus (n (%))

205478 (72.8)

M AN U

Private insurance (n (%))

SC

Delivery payment

RI PT

Mother born in US

1554 (2.9)

5647 (2)

422 (0.8)

7192 (2.5)

4035 (7.5)

12659 (4.5)

53600 (99.1)

280676 (99.5)

Yes (n (%))

469 (0.9)

1426 (0.5)

Missing (n (%))

15 (0)

28 (0)

None (n (%))

48200 (89.1)

246159 (87.2)

One (n (%))

4540 (8.4)

27528 (9.8)

Thyroid dysfunction (n (%))

EP

Asthma (n (%)) Prior stillbirth (≥ 20 wk)

AC C

No (n (%))

Early fetal loss (< 20 wk)

19

ACCEPTED MANUSCRIPT

NH whites

(n=54,084)

(n=282,130)

Two or more (n (%))

1340 (2.5)

8425 (3)

Missing (n (%))

4 (0)

18 (0)

In-vitro fertilization (n (%))

85 (0.2)

2186 (0.8)

Elderly nulliparous (n (%))

1309 (2.4)

14344 (5.1)

No (n (%))

50575 (93.5)

Yes (n (%))

2206 (4.1)

SC

Smoked pre-pregnancy

RI PT

NH blacks

1303 (2.4)

3067 (1.1)

51394 (95)

266797 (94.6)

1395 (2.6)

12435 (4.4)

Smoked in 1st trimester No (n (%))

EP

Missing (n (%))

TE D

Yes (n (%))

BMI (kg/m2) (mean (SD))

Height (inches) (mean (SD))

1295 (2.4)

2898 (1)

26.14 (6.61)

24.55 (5.42)

4982 (9.2)

16543 (5.9)

64.71 (2.77)

65.17 (2.67)

2576 (4.8)

9537 (3.4)

AC C

Missing (n (%))

20837 (7.4)

M AN U

Missing (n (%))

Missing (n (%))

258226 (91.5)

20

ACCEPTED MANUSCRIPT

Table 2: Prediction of separate modelsa Sensitivityb

Specificityb

PVPc

PVNc

NH Black women only

0.62 (0.60, 0.63)

0.56

0.63

0.031

0.985

NH White women only

0.63 (0.61, 0.65)

0.56

0.62

0.012

0.994

Both groups combined

0.67 (0.65, 0.68)

0.61

0.64

0.017

0.994

0.67 (0.65, 0.68)

0.61

0.64

0.017

0.994

0.67 (0.67, 0.68)

0.62

0.018

0.994

terms

variables a

Cross-validated AUC, sensitivity, specificity, PVP and PVN are reported for a single imputed dataset

using logistic regression b

Sensitivity and specificity were calculated for an optimal cut-point for logistic regression with equal

TE D

weights for sensitivity and specificity.

EP

PVP = predicted value positive; PVN = predicted value negative

AC C

c

0.65

M AN U

Both groups + census tract

SC

Both groups + interaction

RI PT

cv-AUC (min, max)

Early PTB vs. Term

21

ACCEPTED MANUSCRIPT

Table 3: Pooleda odds ratios for early spontaneous PTB (≤ 32 weeks) versus term birth using ordinary logistic regression Pooled OR [95% confidence intervals] NH white

Maternal race-ethnicity NH white

Both

RI PT

NH black

Reference

2.15 [1.89, 2.44]

SC

NH black 1.03 [1.01, 1.04]

1.03 [1.02, 1.04]

1.03 [1.02, 1.04]

Paternal age (y)

1.01 [1, 1.02]

1.00 [0.99, 1.01]

1.00 [1, 1.01]

Less than high school

Reference

Reference

Reference

High school to AA degree

0.95 [0.78, 1.15]

0.80 [0.67, 0.95]

0.88 [0.77, 1]

BA or above

0.75 [0.55, 1.01]

0.62 [0.5, 0.76]

0.68 [0.57, 0.8]

Reference

Reference

Reference

0.90 [0.7, 1.16]

0.84 [0.69, 1.02]

0.88 [0.75, 1.03]

0.64 [0.43, 0.96]

0.64 [0.51, 0.8]

0.65 [0.53, 0.79]

NH white

Reference

Reference

Reference

NH black

0.98 [0.71, 1.36]

1.28 [1.02, 1.59]

1.11 [0.97, 1.27]

Hispanic

0.87 [0.61, 1.24]

1.15 [1.02, 1.29]

1.13 [1.01, 1.25]

Asian, Pacific Islander or other

1.68 [0.82, 3.43]

1.23 [0.82, 1.85]

1.4 [0.93, 2.1]

M AN U

Maternal age (y)

Paternal education Less than high school

BA or above

EP

High school to AA degree

TE D

Maternal education

AC C

Paternal race-ethnicity

22

ACCEPTED MANUSCRIPT

Pooled OR [95% confidence intervals] NH black

NH white

Both

No

Reference

Reference

Reference

Yes

1.38 [1.07, 1.77]

1.25 [1.09, 1.44]

1.27 [1.12, 1.43]

Private insurance

Reference

Reference

Reference

Self-pay

2.90 [2.07, 4.08]

2.57 [1.9, 3.47]

2.76 [2.21, 3.45]

MediCal

0.94 [0.82, 1.08]

1.12 [1, 1.26]

1.05 [0.97, 1.15]

Other government program

0.75 [0.5, 1.12]

0.99 [0.69, 1.42]

0.88 [0.67, 1.15]

Other

1.12 [0.77, 1.63]

1.05 [0.79, 1.39]

1.09 [0.87, 1.36]

Reference

Reference

Reference

1.98 [1.33, 2.93]

2.96 [2.25, 3.89]

2.55 [2.04, 3.2]

Reference

Reference

Reference

2.52 [2.01, 3.18]

2.62 [2.19, 3.13]

2.6 [2.26, 2.99]

Reference

Reference

Reference

2.13 [1.39, 3.28]

1.31 [1.05, 1.63]

1.43 [1.18, 1.74]

No

Reference

Reference

Reference

Yes

1.10 [0.89, 1.36]

1.42 [1.21, 1.67]

1.28 [1.13, 1.46]

M AN U

Diabetes mellitus

Yes

No Yes

No

AC C

Thyroid dysfunction

EP

Chronic hypertension

TE D

No

Yes

SC

Delivery payment

RI PT

Mother born in US

Asthma

23

ACCEPTED MANUSCRIPT

Pooled OR [95% confidence intervals] NH black

NH white

Both

No

Reference

Reference

Reference

Yes

2.95 [2.06, 4.24]

2.73 [1.97, 3.78]

2.89 [2.27, 3.67]

None

Reference

Reference

Reference

One

1.39 [1.15, 1.67]

1.14 [1, 1.3]

1.22 [1.1, 1.36]

Two or more

1.72 [1.3, 2.28]

1.25 [1.02, 1.53]

1.39 [1.18, 1.64]

No

Reference

Reference

Reference

Yes

2.79 [1.18, 6.61]

2.32 [1.71, 3.16]

2.33 [1.74, 3.1]

Reference

Reference

Reference

1.03 [0.74, 1.44]

1.17 [0.98, 1.4]

1.13 [0.96, 1.32]

Reference

Reference

Reference

1.43 [0.96, 2.14]

0.73 [0.56, 0.94]

0.88 [0.71, 1.09]

Reference

Reference

Reference

1.01 [0.62, 1.63]

1.54 [1.15, 2.08]

1.4 [1.09, 1.8]

BMI (kg/m2)

1.01 [1.01, 1.02]

1.01 [1, 1.02]

1.01 [1.01, 1.02]

Height (inches)

0.96 [0.94, 0.98]

0.95 [0.93, 0.97]

0.95 [0.94, 0.97]

M AN U

In-vitro fertilization

No Yes

AC C

Yes

EP

Smoked pre-pregnancy

TE D

Elderly nulliparous

No

SC

Early fetal loss (< 20 wk)

RI PT

Prior stillbirth (≥ 20 wk)

Smoked in 1st trimester No

Yes

24

ACCEPTED MANUSCRIPT

a

The odds ratios are pooled from coefficients obtained from three imputed datasets based on Rubin’s

AC C

EP

TE D

M AN U

SC

RI PT

rules. [20]

25

M AN U TE D

0.6 0.4

AC C

0.2

EP

Both groups AUC 0.67 NH white only AUC 0.63 NH black only AUC 0.62

0.0

True positive rate

SC

0.8

RI PT

1.0

ACCEPTED MANUSCRIPT

0.0

0.2

0.4

0.6

False positive rate

0.8

1.0

RI PT

ACCEPTED MANUSCRIPT

1.2

ICD for diabetes

SC

1 0.8

M AN U

0.4 0.2 0

-0.2

0

0.2

TE D

-0.4

0.4

0.6

ICD for thyroid 0.8

1

-0.2 -0.4

pre-pregnancy smoking

EP

-0.6

BA or above mother & father education

AC C

Coefficients for NHW

0.6

prior stillbirth, self pay, IVF, ICD for hypertension

Early vs. Term

-0.6

-0.8

Coefficients for NHB

Concordance

Linear (Early vs. Term)

1.2