Background. In 2009, the Pediatric Emergency Care Applied Research Network (PECARN) published evidence-based guidelines for the risk-stratification of children at risk of clinically important traumatic brain injury (ciTBI). Machine learning approaches may allow for risk-stratification of patients with higher diagnostic accuracy, allowing for decreased utilization of computerized tomography (CT) in low-risk patients. Methods. We performed a secondary analysis of a public use dataset from a multicenter prospective study performed by PECARN between the years 2004-2006 from 25 North American emergency departments who presented within 24 hours of trauma, retaining only those patients with Glasgow Coma Scale scores of 14-15. Patients who were missing outcome data or who were paralyzed, intubated or sedated at the time of physician examination were excluded. Our outcome was ciTBI, defined as death from traumatic brain injury, neurosurgery, intubation >24 h, or hospital admission ≥2 nights. After performing a stratified random split of patients into derivation and validation cohorts (50% each) to ensure an equal number of patients with outcome in each group, we employed the Super Learner, an ensemble machine learning approach that determines the optimal weights for combining the predictions from a collection of machine learning algorithms. A total of nine algorithms were assessed (Bayesian generalized linear models, K-nearest neighbor, support vector machine, recursive partitioning, extreme gradient boosting, multivariate adaptive regression splines, least absolute shrinkage and selection operator (LASSO), random forest, and stepwise logistic regression based on Akaike information criterion). Models were aggregated using a method to maximize the area under the receiver operator curve (AUROC). Approaches were validated in the test cohort, with the identification of a threshold cutoff using a 1:250 misclassification cost ratio. We provide measures of diagnostic accuracy. Results. Of 43,399 patients in the study dataset, 42,055 were retained (96.9%; 62.3% males, median age 5.6 years, interquartile range 1.9-12.0 years). 372 (0.9%) patients had ciTBI. After stratification of patients in derivation and validation cohorts (21,027 patients with 186 having outcome patients in each cohort), the algorithm with the lowest rate of error was the LASSO algorithm, which constituted an importance of 43.3%. Tested on the derivation cohort, the Super Learner algorithm achieved an AUROC of 91.6% (95% confidence interval [CI] 89.8-93.4%), which was greater than any individual algorithm (Figure), a sensitivity of 96.2% (95% CI 92.4-98.5%) and a specificity of 66.3% (65.6-66.9%) on the validation cohort (Table). Overall, the algorithm had 9 false negatives (7 in the validation cohort, 2 in the derivation cohort). Conclusion. While computationally complex and in need of external validation, use of aggregated machine learning approach retains the high sensitivity of the PECARN risk stratification strategy while demonstrating higher specificity, allowing for reduced use of head CT.

Step AIC, stepwise logistic regression based on Akaike information criterion; LASSO, least absolute shrinkage and selection operator; MARS, multivariate adaptive regression splines; XG Boost, extreme gradient boosting; Bayesian GM, Bayesian generalized linear model

Numbers in parenthesis represent 95% confidence intervals. PPV, positive predictive value; NPV, negative predictive value; LR(+), positive likelihood ratio; LR(-), negative likelihood ratio; AUROC, area under the receiver operator curve; Bayesian GM, Bayesian generalized linear model; MARS, multivariate adaptive regression splines; LASSO, least absolute shrinkage and selection operator; Step AIC, stepwise logistic regression based on Akaike information criterion

Figure

Receiver operator curves for each method on the a) derivation and b) validation dataset.

Figure

Receiver operator curves for each method on the a) derivation and b) validation dataset.

Close modal
Table

Measures of diagnostic accuracy on the derivation and validation cohorts (186/21,027 with outcome in each) from individual and aggregate (Super Learner) models.

Table

Measures of diagnostic accuracy on the derivation and validation cohorts (186/21,027 with outcome in each) from individual and aggregate (Super Learner) models.

Close modal