A recent study used machine-learning models to analyze medical records in an effort to identify risk factors for out-of-hospital cardiac arrests (OHCA). The findings, which appeared in the journal Circulation, not only revealed significant non-medical indicators of OHCA risk but showed that the models used were better at predicting OHCA than current approaches.
“While previous studies primarily focused on cardiovascular factors associated with OHCA, our findings show that OHCA risk is complex and includes sociodemographic and non-cardiovascular factors. This insight opens the door to future public health measures that can mitigate the risk. In addition, the machine learning methods used in this research offered considerable improvement over simpler models based on previously-published covariates associated with sudden cardiac death.” said Ali Shojaie, professor of biostatistics at the University of Washington School of Public Health and one of the study’s co-authors.
Non-medical predictors of OHCA identified by the research include demographic factors (single marital status, underrepresented race), substance abuse disorder, fluid and electrolyte disorder, and alcohol abuse.
The study cross-referenced a King County, Wash., registry of OCHA cases with UW Medicine electronic health records (EHR) and found 2,366 patients who had suffered cardiac arrest. These records were then analyzed by three machine-learning models for common medical and non-medical factors.
Shojaie, along with Associate Professor Noah Simon and UW researcher Jessica Perry, played a pivotal role in the data analysis required for the research.
“Aside from the challenges of matching the UW EHR data to the King County EMS system, careful statistical analysis was crucial in this project. First, the analysis of EHR data is complicated by the presence of missing values. Second, OHCA is very rare in the general population (0.1% prevalence), which complicates the development of reliable machine learning models. Third, in addition to developing predictive models, a key goal of the research was to identify factors associated with OHCA risk. This required the use of explainable machine learning models,” said Shojaie. Explainable models reveal how data is used to draw conclusions as opposed to black box models where operations and algorithms are not transparent.
Shojaie noted that the study required close collaboration between biostatisticians/statisticians, bioinformaticians, epidemiologists and cardiovascular researchers, and that such interdisciplinary collaborations are key to advancing biomedical and public health science.
“The research also points to the potential of explainable machine learning methods to provide tools to not only identify those at risk of complex diseases, but also provide new insight into underlying mechanisms of complex diseases. There are many opportunities waiting for biostatisticians to engage in similar projects,” said Shojaie.