false
Catalog
SCCM Resource Library
Predicting Cardiac Arrest in the Pediatric Intensi ...
Predicting Cardiac Arrest in the Pediatric Intensive Care Unit Using Machine Learning
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
So yeah, so I'm Adam Kennett, recent graduate from the Johns Hopkins Biomedical Engineering Undergraduate Program, and I'm excited to talk about my project on predicting cardiac arrest in a pediatric ICU using machine learning. Oh, wrong one. So no real disclosures, just one of our advisors, Dr. Rakti, is an employee of Nihon Kodan. So I've included some learning objectives, also acts as a nice outline of what I'm gonna talk about today. First, understand how to generate features from physiological data collected in the PICU for use in machine learning models. Then, I'm gonna talk about how to train, validate, and test these machine learning models in order to predict in-hospital cardiac arrest before onset. Finally, I'm also gonna talk about how to create a risk score over time from machine learning models in order to identify high-risk patients. So in-hospital cardiac arrest occurs in approximately 15,000 pediatric patients each year in the US, about 6,000 of which occur in the PICU itself. However, only one out of four of these patients survives until discharge from the hospital. So an early prediction of in-hospital cardiac arrest could lead to improved patient outcomes. Our goal is to use machine learning to predict in-hospital pediatric cardiac arrest within three hours of onset. To do this, we're using all the data that's collected for patients in the PICU, which includes numerical EHR data, vital signs data, and high-frequency EKG waveform data. So our patients consist of newborns to young adults who are admitted to the Johns Hopkins PICU between January 2020 and July 2021. During this timeframe, approximately 2,000 patients were admitted to the Hopkins PICU. However, only 1,145 patients had a complete set of data for us to use in our models. As you can see here, I've broken up the age distribution as well as the gender distribution for all the different age groups. And a limitation of this study is that it was just a single-center study, and so we only had 15 patients who actually experienced cardiac arrest who we had a complete set of vital signs, EHR, EKG, et cetera, data for. So what we did was for all the data, we divided it up into five-minute windows, and then we needed to label these windows in order to train and test our models. So for patients who did not experience a cardiac arrest, we decided to label the five-minute windows within the period from one hour to two hours before discharge as being negative for cardiac arrest. And we ignored that first hour right before discharge, from one to zero hours before discharge, since we thought that the patients maybe were moving around or weren't connected to the monitors, so they wouldn't have a complete set of data for us to use. And so that period from one to two hours before discharge would be when the patients were most normal. And so then for the other half of our patients, the patients who experienced cardiac arrest, five-minute windows within three hours of cardiac arrest were labeled as positive for cardiac arrest, as this was our prediction timeframe. Then that next hour from three to four hours before arrest, we ignored, and this acted as a nice buffer region for the negatively labeled windows in the period four to five hours before arrest. So once we had all these windows labeled, we then performed feature engineering on each window. As I just mentioned, we perform feature engineering on all the data that's collected from the patients in the PICU. So this consists of high-frequency 240 hertz ECG waveforms, and half a hertz Q2 vital signs time series data, and then static medications data. From our high-frequency ECG waveforms, we used the Python toolkit from Aura Healthcare to generate 23 different heart rate variability metrics, such as the high-frequency normalized component of the ECG, or the number of RR intervals greater than 20 milliseconds, for example. For our vital signs time series data, we created summary statistics of the 12 vital signs within the five-minute window. And so some vital signs were arterial blood pressure, respiratory rate, heart rate, SpO2, et cetera. And so this resulted in 96 total summary metrics. And finally, for our static medications data, we created binary indicators, so yes or no, of whether that type of drug was administered for 46 different therapeutic drug classes. So this resulted in 166 different features for our models. Then, again, since we were working with pediatric patients, we needed to normalize the data by patient age group, since the normal vital signs for a pediatric patient might vary depending on how old that pediatric patient. A normal heart rate, for example, for a newborn would not be considered normal for a young adult. So then we normalized all the data, and then we trained, validated, and tested our machine learning models. To do this, we split up our data by patient into these three sets, training, validating, and testing sets, so that we could see if we could predict cardiac arrest before onset. We trained the models on the training set, then we validated using the validating set, where we fine-tuned our models and improved the models. And then finally, we tested our models on that held-out testing set, which was not used at all for the training or validating. So this was a held-out set of data to see how well the models are doing. And all the results I'm gonna show in the next few slides are on this held-out testing set. And finally, once our models were trained, we then analyzed feature importances to improve the interpretability of our results. So here we can see the performance of our models, again, on this held-out testing data, not used for training or validating. And as you can see on the left, the ROC curve, receiver operating curve, shows the overall performance of the model, and the precision recall curve, PR curve on the right, shows the performance of the models on predicting the positive class, or prediction of cardiac arrest. As we can see, all the models perform pretty well with an AUROC greater than 0.85 for all the models. And likewise, for the precision recall curve, all the models seem to be performing well. Our best-performing model was the XGBoost model, which had an area under the receiver operating curve of 0.971, and an area under the precision recall curve of 0.797. XGBoost model also had the fewest false negatives, which was important for us in determining the best model. Again, our XGBoost model performed the best, with 99.5% sensitivity, or prediction of the positive class, and 69.6% specificity, prediction of the negative class. However, as we can see from the PR curve, our PPV is only around 20 to 30, so future studies will need to work to improve this PPV before implementation in the PICU. So, from our best-performing model, the XGBoost model, we also generated a risk score, or probability for cardiac arrest. And shown here on the left, at the top, is the risk score across the five hours before arrest for one five-month-old patient. And the coloring scheme is just the same as before, as our labeling method. As we can see from the risk score, let's see, it's low, around zero, during this negatively-labeled window period, then during the buffer region, we see it increase, before remaining high and elevated for the duration of the positive prediction timeframe, before that patient eventually did go into cardiac arrest. And shown below, for this patient, are just some of the top features, as determined from our feature analysis via SHAP analysis plots. And as we can see, we have respiratory rate and heart rate variability metrics, such as the SC segment of the V1 component, or the high-frequency normalized component, coefficient of variation of successive differences of RR intervals, et cetera. And as we can see, there isn't really one vital sign, or one feature that's changing to drive this increase in risk score. So, that's consistent with our hypothesis that these machine learning models are able to identify hidden trends in the data that are indicative of cardiac arrest. And then finally, on the right, we see the average risk score over time for all the 15 patients who experienced cardiac arrest. And the same trend can be seen. It's low during this negatively-labeled green phase, rises during the buffer region, and then remains elevated during the three hours before the arrest, until the patient eventually experienced cardiac arrest. So, just to conclude, we've shown here that we're able to use machine learning models to predict cardiac arrest three hours before onset. Some key indicators include a heart rate variability metrics, a low respiratory rate, and several therapeutic drug classes. Additionally, our models can identify these hidden trends in the data to alert clinicians that a patient may be at high risk for cardiac arrest before they may even realize it. And this early warning could enable clinicians to intervene sooner and prepare for the arrest, thus increasing the patient's chance of survival. So with that, I would like to thank Dr. Hunt and Dr. Duvall-Arnold, from the Hopkins School of Medicine, who played a key role in ensuring that the cardiac arrest events were labeled accurately. And again, I'd also like to thank my co-authors, collaborators from the Johns Hopkins Precision Care Medicine course, where most of this work was completed, including Dr. Fackler in the audience and the other faculty and TAs. So yeah, and thanks again for sticking around for the last presentation of the day. Hope everyone enjoyed that and enjoys the rest of the conference. Thank you.
Video Summary
In this video, Adam Kennett discusses his project on predicting cardiac arrest in a pediatric ICU using machine learning. He explains how they collected data from patients in the PICU, including EHR data, vital signs data, and EKG waveform data. They labeled specific time windows before discharge or cardiac arrest as positive or negative for cardiac arrest. They then performed feature engineering on the data to create 166 different features for their models. They trained, validated, and tested their machine learning models, with the XGBoost model performing the best with high sensitivity and specificity. They also generated a risk score for cardiac arrest and found that their models could identify hidden trends in the data indicative of cardiac arrest.
Asset Subtitle
Pediatrics, Cardiovascular, 2023
Asset Caption
Type: star research | Star Research Presentations: Research Enrichment, Adult and Pediatric (SessionID 30002)
Meta Tag
Content Type
Presentation
Knowledge Area
Pediatrics
Knowledge Area
Cardiovascular
Membership Level
Professional
Membership Level
Select
Tag
Pediatrics
Tag
Cardiac Arrest
Year
2023
Keywords
predicting cardiac arrest
pediatric ICU
machine learning
EHR data
vital signs data
Society of Critical Care Medicine
500 Midway Drive
Mount Prospect,
IL 60056 USA
Phone: +1 847 827-6888
Fax: +1 847 439-7226
Email:
support@sccm.org
Contact Us
About SCCM
Newsroom
Advertising & Sponsorship
DONATE
MySCCM
LearnICU
Patients & Families
Surviving Sepsis Campaign
Critical Care Societies Collaborative
GET OUR NEWSLETTER
© Society of Critical Care Medicine. All rights reserved. |
Privacy Statement
|
Terms & Conditions
The Society of Critical Care Medicine, SCCM, and Critical Care Congress are registered trademarks of the Society of Critical Care Medicine.
×
Please select your language
1
English