false
Catalog
SCCM Resource Library
Identifying Biomarker-Based Pediatric ARDS Subphen ...
Identifying Biomarker-Based Pediatric ARDS Subphenotypes Using Machine Learning and Clinical Data
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Hi, everyone. My name is Dan Balcarcel. I'm a second-year PICU fellow at the Children's Hospital of Philadelphia, and today I'll be talking about identifying biomarker-based pediatric ARDS subphenotypes using machine learning and clinical data. I completed this work with my primary mentor, Dr. Nader Yeya, as well as my co-authors and mentors, Drs. Blanca Himes, Nelson Sanchez-Pinto, Mark Mai, and Sanjeev Mehta. I have nothing to disclose. Our learning objectives for today are as follows. We'll start with a brief review of the problem of heterogeneity in pediatric ARDS, and then next we'll look into a potential solution to this problem by summarizing literature that supports the use of biomarker-based inflammatory subphenotypes in ARDS. And our final learning objective is to learn about a method to accurately identify ARDS inflammatory subphenotypes using machine learning and readily available and routinely collected clinical data. To start with a brief background, Pediatric Acute Respiratory Distress Syndrome is a common condition affecting 6% of mechanically ventilated children admitted to pediatric intensive care units, and it carries a high risk of mortality, approaching 33% in severe cases. Despite a very high burden of disease, no specific pharmacotherapies exist, and management of ARDS remains largely supportive with a focus on limiting ventilator-induced lung injury, avoiding fluid overload, and optimizing sedation. However, even these supportive measures have very limited evidence to support them. The lack of effective therapies for pediatric ARDS is partially due to the reality that ARDS represents a heterogeneous group of patients with a wide range of inciting factors, diverse clinical outcomes, and divergent clinical courses. By placing all of these patients into one bucket, we've made it really difficult to identify effective therapies. One strategy to overcome this is to identify more homogeneous groups within ARDS or subtypes, which could allow us to create more precise management strategies and aid trials through prognostic and predictive enrichment. Subtyping of ARDS is really nothing new and has been attempted for over 50 years with varying levels of success. While physiologic factors such as P to F ratio and clinical factors such as direct versus indirect insult have helped isolate groups of patients with greater or lesser risk of morbidity and mortality, they have not been useful in identifying specific management strategies. Omics-derived subtyping has recently been attempted but is still in the very early stages of research and development. The most promising subtyping strategy to date has been biologic classification through the use of serum-based endothelial biomarkers and inflammatory cytokines. And research into biologic subtyping of ARDS really took off in 2014 when Dr. Carolyn Kelphy from UCSF published this sentinel paper in the Lancet. Dr. Kelphy used lane class analysis and data from two ARDS randomized control trials, which included inflammatory cytokines, to identify two distinct subphenotypes of ARDS. One group, which she termed hyperinflammatory, was found to have high levels of inflammatory cytokines such as IL-6, IL-8, TNF receptor 1, as well as more clinical evidence of shock including lower serum bicarbonate levels and higher vasoactive needs. The other group, which she labeled hypoinflammatory, had lower levels of inflammatory cytokines and less clinical evidence of shock. The hyperinflammatory group was also found to have an increased risk of mortality. What is most promising about this subtyping strategy is that inflammatory subphenotype appears to correlate with response to specific therapies. Dr. Kelphy reanalyzed three negative ARDS randomized control trials, including the alveoli trial, which compared high versus low PEEP, the FACT trial, which compared liberal versus conservative fluid strategy, and the HARP2 trial, which compared simvastatin versus placebo. She actually found that the hyperinflammatory phenotype had improved mortality with high PEEP, liberal fluid, and simvastatin therapy, while on the other hand, the hypoinflammatory group had improved mortality with low PEEP and conservative fluid management. While these were retrospective analyses, the use of randomized control trial data does reduce the risk of bias. These results make this particular subtyping strategy a promising direction for future study. But the major barrier to using this subtyping strategy is that it requires collecting inflammatory cytokines, such as IL-8 and IL-6, which requires a significant amount of time and money and really limits the clinical utility. Dr. Sinha out of Wash U in St. Louis identified a potential solution to this problem in 2020 when he published this paper in which he described using a machine learning classifier model to predict the biomarker-based ARDS subphenotype using just readily available clinical data, such as labs, vital signs, ventilator data, and demographics. Dr. Sinha's model had an area under the curve of 0.94, suggesting that ARDS inflammatory subphenotype could be identified without the need to send inflammatory cytokines. While this phenotyping strategy has almost exclusively been studied in adults, last year's Dr. Mary Dahmer and Dr. Heidi Flory out of Michigan used lane class analysis on a cohort of children diagnosed with ARDS and found that the hyperinflammatory and hypoinflammatory subphenotypes also seemed to be present in children. With that background, the objective of our study was to determine if machine learning algorithms could identify lane class analysis-derived biomarker-based ARDS subphenotypes using readily available clinical data alone in the first 24 hours of ARDS diagnosis. The goal was to essentially see if we could find a quicker, cheaper alternative to subphenotyping children with ARDS. The project had three steps. The first was to use lane class analysis on a CHOP ARDS cohort to determine if the hyperinflammatory and hypoinflammatory phenotype existed in our patient population. The second step was to select clinical predictor variables, which could be used as part of a machine learning classifier model. And the final step was to use these clinical variables to train and validate a machine learning model to predict ARDS subphenotype. The first step of the project was completed by my mentor, Dr. Nader Yahya, who will be presenting the full details of this work later on in this session. Briefly, using his cohort of 333 children admitted to the CHOP PICU between 2014 and 2019, who were diagnosed with ARDS and had inflammatory cytokines collected, he ran a lane class analysis and found a two-class model as the best fit. And in examining those two classes, about a third fit into the previously described hyperinflammatory subphenotype and two-thirds into the hypoinflammatory phenotype, and in similar proportions to prior studies. Now that our patient had the subphenotype label, the next step was to select clinical predictor variables. We selected a total of 165 variables that could easily be extracted from the electronic health record. We included demographic data, such as age, diagnoses from the problem list, including asthma, therapies patients were exposed to, including total doses of vasoactives and FiO2, vital signs such as heart rate, which we then expanded to include mean, median, maximum, and minimum over the first 24 hours of diagnosis, and finally, laboratory data, such as serum bicarbonate levels, which we again expanded to include mean, median, maximum, and minimum over the first 24 hours of diagnosis. The final step was to create a machine learning classifier model. We first obtained all relevant clinical data, cleaned and organized it, verified its accuracy, and assessed for missingness. There were missing laboratory variables, since many of the labs had not been collected on all of the patients, so that missing data was imputed with random forest. Otherwise, we had full demographic, diagnostic, therapeutic, and vital sign data. We combined the clinical data and each patient's subphenotype label and used the package XGBoost in R to create a machine learning classifier model. Due to the small sample size, k-fold cross-validation was used to both train and validate the model. Finally, we compared the machine learning predicted ARDS subphenotype with the lane class analysis derived biomarker-based subphenotype. And now on to our results. So using 24 hours of clinical data, our model was able to achieve an area under the receiver operator curve of 0.91 in predicting the ARDS subphenotype, with a 95% confidence interval of 0.87 to 0.95. And it had a sensitivity of 91% and a specificity of 70% for predicting the hyperinflammatory phenotype. And when we expanded our model to include all clinical data available in the first 36 hours of diagnosis, our AUC was 0.92 and did not significantly improve. However, when we included clinical data from just the first 12 hours of diagnosis, we did notice a slight decrease in the accuracy of our model to an AUC of 0.88. In the interest of simplifying our model, which included 165 variables, we decided to look at each category of variables to see what was most powerful in predicting ARDS subphenotype. We found that demographic data, diagnoses, therapies, and vital sign data alone did not demonstrate adequate accuracy. Laboratory data, however, had an AUC of 0.90 for predicting ARDS subphenotype, which was not significantly worse than all of the data combined. So we went a step further, and we looked at the relative importance of each lab variable and paired our model down to just the top five most predictive variables, which included median INR, minimum white blood cell count, minimum lactate, mean CRP, and median PT. We found that a model using just the top five variables had nearly the same accuracy as the full model with an AUC of 0.90. So we went a step further and paired our model down to just the top four variables and still had an AUC of 0.90. However, we did start to see a drop in accuracy when we cut our model down to the top three variables with an AUC of 0.86. In conclusion, machine learning offers a potential to cheap identification of latent class analysis-derived biomarker-based ARDS subphenotypes in children within the first 24 hours of ARDS diagnosis. And routinely collected lab data alone may offer sufficient predictive power. Our study does have several limitations. First, we had a relatively small sample size of 333 children to both train and validate our cohort, which does raise concerns for overfitting. Second, we used just single-centered data, which limits our generalizability. And finally, given the retrospective observational nature of the study, we were not able to examine treatment effects related to the subphenotypes, which is ultimately the main reason for studying this subtyping strategy. In the future, we hope to externally validate the clinical classifier model and test it prospectively on additional cohorts of children with ARDS. Here are my citations. And I want to thank again my mentor, Dr. Nader Yeha, as well as my co-authors listed here. And thank you all for coming to listen today. I'll be happy to take any questions.
Video Summary
In this video, Dan Balcarcel discusses identifying biomarker-based pediatric ARDS subphenotypes using machine learning and clinical data. ARDS is a common condition in pediatric intensive care units with a high mortality rate and no specific pharmacotherapies. Subtyping ARDS could help develop more precise management strategies and aid in trials. Previous subtyping strategies using inflammatory cytokines have shown promise but require time-consuming and expensive testing. Balcarcel discusses a potential solution using a machine learning classifier model that can predict ARDS subphenotype using readily available clinical data alone. Their model achieved an area under the curve of 0.91, suggesting that subphenotyping children with ARDS could be done quicker and at a lower cost.
Asset Subtitle
Pulmonary, Pediatrics, 2023
Asset Caption
Type: star research | Star Research Presentations: Biomarkers I, Pediatrics (SessionID 30007)
Meta Tag
Content Type
Presentation
Knowledge Area
Pulmonary
Knowledge Area
Pediatrics
Membership Level
Professional
Membership Level
Select
Tag
Acute Respiratory Distress Syndrome ARDS
Tag
Pediatrics
Year
2023
Keywords
biomarker-based pediatric ARDS subphenotypes
machine learning
clinical data
ARDS management strategies
machine learning classifier model
Society of Critical Care Medicine
500 Midway Drive
Mount Prospect,
IL 60056 USA
Phone: +1 847 827-6888
Fax: +1 847 439-7226
Email:
support@sccm.org
Contact Us
About SCCM
Newsroom
Advertising & Sponsorship
DONATE
MySCCM
LearnICU
Patients & Families
Surviving Sepsis Campaign
Critical Care Societies Collaborative
GET OUR NEWSLETTER
© Society of Critical Care Medicine. All rights reserved. |
Privacy Statement
|
Terms & Conditions
The Society of Critical Care Medicine, SCCM, and Critical Care Congress are registered trademarks of the Society of Critical Care Medicine.
×
Please select your language
1
English