false
Catalog
SCCM Resource Library
Data Science
Data Science
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
All right, just to wrap things up, thanks very much for both inviting me to talk and for including data science in the year-in-review. As Mark said, my name's Adam Disorni, I'm at the University of Rochester. I don't have any financial disclosures to report. So very briefly for a methodology approach similar to others, I tackled both pediatric critical care and data science. Data science is a rather large topic, so we broke it up into several subtopics, both from Sentinel article reviews and from content expertise. Reviewing table of contents from some of our pediatric critical care journals, as well as some of our informatics and data science journals, including JAMIA and ACI, and the group-by-topic, sorted-by-relevance, and alt-metric scoring. And that QR code there is the full query for anybody that's interested. From that, about 500 articles that resulted. They were screened down to about 80 articles that met the criteria. Of those 80, I grouped those into the categories that you can see here. These categories were taken from Tel Bennett et al's Data Science for Child Health 2019 article. And we're gonna cover, very briefly, an article or two from each of these categories. You can see that most prominent is predictive analytics, machine learning, and artificial intelligence. These were broken down as well into further subcategories, including the methodology, the clinical context, the maturity of the model that was presented, as well as the reporting quality. And again, this QR code on this slide has all of the 80 or so screened articles, and I encourage anyone interested in these areas to take a look. They were all excellent articles. So we'll begin with reviews, summaries, and perspectives. The first two articles that I wanted to highlight are two educational articles to help us understand the use of machine learning and artificial intelligence, as well as what is machine learning. These are great starter articles for anyone who wants to become more interested or involved in this field. Both of them describe common artificial intelligence and machine learning techniques that are used in numerous articles published in our field. The first one by Neil Shaw and Ken Remy at WashU and Case Western includes, as well, common measures of model performance description. This is useful as we're reviewing and critiquing articles in our field to understand what are the model performance characteristics that are important. And, as well, this article links to clinical decision support tools and focuses on some of the limitations in this area. The second article by Orkin and others at Cleveland Clinic also includes a summary of selected studies more recently included with machine learning. You can see their figure one here shown on the bottom, as well, covers some of the common machine learning techniques, such as logistic regression, support vector machines, clustering, neural networks, and decision trees. The second article, or the third article, excuse me, that I wanted to highlight involves how we focus as faculty in training our trainees to understand more about these topics. These are topics that I think we can agree are becoming more and more important over the years, and we really need to make sure that our trainees coming through our programs have a good understanding, both for understanding the concepts foundationally to these works and being able to clinically appraise machine learning systems and work. So, in this article, they defined the problem of clinicians and trainees specifically lacking foundational machine learning principles and provided this roadmap that they developed based on Kern's six-step framework for curricular development, and I'm gonna highlight a couple of the objectives that they focused on. You can see the curriculum objectives here from foundational machine learning concepts, development to deployment, understanding the ethical and legal considerations in clinical practice, understanding the proper usage of both electronic health record and biomedical data, as well as understanding how to critically appraise machine learning systems. Through this, they developed a number of enabling competencies. Some of the challenges potentially that arise with results of these educational strategies, as you can see, multiple objectives require interpersonal or interprofessional discussions with experts in data science, computer science, informatics, and engineering, and that can be challenging without local or remote expertise, and this is potentially an avenue for, for example, pediatric SCCM groups to organize and have some of that remote expertise for areas that might not have that local expertise. Next, I'd like to jump to big data studies, which contrasts with the locally sourced artisanal data. The first study that I wanna highlight in the big data group is a study from the National COVID Cohort Collaborative, or the N3C group. This is a funded study through NCATS that provides electronic health record abstracted data from over 65 different institutions, and in this particular study, focusing on the characteristics outcomes and severity risk factors for children with SARS-CoV-2 infection by Blake Martin and his mentor, Tal Bennett, at Children's Colorado. This is not specifically a PICU study, but certainly encounters a number of PICU patients. Of the 6% of children that were hospitalized, 14% of those had severe disease, and what they examined in this study was to look at the characteristics and outcomes of these children over time. A couple of the results that I wanna highlight, they noted that there was increased severity of disease among males, obese patients, patients identifying as black or African American, and as well, patients with several PCCC subcategories. They also made the distinction between MIS-C and acute COVID, and noted that in the MIS-C patients, there were fewer patients with PCCC subcategories, but more inflammatory lab profiles, which I'll show on the next slide. This also happened to be the study with the highest altmetric score that I reviewed, which blew everyone else out of the water with 350. This is from their figure in their paper, showing MIS-C in the first column in the light blue, and acute COVID in the orange, and just to highlight here, the at least one PCC category we can see is greater in the acute COVID compared to the MIS-C, and as well, we can see that the inflammatory laboratory profile certainly is quite greater in the MIS-C group than it is in the acute COVID group. The other big data study that I wanted to focus on makes use of a different data set, one that we see being published commonly now in some of our journals, using the Pediatric Health Information Systems, or PHIS, data set. For those not familiar, this is an administrative or billing data set that has a number of advantages in that it spans the country, but it does have some limitations. This is work by Chris Horvath and colleagues at the University of Pittsburgh, looking at the link between pediatric TBI mortality and median family income in the United States. Just to highlight a couple of the challenges with using the PHIS data set, it is an administrative data set, so cohort identification must be done with ICD billing codes and as well, for this particular study, they focused on zip code mapping, which suffers from a lack of granularity compared to census tract mapping, but those are the limitations of that data set. Using this, they found that children from lower income zip codes, not surprisingly to anyone here, were more likely to sustain ballistic TBI injury and unfortunately, also more likely to die from those injuries. And you can see from their figures in their paper here, the x-axis being median family income, the y-axis being percent of both mortality and ballistic injury. As the median family income increases, the percent of mortality decreases and the percent of ballistic injuries also decrease. Similarly, in grouping by regions of the country, we see the same graph, median household income on the x-axis, percent ballistic TBI on the y-axis and as median household income increases, sorry, we'll go back, as median household income increases, percent of ballistic TBI also decreases. Next, I'd like to move on to predictive models and machine learning and cover two articles that I think represent how to both present, how to both do and present these types of work. So the first one is by Anup and Matthew Chirpak at the University of Wisconsin-Madison and this looked at not only development, but also external validation of a machine learning model for prediction of potential transfer to the PICU. And I highlight that because many of the models that we're seeing published now are simply a development at a single site and it's really important when thinking about these models to externally validate a second site and both of the studies that I'm sharing here have done just that. So in this group, in this paper, this group did an observational cohort study at two sites, the first one to develop and internally validate and the second to test and externally validate. They use multiple models, the takeaway being that it's not necessarily about the exact type of model that's used, but more about the setup of the study and the features that were extracted. As is inherent with many of the problems in pediatric critical care, they had a case control imbalance where they saw a much higher percentage of controls compared to the patients that were actually transferred to the unit. And this is challenging with any model that has a high degree of accuracy to get an acceptable positive predictive value when you have a high case control imbalance. They report the model performance as the area under the receiver operator characteristic curve, as well as talk about this trade off between sensitivity and number needed to alert, which is the inverse of the positive predictive value. Again, if we have a low positive predictive value, we have a high number that we need to alert before we find a true positive and compared this to a standard bedside pews. Just showing figure one from their paper on the left hand side at site one, we can see them plotting sensitivity versus the number needed to alert, which again is the inverse of the positive predictive value. And what we can see is as the sensitivity increases, the positive predictive value decreases, which means the number needed to alert increases. So again, sensitivity goes up, number needed to alert goes up. At a second site on the right hand side, we can see the same plot, sensitivity versus number needed to alert. And the same challenge as your sensitivity increases, you need to alert more times to get an appropriate response. Note that the y-axis on these figures are slightly different. So that external validation as it usually does resulted in a slightly decreased real world performance. The second study in this category that I wanted to highlight briefly is from the Swiss Pediatric Sepsis Study. And this is prediction of recovery from multiple organ dysfunction syndrome in pediatric sepsis patients. And once again, to highlight that this study both developed and then externally validated at a second site, a model to predict recovery from MODS and children with blood culture confirmed bacteremia. Again, two sites, one to cross validate and the second to test. They also use multiple models and then report both the area under the receiver operating characteristic curve, as well as the area under the precision recall curve. Recall that precision is just positive predictive value and recall is just sensitivity. Those are just the machine learning terms for the same exact thing. Sorry, the other thing I wanted to highlight with this study, if I can go back, is that they did, at the bottom you can see the GitHub link for their work. One of the nice things about this group is that they shared all of their data, all of their analytics software to do this work on a publicly available repository. As we move towards more open science and certainly with the new NIH data sharing requirements, I think this is going to become more prevalent and hopefully becomes more useful to those trying to replicate the work. So in this study, we can see the two sites. Site one is A and B and site two is C and D. And for each site, they report both the receiver operating characteristic curve, that's A and C, as well as the precision recall curve, that's B and D. And as we're aware, the receiver operating characteristic curve is a plot of the true positive rate or sensitivity versus the false positive rate. And the goal there is to bring that curve as close to the zero one point as possible to encompass as many true positives as we can. Similarly, with the precision recall curve, the goal is to bring that curve as close to the one one point as possible. And they concluded from this study that by externally validating this model with reasonable measures of accuracy, they have the potential to include this in electronic health record systems and contribute to patient assessment. Which leads us nicely into the informatics and decision support section. And again, I wanna briefly highlight two articles in this section. The first one is a scoping review of early detection of sepsis among pediatric, neonatal and maternal inpatients. And the reason I wanted to highlight this in particular is that of the 13 pediatric studies that they reviewed, only two studies reported on the usability outcomes of these systems. And that's a really important category that looks not just at how effective the model was, but how effective it's going to be when you actually put it into clinical practice. It's one thing to derive an excellent model that has excellent characteristics, but it's a whole nother to actually get it to the bedside and make it usable. One of those studies was in neonatal population and one was in the pediatric population. So following along that lines, I wanna share work by Christina Sifra and Hardeep Singh when they were at Iowa and UT Southwestern. Christina is now at Boston Children's, reporting outcomes of the Pediatric Intensive Care Unit patients to referring hospitals based on EHR feedback system. And the focus of this work was really to understand the feasibility, the usability and the relevance of this particular semi-automated system. And this ties in nicely with the human factors approach that Erin was discussing earlier. They use a combination of qualitative methods as well as other mixed methods approaches. First, to understand the socio-technical environment, basically what was going on in the workplace. And second, to assess the usability of the system using quantitative as well as qualitative outcomes. And they reported the human factors engineering approach, which had appropriate use in the workflow process and had appropriate quantitative and qualitative usability features showed that it potentially in the next steps would have an impact on patient and clinical outcomes. And again, importantly, they did not start with trying to identify that this had an impact on patient and clinical outcomes. They started by identifying that it was usable and feasible to the clinicians and the patients. This is an infographic showing their approach with the three boxes that they use. The first is analysis of the socio-technical system and the second cooperative design and the third iterative evaluation. And again, we can see many of the methods that they use were qualitative where they did interviews, focus groups, as well as other interviews and evaluation qualitative methods. They did formative and summative usability testing of the system to make sure that the system they developed was usable. And lastly, they use quantitative methods to understand the use in EHR access logs. And again, this multimodal approach should be the standard for implementing systems into our clinical workflow. Lastly, I want to finish up with one study in the omics section. Unfortunately, I didn't have time to include any around bias in data science, not because it's not an important topic, it certainly is, but unfortunately ran out of time. There are more on the QR code if you jump back to that to see a few of those studies though. So just briefly to talk about an omics study that utilized both an omics approach as well as machine learning. This is a study by Nader and others at CHOP looking at differentiating children with sepsis and acute respiratory distress syndrome using a proteomics and machine learning approach. So in this study, they had two cohorts of patients, patients with ARDS without sepsis matched the patients with ARDS with sepsis. They used anti-double-stranded DNA co-immunoprecipitation and mass spec selected by a number of relative predictor importance, and then used a random forest classifier. Using this random forest classifier, the goal was to take the top dimensions of the random forest and separate these two populations, and then to be able to predict against double-blinded samples, which population those samples fell into. And what you can see from this figure, which is figure two in their paper, is that showing the top two dimensions of this particular random forest classifier, they were able to accurately separate these two populations and then accurately predict where those blinded samples fell. The point of this is not necessarily to make a bedside test available, but really to support the molecular definition of ARDS and potentially identify target proteins of interest. So in summary, thank you for allowing me to talk so quickly. Data science certainly incorporates many topics that are relevant to the clinical researcher, basic scientist, the QI expert, and the intensivist. There are important reviews and educational position papers which suggest opportunities for us to grow in the field. It's really important to push beyond the just another model as well as looking at external validation and implementation of the last mile of decision support and to think about all of these categorizations at the end. Thank you.
Video Summary
In this video transcript, Adam Disorni from the University of Rochester discusses the use of data science in pediatric critical care. He explains his methodology, which involved reviewing articles from various journals and categorizing them based on topics such as predictive analytics, machine learning, and artificial intelligence. He highlights several articles, including educational articles on machine learning techniques and the use of clinical decision support tools. He also discusses studies on big data analysis, predictive models, and machine learning, as well as informatics and decision support systems. One study focuses on predicting potential transfer to the pediatric intensive care unit, while another looks at predicting recovery from multiple organ dysfunction syndrome. Disorni also mentions a scoping review on early detection of sepsis and a study on differentiating children with sepsis and acute respiratory distress syndrome using proteomics and machine learning. He stresses the importance of external validation and implementation of decision support systems in clinical practice.
Asset Subtitle
Research, Quality and Patient Safety, 2023
Asset Caption
Type: year in review | Year in Review: Pediatrics (SessionID 2000008)
Meta Tag
Content Type
Presentation
Knowledge Area
Research
Knowledge Area
Quality and Patient Safety
Membership Level
Professional
Membership Level
Select
Tag
Clinical Research Design
Tag
Evidence Based Medicine
Year
2023
Keywords
data science
pediatric critical care
predictive analytics
machine learning
artificial intelligence
Society of Critical Care Medicine
500 Midway Drive
Mount Prospect,
IL 60056 USA
Phone: +1 847 827-6888
Fax: +1 847 439-7226
Email:
support@sccm.org
Contact Us
About SCCM
Newsroom
Advertising & Sponsorship
DONATE
MySCCM
LearnICU
Patients & Families
Surviving Sepsis Campaign
Critical Care Societies Collaborative
GET OUR NEWSLETTER
© Society of Critical Care Medicine. All rights reserved. |
Privacy Statement
|
Terms & Conditions
The Society of Critical Care Medicine, SCCM, and Critical Care Congress are registered trademarks of the Society of Critical Care Medicine.
×
Please select your language
1
English