false
Catalog
SCCM Resource Library
CURE ID: Leveraging Real World Data in COVID-19 an ...
CURE ID: Leveraging Real World Data in COVID-19 and Beyond
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Hello, and welcome to this webcast. I'm going to go through the introductions a little bit slowly because I think there are people still trickling in. Today's webcast is titled CureID, Leveraging Real-World Data in COVID-19 and Beyond. I'm Dr. Smith Hevner. I'm Scientific Director at the Cure Drug Repurposing Collaboratory with the Critical Path Institute in Tucson, and I'll be moderating the webcast today. Just a note to everyone, there will be a recording of this webcast available in about five to seven business days. You can access that by logging into your mysccm.org account and navigating to the My Learning tab to access that recording. Have a few housekeeping items before we get started. There will be a Q&A at the end of the presentation, but feel free to drop your questions throughout the presentations into the question box located in your control panel. You can see a screenshot of that on the screen here for you. Feel free to drop those in. We'll answer what we need to as things are going through, and then we'll save most of those questions for the Q&A at the end. We do want to make a disclaimer that this presentation is for education purposes only. The material is intended to represent an approach, view, statement, or opinion of the presenter that may be helpful to others. The views and opinions expressed herein are those of the presenters and do not necessarily reflect the opinions or views of SCCM. SCCM does not recommend or endorse any specific test, physician, product, procedure, opinion, or other information that may be mentioned. Just for us to get an idea of our audience here, we have a couple questions that we wanted to ask y'all. You'll see a poll screen pop up, but first thing we want to know what people's primary roles are. Let us know, are you a physician, an advanced practice provider, a nurse, respiratory therapist, or other bedside clinical professional? Are you a researcher, a scientist, or an informaticist? Those responses coming in. Great. Looks like about half of our group here are physicians, which is awesome. We've got 5%, advanced practice providers, good representation of nurses. I'm a nurse myself, so I always like to see us here. Then, of course, as we would expect, a number of research scientists. I don't see anybody whose primary role is an informaticist, so hopefully this will be a very helpful webinar to people to get some understandings about the tool and the work that we are doing. In the interest of time, we'll pop right onto the next question. We want to know what EHRs are y'all using? Epic, Cerner, Meditech, Allscripts, something else? Oh, and we should have made a longer list of others. 17% are using some other EHR than what we represented, but as we expect, it's a large portion of Epic users. We know that they do have a good portion of the market share. Glad to see a large number of Cerner users as well. And then our last question, we want to get a sense of how familiar people are with common data models. So our responses there are, what is that? Do you have any reference for what a common data model is? Have you heard of them? You worked with common data models in research or in your institution? Or are you actively involved in one of the communities? For example, the Odyssey community is a pretty active group around common data models. Give it just a few seconds for responses to trickle in. Oh, no, that's a nice spread. I think that Paul and Matt, I think, will be very happy with this. We get a number of people that are pretty familiar, but a good chunk of people that have just heard of common data models and don't necessarily have the expertise. So I want to get right into introducing our speakers. I want to make sure that they have all the time that we give them and save some time for that Q&A. Up first, Heather Stone is a health science policy analyst at the FDA in Silver Spring, Maryland, the United States. Then we have Dr. Paul Nagy, the director of education, biomedical informatics, and data science at the Johns Hopkins University in Baltimore, Maryland. And Dr. Matthew Robinson, an assistant professor in the Division of Infectious Diseases at the Johns Hopkins University in Baltimore, Maryland. Very happy to have all of our professors here with us. And as you saw on the slide, there are no disclosures. I'm going to hand it off to Heather for our first presenter. Please go ahead. Great. Thank you, Smitty. It's a pleasure to be here and to be discussing this really very exciting project, the EDGE tool that has been developed as a collaboration between FDA, NIH, the Critical Path Institute, Hopkins, SCCM, and many other partners. So I'm going to begin by just giving you a broad overview of the project as a whole and how we sort of got to this point. So CureID was developed as a platform to capture novel uses of existing drugs. We wanted to understand how drugs were being repurposed to treat infectious diseases, where there might be no adequate approved therapies. And so we wanted to see, well, what if we could collect case reports directly from clinicians of how they were using drugs off-label to treat these infections, and then share that information with the community in an open source manner so that other clinicians could learn from each other's experiences. We developed a case reporting platform, first a website and then a mobile app that can be used globally, where a clinician in about three minutes can report what drug did they use for which disease, why did they use that drug in a new way, what was the outcome of the treatment from their perspective, and were there any adverse events. And then, of course, we collect additional demographic information and that kind of thing. As you can imagine, there have been challenges in getting clinician-entered case reports, which, of course, was magnified during the pandemic. And so we started to look at other ways of collecting similar information as well. But the goal of CureID has remained the same, which is to enhance our understanding of new uses of approved medical products. I think it's really important to note that we don't want to leave it at just the anecdotal collection of reports. I think that there's a lot that can be learned from case reports, but there are also obvious limitations. And so the goal was to use this platform as hypothesis generating and then to try to have a more robust process in place to facilitate clinical trials and drug development for these areas of high unmet medical need while also recognizing the ability of this platform to serve as a resource for physicians to share information. We recognized early on that this was not an effort that could be led by FDA or NCATS alone, and so enlisted the help of the Critical Path Institute, which in June of 2020 convened a public-private partnership called the Cure Drug Repurposing Collaboratory. Given its timing, it started with a pilot focused on drug development for COVID through use of the CureID platform, but with very much a view towards sustainable data and trial infrastructure. And the goal of this collaboratory was to demonstrate how data shared from clinicians in real time could be used to inform ongoing and future clinical trials, and then eventually, based on those trial results, potentially drug labeling. We have been fortunate to have many incredible partners be a part of the Cure Drug Repurposing Collaboratory. These include the Society for Critical Care Medicine, obviously, which is why we're here presenting today, the Infectious Disease Data Observatory at Oxford, Johns Hopkins, Mayo Clinic, Emory School of Medicine, as well as many others, NIH and IDSA and so forth. We really make an effort to include researchers, clinicians, and patients so that they all have a voice in this process and seek to, you know, just collaborate with each other and to try to find synergies to be able to accomplish our shared goals. So we were very fortunate, as I mentioned at the beginning, you know, there are a lot of people who are running, you know, there are challenges in getting case reports from clinicians. Actually, I remember years ago when we first presented this, Matt sort of laughed at me at the idea that we were gonna get a clinician to enter a case report, and he wasn't wrong, you know. So as unfortunate, of course, as the epidemic has been, it did present an opportunity for us to think more on a larger scale about the ability to get data and to automate the extraction of information that was already being collected in the electronic health records. So we were very fortunate last year to be awarded a large grant from the Department of Health and Human Services, ASPE, Assistant Secretary for Planning and Evaluation, the Office of the Secretary's Patient-Centered Outcomes Research Trust Fund. Quite the mouthful. But we were awarded this grant to build the EDGE tool, which you will hear about from Matt and Paul shortly, which is a tool to help automate the extraction of de-identified data from electronic health records into QRID. And importantly, before it does that, it actually maps the entire EHR to OMOP to this common data model. And so really is potentially opening up an opportunity for many more institutions to participate in sharing their information than have been able to in the past. Just for those who are unaware, the Patient-Centered Outcomes Research Trust Fund is a very interesting program and was established in December of 2010 through the Patient Protection and Affordable Care Act, with the goal of expanding comparative effectiveness research through patient-centered outcomes research. So as I mentioned, we have four primary institutional partners or PIPs who have really been leading the charge in this effort. Johns Hopkins, who has been building the EDGE tool, Emory School of Medicine that has been really critical in our clinical trials efforts, the Society for Critical Care Medicine's Virus Registry, of which none of this would have been possible without their crucial efforts and their willingness to share data early on, which was so appreciated. And then really excited to also be incorporating a global perspective with our partners at the Infectious Disease Data Observatory at Oxford and thinking about ways that we can have data from many different sources internationally as well. So the EDGE tool output is ultimately a high-level case report that I describe as being sort of translated from the electronic health record or EHRs into the case report form that we use in the CureID platform. And this all occurs at the institutional site. The data is completely de-identified patient-level data, and it's approximately 40 variables that are then made openly accessible via the CureID platform. And one of the critical aspects of the application and one of the critical aspects of the ASPE grant is the open-access nature of this data and making it available to as many people as possible while, of course, respecting patients' privacy and institutional privacy. A larger, more detailed data set of approximately 180 variables was decided to be made available for researchers who submit a research proposal via the EDO platform. For those of you who are part of SCCM, you will be able to submit a research proposal through the virus registry as well. But to get the data incorporating the ISERIC data from the UK and the EDO data in that 180-variable data set, one could apply to EDO. So once completed, the expanded CureID platform will house tens or hundreds of thousands of COVID case reports. And this large collection of cases should enable the clinical research and regulatory communities to hopefully identify signals of potentially safe and effective COVID treatments from amongst the armamentarium of existing therapeutics approved by FDA and similar regulatory authorities. So while the data infrastructure being built is specific to COVID-19, it is being designed in a sustainable manner so that it can be promptly deployed for future uses, including outbreaks of existing or emerging infectious diseases, like monkeypox, for example, as well as other diseases with high unmet medical needs. And we're exploring the possible utility in a number of different areas that are very exciting. And then this should provide real-time access to the global clinical experience of repurposed drugs, where there's an immediate need to identify potential existing treatments in the absence of knowledgeable drug development, as well as inform ongoing and future clinical trials. So I'll just review the grant objectives so you understand sort of why we've been doing this or how we've been approaching it. So the objectives include to develop partnerships with clinical consultants, technical consultants, and data providers to capture the most critical treatment and patient outcomes data from electronic health records and registries. To build the infrastructure, technology, and methodology needed to extract and aggregate targeted clinical data from many global sources. This includes the electronic health records and registries, but also our clinician-submitted cases, as well as published cases, which we have a team that extracts the information from the published case reports and adds it to CURE. So the hope is to make it sort of a one-stop shop of information on all the available data for how a repurposed drug is being used. To do this, we need to customize the CURE-ID data fields to specifically accommodate COVID information that is available in the electronic health records, which we've done with the help of the partners here, and then make large quantities of de-identified patient-level data on COVID treatments from many different sources rapidly and openly available. In addition, we wish to expand clinician engagement and the creation of global treatment networks, increase patient involvement in the platform, and are exploring the opportunity for a patient portal, including a patient case report form, and then ultimately identify promising drugs, combinations, or regimens for COVID and other diseases with inadequate therapy from the data in CURE that could then be further studied in robust clinical trials. Finally, we hope to ensure the sustainability and use of the CURE-ID platform EHR registry expansion beyond COVID, and this would include the open dissemination of work products from the platform, such as providing open-source access to the software code, which is available on GitHub and will continue to be updated, the methodological learnings through publications, and tools developed for automated extraction so that they can be used by other interested parties. This is just the flow of data. So the Society for Critical Care Medicine's virus sites will contribute data through both manual extraction as well as the Edge tool automation. They will send 180 data variables, which through the Edge tool have been mapped to OMOP, the common data model. That data will go to the virus registry where it will be added to data from Johns Hopkins and Emory, which also extracted data via the Edge tool. And then this data set of 180 variables will be sent in OMOP from SCCM to Edo, where it will be mapped to CDISC by Edo, as that's the data standard that they use. It will be added data, not using the Edge tool, will be added from the ISARIC platform. And then the virus data set of just 40 variables will be sort of cut off and sent to NCATS where it will then be hosted on the QRID platform and made fully available as well as fully anonymized. We're currently undergoing a major revision of the QRID platform, including improved data visualization, searching, filtering, and other utilities. And the new release is expected on June 15th. So while you're welcome to go to the website or download the app now, I highly encourage you to check back in about three weeks because there are really very major improvements that are being launched then. We're very excited to note that the first pilot site has successfully mapped their entire EHR, which did not use a common data model to OMOP and extracted the required tables. And that this was able to occur in a fraction of the time that was normally required. And I'm sure Matt and Paul will go into more detail about that. And that to do that, they were also able to successfully implement the cohort definition to correctly identify patients for inclusion. Just more complicated than you would think it would be. So currently there are 16 virus sites in the Hopper to begin deploying the Edge tool now that the pilot is complete and they're being assessed by Hopkins and putting contracts in place. And we welcome interest from other interested parties that would like to be Edge tool sites. And then we are building a patient portal and improved data visualization tools. And then importantly, we're exploring how we could funnel potential findings from the observational data into more robust and sustainable platform randomized controlled trials to really test whether those findings hold up under randomization or not. And that is it for me. Thank you so much. Matt, over to you. Thank you, Heather, for getting us started on this conversation. Presenting with Paul here, I'm the clinician easy informaticist. We work together on this. So the things that we're going to try to cover today is a little bit about what the QRID project is. Although I think Heather covered that pretty well already. And what is the common data model? And then what is Odyssey? And why we want to use Odyssey for registries. And then why do we need tools to assemble these common data models? And then more specifically, this Edge tool that we've been building to help enable this. So I'll start with the kind of beginning of the COVID pandemic. So we've thought that real world data can be really helpful to evaluate repurposed drugs. It can provide a lot of insight, but it can also lead to a lot of confusion. So the example of hydroxychloroquine I think is great for this. So at the beginning of the pandemic, there was some idea that using hydroxychloroquine would help with COVID. And some observational studies were done that seemed to suggest it was helpful. It was quickly realized that there was some limitations in the methods for those studies. And subsequent randomized control trials showed that there was no benefit. So this just shows us kind of the problems and the pitfalls of using real world data in real time in a pandemic. So you really need robust methods to address this. So let's say we have a patient A who's sitting there with an oxygen saturation of 95%. He gets drug X and he survives. And then say, all right, well, I want to compare him to another patient who's similarly ill. So we got patient B also sitting there with COVID with an oxygen saturation of 95%. He did not get drug X and he dies. So it might conclude, oh, drug X saves lives. This is great. But, and if we were using kind of very limited information from electronic medical records, we could come to this conclusion. But let's say we tell you that patient A is the one on the left, who's sitting there on room air doing okay. And patient B is on a ventilator, both have an oxygen saturation of 95%. But you can see how that there may be other reasons for patient A surviving and patient B not. So when you want to use observational data to understand drug efficacy, you really need to use robust causal inference models that match similarly ill patients so that you're actually isolating the impact of the therapeutic itself on the clinical outcome. And so when you use incomplete registry data, you can really go down roads that lead you to misinterpreting findings. And this is sort of the impetus for us to try to do better with a clinical registry for us to really understand drug efficacy. So with this QRAD project, which Heather's already talked about in a lot of detail, it's a large project and gives us an opportunity to build a larger set of tools and to help extract the relevant data from electronic health records for us to do these sorts of analyses. And just to, if we were to create a registry manually, it's extremely resource intensive. This article here is showing you for a traditional registry for trauma data, how much effort is required to maintain it. And so for every two to 300 patients in a traditional registry, where data is entered manually and curated manually, you need about a half an FTE for every two to 300 patients, which is just truly unsustainable when you have a common disease and you need a lot of data to assemble enough patients that are similarly ill to do this sort of causal inference study. Oops. So traditional data management is really difficult to scale. So usually when we go to do a research project, we kind of come up with a schema to represent the data that we need for that project. So let's say project one on the left here, we decide that we wanna know something about neutrophils and asthma. And we decide that we're gonna come up with a few variable names, which you see on the left. Let's say we'll look for absolute neutrophil count. We'll give it a certain variable name and we'll capture a history of asthma on our RedCat form, for example. Project two might be doing something very similar, but they might collect information on neutrophil percentage. They may group COPD and asthma history together. And they're gonna use different variable names with different formats. And so every time you do a new study like this, it's a time-consuming effort to figure out which variables we need, how should we name them, what should their conventions be for formatting? And then let's say a researcher wanted to take data from project one and project two and analyze it together. They have to go through this whole process of figuring, all right, well, let's match with the variables in their names. Do they actually represent the same thing? Oftentimes they represent similar but not exactly the same concepts. And so this really limits the ability to combine data from multiple research projects. And so common data models offer a better way to do this. But what is a common data model? So before I got started on this, I hadn't thought much about standard terminologies. But if anyone has ever billed a patient, you've probably come across ICD-10 codes or CPT-4 codes and just thought of them as generally being annoying. And they are, but they do attempt at least to codify in a standard way between different sites and providers what we're actually talking about. So common data models use these standard terminologies. We'll talk more about that later. Organizes this data from multiple different sources and stores the data in a standard set of tables that can easily be combined from multiple sites. And then also importantly, the relationships between these concepts are clearly defined so that we know how one variable relates to another. So I'm gonna turn it over to Paul. Yeah, that's great, thank you. So in ODYSSEY, ODYSSEY stands for Observational Health Data Science and Informatics. This is an effort that's been going on for 10 years. It came out of the FDA as an effort to do drug surveillance in the real world. But really this community has grown to becoming a very large international open science community trying to study electronic medical record data. It has become the largest de facto common record. And so what does that mean? And so what I wanted to share is the first thing is that today your data is in a proprietary format. It is in Epic or Cerner or whichever EMR that you're using. And that data is stored in literally thousands of different tables sitting within your... Oh, do you wanna hit advance? And the first step is that there's a Rosetta step. So what we've done is we've created a very simplified data model for a patient and we've created the model just enough so that we can answer basic research questions around it. And we've taken the major elements of the medical record and convert it into this common model. So instead of it being Epic or Cerner or another vendor this is now in a neutral standard model. And the advantage of that is also these data models are often proprietary and you're not allowed to use them for research purposes and that you cannot share. They're also considered intellectual property. And so it's very hard to share our code or our analysis on proprietary data without the permission of that EMR vendor. So one big advantage of the OMOP Odyssey community is that they've created this Rosetta stone for a simple medical record data to be able to convert from all of these different EMR types around the world into a common format. And I'll share that format, but that it is... OMOP and Odyssey is much more than just a common data model. It has also been an ecosystem where literally hundreds of software developers have been actually building tools off of that common data model. Once you have a common data model you can start building all kinds of analytics tools and visualization and different ways of doing survival analysis. And so it's actually been a very large open source ecosystem that has now contributed over 13 million lines of code into the open source repository for GitHub. And this means that there are very powerful tools that come with this data model. It's not the data model itself, but these tools can help you analyze data, can help you really understand it and make questions and to be able to answer them and to help you model proprietary data into the OMOP format. And this has grown. This community is now, we have over 350 databases around the world. We've mapped over 800 unique patient records into OMOP. And we can actually now do large-scale evidence-based trials, where we can do an analysis, where we share the analysis and the code between the sites without having to necessarily share the data. And this can be very advantageous in certain types of architectures. We've had a large study where we've done over 100 million patients have been studied through the same analysis by sharing the analysis across different institutions. OK, so this really came out of this Columbia University with George Ripsack. And this is really creating an accessible, reliable way of repeatedly building reproducible evidence off of medical records. Medical records have a lot of bias in observation. Patients are seeking care because they're sick. And so understanding all the different sources of bias is really an important part of this. And so trying to build this network to collect all this data really provides a valuable avenue for looking at real-world evidence. And so the data model does what you would think it would do. It has, basically, it looks for things like conditions. What are the symptoms of a patient? Trying to describe the disease a patient might have up until a certain point. It looks at what interventions have been done from procedures and medications. And then it looks for what measurements might be available to describe that disease. And everything in OMOP is mapped to a concept. And I think one of the really smart things about OMOP is they didn't have just one controlled vocabulary. You might see ICD-10 is great for diagnosis codes. CPT-4 is good for procedure codes. But there are other coding systems. And it's really important to be able to take advantage of them, like RxNorm for medications, LOINX for labs. And so what they did was they said, you know what? We're just going to make everything a concept. We're actually going to bring in 150 controlled vocabularies and be able to bring them together and be able to define things as a concept. And so at least everything will have a definition and will be tied to a concept that you can then map. And so this allows us to be able to do different types of analysis. The other key part is everything has a timestamp as well. It is tied to an event. Because we're really looking for, we want to know what diseases the patient had before they were hospitalized so we can look at them as comorbidities. We want to know what medications they were taking beforehand to see the impact of this additional medication. So really trying to look at this as a way of time to event type of analysis in this data model. And so here is the data model. Instead of the thousands of tables you might see in an EMR, we've really boiled it down to about 12 important clinical tables that describe a patient. A patient's going to have multiple visits with a health system, whether it's telemedicine, outpatient, inpatient. And in those visits, they will then have medications and procedures and diagnosis codes and measurements. And those generally, that's how we kind of look at the event analysis. And then from here, we can then map things into standard concepts. And so this is what's considered the common data model for ODYSSEY. And what's nice is that everything is tied to a concept, tied to a common terminology. And it's also, these common terminologies are really defined by the international community. So this is really tied and have done in a very open, repeatable way. So the goal is now, I can say I can define a cohort of patients who are hospitalized for COVID who went on a ventilator. And I can do that as an analysis and then share that code with my colleagues at another university. They can run the same analysis. And very quickly, we can be able to aggregate not just the patient row level data, but the results around that. And so the goal is, I think the real benefit here is how do we take advantage of the OMOP common data model for registry data? And it's been shown to create a lot of efficiencies. The biggest goal, of course, is trying to reduce the provider workload of having to do manual chart abstraction to collect the data manually and then to submit it. When we already have, a lot of this data is already coded in the EMR. So we have a lot of the base facts. We have the medications, labs, procedures, and conditions. And so the questions are then just how do we bring in the extra pieces that are critical for the registry around respiratory management for COVID to make sure that we can really capture everything we need for that registry? So the opportunity is great. We've seen other societies moving towards OMOP for clinical registry so that they can really harness a lot more data, but by not being rate limited by this half an FTE for 300 patients. Now I can bring to bear, once I've converted at Johns Hopkins, we've converted 2 and 1 half million patient records over the last five years into OMOP. And so I can now really easily scale a lot more for different types of registries. And so the goal in this project is that we really want to reduce the time and cost to convert your EMR data into OMOP. And once you have into OMOP, it's in a common format. And then help you maintain that instance and using that data to help you deliver value for your health system. So this common data, now that you can share analysis, you can use this for quality measures. You can use this for doing other types of analysis around your health system, as well as really supporting clinical research, as well as clinical registries. And so this really gives you an opportunity to have that data in a format that you can then do all kinds of longitudinal analysis. And so what is the goal of our edge node? And so our goal is to assist 60 health systems to develop and help them convert their data from their EMRs into OMOP and really track how much time it's taking on their end. And part of it is also not just helping them convert their data, but helping them deploy some of the tools in Odyssey, which we'll be showing you that are open source that can really help you with your ETL, can help you with managing your system, and also helping you conduct clinical analysis on your data. And so our goal is really to really build the capacity within these health systems at a very low cost in a short period of time to help them take advantage of this common data model. And so the components to the edge tool is gonna be the software that we've taken from the Odyssey community and putting it into a package that we can then deploy within a health system. And so you can scroll through these slides. So the goals are, we wanna have a web-based decision support tool to help the site with our custom concept mapping. Some data is gonna be in flow sheets or smart data elements. And so we wanted to build some tools that really help aid you in that guiding of that mapping. We wanna also have, a lot of the EMR sites have really a base configuration that we can take advantage of. We found that from taking scripts at Hopkins and deploying them at Prisma, we found that there was about a 70% correlation with the base of what we've done, and they were able to take advantage of that and really radically reduce their time to do the ETL by doing that. And then helping them with their configuration management of all of their transformation from their data into OMOP, and then help them with that, managing that process. So you wanna run this on a relatively, on a monthly basis, or based on what your research needs are for clinical registries. We have some prospective clinical registries at Johns Hopkins that we would like to have that data projected on a weekly basis. And another big part of this is actually helping deploy, there's a lot of great software in OMOP because it's observational data, is really looking at data quality. So Odyssey puts a lot of work into data quality. We run over 3000 data quality tests, looking at conformance, completeness, and plausibility of the data to try to look at it and evaluate to see if there's pieces of it that are missing in your ETL process. And then of course we can now, with that edge tool, we can bring that software with all the key components that we need to do to define the patient population for the virus registry, perform dedemification on that data and submission to the virus registry. So we're trying to make it easier for the site to be able to convert their data, manage their data, and then use that data to submit to the virus registry. So this is what it looks like. And so from a technical perspective, we actually did a partnership with Microsoft. This is all open source software. And what we've done is we've worked with them on the Azure platform, and we've built basically containers, which are basically clusters of software that's put together. And we put those containers to help us with the different pieces of it that we can deploy at a site. And so this is what's nice about cloud computing is you can stand this up now. If you have an Azure relationship, you can have a one-click deployment that can help you with the configuration and stand up. And while Odyssey is a phenomenal environment, there's a lot of software. A lot of software is built in R or JavaScript or Java, and putting all the pieces together can take a significant amount of time. And we wanted to make it into as easy as possible to do a one-click deployment for this technology. And so I'll show the, so one of the pieces is this tool called the Documentation Engine. And what this does is it takes all of the transformation data from your Epic or from your Cerner into OMOP, and it turns it into a webpage that all your clinical researchers can look at. So if a researcher has a question about a data element, why it's getting strange values or doesn't understand the units, they can actually easily see the ETL that you're using from the source data. And so this is a great way of finding bugs, but also helping maintain the provenance to make sure that the researcher know where the data is coming from within your EMR. Next slide. This is that data quality dashboard that comes with the Edge node that actually creates a website that helps you explore all the tests that it runs and looks for atemporal and temporal plausibility and can give you a quick dashboard of which test fails. And it will actually give you the code so you can run it yourself and find exactly which rows are failing which tests. And so this is really helpful in accelerating your time and validating your data to turn it into a first-rank research resource. And so the second container, sorry, go to the next slide. The next container is really where, this is one of the main tools that comes with Odyssey. This is the Atlas tool. And this is actually a web-based tool that helps you define a cohort. And so here I'm defining a condition diagnosis code as a patient with diabetes, or maybe taking a type two medication and then building all these inclusion criteria. So in this use case here, I'm actually following the graph on the right-hand side, the algorithm for the phenotype for type two diabetes, where I'm then excluding all the type one diabetes, but then I'm also looking for either an abnormal lab, a medication or a diagnosis code. And so when you start building out these, you really need to build complex phenotypes because this is how you try to get as close as possible to what a chart review of clinical professionals would get to. This can become very complicated. On the right-hand side, it becomes a thousand lines of code, but on the left-hand side, you can see it through a user interface. You can kind of see what it's doing and you can be able to manage it. We've found as clinical researchers at Johns Hopkins without SQL training or training in R and Python, we're able to use Atlas to be able to answer and ask clinical research questions. And so this is, I feel, a really important step forward for taking all of this raw data, converting it into a common data model, then providing these data science as a service tools on top of that to help you explore it and ask questions. Next slide. And so this is all built on Azure, where we've built in a specific instance of it with the Azure SQL server and vocabulary management for setting up the environment. The goal is to actually, this whole project will be open source and other cloud providers can emulate this. But our goal is that this becomes a very easy to deploy, not just the data model, but the software that comes with the data model that really helps you manage it effectively. That's really one of the key parts of the Edge tool is to help package the Odyssey software into a more maintainable, supportable environment. Next slide. And so this is actually what it looks like. This is how the Microsoft team has architected the different clusters. They deploy this as a secure network within your environment. They use authentication within the local site, but they adhere to all the security policies within the health system that is hosting the environment. And most health systems already have a contract with either Amazon, Google, or Microsoft. And our goal is that whichever one you already have, you can then deploy the Odyssey stack with the registry to help you manage that data. And here's what the visual web-based tool is to help you actually do the mapping of raw data from your clinical environment and then mapping it to the concept code. And there's some really useful tools that does text similarity for flow sheet data to help you look at what the variable names might be. Because in Epic and Cerner, it's just gonna be a variable name that you've added to a flow sheet or to a form. And we have to map it to an actual concept like a blood pressure and to a unit and to an actual clinical terminology. And so this is that graphical tool that helps you do that text similarity matching to find the right concepts for the data that you've provided into the system. And so specific to CureID, we've worked collaboratively to identify the variables of greatest interest for looking at repurposed drugs and their impact on COVID care. And so, as Heather mentioned, there's about 40 sort of core variables, 180 of the bigger ones. And we've kind of condensed into a fewer number of OMOP concepts. And we're not creating new variable names. We're simply identifying the ones that already exist within the OMOP kind of data model. And then we're helping sites map their electronic health record data to those existing concepts. And what we found is that most existing registries can go for the easier to map things like those ICD-10 codes or those procedure codes, but then they miss that kind of granular clinical information that I really think you need to understand how sick a patient is, particularly for COVID. And so a lot of our work now is diving into this minutiae to try to figure out how much supplemental oxygen was the patient receiving? What drugs did they receive and when? And so that's the bigger effort that's gonna require a lot of collaboration with sites, but we're hoping to use tools to make it much easier. And last thing, so the other key part of this is one of the things we can do now that we've converted to an OMOP format, we know it's all coded data. It's actually very straightforward to algorithmically de-identify the data. So randomly reassigning the patient IDs, the visit IDs, and then doing a shift and truncate on the actual, on the date of birth, the date of birth, the date of birth, on the date of birth with all the subsequent events of that patient as well. And so there's, this has been something that's been published in peer review. It's a standard algorithm for how we can do de-identify, de-identification of EMR data. So the steps are, this is another algorithm that can be run and then that data can then be submitted safely to the Society of Critical Care Medicine. And so we've worked with a bunch of sites already to get this started, and we're excited to move to the next phase, get more sites and actually start working with the data. Thank you. Thank you all very much. We do have a couple of questions that have popped into the chat. A couple that have popped into the question field. Of course, anyone has other questions, feel free to add those. But I guess we'll start, kind of do them in order. The first one came in while Heather was presenting. I think she basically be the best person to answer this question. They say, you mentioned drugs as a focus. Is there any consideration for devices which received emergency use authorization and approval for COVID-19 treatment? Yeah, thanks. That's a great question. At this stage, no, it's been exclusively focused on drugs. I think that the Edge tool as a whole probably does have the capacity to expand in new ways quickly, things like devices, and that would certainly be an interesting area to study. I think that's one of the benefits of this approach is that it's scalable in lots of different ways. And because it's all open source, anyone can kind of take it and run with it and adapt it to their own uses. Yeah, and if I can add, in Odyssey, there's quite a few device researchers. There's even a device research working group. Great, thanks. Matt, anything to add? Just that with the approach that we're taking, we're going to the flow sheet level. We're going to the flow sheet level. So if you see in the EHR that a patient is on a specific device, like a ventilator or a BiPAP machine, that is data that we can capture with the Edge tool. So then the next question, it's a few typos. I think what we're tending to ask is, is the CureID platform available only in the United States or is it available internationally? Internationally. Yeah, it is hosted on GitHub. It is open source and it is freely available. Yeah, that's right. No, and actually the CureID platform itself was developed for use around the world, including in resource constrained settings. We've had some very interesting adventures trying to test it in remote parts of India and Peru and all sorts of places, South Africa. And it works, it's available offline. You can download the information and have offline access and then upload things when you have internet. But I think one of the key features of a platform that goes beyond some of the other fantastic efforts that happen throughout the government and other places is the international focus of it. And that there's a great benefit and value to be had from looking at data that is contributed internationally because there's such differences in practice in different places that you get a much wider range of experiences. Challenges as well, of course, with making that data comparable, but yes, it's open to anyone, to any healthcare provider internationally. And Paul and Matt, are there any limitations around the Edge tool being deployed outside of the US? There are not. There are some vocabularies that have some intellectual property permissions like CPT-4. A lot of Odyssey relies on SNOMED, that we mapped CPT-4 and ICD-10 to SNOMED so that we can use it across international borders. I think the biggest limitation is just that many sites internationally don't have EHRs. So we started this with neglected tropical diseases and not many EHRs from isotomers. So that can be a little bit of a challenge. Absolutely. And then the next question that's come in is, how do these models, I'm assuming we're talking about a hybrid model, relate to quality clinical data registries or QCDR? Would it replace them? Yeah, so QCDR and so quality measures, there are like HEDIS measures, there's a lot of CMS measures for quality, and there's been a big move to turn them into electronic clinical quality measure, eCQMs. And so what's interesting here is that there's a lot of opportunities. So since we have things in a common data model, you can actually use those to do a lot of things. You can actually use that for certain quality measures. And so it actually works out, there's a good overlapping of measuring quality measures within OMOP. What OMOP doesn't have is it only generally has one timestamp per event. So it's used for research. So we wanna know if something was performed, if a medication entered the body. And so a lot of quality measures are process-oriented where they're looking for the time between something was ordered and then administered for say, for TPA and stroke. And so we generally are not used for operations research types of quality measures, but there's a good subset that we certainly can be used for. And I'll just add that I think that this is a topic of a great deal of interest to other partner organizations in terms of thinking about how a deployment of the EDGE tool could benefit their organizations. And so I think it's an exciting area to further explore. Couldn't find my unmute button. Thank you. So I don't see any other questions pop in in our last few minutes. I wanted to ask, how do people get involved in these tools or using these resources? What are, specifically for Paul and Matt, what are the resources someone needs to have to be able to start implementing a tool like this? Well, we certainly want you to join the CURE-ID registry and that's a great way to be involved because then we actually have built some support groups and we're here helping you map your data. The keys are, we work with each of the health organizations and ideally it's great if that organization already has a cloud provider that can help them with standing up these tools. And that way I feel it makes it much easier to manage and deploy. It also is helpful to have, of course, having some resource on the data side within that organization, some data analysts. And so, for example, at Johns Hopkins, when we originally went to OMOP a few years ago, we put a lot of time into it. We had to learn a lot of the steps of what the conventions were and it took us over 2,000 hours to map all of our EMR data into OMOP. And at Prisma, we were able to do it by guiding the data analysts at Prisma in just under 200 hours. So we feel like by helping you through that step, by giving you tools and really helping you, make sure you don't get hit too many walls or potholes, we can kind of accelerate that process. And our goal is to get it under a hundred hours where we really reduce that cost. So if we can reduce that cost and then hopefully, I also see this as that I'm hoping this supports more than just the virus registry. This could be, many registries can be supported off of OMOP and that way you can get a lot of value out of doing this conversion once and then using it for many different purposes. Yeah, I believe, if I remember you're collecting, you're supporting a hundred registries at Johns Hopkins with an OMOP instance, right? So yeah, we actually have 160 clinical registries at Johns Hopkins and we have over 60 FTEs doing chart abstraction for those registries. So we are very interested in trying to help support clinical research, but it has to be done in a new data automation way that we can have reliability in the data and also latency, having it acquired in a faster period of time and being able to have it done through already taking advantage of all the coding that's being done as part of clinical care. Thank you. And I just dropped in, I believe I dropped in correctly in the chat and email where people can reach out for more information in our last few minutes. Any closing thoughts from you, Heather or Matt? No, I just want to thank everyone for joining the presentation. I learned so much every time Matt and Paul present on this topic. So it was great to kind of have it all come together in one place. And as I said, really encourage people to reach out. If you're interested in being involved, we welcome additional collaboration. Think about the ways that this could benefit your institutions beyond CureID and the virus registry, although certainly within that space as well. And we look forward to the opportunity to hopefully work with you all. Matt, Paul? Yeah, I'm just a, you know, I hadn't worked in this space until COVID. And it's just really interesting that we have so much potential to use electronic health record data for observational research. We're just really hoping to lower the barriers for people to do that exciting work. Paul, any final thoughts from you? No, this is really an exciting project. And I hope the standard for how we move registries forward. Great. Well, thank you to our panelists. Thank you, Heather, Paul, and Matt for taking time out of your day. Once again, there will be a recording available within about five business days on your mysccm.org account. Feel free to reach out to SCCM if you need help with that. And we'll be happy to answer any questions Feel free to reach out to SCCM if you need help navigating that. I also want to give a plug that we will have a second webinar around this. We're working on setting the date, but it'll be late June or early July, where we'll dig in a little bit deeper into the implications of these kinds of resources for observational research and causal inference research. So please keep an eye out for announcements about those dates. We'll have another good discussion. But we appreciate everyone taking your lunch break or your afternoon or morning break, depending on where you are in the world, and thank you so much to join us today. Thank you everyone. And thanks, Smitty, for being our host. Yeah, thank you. Bye.
Video Summary
The webinar titled "Cure-ID: Leveraging Real-World Data in COVID-19 and Beyond" discussed the use of real-world data to understand the repurposing of drugs for infectious diseases. The Cure-ID platform was developed to collect case reports from clinicians about off-label use of drugs to treat infectious diseases, with the goal of sharing this information with the medical community. The platform allows clinicians to report information about the drugs used, the outcomes of treatment, and any adverse events. The data collected is then mapped to the OMOP common data model, which organizes the data in a standard set of tables that can be easily combined from multiple sites. The data is de-identified and made openly accessible via the Cure-ID platform. The webinar also introduced the Edge tool, which automates the extraction of data from electronic health records into the Cure-ID platform and maps the data to the OMOP common data model. The Edge tool allows for the scalability of data collection and can facilitate the participation of more institutions in sharing their information. The webinar highlighted the international focus of the Cure-ID platform and the potential for it to be used in other research areas beyond COVID-19. Overall, the webinar showcased the benefits of real-world data analysis and the importance of standardizing and sharing this data for the advancement of medical knowledge.
Asset Subtitle
Research, Quality and Patient Safety, 2022
Asset Caption
The Society of Critical Care Medicine is offering this free webcast to discuss its partnership with CURE ID. Join Heather Stone, MPH, from the U.S. Food and Drug Administration (FDA), and Paul Nagy, PhD, and Matthew Robinson, MD, from Johns Hopkins University as they describe the automated extraction of real-world data from the electronic health record. CURE ID is an FDA initiative designed to capture real-world data, and the edge tool is a scalable solution for data harmonization and extraction.
Meta Tag
Content Type
Webcast
Knowledge Area
Research
Knowledge Area
Quality and Patient Safety
Knowledge Level
Foundational
Knowledge Level
Intermediate
Knowledge Level
Advanced
Membership Level
Select
Membership Level
Professional
Membership Level
Associate
Membership Level
Nonmember
Tag
Research
Tag
Evidence Based Medicine
Year
2022
Keywords
Cure-ID
real-world data
COVID-19
repurposing drugs
infectious diseases
case reports
off-label use
OMOP common data model
adverse events
Society of Critical Care Medicine
500 Midway Drive
Mount Prospect,
IL 60056 USA
Phone: +1 847 827-6888
Fax: +1 847 439-7226
Email:
support@sccm.org
Contact Us
About SCCM
Newsroom
Advertising & Sponsorship
DONATE
MySCCM
LearnICU
Patients & Families
Surviving Sepsis Campaign
Critical Care Societies Collaborative
GET OUR NEWSLETTER
© Society of Critical Care Medicine. All rights reserved. |
Privacy Statement
|
Terms & Conditions
The Society of Critical Care Medicine, SCCM, and Critical Care Congress are registered trademarks of the Society of Critical Care Medicine.
×
Please select your language
1
English