Tuesday, March 1, 2022
1:00pm Eastern Time
Rebecca A. Hubbard, PhD
Professor of Biostatistics
University of Pennsylvania Perlman School of Medicine
Abstract: Opportunities to use real-world data (RWD), including electronic health records (EHR) and medical claims data, have exploded over the past decade. The Covid-19 pandemic has provided a particularly dramatic illustration of the potential value of RWD for advancing medical research as well as the high-risk of bias inherent to many such studies. Using data sources that were not collected for research purposes comes at a cost, and naïve use of these data without considering their complexity and imperfect quality can lead to biased inference. While RWD offer the opportunity to generate timely evidence grounded in real-world populations and clinical practice, issues of information bias and confounding create serious threats to the validity of these studies. The statistician is faced with a quandary: how to effectively utilize RWD to advance research without compromising best practices for principled data analysis. In this talk I will use examples from my research on methods for the analysis of EHR derived-data to illustrate how an understanding of the EHR data generating mechanism can inform selection of appropriate study questions and development and application of statistical methods that minimize the risk of bias. The overarching goal of this presentation is to raise awareness of challenges associated with the analysis of RWD and demonstrate that valid evidence generation can be grounded in an understanding of the scientific context and data generating process.
Dr. Hubbard’s research focuses on the development and application of methods to improve analyses using real world data sources including electronic health records (EHR) and claims data. The data science era demands novel analytic methods to transform the wealth of data created as a byproduct of our digital interactions into valid and generalizable knowledge. Dr. Hubbard’s research emphasizes statistical methods designed to meet this challenge by addressing the messiness and complexity of real world data including informative observation schemes, phenotyping error, and error and missingness in confounders. Her methods have been applied to support the advancement of a broad range of research areas through use of EHR and claims data including health services research, cancer epidemiology, aging and dementia, and pharmacoepidemiology.