A major concern in many social sciences is that it is often impossible to study the causal impact of a policy intervention or treatment through a randomized controlled trial. Instead, only observational are often available for this task. Over the past two decades, regression discontinuity (RD) designs have become widely used in empirical economics for estimating causal effects from observational data. In these designs, units are assigned to the treatment group based on whether a special covariate, called the running variable, exceeds a specific cutoff value. When certain conditions are met, units near the cutoff can be considered as if they were randomly assigned to receive the treatment. This simple approach helps identify the causal effect of the treatment.
The goal of this project is to expand the tools available to researchers working with RD designs. It involves both methodological research in econometrics and statistics and practical work with real data. The project has developed more accurate and easy-to-use methods for estimation and inference, addressing common issues that affect the validity of RD analysis.
Part I focuses on covariates and group structures in RD designs. The research has led to two new approaches that improve estimation precision by making use of covariates with many dimensions. This is achieved by combining classical nonparametric regression techniques commonly used in the RD literature with modern machine learning methods that help tracing out the most relevant covariate information.
Part II develops methods for RD designs when the running variable takes on only a moderate number of distinct values. The research shows that the commonly used method of clustering standard errors at the level of the running variable to account for its discreteness has very poor properties and should not be used. Instead, the project recommends using recently proposed "bias-aware" methods that are well-suited for dealing with discrete covariates. The project also extends the "bias-aware" methodology fuzzy RD designs with discrete covariates and other forms of irregular support.
Part III introduces methods that consider the manipulation of the running variable in RD designs. The project has obtained new partial identification results for general models of manipulation and proposed flexible methods for estimating these bounds. Additionally, the project has made progress in understanding Donut RD designs, a widely used approach that involves excluding units near the cutoff to address concerns about manipulation of the running variable.
Given the popularity of RD designs and the project's practical focus, it has the potential to make a significant impact on empirical economic research in various policy-relevant areas such as education and public finance. Its findings can also benefit other fields like sociology or epidemiology, where researchers must commonly work with observational data to infer causal relationships. By expanding the range of available methods, this project contributes to advancing the field of economics and promoting progress in other scientific disciplines, ultimately leading to positive outcomes for society.