Skip to main content

Research on Microeconometrics: Identification, Inference, and Applications

Periodic Reporting for period 4 - ROMIA (Research on Microeconometrics: Identification, Inference, and Applications)

Reporting period: 2020-07-01 to 2021-12-31

This research project is motivated from three observations regarding recent trends in empirical economics using micro-level data. First, researchers are increasingly aware of the trade-off between credibility and the strength of the assumptions maintained. This trend has led to recent intensive research in partial identification. Second, applied empirical research is increasingly based on data collected for study by individual researchers, quite often through laboratory or field experiments. Third, high-dimensional data are more readily available than ever before, and have received growing attention in economics.

Generally speaking, the purposes of econometrics are (i) to help empirical researchers understand under what conditions interesting features of an econometric model can be identified from the population; (ii) to develop corresponding suitable methods for estimation and inference, and (iii) to learn about parameters of interest, such as those governing mechanisms behind economic behaviours, impacts of social policy, and predicted outcomes under counterfactual exercises. Textbook econometrics implicitly assumes that (i) objects of interest are point identified, and (ii) datasets possess a small number of variables relative to sample size. In other words, textbook treatments of econometrics do not pay careful attention to identification problems, do not explicitly consider the research stage of data collection, and presume that the sample size is sufficiently large relative to the number of variables. Therefore, there is a call for research to improve standard econometric practice by facing identification problems upfront, by providing econometrically sound guidelines for data collection, and by making use of the increasing availability of high-dimensional data without sacrificing the credibility of econometric methods.

This research project aims to contribute to advances in microeconometrics by considering the issues of identification, data collection, and high-dimensional data carefully. The proposed research builds on semiparametric and nonparametric approaches to increase the credibility of proposed econometric methods. The key objectives are as follows.

(1) To develop identification results of practical value and to characterize optimal data collection for applied researchers.

(2) To make advances in estimation, inference, and testing in a variety of microeconometric models.

(3) To produce credible evidence in applied microeconometric research.

(4) To develop computer software that implements newly available microeconometric techniques.
The PI and researchers have produced a large number of academic articles. During the project period, a total of 46 papers have been written. Among these, 30 papers are either published or in press, including four publications in general interest journals (three in Econometrica, one in American Economics Review). 16 working papers are completed and under review for publication. All research papers are available via cemmap working papers ( or arXiv working papers ( in the public domain.

In addition, a conference entitled "Econometrics for public policy, methods and applications" ( took place in London, 14-16 April 2016 (jointly sponsored with Cemmap). About 50 academic participants attended this event. Another conference entitled "Conference on optimisation and machine learning in economics" ( took place in London, 8-9 March 2018 (jointly sponsored with Cemmap). More than 80 academic participants attended this event.

The research is of basic nature and has potential applications not only in economics but also in other social sciences. Furthermore, the methodology developed can affect statistics and machine learning.
Overall, the project has been successful since it started in January 2016. Its progress has been steady and the high productivity continued to the end of the project in December 2021. The project achieved its original objectives, and the main results of the project include several key publications:

Lee, S. and Weidner, M. (2021), Bounding Treatment Effects by Pooling Limited Information across Observations, arXiv Working paper, arXiv:2111.05243 [econ.EM]

Pedro Carneiro, P., Lee, S. Wilhelm, D. (2020), Optimal data collection for randomized control trials. The Econometrics Journal, 23: 1-31.

Lee, S. and Salanié, B. (2018), Identifying Effects of Multivalued Treatments. Econometrica, 86: 1939-1963.
ROMIA logo