Skip to main content
European Commission logo print header

Inference in Microeconometric Models

Periodic Reporting for period 5 - MiMo (Inference in Microeconometric Models)

Reporting period: 2021-05-01 to 2022-12-31

Unobserved differences between economic agents are an important driver behind the differences in their economic outcomes such as schooling decisions, wages, and employment durations. Allowing for such unobserved heterogeneity in economic modelling equips the specification with an additional dimension of realism but presents major challenges for econometric practice. Hence, reconciling heterogeneity in the data with econometric models is an issue of utmost importance.
The aim of this project was to develop inference methods for models with unobserved heterogeneity, with a focus on longitudinal (panel) data and network data.
The first block dealt with inference in linear and nonlinear models and enhanced the performance of statistical hypothesis tests. We have provided new bootstrap-based procedures to perform inference in general nonlinear panel data problems, new bias-corrected estimators for the distribution of unobserved heterogeneity, new standard errors that are robust in high-dimensional problems, and several new tests for serial correlation in panel data.
The second block made progress on the estimation of models for network data. The importance of social and economic connections is well established but few formal results are available. We exploited the fact that network data can be seen as a type of panel data to derive such results. Most notably, we have provided the first theoretical analysis of fixed-effect regressions on general network data, constructed new estimators for network-formation models and general nonlinear models of dyadic interaction, and derived a general approach to deal with the self-selection of reference groups in linear-in-means problems.
The third block used panel data to non-parametrically estimate dynamic discrete-choice models with unobserved type heterogeneity and/or latent state variables. Such results are important because dynamic discrete-choice models are a workhorse tool in labour economics and industrial organization.
The project has delivered 21 research papers and 6 Stata routines for implementation. For each of the three work packages we have been able to perform all the work that was planned and have achieved all scientific objectives that were set out.

WP1 delivered a variety of new methods to perform inference in panel data problems:

- Bootstrap inference for fixed-effect models (under revision for Econometrica)
- Fixed-T estimation of linear panel data models with interactive effects (2022)
- Inference on a distribution from noisy draws (Econometric Theory, 2021+)
- Heteroskedasticity-robust inference in linear regression models with many covariates (Journal of the American Statistical Association, 2022)
- Bias in instrumental-variable estimators of fixed-effect models for count data (Economics Letters, 2022)
- A portmanteau test for correlation in short panels (Econometric Theory, 2020)
- Testing for correlation in error-component models (Journal of Applied Econometrics, 2020)
- A portmanteau test for serial correlation in a linear panel model (Stata Journal, 2020)
- A note on sufficiency in. binary panel models (Econometrics Journal, 2017)

WP2 derived new results on inference with network data:

- Peer effects and endogenous social interactions (Journal of Econometrics, 2022+)
- Instrumental-variable estimation of exponential regression models with two-way fixed effects, with an application to gravity equations (Journal of Applied Econometrics, 2022)
- Testing random assignment to peer groups (Journal of Applied Econometrics, 2022+)
- Fitting exponential regression models with two-way fixed effects (Stata Journal, 2020)
- Fixed-effect regressions on panel data (Econometrica, 2019)
- Likelihood corrections for two-way models (Annals of Economics and Statistics, 2019)
- Semiparametric analysis of network formation (Journal of Business & Economic Statistics, 2018)

WP3 obtained new identification results for mixture models in panel data:

- Learning Markov processes with latent variables from longitudinal data (2022)
- Identification of mixtures of dynamic discrete choices (resubmitted to Journal of Econometrics, 2022)
- Joint approximate asymmetric diagonalization by non-orthogonal matrices (2021)
- Nonparametric estimation of non-exchangeable latent variable models (Journal of Econometrics 2017)

These papers were presented at a variety of international conferences (for example the International Panel Data Conference, the Econometric Society summer conferences, and the conference of the International Association of Applied Econometrics) and university seminars (for example at Penn, UC San Diego, UCL, Brown, Yale, Oxford).
Inference in grouped data is plagued by bias introduced by the presence of many group-specific parameters. We provide improvements both through the development of new point estimators as the construction of alternative standard errors and bootstrap-based procedures. One substantial improvement on the state of the art in panel data is that we aim to estimate the entire distribution of marginal effects, whereas current results are limited to averages. Similarly, our new standard errors for regression models with many control variables can be applied in more general situations than existing alternatives. One example there are models with many dummy variables, which typically leads to a highly unbalanced regressor design. Also, our bootstrap procedure for nonlinear fixed-effect models provides a revaluation of maximum likelihood as a unifying approach to estimation and inference.

Although there are many economic applications to network data there is very little theoretical work on how to perform valid inference in such settings, especially when the network is rather sparse. Our results give formal sufficient conditions for conventional inference to be valid in the linear regression model and provides easy-to-verify diagnostics. In a large student-teacher data set, for example, our results show that standard inference procedures dramatically overestimate the importance of teacher value-added to student achievement. We have also provided simple estimators for dyadic data and a model for network formation that allow to estimate models for which previously no estimator with attractive statistical properties was available. We have also provided a theory-consistent solution to the problem of estimating gravity equations with endogenous choice variables. We have also provided a flexible method to deal with the endogenous formation of peer groups in the linear-in-means model of social interactions and a statistical test for random assignment. This is the workhorse model in the analysis of peer effects and the problem of self-selected peers is pervasive in practice.

Finally, we have provided general and constructive identification results for finite mixtures of dynamic discrete choices. These models are a cornerstone in structural applied work but little was known about their identification. Our work fills this important gap and our papers present the first complete identification analysis of such models to date.