Periodic Reporting for period 3 - PANEDA (High-Dimensional Inference for Panel and Network Data)
Período documentado: 2022-07-01 hasta 2023-12-31
The trend to increased data availability in the Social Sciences has accelerated in the past decades: Better computer and storage capabilities allow to record and manage much larger datasets and to access them more easily. We all create digital data footprints on a daily basis, from bank transactions to social network data. New machine learning methods allow quantifying everything from satellite images to legal documents, thus creating structured information out of unstructured raw data. And of course, there have been many conscious efforts of scientists and policymakers to collect larger and better datasets.
Those modern datasets often have a more complicated internal structure that cannot be accurately classified simply as cross-sectional data, time-series data, or panel data. In particular, those modern datasets often have a network structure, and the precision of statistical inference is often crucially linked to the structure of the underlying network. The main goal of this research project is to develop robust inference methods for such modern panel and network datasets. This requires establishing a mathematical representation of the network that allows formalizing the connection between the network and the precision of statistical inference. In addition, new bias correction and robust standard error estimation methods will be developed that account for the sparsity structure of the data. The new statistical methods developed in this project will help to analyze modern datasets in the social sciences more robustly and more reliably.
For the future of this ERC project, significant scientific progress is expected from combining ideas from the paper "Fixed-effect regressions on network data" by Koen Jochmans and Martin Weidner (Econometrica 2019) with the methods of the papers
"Minimizing Sensitivity to Model Misspecification" and "Posterior Average Effects" by Stephane Bonhomme and Martin Weidner (those papers were accepted for publication at Quantitative Economics and the Journal of Business & Economic Statistics). The latter two papers are about robustness towards model misspecification, but mismeasurement of nuisance parameters can also be interpreted as a type of model misspecification, which when combined with the first paper leads to a completely novel approach to tackle the high dimensional inference problems in the sparse network models that this ERC project is aimed at. This is work in progress, but it is one of the most exciting and most promising research directions to achieve the goals of this ERC project in the remaining years of this grant.