Periodic Reporting for period 1 - MACROML (Machine Learning Macroeconometric Methods for Dynamic Causal Inference)
Reporting period: 2023-04-01 to 2025-03-31
The primary goal of the macroml research project is to put forward theory-driven methods for dynamic causal inference analysis based on models typically used in the macroeconometrics literature, bridging the gap between machine learning and macroeconometric modelling. The key distinction of this project from the state-of-the-art methods is the analysis of heavy-tailed and highly persistent time series data — a critical feature that has been largely overlooked in the literature.
In particular, the research project will investigate:
I. accurate and theoretically-valid estimation and inference econometric techniques for general high-dimensional time series models; II. a general methodology for high-dimensional local projection estimators which allows studying the dynamic causal relationship between economic time series data.
The project enlarged policymakers’ toolbox for the analysis of macroeconomics and finance data to assess different dynamic causal hypotheses in a flexible and accurate way, thereby making it highly policy-relevant. In addition, new estimation methods of machine learning time series models allow practitioners to implement ML techniques for time series data in a data-driven way. The project also delivered several interesting empirical applications.
During the first phase of the project, substantial progress was made in the formulation and analysis of factor-augmented sparse regression methods. A comprehensive framework was developed to investigate the statistical properties of estimators when applied to high-dimensional data. New theoretical results were derived that expanded the classical low-dimensional assumptions by introducing regularization techniques to handle large datasets. This led to the creation of rigorous conditions under which sparse plus dense models can be reliably estimated in high-dimensional environments. Extensive Monte Carlo simulations were performed to validate the theoretical developments, demonstrating superior performance compared to existing unregularized approaches.
Concurrently, focused efforts were directed towards enhancing regularized estimation techniques for time series regressions. In particular, methods for selecting optimal regularization parameters in LASSO regressions were advanced using innovative bootstrap approaches specifically tailored to dependent data structures. This work resulted in novel bootstrap-based estimation procedures that not only improved the accuracy of coefficient estimates but also provided more reliable inference in the presence of complex data dependencies. The outcomes of these activities have been encapsulated in a series of working papers, one of which has already been published in a major econometrics journal, while another remains under review.
Moreover, an innovative application was designed for nowcasting economic recessions using advanced logistic regression approaches implemented via sparse-group LASSO and factor augmented high dimensional MIDAS regressions. This tool has shown significant promise in delivering timely and precise recession forecasts, a result of considerable interest for both academic research and policy-making.
Complementing the theoretical and empirical achievements, the project also resulted in practical tools to facilitate the broader adoption of the new methodologies. An open-source R package was developed that encapsulates the core methods and simulation routines from the project. This software package has been rigorously tested and disseminated within the research community, ensuring that the novel approaches are accessible for further academic inquiry and practical implementation.
Additional significant outcomes include the development of innovative bootstrap-based procedures tailored for selecting optimal regularization parameters in LASSO-based time series regressions. These procedures have been particularly effective in managing the challenges posed by data dependencies and non-sparse configurations. Empirically, the project has addressed key questions in asset pricing, notably by comparing the predictive capabilities of sparse and dense model structures. A novel nowcasting application based on sparse-group LASSO has also been implemented to forecast economic recessions, providing promising early results that enhance the timeliness and precision of economic forecasts.
The potential impacts of these results are multifaceted. Technically, the methodological innovations provide a robust foundation for further research in causal inference and high-dimensional econometrics. The enhanced estimation techniques and forecasting tools are expected to influence both academic research and practical applications in finance and policy analysis. By improving the reliability of causal inference and forecasting in economic data, the project promises significant advancements in decision-making processes within financial markets and central banking.
To ensure the further uptake and success of these methodologies, several key needs must be addressed. Future research should focus on extending the current framework to handle more complex data structures, such as heavy-tailed distributions and panel data, and on integrating these methods within broader data analytics platforms. Demonstration projects and pilot studies in collaboration with industry partners and policymakers will be crucial to validate the practical utility of the new techniques in real-world environments. In parallel, establishing robust intellectual property rights (IPR) support and accessing markets and finance through strategic partnerships will help commercialise and further disseminate the methodologies developed in this project.
Internationalisation also represents a key area of impact. Collaborative efforts across borders, including joint research initiatives and participation in global standardisation frameworks, are essential to promote an international exchange of ideas and standards. A supportive regulatory environment and standardisation framework, particularly in the context of financial modelling and big data analytics, will further facilitate the widespread adoption of these innovative methods.
In summary, the MACROML project has not only delivered groundbreaking technical results but has also set the stage for significant scientific, economic, and societal impacts. The rigorous theoretical and empirical contributions, combined with the development of accessible tools such as an open-source R package, ensure that the project outcomes will continue to influence future research and practice. Addressing the key needs for further research, demonstration, access to markets, and international collaboration will be pivotal for the enduring success and uptake of the project’s innovations.