Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Bayesian Neural Networks for Bridging the Gap Between Machine Learning and Econometrics

Periodic Reporting for period 1 - BNNmetrics (Bayesian Neural Networks for Bridging the Gap Between Machine Learning and Econometrics)

Reporting period: 2020-10-01 to 2022-09-30

Bayesian methods are relevant in applications over desperate domains is because they implicitly deal with and focus on uncertainties. In high-risk and decisional domains (such as economics or medicine) uncertainties associated with models' forecasts and on the values of their estimated parameters are certainly not negligible. Unfortunately, Bayesian methods are known, at a general level, for being of difficult application though very attractive from a theoretical standpoint. There are only a few classes of problems that can be easily tackled in a Bayesian way, but in general, the shift toward a Bayesian prescription of a non-Bayesian model is challenging. It is thus not surprising that the applicability of Bayesian principles in machine learning has long been almost inaccessible.
The overall objective of the action and of the research line I followed is that of devising feasible solutions for performing Bayesian inference in complex models characterized by a high number of parameters such as machine learning ones, and analyses to what extent such models compare with traditional ones.
The application domain of the Action involves financial and economic data for estimation and forecasting with standard statistical models (regressions), econometric models (volatility models), and neural networks under a Bayesian approach.
In particular, the action addresses the following two major points.
1. Establish if (i) a Bayesian approach to complex financial models is in first place feasible, (ii) whether and to which extent it outperforms analogous models estimated with non-Bayesian techniques, (iii) address the use of predictive distributions to analyze errors and uncertainties associated with the estimated parameters and model forecast. (iv) Address to which extent the Bayesian dimension provides decisional advantages with respect to standard non-Bayesian estimation.
2. Developing on and extending current existing methods for Bayesian inference in complex models to ease their implementation, computational requirements, and applicability.
A first manuscript entitled "Bayesian Bilinear Neural Network for Predicting the Mid-price Dynamics in Limit-Order Book Markets" addresses the first major research question.
The paper explores the applicability of a Bayesian version of a Neural network parametrized by a Temporal Augmented Bilinear Layer (TABL), for forecasting mid-price movements in limit order books.
We adopt the state-of-the-art VOGN optimizer for the Bayesian training of the TABL parameters and the standard non-Bayesian ADAM optimizer. We show that the Bayesian training of the 464 parameters characterizing our model is indeed feasible and satisfactory. We then turn to evaluating and comparing the performance of the two approaches showing that, though the results are similar, the Bayesian TABL perhaps generalized better on unseen test data: we discuss how to construct forecasts from the predictive distribution, how to interpret the probabilities associated with the foretasted price movements and argue that from a practical perspective these probabilities (measures of uncertainties accosted with the model's point-predictions) are useful in making investment-related decisions.

In "Quasi Black-Box Variational Inference with Natural Gradients for Bayesian Learning" we develop an algorithm that merges elements of the BBVI framework and NGVI update. Indeed, the majority of VI algorithms do require the computation of models' gradients, which is, in the general, an expensive task. On the other hand, Black-box methods are capable of performing optimization without requiring the model's gradients, but via function queries only. The advantage is that the BB framework is readily applicable without requiring model-specific gradients or derivations. On the other hand, the NGVI setup devises the use of natural gradients for performing SGD-like updates within a theoretical foundation that enormously simplifies their computation. Natural gradients have been extensively studied and shown to be preferable to common Euclidean gradients in Bayesian optimization. On this basis, we develop an optimization routine capable of performing VI in a black-box fashion by relying on natural gradient computations. We test out optimizer on a number of datasets and statistical-econometric models showing alignment with maximum likelihood results, excellent accordance between the variational posteriors and the true one approximate via Monte Carlo Markov Chain sampling, and a variety of existing methods available in the literature.

Besides QBVI we developed an alternative black-box procedure based on manifold optimization. We develop on the so-called Manifold Gaussian Variational Bayes (MGVB) method (Tran, 2019) capable of explicitly addressing the symmetric and positive-definite constraint on the covariance matrix of the Gaussian variational distribution. Our optimizer uses exact natural gradient computations. Indeed, whereas MGVB uses some approximate results, the same theoretical results used in QBVI turn out to be adaptable to this context allowing for exact natural gradient computations. Opposed to QBVI and earlier methods, our EMGVB approach, because of the constrained manifold optimization, guarantees positive-semidefinite updates of the variational covariance matrix resulting in a robust and efficient procedure, less susceptible to the specification o the hyper-parameters. We validate our approach along with several alternative ones over different economic datasets and perform extensive analyses over different variants of the GARCH volatility model.

In a final manuscript, we review the state-of-the-art Bayesian machine learning techniques. For a reader not acquainted with the literature in the area, we present a multitude of approaches and algorithms that are feasible for Bayesian inference in machine learning applications. Besides describing standard methods such as Monte-Carlo samplers and Monte-Carlo dropout we focus on algorithms for Variational Inference, including those developed within this action. As of now, an algorithm-oriented literature review on Bayesian methods for ML is not available: our contribution is expected to promote such methods to a wider audience and provide a step-by-step introduction to the different approaches developed in the last decade.
The progress beyond the state of the art is twofold. First, this action proved that Bayesian methods in ML are feasible, useful, and capable of providing decisional advantages and improved out-of-sample performance with respect to standard non-Bayesian approaches in economic and financial problems. Methodologically, we advance the theory of Black-Box methods by boosting the algorithms with Natural gradient computations, guaranteeing theoretical constraints over the estimated variational covariance matrix through manifold optimization.
This impact of the above is that of promoting the use of VI in ML in wide classes of economic and financial problems, with algorithms that are general and do not require problem-specific computations and adjustments (such as the use of models’ gradients opposed to black-box rationale). At a more abstract level, the action proved that econometrics and finance can make great use of the Bayesian ML method, enabling the typically non-probabilistic ML models with probabilistic elements typically found in the econometric literature, bridging the gap between the two practices.
Representation of the probabilistic dimension associated with the outputs of a Bayesian ML
Illustration of marginal posterior densities estimated with several methods (Bayesian adn not)