Periodic Reporting for period 3 - NBEB-SSP (Nonparametric Bayes and empirical Bayes for species sampling problems: classical questions, new directions and related issues)
Reporting period: 2022-03-01 to 2023-08-31
RT1) the study of nonparametric Bayes and nonparametric empirical Bayes methodologies for classical species sampling problems, generalized species sampling problems emerging in biological and physical sciences, and question thereof in the context of optimal design of species inventories;
RT2) the use of recent mathematical tools from the theory of differential privacy to study the fundamental tradeoff between privacy protection of information, which requires to release partial data, and Bayesian learning in species sampling problems, which requires accurate data to make inference.
With regards to RT2, we considered the following research lines (RL): RL7) the development of a nonparametric Bayes methodology and a nonparametric empirical Bayes methodology for disclosure risk assessment (or risk of re-identitication), which is a the basis of some modern privacy preserving mechanisms; RL8) the study, and the development, of species sampling problems within the framework of global differential privacy and the framework of local differential privacy, with respect to suitable perturbation mechanisms, e.g. Laplace and Gaussian noise additions, generalized randomized response and bit flipping; RL9) the development of a comprehensive theory for goodness-of-fit tests, with emphasis on the study of the power of the test, under global differential privacy and local differential privacy. RL10) the study of novel mechanism for releasing private data, based on the use of synthetic data generated from nonparametric posterior distributions, with applications to species sampling problems; RL11) the developement of a computational framework (Markov chain Monte Carlo) for Bayesian nonparametric estimation and clustering under local differntial privacy and global differential privacy.
In addition to RT1 and RT2, we started a new research theme (RT3) on deep Bayesian neural networks, whcih are very popular in statistics and machine learning. Several results have been produced in the context of the large width and large depth behaviour of feedforward deep neural networks, also in terms of contraction rates, under both Gaussian random weights and Stable random weights for the network. Quantitaitive central limit theorems for large-width Gaussian neural networks have been also considered by relying on the use of the Stein-Malliavin calculus and second-order Poincaré inequlities. Other results concern with the fundamental problem of the training of feedforward neural networks through the gradient descent, leading to an interesting generalizations of the popular notion of the neural tangent kernel, and setting a link with estimation in kernel regression.