Skip to main content

Privacy and Utility Allied

Periodic Reporting for period 1 - HYPATIA (Privacy and Utility Allied)

Reporting period: 2019-10-01 to 2021-03-31

With the ever-increasing use of internet-connected devices, such as computers, smart grids, IoT appliances and GPS-enabled equipments, personal data are collected in larger and larger amounts, and then stored and manipulated for the most diverse purposes. Undeniably, the big-data technology provides enormous benefits to industry, individuals and society, ranging from improving business strategies and boosting quality of service to enhancing scientific progress. On the other hand, however, the collection and manipulation of personal data raises alarming privacy issues. Not only the experts, but also the population at large are becoming increasingly aware of the risks, due to the repeated cases of violations and leaks that keep hitting the headlines.

The objective of this project is to develop the theoretical foundations, methods and tools to protect the privacy of the individuals while letting their data to be collected and used for statistical purposes. We aim in particular at developing mechanisms that can be applied and controlled directly by the user thus avoiding the need of a trusted party, are robust with respect to combination of information from different sources, and provide an optimal trade-off between privacy and utility.
The PI and her team have achieved the following HYPATIA objectives:

1) We have advanced towards the development of a framework for designing optimal privacy mechanisms. In particular, we have effectuated a study of the refinement relation between various mechanisms, based on their information leakage. Furthermore, we have developed a logical characterization of d-privacy, a variant of differential privacy.

2) We have developed a method for the reconstruction of the original distribution from individually sanitized data collections. This method, which we call Generalised Bayesian Update, is based on the statistical Expectation-Maximization principle and it allows different individuals to use different sanitization mechanisms. We have experimented with the k-Randomized-Response and the Geometric mechanisms, validating the method from both the correctness and the performance standpoints.

3) We have developed a method, called MILES, for the black-box measurement of information leakage via machine learning. Based on this method, we have also developed a tool publicly available. Furthermore, we have developed a method based on the machine learning paradigm of the Generative Adversarial Networks (GAN) to compute an approximation of an optimal obfuscation mechanism.
We expect to further investigate the optimality issue in local mechanisms for privacy protection. More specifically, develop a compositional method for privacy-preserving federated learning which also optimizes the trade-off between privacy and two kinds of utility, namely quality of service and preservation of statistical information (three-way optimality). We also plan to put fairness into the equation, namely, we intend to study mechanisms for data sanitization that allow achieving privacy and at the same time removing bias from training data, while preserving the accuracy of the model.
A compositional framework for privacy protection in federated learning