Periodic Reporting for period 1 - PRIMAL (Private Machine Learning)
Berichtszeitraum: 2020-06-01 bis 2022-05-31
The primary objective of this research is to develop privacy-preserving techniques with an emphasis on applications in machine-learning algorithms and data mining. Our results, however, evolved to prioritize foundational work on privacy-preserving techniques with broader applicability beyond machine learning. Our research encompasses a diverse spectrum, delving into fundamental aspects of secure computation and drawing insights from disciplines such as algorithms, data structures, and distributed computing.
* The project shows accelerations for secure computation - both in terms of communication complexity and round complexity. Specifically, we focus on information-theoretic security.
* Along the way, we also show efficiency improvement to the fundamental task of broadcast: One party wishes the broadcast a message, and all parties have to agree on the message that has been sent. We improve the state of the art by multiple orders of magnitudes.
* We show a novel approach for minimizing a trust assumption for Non-Interactive Zero-Knowledge, which is a primitive that allows to prove, non-interactively, the validity of a statement without revealing why the statement is true. Specifically, it is known that non-interactive zero knowledge requires some trusted setup, and security does not hold when the setup is compromised. We show how to address this security gap by introducing a new notion that allows some sort of accountability — in case the authority generating the setup is compromised, it can be held accountable. The hope is that this notion will prevent the authority from cheating.
* We show several concrete protocols for statistical analysis on multiple databases. Our protocols allow (private) queries to a joint private database of the form ``which products were bought by people earning this much per annum?" -- without revealing the query. It allows different variants of JOIN and GROUP-BY operations, which are fundamental for extracting information from databases.
* We also showed a novel protocol for secure sorting and demonstrated its importance in data analysis, particularly in the problem of heavy hitters. In particular, this allows companies to improve their products by allowing, for instance, collecting popular URLs, application usage patterns, or other performance data.
* We also show several advances in the oblivious RAM model -- which allows outsourcing storage to a remote database and accessing the data without the server understanding what has been asked. We show new techniques to oblivious RAMs that are optimal in terms of overhead, and our experimental evaluation shows that they outperform previous known constructions.
Dissemination: As mentioned, all publications were peer-reviewed and presented at top academic conferences. They are also archived and open-access.
For instance, for the problem of heavy hitters (privately collecting popular URLs that users visit, privately collecting application usage patterns or other performance data), our results are about 130-200 times faster than previous works. Our techniques have the potential to make secure computation more accessible, scale to more tasks, and reap the benefits of recent technological advances without compromising our privacy.