On intelligenCE And Networks

Informacje na temat projektu

OCEAN

Identyfikator umowy o grant: 101071601

DOI

10.3030/101071601

Data podpisania przez KE 27 Lutego 2023

Data rozpoczęcia 1 Marca 2023

Data zakończenia 28 Lutego 2029

Finansowanie w ramach

European Research Council (ERC)

Koszt całkowity

€ 7 762 668,75

Wkład UE

€ 7 762 668,75

7 762 668,75

Koordynowany przez

ECOLE POLYTECHNIQUE
France

Periodic Reporting for period 1 - OCEAN (On intelligenCE And Networks)

Okres sprawozdawczy: 2023-03-01 do 2024-08-31

Machine learning and artificial intelligence (AI) have made major strides in the last two decades. The progress has been based on a dramatic increase of data and computing capacity, in the context of a centralized paradigm that requires aggregating data in a single location where massive computing resources can be brought to bear. This fully centralized machine learning paradigm is, however, increasingly at odds with realworld use cases, for reasons that are both technological and societal. In particular, centralised learning risks exposing user privacy, makes inefficient use of communication resources, creates data processing bottlenecks, and may lead to concentration of economic and political power.

It thus appears most timely to develop the theory and practice of a new form of machine learning that targets heterogeneous, massively decentralised networks, involving self-interested agents who expect to receive value (or rewards, incentive) for their participation in data exchanges. In response to these challenges, OCEAN aims to develop statistical and algorithmic foundations for systems involving multiple incentive-driven learning and decision-making agents, including uncertainty quantification predominantly with a Bayesian focus. To achieve these goals, we need to develop new statistical and machine-learning methodologies, together with algorithms for sampling and optimisation which are both scalable to large problems, and have provable theoretical guarantees.

The first year of the OCEAN project has been highly productive. The consortium has made significant progress in all work packages, and we are generally on track with our planned research agenda.

First, Ocean produced several results regarding "optimization core" work package. Michael I. Jordan's team carried out a work to improve the optimization methods to derive Nash equilibria in quantum zero-sum games, and three more projects studying respectively extra-gradient methods for separable stochastic variational inequalities, for general variational inequalities with general constraints and for monotone equations in continuous time.

Second, Ocean made significant progress in the "Bayesian inference and sampling core" work package. The Warwick team and associated partners published several studies designing novel MCMC-related methods such as the Divide-and-conquer approach. A particular emphasis has been put on Bayesian methods for heterogenous and distributed inference such as the Bayesian fusion.

Likewise, advances were made for the "Federated Learning" package in the year 2024. Eric Moulines supervised a project aimed at understanding and quantifying a heterogeneity bias that arises in federated procedures. An algorithm, SCAFFLSA, was also proposed to correct this bias in the context of linear stochastic approximation and TD learning. Likewise, Pr. Dieuleveut contributed to the work package through studying compression schemes for federated learning which produce a specific error distribution. Additionally, Gareth Roberts carried out two projects related to Bayesian fusion, an approach particularly well suited for decentralized learning.

Significant results have also been obtained for the "Privacy" work package. Warwick and associated Universities' teams published a study which incorporates privacy in posterior sampling. They relied on Huber contamination with heavy-tailed distributions and showed that their method enjoys desirable asymptotic properties. Christian Robert focused with his team on laying the foundation of a novel decision-theoretic framework for privacy, as an alternative to the differential privacy paradigm. Moreover, Finally, the UC Berkeley node also got involved in this work package, with one study establishing that privacy may arise endogenously in strategic environments and another one characterizing the impact of privacy on market segmentation.

Similarly, Ocean's efforts in the "Data market and economic value of data" work package proved succesful. A study co-supervised by Eric Moulines and Michael Jordan characterizes the effect of adverse selection on collaborative learning and shows that it may fall victim to unravelling. Similarly, the UC Berkeley team produced several projects addressing participation incentives for collaborative learning, in particular when participants have divergent strategic interests.

Additionally, the "Strategic experimentation" work package has also been addressed during this first year. For instance, a project carried out by two PhD students at Polytechnique studies the case of a principal-agent problem in a bandit environment. An explicit algorithm, along with theoretical guarantees, has been designed to solve the principal’s problem. Another follow-up paper generalizes this idea to a multi-agent setting with externalities and adapts the former algorithm to recover the celebrated Coase theorem.

Finally, we will start working on WP7 from the year 2026 on, as planned per our prospective research agenda.

The OCEAN research project has the potential to make significant contributions to both society and the economy by addressing some of the core limitations of current centralized machine learning models.

First, our research should lead to enhanced privacy protection and fairness and AI. By moving away from centralized data collection and processing, OCEAN’s decentralized framework could reduce the risk of exposing sensitive user data. This would be particularly beneficial in sectors such as healthcare, finance, and personal services, where privacy is a primary concern. Users could retain control over their own data, fostering greater trust in AI systems and increasing user engagement.

Second, our research should help the democratization of AI. The current centralized model often benefits organizations with the most data and computing resources, contributing to the concentration of economic and political power. OCEAN’s focus on decentralized, incentive-driven learning could distribute AI development more evenly across different sectors and stakeholders. This could level the playing field, enabling smaller companies, startups, and individuals to participate in AI development and innovation.

Finally, Ocean research should reduce inefficency in large Machine Learning pipelines. Indeed, the current centralized machine learning paradigm involves aggregating large volumes of data in a single location, often requiring significant computational and communication resources. This model, while effective in many cases, creates several inefficiencies and bottlenecks, particularly as the volume of data continues to grow exponentially. By promoting decentralized learning, OCEAN could help improve the scalability of machine learning systems, reducing latency and ensuring more efficient resource use, particularly in large-scale applications such as smart cities, Internet of Things (IoT) networks, and global supply chains.

Ocean logo

Periodic Reporting for period 1 - OCEAN (On intelligenCE And Networks)

Udostępnij tę stronę Udostępnij tę stronę w mediach społecznościowych

Pobierz Pobierz zawartość strony