Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Mechanism Design for Data Science

Periodic Reporting for period 4 - MDDS (Mechanism Design for Data Science)

Reporting period: 2023-01-01 to 2024-06-30

The problem addressed in this project is the essential need to introduce incentives into the framework of fundamental data sciences algorithms. This is done by considering 4 central themes: re-visiting search and information retrieval, competition in machine learning (segmentation / regression), re-visiting on-line learning and explore & exploit mechanisms, and incentive compatible diffusion / influencer selection in social networks. The common theme is a game-theoretic / mechanism design approach to the most basic challenges in data science.

The problem addressed is crucial to our society, as it moves into AI systems controlled by interested parties. While much work has been carried out on issues such as security and spamming, our on-line systems are fragile due to the incentives of honest strategic parties aiming to optimize their own utility. Our data science / AI algorithms have not been built to address the dynamics created due to these incentives, leading to poor social welfare. By bridging game theory/economic mechanism design and data science/AI we contribute directly to addressing that major social pain.

Our success in establishing the project goals has been remarkable given the great challenge. The group of seven PhD students, together with several master students and two post-docs were instrumental for the success. Moreover, the fact we managed to attract quite a few faculty members to engage and closely collaborate with us on the topics of the project led to remarkable outcomes. This created a second to none group, which its structure and achievements are detailed in mdds.net.technion.ac.il.

As planned the project had 4 central themes. The common theme is a game-theoretic / mechanism design approach to the most basic challenges in data science. In all four parts we established pioneering outcomes. Moreover, some of these are due to synergies between the different parts of the projects, creating a a whole framework of mechanism design for data science. On overall we published 52 papers in competitive forums, and had significantly influenced the creation of a new area. Interestingly, the published articles are partitioned equally among economics and computation (EC) outlets, and Artificial Intelligence (AI) outputs, which signals the unique contribution.
Our work on the relevance ranking / search part started from a novel observation, initiated in a JAIR paper, showing that the Probability Ranking Principle (PRP), central to current search systems fails in adversarial/strategic environments. In order to tackle the above we introduced the study of both an agent perspective, dealing with how content owners promote their content, and a mediator perspective, dealing with how we can modify search systems in order to lead to high social welfare when document authors are strategic. This led to quite a few results, and in 2022 to a SIGIR perspective paper that sum up this revolutionary angle in the main informational retrieval forum. In addition, as part of this work we established the first controlled experiments setup for the study of strategic information retrieval.

Our work on on-line recommendations introduced two novel paradigms: A game -theoretic approach to recommendation ecosystems, and individually rational / incentive compatible on-line explore and exploit. The first part, initiated in a Neurips 2018 paper, showed that we must take publishers’ game-theoretic incentives into account, if we aim at stable, fair, welfare maximizing outcomes. This work had several follow-ups in and had remarkable success influencing industry. The complementary part of our work on on-line recommendations deals with the fact that in modern explore and exploit systems, exploration is done by the participants themselves, and therefore should be aligned with their incentives. One example of a fundamental contribution is an ICML paper joint with an ERC visitor.
Our work on recommendation systems also revealed a remarkable synergy we completely did not envision to start with -- we initiated as part of our project the study of language-based games, focusing on on-line recommendation/persuasion, leading to JAIR and TACL papers, pioneering a novel bridge between game theory and natural language processing.

The work on segmentation/clustering focused on both the use of these methods in strategic predictions, as well as on work connecting them to social aspects. In a Neurips paper we have shown how predictors can successfully take into account competing parties’ predictions, later extending it in an EC paper to work on how a platform can offer useful predictors to competitors. We established a connection between privacy, incentives, and segmentation in an MOR paper. In this paper we show that we can obtain truthful privately preserving segmentation when we have a system consisting of many users. In an AAAI paper, we show a deep connection between game theoretic modelling to fair clustering. More specifically we show that without actual economic treatment of welfare fair classification leads to inferior results, and how to correct for these.

The work on network analysis/effects started as independent track, and led to two breakthroughs that emerged due to the emergence of insightful connections to other parts of the project. One of the major observations of our work in the context of search/relevance ranking is the fact that authors tend to mimic highly ranked authors. This led us to work on influencing social herding in data science. Another social aspect we realized is crucial is sybil attacks in economic mechanisms, originating from the fact the on-line social environment allows for creating multiple copies of participants. We have shown when and how one can tolerate such attacks in the context of regression and segmentation in an EC paper.
As the project ended, I decided to answer this question pointing to some major achievements that may have significant ramifications on future research.

Our work on strategic / competitive search is definitely a breakthrough. It definitely targets a major failure of standard aspect of our on-line activity that is taken for granted as “solved”, but as it turns out is treated in a way which might result is low social welfare, and ways to target it. I would rank similarly the work of recommendation ecosystems, which I believe its idea impacts way that groups in industry think about the problem, although here we did not establish the first experimental framework as in search. The work on language-based games was not envisioned in the beginning of the project. It is definitely a pioneering work which I believe will create a breakthrough in the way we analyse AI ecosystems. It is indeed puzzling how celebrated work in theoretical and experimental economics did not apply NLP considerations, and that non-cooperative game theory has not been bridged to NLP so far.

I started this project as an expert in game theory and algorithmic game theory, isolating some issues in game theory which seem to me fundamentally missing from data science. I sum up this project in a situation where my closest collaborators are world leaders in data science, absorbing game-theoretic ideas and pushing this synergy as their main research agenda, closely collaborating with my research team. I believe one cannot hope for better influence of a research grant.
capture.png