Project description
Building trustworthy generative information retrieval
Traditional information retrieval relies on indexing and searching. A new generative information retrieval approach encodes entire document collections directly into a model. This retrieval is then formalised into a sequence-to-sequence learning problem, which allows for the optimisation of broader goals like fairness, diversity and long-term impact. Funded by the ERC, the UNITE project aims to build trustworthy generative retrieval systems. It will combine generative models with reinforcement learning. It will also ensure accuracy, reliability and resilience, while exploring explainability, reproducibility and safety. Applications will include news search, recommendation systems and climate-related information retrieval. UNITE promises to reshape how information retrieval is conceived, evaluated and optimised for societal benefit.
Objective
Generative information retrieval is an emerging paradigm that replaces the traditional index-then-retrieve pipeline by encoding all information about a document collection in the parameters of a model. The retrieval task is then formalized as a sequence-to-sequence learning problem, making it possible to optimize the system end-to-end. This enables optimization towards a broad range of goals, not just short-term utility ones but also broader long-term objectives, such as fairness and diversity.
Trustworthiness is a prerequisite for the development, deployment, and use of AI-based systems. With UNITE I propose a technical agenda to understand how we can build warranted trust in generative information retrieval while reaping the benefits of the potential this paradigm promises for optimizing for goals beyond short-term utility.
My methodological innovations will be based on advancing the foundations of generative information retrieval and a synthesis of generative information retrieval with reinforcement learning, capturing the sequential and interactive nature of retrieval, thus offering a principled way to deal with long-term goals. These advances will be pursued along three lines where generative information retrieval needs to uphold verifiable guarantees: accuracy, including well-defined and explained contexts of usage; reliability, including exhibiting parity with respect to sensitive attributes; and resilience to distributional shifts and adversarial examples. I will also study ways to probe generative information retrieval methods to aid explainability, reproducibility, and safety. We will demonstrate the utility of our new methodologies on tasks of great societal value: news search and recommendation, and information retrieval for climate impact.
While adventurous, UNITE has great algorithmic significance. It may lead to a fundamental re-assessment of how the field conceptualizes, evaluates and optimizes the success of information retrieval methods.
Fields of science (EuroSciVoc)
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
This project has not yet been classified with EuroSciVoc.
Be the first one to suggest relevant scientific fields and help us improve our classification service
You need to log in or register to use this function
Keywords
Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)
Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)
Programme(s)
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
-
HORIZON.1.1 - European Research Council (ERC)
MAIN PROGRAMME
See all projects funded under this programme
Topic(s)
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Funding Scheme
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
HORIZON-ERC - HORIZON ERC Grants
See all projects funded under this funding scheme
Call for proposal
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
(opens in new window) ERC-2024-ADG
See all projects funded under this callHost institution
Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.
1012WX Amsterdam
Netherlands
The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.