Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

Retrieval-Augmented VIsion-Language Models for Open-vocabulary LocalizatIon

Project description

Solution for improved segmentation operations for vision-language models

The recent and increasingly widespread use of large language models (LLMs) and vision-language models (VLMs) has introduced new features, capabilities, and possibilities across various services. However, these advancements have also driven up operational costs, as such models are often expensive, complex, and time-consuming to develop. In particular, segmentation (an essential component in applications such as autonomous vehicles and medical imaging) faces challenges when adapting to new or complex domains and classes. Supported by the Marie Skłodowska-Curie Actions programme, the RAVIOLI project aims to develop a scalable and robust fusion model designed for VLM segmentation. This solution will improve adaptability, accuracy, and the granularity of segmentation operations, enhancing the overall performance of VLM-based systems.

Objective

The proposed research project, RAVIOLI (Retrieval-Augmented VIsion-Language Models for Open-vocabulary LocalizatIon), aims to significantly advance the field of segmentation by innovatively integrating retrieval-based predictions from a memory with the original predictions of a vision-language model (VLM) through a learnable fusion model. Addressing a critical gap in existing methods, which often struggle to adapt to new or complex classes and domains, RAVIOLI seeks to enhance the accuracy, adaptability, and granularity of segmentation tasks across various applications, from autonomous vehicles to medical imaging. Importantly, there has been no similar attempt to learn a fusion model with these properties in any open-vocabulary dense task, such as segmentation, making our approach truly pioneering. The ambitious scope of this project lies in its aim to create a tailored, flexible, robust, and scalable solution that will redefine the capabilities of vision-language models, setting a new standard in the field of open-vocabulary segmentation. The project will be hosted by the Visual Recognition Group (VRG) at the Czech Technical University in Prague (CTU) under the supervision of Prof. Giorgos Tolias. The fellow, Bill Psomas, with a strong background in computer vision (CV) and deep learning (DL), is well-equipped to lead this research, which will further supported by a secondment at AImageLab, University of Modena and Reggio Emilia (UNIMORE) working with Prof. Rita Cucchiara.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

This project has not yet been classified with EuroSciVoc.
Be the first one to suggest relevant scientific fields and help us improve our classification service

You need to log in or register to use this function

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

HORIZON-TMA-MSCA-PF-EF - HORIZON TMA MSCA Postdoctoral Fellowships - European Fellowships

See all projects funded under this funding scheme

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

(opens in new window) HORIZON-MSCA-2024-PF-01

See all projects funded under this call

Coordinator

CESKE VYSOKE UCENI TECHNICKE V PRAZE
Net EU contribution

Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.

€ 191 918,16
Address
JUGOSLAVSKYCH PARTYZANU 1580/3
160 00 PRAHA
Czechia

See on map

Region
Česko Praha Hlavní město Praha
Activity type
Higher or Secondary Education Establishments
Links
Total cost

The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.

No data

Partners (1)

My booklet 0 0