Retrieval-Augmented VIsion-Language Models for Open-vocabulary LocalizatIon

Project description

Solution for improved segmentation operations for vision-language models

The recent and increasingly widespread use of large language models (LLMs) and vision-language models (VLMs) has introduced new features, capabilities, and possibilities across various services. However, these advancements have also driven up operational costs, as such models are often expensive, complex, and time-consuming to develop. In particular, segmentation (an essential component in applications such as autonomous vehicles and medical imaging) faces challenges when adapting to new or complex domains and classes. Supported by the Marie Skłodowska-Curie Actions programme, the RAVIOLI project aims to develop a scalable and robust fusion model designed for VLM segmentation. This solution will improve adaptability, accuracy, and the granularity of segmentation operations, enhancing the overall performance of VLM-based systems.

Objective

The proposed research project, RAVIOLI (Retrieval-Augmented VIsion-Language Models for Open-vocabulary LocalizatIon), aims to significantly advance the field of segmentation by innovatively integrating retrieval-based predictions from a memory with the original predictions of a vision-language model (VLM) through a learnable fusion model. Addressing a critical gap in existing methods, which often struggle to adapt to new or complex classes and domains, RAVIOLI seeks to enhance the accuracy, adaptability, and granularity of segmentation tasks across various applications, from autonomous vehicles to medical imaging. Importantly, there has been no similar attempt to learn a fusion model with these properties in any open-vocabulary dense task, such as segmentation, making our approach truly pioneering. The ambitious scope of this project lies in its aim to create a tailored, flexible, robust, and scalable solution that will redefine the capabilities of vision-language models, setting a new standard in the field of open-vocabulary segmentation. The project will be hosted by the Visual Recognition Group (VRG) at the Czech Technical University in Prague (CTU) under the supervision of Prof. Giorgos Tolias. The fellow, Bill Psomas, with a strong background in computer vision (CV) and deep learning (DL), is well-equipped to lead this research, which will further supported by a secondment at AImageLab, University of Modena and Reggio Emilia (UNIMORE) working with Prof. Rita Cucchiara.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

This project has not yet been classified with EuroSciVoc.
Be the first one to suggest relevant scientific fields and help us improve our classification service

Coordinator

CESKE VYSOKE UCENI TECHNICKE V PRAZE

Net EU contribution

€ 191 918,16

Address

JUGOSLAVSKYCH PARTYZANU 1580/3
160 00 PRAHA
Czechia

Region

Česko Praha Hlavní město Praha

Activity type

Higher or Secondary Education Establishments

Links

Contact the organisation Website

Participation in EU R&I programmes

HORIZON collaboration network

Total cost

No data

Partners (1)

Partner

UNIVERSITA DEGLI STUDI DI MODENA E REGGIO EMILIA

Italy

Net EU contribution

€ 0,00

Project description

Solution for improved segmentation operations for vision-language models

Objective

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Coordinator

Partners (1)

Share this page Share this page on social networks

Download Download the content of the page

Retrieval-Augmented VIsion-Language Models for Open-vocabulary LocalizatIon

Project description

Solution for improved segmentation operations for vision-language models

Objective

Fields of science (EuroSciVoc) CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s) Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s) Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Call for proposal Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Coordinator

Partners (1)

Share this page Share this page on social networks

Download Download the content of the page

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.