Project description
Tackling bias and errors in AI for software engineering
Large language models (LLMs) are powerful tools, but their use in software engineering (SE) faces challenges like errors and bias. Issues such as ‘hallucinations’, where AI generates incorrect information, seem unavoidable due to how these models work. Increasing the size and complexity of LLMs alone hasn’t solved the problem. However, research shows that collaboration among multiple AI agents could improve accuracy and reliability. The EU-funded MOSAICO project is working towards turning this idea into reality. To that end, it will create a platform where AI agents work together, guided by systems for communication, quality checks and decision-making. MOSAICO’s open-source framework promises to make AI in SE faster, more accurate and widely accessible.
Objective
The reliable application of LLM-based agents to SE requires a tremendous increase in their accuracy and minimisation of their bias. While LLMs continue increasing in size and performance, it seems that phenomena like hallucinations of a single agent are substantially inevitable, since they are linked to the fundamental inference mechanism in generative models. On the other hand, evidence starts accumulating about the possibility of achieving the required performance by collaboration and debate among groups of agents.
As it happens among humans, quality of work increases with specialisation of workers on tasks, organised collaboration, and discussion among workers with different backgrounds. Differently from humans, the instantiation of multiple required AI agents, and the collaboration and discussion among them, are very fast and cheap, making this approach particularly convenient.
MOSAICO proposes the theoretical and technical framework to implement this approach and to scale it to very large groups of collaborating agents, i.e. AI-agent communities. The developed solutions are composed into an integrated MOSAICO platform, handling communication, orchestration, governance, quality assessment, benchmarking and reuse of AI agents. MOSAICO is integrated with existing development environments, to present the results to software engineers, and allow expert users to intervene in the AI decisions.
The performance and reliability of MOSAICO technologies and tools to achieve given software engineering tasks are assessed within 4 different use cases scenarios coming from immersive technologies, bank/financing, aerospace and Internet of Things sectors.
The long-term adoption of MOSAICO results and technologies will be ensured by open sourcing the code and fostering an open collaboration, such as open-source initiatives, to enhance user engagement in the MOSAICO community.
Fields of science (EuroSciVoc)
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
You need to log in or register to use this function
We are sorry... an unexpected error occurred during execution.
You need to be authenticated. Your session might have expired.
Thank you for your feedback. You will soon receive an email to confirm the submission. If you have selected to be notified about the reporting status, you will also be contacted when the reporting status will change.
Keywords
Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)
Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)
Programme(s)
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
-
HORIZON.2.4 - Digital, Industry and Space
MAIN PROGRAMME
See all projects funded under this programme -
HORIZON.2.4.2 - Key Digital Technologies
See all projects funded under this programme
Topic(s)
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Funding Scheme
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
HORIZON-RIA - HORIZON Research and Innovation Actions
See all projects funded under this funding scheme
Call for proposal
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
(opens in new window) HORIZON-CL4-2024-DIGITAL-EMERGING-01
See all projects funded under this callCoordinator
Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.
91120 Palaiseau
France
The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.