Controlling Large Language Models

Project description

Interpreting and controlling large language models

Large language models (LMs) have rapidly become the backbone of most AI systems, driving cutting-edge advancements across various tasks and applications. However, these benefits come with notable drawbacks, as AI systems often exhibit flaws related to their underlying LMs, such as biased behaviour, confabulations, flawed reasoning and outdated information. These issues have become increasingly difficult to address due to the black-box nature of LMs. The ERC-funded Control-LM project will develop a framework to overcome this opacity, elucidating the internal mechanisms of LMs and enabling safer, more efficient control and interpretation of these models.

Objective

Large language models (LMs) are quickly becoming the backbone of many artificial intelligence (AI) systems, achieving state-of-the-art results in many tasks and application domains. Despite the rapid progress in the field, AI systems suffer from multiple flaws inherited from the underlying LMs: biased behavior, out-of-date information, confabulations, flawed reasoning, and more.
If we wish to control these systems, we must first understand how they work, and develop mechanisms to intervene, update, and repair them. However, the black-box nature of LMs makes them largely inaccessible to such interventions. In this proposal, our overarching goal is to:

*Develop a framework for elucidating the internal mechanisms in LMs and for controlling their behavior in an efficient, interpretable, and safe manner.*

To achieve this goal, we will work through four objectives. First, we will dissect the internal mechanisms of information storage and recall in LMs, and develop ways to update and repair such information.
Second, we will illuminate the mechanisms of higher-level capabilities of LMS to perform reasoning and simulations. We will also repair problems stemming from alignment steps. Third, we will investigate how training processes of LMs affect their emergent mechanisms and develop methods for fine-grained control over the training process. Finally, we will establish a standard benchmark for mechanistic interpretability of LMs to consolidate disparate efforts in the community.
Taken as a whole, we expect the proposed research to empower different stakeholders and ensure a safe, beneficial, and responsible adoption of LMs in AI technologies by our society.

Keywords

Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Host institution

TECHNION - ISRAEL INSTITUTE OF TECHNOLOGY

Net EU contribution

€ 1 500 000,00

Address

SENATE BUILDING TECHNION CITY
32000 Haifa
Israel

Activity type

Higher or Secondary Education Establishments

Links

Contact the organisation Website

Participation in EU R&I programmes

HORIZON collaboration network

Total cost

€ 1 500 000,00

Beneficiaries (1)

TECHNION - ISRAEL INSTITUTE OF TECHNOLOGY

Israel

Net EU contribution

€ 1 500 000,00

Project description

Interpreting and controlling large language models

Objective

Keywords

Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Host institution

Beneficiaries (1)

Share this page Share this page on social networks

Download Download the content of the page

Controlling Large Language Models

Project description

Interpreting and controlling large language models

Objective

Keywords Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Programme(s) Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s) Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Call for proposal Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Host institution

Beneficiaries (1)

Share this page Share this page on social networks

Download Download the content of the page

Keywords

Project’s keywords as indicated by the project coordinator. Not to be confused with the EuroSciVoc taxonomy (Fields of science)

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.