Periodic Reporting for period 1 - CONVEY (Conveying Agent Behavior to People: A User-Centered Approach to Explainable AI)
Reporting period: 2023-01-01 to 2025-06-30
Explainable AI methods aim to support users by making the behavior of AI systems more transparent. However, the state-of-the-art in explainable AI is lacking in several key aspects. First, the majority of existing methods focus on providing "local" explanations to one-shot decisions of machine learning models. They are not adequate for conveying the behavior of agents that act over an extended time duration in large state spaces.
Second, most existing methods do not consider the context in which explanations are deployed, including to the specific needs and characteristics of users. Finally, most methods are not interactive, limiting users' ability to gain a thorough understanding of the agents.
The overarching objective of this proposal is to develop adaptive and interactive methods for conveying the behavior of agents and multi-agent teams operating in sequential decision-making settings.
To tackle this challenge, the proposed research will draw on insights and methodologies from AI and human-computer interaction. It will develop algorithms that determine what information about agents' behavior to share with users, tailored to users' needs and characteristics, and interfaces that allow users to proactively explore agents' capabilities.
We have developed several new explainability methods: (1) Counterfactual summaries that demonstrate an agent's alternative actions, helping users better discern agent goals and reasoning; (2) Policy summaries that support human operators in determining when to intervene and take over an agent, and (3) Textual policy summaries utilizing large language models which describe main patterns in agent behavior. We conducted user studies to assess each of these approaches, showing promising results.
In addition, we began an interdisciplinary project exploring the use of large language models to support the training of scientists in science communication. Here, the model generates explanations to scientists that suggest ways to better communicate their research.
Aim 2: Designing interactive interfaces for exploration of agent behavior
We developed ASQ-IT – an interactive explanation system that presents video clips of the agent acting in its environment based on queries given by the user that describe temporal properties of behaviors of interest. User studies show that end-users can understand and formulate queries in ASQ-IT and that using ASQ-IT assists users in identifying faulty agent behaviors.
We are currently exploring ways to use large language models to allow users to pose queries in a more natural way, building on our work on textual summaries.
The introduction of large language models also poses new explainability challenges, as these are usually highly complex and often proprietary models.