Skip to main content
European Commission logo print header

Teaching Robots Interactively

Periodic Reporting for period 2 - TERI (Teaching Robots Interactively)

Période du rapport: 2020-08-01 au 2022-01-31

Programming and re-programming robots is extremely time-consuming and expensive, which presents a major bottleneck for new industrial, agricultural, care, and household robot applications. My goal is to realize a scientific breakthrough in enabling robots to learn how to perform manipulation tasks from few human demonstrations, based on novel interactive machine learning techniques.
Current robot learning approaches focus either on imitation learning (mimicking the teacher’s movement) or on reinforcement learning (self-improvement by trial and error). Learning even moderately complex tasks in this way still requires infeasibly many iterations or task-specific prior knowledge that needs to be programmed in the robot. To render robot learning fast, effective, and efficient, we propose to incorporate intermittent robot-teacher interaction, which so far has been largely ignored in robot learning although it is a prominent feature in human learning. This project will deliver a completely new and better approach: robot learning will no longer rely on initial demonstrations only, but it will effectively use additional user feedback to continuously optimize the task performance. It will enable the user to directly perceive and correct undesirable behavior and to quickly guide the robot toward the target behavior. The three-fold challenge of this project is: developing theoretically sound techniques which are at the same time intuitive for the user and efficient for real-world applications.
The novel framework will be validated with generic real-world robotic force-interaction tasks related to handling and (dis)assembly. The potential of the newly developed teaching framework will be demonstrated with challenging bi-manual tasks and a final study evaluating how well novice human operators can teach novel tasks to a robot.
During this project, our research has focused on developing methods for users to teach robots to perform tasks, in an intuitive way. We have developed methods that allow non-expert users to shape complex behaviors, with a teacher-learner interaction based on relative corrections applied to the actions executed by the robot, everything while the teacher observes the task execution. The research has focused on learning state representations based on specific neural network architectures and cost functions which improve the data efficiency of teaching simultaneously. We also evaluated other forms of complexity and developed appropriate methods: simultaneously teaching movements and forces (pushing, cleaning, unplugging) as well as teaching trajectories and velocity in a decoupled manner (picking up objects very rapidly). Additionally, these techniques have been extended to scenarios in which the user does not need to understand the domain of the robot actions, but only to what is possible to be observed by human perception, i.e. partial states of the robot, which reduces the cognitive effort the user has to do. For this, the robot needs to learn to adapt to the corrections advised by the teacher and at the same time to learn to map what are the actions that obtain the desired effect in the state space.
Another focus was on the use of interactive feedback to reduce the burden of the user in supplying demonstrations and trying to extract as much information about the intention of the user in the provided inputs as possible to reduce any possible ambiguous interpretation from the side of the learning algorithm. We investigated ambiguities in in three different scenarios 1) learning complex trajectory when the goal is dependent on different reference frames (attached to the objects in the environment, for example), in different segments of the movement 2) teaching complex movement where both trajectories and stiffness properties need to be learned 3) learning controllers that need to rely on different sensor modalities in different situations. We proposed methods for using priors and interactive feedback for solving the ambiguity in the inferring choice without any explicit programming and using a reduced number of complete demonstrations.
The developed approaches were benchmarked against state-of-the-art approaches, evaluated with human volunteers, and we demonstrated them in various tasks on real robot arms and mobile robots.
We have developed and published new learning methods that outperform the state of the art and/or enable interaction settings that were previously infeasible.
Currently, the research is focused on learning to predict evaluative feedback implicit in the corrections a user advises within the action/state-space domain. With this evaluative prediction, it is possible to augment sparse reward functions within reinforcement learning processes, such that the convergence is sped up. All these methods aim to reduce the amount of agent-environment interaction while also reducing the workload of the users participating in the learning loop. We started extending the methods to bi-manual tasks, integrating the methods in one overarching interactive teaching framework, learning tasks interactively from video demonstrations, integrating learning and planning, and to investigate how novice human operators prefer to teach robots. Finally, we are working on a survey on interactive imitation learning in robotics.
A novel method for resolving ambiguities: move relative to the cup or to the coaster?
A novel method for interactive imitation learning in state-space