During this project, our research has focused on developing methods for users to teach robots to perform tasks, in an intuitive way. We have developed methods that allow non-expert users to shape complex behaviors, with a teacher-learner interaction based on relative corrections applied to the actions executed by the robot, everything while the teacher observes the task execution. The research has focused on learning state representations based on specific neural network architectures and cost functions which improve the data efficiency of teaching simultaneously. We also evaluated other forms of complexity and developed appropriate methods: simultaneously teaching movements and forces (pushing, cleaning, unplugging), teaching trajectories and velocity in a decoupled manner (picking up objects very rapidly), as well as bi-manual manipulation. Additionally, these techniques have been extended to scenarios in which the user does not need to understand the domain of the robot actions, but only to what is possible to be observed by human perception, i.e. partial states of the robot, which reduces the cognitive effort the user has to do. For this, the robot needs to learn to adapt to the corrections advised by the teacher and at the same time to learn to map what are the actions that obtain the desired effect in the state space.
Another focus was on the use of interactive feedback to reduce the burden of the user in supplying demonstrations and trying to extract as much information about the intention of the user in the provided inputs as possible to reduce any possible ambiguous interpretation from the side of the learning algorithm. We investigated ambiguities in in three different scenarios 1) learning complex trajectory when the goal is dependent on different reference frames (attached to the objects in the environment, for example), in different segments of the movement 2) teaching complex movement where both trajectories and stiffness properties need to be learned 3) learning controllers that need to rely on different sensor modalities in different situations. We proposed methods for using priors and interactive feedback for solving the ambiguity in the inferring choice without any explicit programming and using a reduced number of complete demonstrations. We have evaluated different feedback modalities ranging from corrections to ratings as well as different interaction modalities ranging from the robot asking for advise to the human teacher being the driving force, also combining multiple modalities in one joint framework.
The developed approaches were benchmarked against state-of-the-art approaches, evaluated with human volunteers, and we demonstrated them in various tasks on real robot arms and mobile robots.
The project has resulted in numerous novel methods published in scientific papers. We also published an extensive survey on interactive imitation learning. The ideas have been disseminated by invited talks at conferences, workshops, companies, and universities, by organizing scientific workshops on the topic, as well as applying the methods in competitions and more application-oriented projects.