Periodic Reporting for period 3 - GOAL-Robots (Goal-based Open-ended Autonomous Learning Robots)
Reporting period: 2019-05-01 to 2021-04-30
Why is the project important for society? For many applications, future robots should be able to learn how to solve multiple tasks in unstructured environments in an autonomous way. The project aims to face this challenge. The project follows a previous European project called IM-CLeVeR which investigated how intrinsic motivations can support autonomous learning in biological and artificial agents. Intrinsic motivations, related to novelty, surprise, and competence acquisition, are maximally apparent in children at play. Driven by intrinsic motivations, children explore and interact with objects in the environment thus getting to know how to manipulate them. The central idea of GOAL-Robots is that intrinsic motivations can fully support open-ended learning of skills only if a further critical ingredient is considered: goals. A goal is an internal representation of a set of states of the world, which can be internally activated by the agent in the absence of its current perception.
Experiments in infants. The project carried out experiments with infants to understand the mechanisms through which they undergo open-ended learning. This led us to understand that learning in children is strongly based on “sensorimotor contingencies” for which the consequences of actions, internally represented as goals, lead infants to acquire actions that reliably cause such consequences. Contingency-based learning was translated into algorithms employed in robots to allow them to autonomously generate goals.
Autonomous goal formation. A second achievement of the project was the development of controllers allowing robots to self-generate goals to guide autonomous open-ended learning. What really counts for humans, and similarly for robots, is the acquisition of the capacity to change the surrounding world in desired ways. This led to the specific idea, translated into algorithms, that a way to self-generate goals is to form goals as states that follow the robot's actions that cause novel changes in the environment.
Autonomous skill learning. To have full autonomy, the self-generated goals should then guide robots to acquire the skills to accomplish them. The robots can use the goals and skills so acquired to accomplish tasks relevant for their human users. For example, the robots can use goals and skills for moving objects to desired places to tidy up a kitchen by putting the objects in proper places. As a third achievement, the project produced new robot architectures able to acquire ample repertoires of goals and skills based on such principles encompassing several functions.
Robot demonstrators. The project also built a number of increasingly challenging simulated and real robot demonstrators involving a camera-arm-gripper robot engaged in the displacement of objects on a table. In a first long “intrinsic phase”, the robot was required to learn “intrinsic goals” and skills to displace the objects on the table in a fully autonomous way. In a later “extrinsic phase”, the robot was required to re-use the acquired goals and skills to arrange the objects in multiple desired configurations (“extrinsic goals”). In the easier version of the demonstrators, the robots were facilitated by directly furnishing the position of the objects and a parameterised “macro-action”. In a more complex version, the robots had to autonomously learn to understand what objects are, and where they are located, on the basis of raw pixel images. In an even more challenging demonstrator, the robots also had to autonomously learn to manipulate objects by controlling the joints of the arm and gripper.
Dissemination. The project also carried out a number of dissemination activities. In particular, the project realised about 77 scientific publications in conference proceedings and journals; organised two journal special issues; organised two workshops on “Intrinsic Motivation and Open-ended Learning” and two workshops on the “Ethics and Future of Artificial Intelligence”; trained several master and PhD students; held tens of seminars and keynote speeches on open-ended learning; prepared and maintained a website; realised 3 professional videos on the demonstrators; and created a repository of 23 software packages hosting the project models.
Exploitation. The project also carried out a number of exploitation activities. In particular, it developed a “wearable intelligent companion” called “PlusMe”, that is, a teddy-bear robot usable as a smart toy to stimulate the development of social intelligence in typically developed and autistic children. The further development towards the market of the PlusMe, now being engineered, is being taken on by two new projects funded by the European Innovation Council: the “PlusMe” Launchpad project, and the IM-TWIN Pathfinder project. The technology for open-ended learning of objects manipulation is being further developed in two new projects funded by the European Commission: the GROW HBP project, and the GROW Launchpad project. The technology is being in particular developed for the manipulation of objects in industrially relevant scenarios.