Skip to main content

Goal-based Open-ended Autonomous Learning Robots

Periodic Reporting for period 3 - GOAL-Robots (Goal-based Open-ended Autonomous Learning Robots)

Reporting period: 2019-05-01 to 2021-04-30

Context. Imagine your friend asks you to tidy up her kitchen full of furniture and objects. Your friend described through a photo how the kitchen should look after being tidy up. This task would be boring for you, but you would nevertheless be able to carry it out with ease. Indeed, when you were a child you played with all sorts of objects and so, driven by curiosity, you learned many flexible sensorimotor skills to manipulate them at will. Differently from you, current robots are ill-suited for these types of challenges. Indeed, they still have important limitations in facing unstructured environments and situations unforeseen at design time requiring autonomy, flexibility, and adaptation.

Why is the project important for society? For many applications, future robots should be able to learn how to solve multiple tasks in unstructured environments in an autonomous way. The project aims to face this challenge. The project follows a previous European project called IM-CLeVeR which investigated how intrinsic motivations can support autonomous learning in biological and artificial agents. Intrinsic motivations, related to novelty, surprise, and competence acquisition, are maximally apparent in children at play. Driven by intrinsic motivations, children explore and interact with objects in the environment thus getting to know how to manipulate them. The central idea of GOAL-Robots is that intrinsic motivations can fully support open-ended learning of skills only if a further critical ingredient is considered: goals. A goal is an internal representation of a set of states of the world, which can be internally activated by the agent in the absence of its current perception.
The project had three main objectives: (1) advancing our understanding of how goals are formed and underlie skill learning in children; (2) developing novel robot architectures able to self-generate goals based on intrinsic motivations, and to use such goals to learn skills; (3) demonstrating the project concept with increasingly challenging demonstrators. These objectives led to the achievement of the following results.

Experiments in infants. The project carried out experiments with infants to understand the mechanisms through which they undergo open-ended learning. This led us to understand that learning in children is strongly based on “sensorimotor contingencies” for which the consequences of actions, internally represented as goals, lead infants to acquire actions that reliably cause such consequences. Contingency-based learning was translated into algorithms employed in robots to allow them to autonomously generate goals.

Autonomous goal formation. A second achievement of the project was the development of controllers allowing robots to self-generate goals to guide autonomous open-ended learning. What really counts for humans, and similarly for robots, is the acquisition of the capacity to change the surrounding world in desired ways. This led to the specific idea, translated into algorithms, that a way to self-generate goals is to form goals as states that follow the robot's actions that cause novel changes in the environment.

Autonomous skill learning. To have full autonomy, the self-generated goals should then guide robots to acquire the skills to accomplish them. The robots can use the goals and skills so acquired to accomplish tasks relevant for their human users. For example, the robots can use goals and skills for moving objects to desired places to tidy up a kitchen by putting the objects in proper places. As a third achievement, the project produced new robot architectures able to acquire ample repertoires of goals and skills based on such principles encompassing several functions.

Robot demonstrators. The project also built a number of increasingly challenging simulated and real robot demonstrators involving a camera-arm-gripper robot engaged in the displacement of objects on a table. In a first long “intrinsic phase”, the robot was required to learn “intrinsic goals” and skills to displace the objects on the table in a fully autonomous way. In a later “extrinsic phase”, the robot was required to re-use the acquired goals and skills to arrange the objects in multiple desired configurations (“extrinsic goals”). In the easier version of the demonstrators, the robots were facilitated by directly furnishing the position of the objects and a parameterised “macro-action”. In a more complex version, the robots had to autonomously learn to understand what objects are, and where they are located, on the basis of raw pixel images. In an even more challenging demonstrator, the robots also had to autonomously learn to manipulate objects by controlling the joints of the arm and gripper.

Dissemination. The project also carried out a number of dissemination activities. In particular, the project realised about 77 scientific publications in conference proceedings and journals; organised two journal special issues; organised two workshops on “Intrinsic Motivation and Open-ended Learning” and two workshops on the “Ethics and Future of Artificial Intelligence”; trained several master and PhD students; held tens of seminars and keynote speeches on open-ended learning; prepared and maintained a website; realised 3 professional videos on the demonstrators; and created a repository of 23 software packages hosting the project models.

Exploitation. The project also carried out a number of exploitation activities. In particular, it developed a “wearable intelligent companion” called “PlusMe”, that is, a teddy-bear robot usable as a smart toy to stimulate the development of social intelligence in typically developed and autistic children. The further development towards the market of the PlusMe, now being engineered, is being taken on by two new projects funded by the European Innovation Council: the “PlusMe” Launchpad project, and the IM-TWIN Pathfinder project. The technology for open-ended learning of objects manipulation is being further developed in two new projects funded by the European Commission: the GROW HBP project, and the GROW Launchpad project. The technology is being in particular developed for the manipulation of objects in industrially relevant scenarios.
The project's main result is the development of fundamental principles and architectures to control open-ended learning robots. These principles allow robots to autonomously acquire skills without any guidance, in particular on the basis of the capacity for self-generating goals. The project in particular succeeded in building robots that are able to acquire goals and skills to displace objects in space from scratch, by relying only on raw pixel images and joint control. The architectures developed by the project represent an important contribution to the construction of future service robots addressing relevant societal needs. Examples of applications involve tidying up indoor and outdoor environments, accomplishing tasks useful for humans in unstructured environments, and entertaining humans with educational and rehabilitation activities.
The kitchen scenario involving two robot arms and objects to be set in a desired configuration
iCub humanoid robot performing a goal-based object retrieval with a tool