Skip to main content

Robots Understanding Their Actions by Imagining Their Effects

Periodic Reporting for period 3 - IMAGINE (Robots Understanding Their Actions by Imagining Their Effects)

Reporting period: 2019-09-01 to 2021-02-28

Today's robots are good at executing programmed motions, but they cannot automatically generalize them to novel situations or recover from failures. IMAGINE seeks to enable robots to understand the structure of their environment and how it is affected by its actions. "Understanding" here means the ability of the robot to determine the applicability of an action along with parameters to achieve the desired effect, to discern to what extent an action succeeded, and to infer possible causes of failure and generate recovery actions.

The core functional element is a generative model based on an association engine and a physics simulator. "Understanding" is given by the robot's ability to predict the effects of its actions. This allows the robot to choose actions and parameters based on their simulated performance, and to monitor their progress by comparing observed to simulated behavior.

This scientific objective was pursued in the context of recycling of electromechanical appliances. Current recycling practices do not automate disassembly, which exposes humans to hazardous materials, encourages illegal disposal, and creates significant threats to environment and health, often in third countries. IMAGINE developed a TRL-5 prototype that can autonomously disassemble prototypical classes of devices, generate and execute disassembly actions for unseen instances of similar devices, and recover from certain failures.

IMAGINE raised the ability level of robotic systems in core areas of the work programme, including adaptability, manipulation, perception, decisional autonomy, and cognitive ability. Since only one-third of EU e-waste is currently recovered, IMAGINE addressed an area of high economical and ecological impact.
We devised a visual intelligence scheme that can analyze a disassembly scene and extract relevant geometric information about parts inside a device. It combines DCNN-powered modules with classical computer vision methods to provide the necessary predicates required for affordance detection and for planning, and it generalizes to reasonably-similar, unknown devices.

Our action descriptors incorporate insights from decades of lessons learned in robotics on what information is important for the robot to successfully plan and execute actions in a consistent manner. Their definition is grounded in psychology, philosophy and neuroscience.

We developed Conditional Neural Movement Primitives (CNMP) that address several important problems in Learning from Demonstration in a unified manner. CNMP can learn joint distributions of complex, temporal, multi-modal, sensorimotor trajectories non-linearly conditioned on external parameters and goals, modeling the associations between high-dimensional sensorimotor spaces and complex motions.

To simulate different actions, we designed novel physics-based models of specific mechanical phenomena. Our interactive model of suction uses the Finite-Element Method (FEM) and a constraint-based formulation to simulate pressure variations inside deformable suction cups in contact with rigid or deformable objects. The results allow the simulations of the gripper actions but also show promise for various applications in soft robotics and computer animation.

The IMAGINE planner can cope with uncertainty in the outcome of actions. The disassembly problem is broken into small, manageable Markov Decision Processes that are then solved using determinization.

We developed a novel multi-functional robotic tool consisting of a parallel gripper equipped with a small robot arm (third finger) to perform a set of actions needed for disassembly tasks. These actions are learned from human demonstration based on a novel, robot-assisted kinesthetic teaching approach and are represented as via-point movement primitives that adapt to new situations.

We implemented a demonstrator targeting the disassembly of computer harddrives and similar devices. A real scene is analyzed, opportunities for actions are detected, and an action plan is reactively generated using information from simulated and real interaction. Robot actions are performed by the multi-functional gripper.

The results are widely disseminated in academic and industrial circles as well as to the general public. It has given rise to a follow-up H2020 project and offers opportunities for further development into a product.
To our knowledge, we completed the first visual intelligence scheme that demonstrates the capabilities required for fully-automated disassembly, and is also the first that leverages deep learning for disassembly. Generalization capabilities to unknown devices pave the way toward fully-autonomous recycling lines.

Our action descriptors contributes to the formalization of an otherwise imprecise concept, and connect the different levels of a complex autonomous robot. A prototype learning system allows the robot action repertoire to be extended dynamically by a naive user.

We developed multiple deep neural network architectures that predict affordances and effects of actions on groups of articulated objects with different geometries. We extended DMP with a novel force-feedback term, developed state-of-the-art learning-from-demonstration framework CNMP, and extended it with a novel RL component, improving sample efficiency over existing methods by an order of magnitude.

For simulation of the different actions of the gripper, novel, accurate physics-based models were designed, most importantly suction which is used to simulate the suction cup of the gripper. To perform simulations of real scenes observed by a camera, we devised a method for registering the acquired point clouds with the robust and high-quality 3D meshes needed by physics-based simulations. This allows the system to launch multiple simulations to refine the action parameters as well as the material properties. These advances pave the way for simulating and understanding robot actions in many scenarios that involve complex mechanical phenomena.

We made substantial progress in planning for devices of high complexity under perceptual and outcome uncertainty through multiple, small-scale MDP with determinization, novel algorithms for learning symbolic actions, as well as assumption-based planning and Monte Carlo Tree Search for planning with state beliefs.

We introduced the novel concept of a multi-functional gripper that presents a flexible and affordable solution for disassembly tasks for the recycling of electromechanical devices. Our gripper is designed to perform complex manipulation actions that require high precision, multiple tools and adaptation to devices. The evaluation results demonstrate robustness, accuracy and intuitive programming of disassembly tasks.
State Estimation, its modules and the relations to other work packages
Physics-based simulation of suction based on a FEM coupled with a constraint-based formulation
Conditional Neural Movement Primitives
Problem solving by decomposition into manageable Markov Decision Processes and determinization
The IMAGINE multi-functional gripper