All day long, our fingers touch, grasp and move objects in various media such as air, water, oil. We do this almost effortlessly - it feels like we do not spend time planning and reflecting over what our hands and fingers do or how the continuous integration of various sensory modalities such as vision, touch, proprioception, hearing help us to outperform any other biological system in the variety of the interaction tasks that we can execute. Largely overlooked, and perhaps most fascinating is the ease with which we perform these interactions resulting in a belief that these are also easy to accomplish in artificial systems such as robots. Humans acquire physical interaction skills from birth and continue to advance these throughout their lifetime. It is the interplay between perception, planning and control together with training and some innate knowledge that drives this.
A vision for the future is systems that perform complex tasks safely and robustly in interaction with humans and the environment. To assist humans both at home and in industrial environments, robots need to manipulate objects -- pick them up, place in a particular position, move, and even perform more complex tasks such as cutting food, packing bags, dressing humans, etc. These objects can have different shapes and physical properties as well as different degrees of deformability. Despite recent deep reinforcement learning algorithms that demonstrate action learning directly from raw sensor data, this usually requires large training datasets that are hard or impossible to collect in robotics application. The questions of how to model, or represent, the object under consideration, the robot itself, and the environment in which the robot operates, are therefore fundamental for robotic manipulation planning and control.
The main scientific objective of this project was to create new informative and compact representations of deformable objects that incorporate both analytical and learning-based approaches and encode geometric, topological, and physical information about the robot, the object, and the environment. We have done this in the context of challenging multimodal, bimanual object interaction tasks.