Learn to learn human learning process from teleoperated demonstrations

Informacje na temat projektu

L3TD

Identyfikator umowy o grant: 101030691

Strona internetowa projektu

DOI

10.3030/101030691

Projekt został zamknięty

Data podpisania przez KE 12 Kwietnia 2021

Data rozpoczęcia 1 Lipca 2021

Data zakończenia 31 Grudnia 2022

Finansowanie w ramach

EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions

Koszt całkowity

€ 168 700,32

Wkład UE

€ 168 700,32

168 700,32

Koordynowany przez

UNIVERSITY OF THE WEST OF ENGLAND, BRISTOL
United Kingdom

Periodic Reporting for period 1 - L3TD (Learn to learn human learning process from teleoperated demonstrations)

Okres sprawozdawczy: 2021-07-01 do 2022-12-31

Learning from demonstration (LfD) is a paradigm that allows robots to autonomously learn from demos to perform new tasks through human demonstrations, which can bridge robotics and AI techniques to promote robot manipulability and robot programming feasibility. This project addresses two problems of LfD technology: 1) physical differences between robotic arm/gripper system and human arm/hand in manipulation, such as most grippers are not as soft and deformable as human fingertips; 2) learning skills with failure reasoning and incremental learning capabilities. From the perspective of neural motor theory, humans have stronger perceptual, cognitive, and muscular adaptive abilities that can sense and recognise temperature, pressure, vibration, texture, and shape of the touched object and adjust muscle impedance to changes in the environment, which is difficult for the robot. The differences between humans and robots are challenging, but there is hope to solve them to some extent using advanced robotics and AI techniques.

L3TD aims to solve the two challenging problems by developing a new tele-demonstration interface and proposing some new theoretical innovations on incremental learning and few-shot learning, and applying them to robot manipulation in fields such as nuclear industry and medical assistance with five separated objectives, which can be summarised into the following three aspects

Aspect 1 ( Objective 1 and Objective 2): Establish a teleoperation interface with a new facility to obtain a human demonstration data set. The interface operates in a "human-in-loop" control mode that allows humans to make immediate decisions from the first perspective, and the multimodal demonstration data is collected to be stored in a demonstration dataset and managed in a condensed form by the hierarchical labels with primitive capabilities.

Aspect 2 ( Objective 3 and Objective 4): Learning primitive skills and programming primitive skills (PS) for few-shot tasks. New theories of PS learning and PS programming are explored based on the learning methods such as improved meta-learning, reinforcement learning, and broad learning, etc., to achieve failure reasoning and adaptation to tasks with zero/few shots, respectively.

Aspect 3 ( Objective 5): Experimental Verification. We choose typical actions such as grasping objects and approaching in medical assistance scenes to verify the effectiveness of the proposed methods using data collected from the demonstration system.

The work done in this project can be split into two main parts: i) design of a wearable human exoskeleton, construction of a data acquisition and processing system, and creation of a demonstration database and a primitive skill library; ii) skill learning and programming of few-shot tasks and experimental verification.

Several improvements have been achieved in the first part. The basic result is the development of a wearable tool-like exoskeleton to measure the movements (positions and gestures) of arms and hand fingers, and an action measurement platform to collect data. The exoskeleton's end grippers choose the same material, size, and range of motion as the robotic grippers to minimize physical differences from human hands. A conference paper has been published on this design and a journal article is under review. For the teleoperation platform, we have investigated adaptive control methods to improve system performance. We also created a primitive skill dataset that includes typical manipulation actions such as grasping, picking up and putting down, reaching targets, and obstacle avoidance from various scenarios

In the second part, few-shot learning from the dataset is taken to achieve robot skill updating and failure reasoning and correction from the learning results. The typical research proposes an incremental learning network for robot skills. The learned skills are adjusted and corrected by adding new human actions after the initialized demonstrations, so that the previous inappropriate and erroneous actions can be corrected by the continuous human learning process and the corrections of the robot actions. Meanwhile, three journal articles have been published on this topic, and some articles on robot grasping based on meta-learning, reinforcement learning, and graph neural networks (GNN) to achieve few-shot learning and skill transfer from humans to robots are under review. The effectiveness of the proposed algorithms is verified using typical manipulation tasks such as grasping and moving objects, approaching targets, and using tools.

There are several aspects which extend beyond the state of the art, such as the new design of the exoskeleton, which has been updated and equipped with a pair of new tactile sensors to provide tactile information as well, which will be summarized in an international conference publication. In terms of theoretical aspects, we propose an incremental robot learning from demonstration framework that the learned skills can be updated using newly added demonstrations to make reactions and corrections, and multiphase programming of skills in tool use, which provides a new solution and inspiring ideas for related research areas. The related research was awarded the Best Poster Award at the 9th Annual Conference of the Marie Curie Alumni Association held in Lisbon.

The impact of this project is to create a low-cost and easy-to-create adaptive human demonstration system to provide a common research platform for academic research as well as an applicable solution for robotic fabrication and manipulation scenarios. There are some freely available videos and databases of passive observations of robots collecting data on human hand/arm manipulations to teach to robots, but it is difficult to describe accurate relative motions and interactions, especially force and tactile information, with objects and the environment. In the further research process, we will continuously update the performance, such as applicability, usability, and accuracy of the system. In the future, we plan to organize and process the data and publish a freely available version (including mechanical and electrical design, quick reference guide, and data) to expand the influence of the research results.

Designed wearable human exoskeleton for demonstration

Best poster paper of MCAA Annual conference

Data collection and measuing platform

Periodic Reporting for period 1 - L3TD (Learn to learn human learning process from teleoperated demonstrations)

Udostępnij tę stronę Udostępnij tę stronę w mediach społecznościowych

Pobierz Pobierz zawartość strony