Data-Efficient Scalable Reinforcement Learning for Practical Robotic Environments

Project Information

DESlRE

Grant agreement ID: 798321

Project website

DOI

10.3030/798321

Project closed

EC signature date 19 March 2018

Start date 1 April 2018

End date 31 March 2020

Funded under

EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions

Total cost

€ 159 460,80

EU contribution

€ 159 460,80

159 460,80

Coordinated by

MAX-PLANCK-GESELLSCHAFT ZUR FORDERUNG DER WISSENSCHAFTEN EV
Germany

Periodic Reporting for period 1 - DESlRE (Data-Efficient Scalable Reinforcement Learning for Practical Robotic Environments)

Reporting period: 2018-04-01 to 2020-03-31

The initial aim of the project was to develop algorithms suitable for challenging control tasks. Current algorithms that perform well in simulation typically transfers suboptimally to real test or robots. For example, one of the hot topics in research is termed- sim-to-real transfer, which aims to transfer amazing feats that algorithms can achieve in computer simulations to test time performance. If we can understand how to perform control in a highly stochastic environment, many problems in social decision-making can be solved. The overall objectives are to advance our understanding of the difficulty of such an application of algorithms and developing new ones.

The fellow has developed various new optimization and control algorithms that are designed to handle the hybrid and stochastic environment. Our theory connects to the deep theory of robust optimization and robust control, as well as the robustness of machine learning algorithms such as kernel method. One of the most significant research outcomes was the newly proposed framework of kernel distributionally robust optimization algorithm. This elegant framework is a combination of principled robust optimization theory and kernel machine learning.

Fellow's work presents new insights combining the theory of robust convex optimization and RKHS. It shows the theory of kernel methods can be used to make robust decisions for general decision-making problems. The work adds an interesting piece to both DRO and kernel method literature.

From a practical perspective, we have proposed easy-to-implement algorithms. As we discussed in recent works, one strength of our methods is its wide applicability. Many of today's learning tasks suffer from manifestations of distributional ambiguity. We believe practitioners from industry and business that wish to gain robustness in their learning or decision-making tasks can apply our kernel distributionally robust optimization algorithms.

We aim to design optimization and control algorithm that can hedge against distribution shift

Periodic Reporting for period 1 - DESlRE (Data-Efficient Scalable Reinforcement Learning for Practical Robotic Environments)

Share this page Share this page on social networks

Download Download the content of the page