Provably Efficient Algorithms for Large-Scale Reinforcement Learning

Información del proyecto

SCALER

Identificador del acuerdo de subvención: 950180

Sitio web del proyecto

DOI

10.3030/950180

Fecha de la firma de la CE 3 Septiembre 2020

Fecha de inicio 1 Octubre 2021

Fecha de finalización 30 Septiembre 2026

Financiado con arreglo a

EXCELLENT SCIENCE - European Research Council (ERC)

Coste total

€ 1 493 990,00

Aportación de la UE

€ 1 493 990,00

1 493 990,00

Coordinado por

UNIVERSIDAD POMPEU FABRA
Spain

CORDIS proporciona enlaces a los documentos públicos y las publicaciones de los proyectos de los programas marco HORIZONTE.

Los enlaces a los documentos y las publicaciones de los proyectos del Séptimo Programa Marco, así como los enlaces a algunos tipos de resultados específicos, como conjuntos de datos y «software», se obtienen dinámicamente de OpenAIRE .

Resultado final

Publicaciones

Offline Primal-Dual Reinforcement Learning for Linear MDPs

Autores: G. Gabbianelli, G. Neu, N. Okolo, M. Papini
Publicado en: Proceedings of the Twenty-seventh International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Editor: Proceedings of Machine Learning Research

Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees

Autores: Tirinzoni A.; Papini M.; Touati A.; Lazaric A.; Pirotta M.
Publicado en: Advances in Neural Information Processing Systems 35 (NeurIPS 2022), 2022
Editor: NeurIPS foundation
DOI: 10.48550/arxiv.2210.13083

Efficient Global Planning in Large MDPs via Stochastic Primal-Dual Optimization

Autores: Gergely Neu, Nneka Okolo
Publicado en: Proceedings of The 34th International Conference on Algorithmic Learning Theory (ALT 2023), 2023
Editor: Proceedings of Machine Learning Research

Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits

Autores: Gergely Neu, Julia Olkhovskaya, Matteo Papini, Ludovic Schwartz
Publicado en: Advances in Neural Information Processing Systems 35 (NeurIPS 2022), 2022
Editor: NeurIPS foundation

Nonstochastic Contextual Combinatorial Bandits

Autores: L. Zierahn, D. van der Hoeven, N. Cesa-Bianchi, G. Neu
Publicado en: Proceedings of the Twenty-sixth International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Editor: Proceedings of Machine Learning Research

Optimistic Information-Directed Sampling

Autores: G. Neu, M. Papini, L. Schwartz
Publicado en: Proceedings of the 36th Annual Conference on Learning Theory (COLT), 2024
Editor: Proceedings of Machine Learning Research

Dealing with Unbounded Gradients in Stochastic Saddle-Point Optimizaiton

Autores: G. Neu, N. Okolo
Publicado en: Proceedings of the 41st International Conference on Machine Learning (ICML), 2024
Editor: Proceedings of Machine Learning Research

Proximal Point Imitation Learning

Autores: Luca Viano, Angeliki Kamoutsi, Gergely Neu, Igor Krawczuk, Volkan Cevher
Publicado en: Advances in Neural Information Processing Systems 35 (NeurIPS 2022), 2022
Editor: NeurIPS foundation

Importance-Weighted Offline Learning Done Right

Autores: G. Gabbianelli, G. Neu, M. Papini
Publicado en: Proceedings of the 34th International Conference on Algorithmic Learning Theory (ALT), 2024
Editor: Proceedings of Machine Learning Research

Online learning with off-policy feedback

Autores: Germano Gabbianelli, Matteo Papini, Gergely Neu
Publicado en: Proceedings of The 34th International Conference on Algorithmic Learning Theory (ALT 2023), 2023
Editor: Proceedings of Machine Learning Research

First-and Second-Order Bounds for Adversarial Linear Contextual Bandits

Autores: J. Olkhovskaya, J. Mayo, T. van Erven, G. Neu, C.-Y. Wei
Publicado en: Advances in Neural Information Processing Systems 36 (NeurIPS), 2023
Editor: NeurIPS foundation

Optimistic Planning by Regularized Dynamic Programming

Autores: Antoine Moulin, Gergely Neu
Publicado en: International Conference on Machine Learning (ICML 2022), 2023
Editor: Proceedings of Machine Learning Research

Generalization bounds via convex analysis

Autores: Gabor Lugosi, Gergely Neu
Publicado en: Proceedings of Thirty Fifth Conference on Learning Theory (COLT 2022), 2022
Editor: Proceedings of Machine Learning Research

Adversarial Contextual Bandits Go Kernelized

Autores: G. Neu, J. Olkhovskaya, S. Vakili
Publicado en: Proceedings of the 34th International Conference on Algorithmic Learning Theory (ALT), 2024
Editor: Proceedings of Machine Learning Research

Smoothing policies and safe policy gradients

Autores: Matteo Papini; Matteo Pirotta; Marcello Restelli
Publicado en: Machine Learning, Edición 111, 2022, Página(s) 4081–4137, ISSN 1573-0565
Editor: Springer
DOI: 10.1007/s10994-022-06232-6

A note on regularised NTK dynamics with an application to PAC-Bayesian training

Autores: Clerico, Eugenio; Guedj, Benjamin
Publicado en: Transactions on Machine Learning Research, 2024, ISSN 2835-8856
Editor: Transactions on Machine Learning Research
DOI: 10.48550/arxiv.2312.13259

Buscando datos de OpenAIRE...

Resultado final

Publicaciones

Descargar Descargar el contenido de la página