Skip to main content
Ir a la página de inicio de la Comisión Europea (se abrirá en una nueva ventana)
español es
CORDIS - Resultados de investigaciones de la UE
CORDIS
Contenido archivado el 2024-05-29

Reinforcement learning via supervised learning

Objetivo

The field of machine learning develops learning paradigms and algorithms which allow systems to learn some desired functionality on their own. Supervised learning is learning with a teacher; some authoritative source provides a finite set of correct examples, and the learner generalises from the examples and learns a correct function over the entire spectrum. An example from human learning would be the learning of correct spelling by observing correctly spelled words. Reinforcement learning, on the other hand, is learning by trial and error; there is no teacher and the learner interacts directly with its environment to acquire information. The learner makes decisions arbitrarily and occasionally receives a numerical score (reinforcement signal) for its overall behaviour. This score does not indicate correct or incorrect actions, but can be used to reinforce good decision-making and discourage bad decision-making.

An example from human learning would be the process of learning how to balance and ride a bicycle (falls incur negative scores). These two fields have been researched mostly independently. Recent advances in supervised learning have demonstrated outstanding, near optimal, generalisation performance. Reinforcement learning has not reached the same level of applicability to real-world problems. This research proposal investigates the potential of using supervised learning technology for advancing reinforcement learning. It is possible to incorporate supervised learning algorithms within the inner loops of several reinforcement learning algorithms and therefore reduce one problem to the other. This synergy opens the door to a variety of promising combinations. The proposed research will establish the criteria under which this reduction is possible, will investigate viable combinations, will propose novel algorithms, will assess their potential, and will apply them to real problems of practical interest to demonstrate their effectiveness.

Palabras clave

Palabras clave del proyecto indicadas por el coordinador del proyecto. No confundir con la taxonomía EuroSciVoc (Ámbito científico).

Tema(s)

Las convocatorias de propuestas se dividen en temas. Un tema define una materia o área específica para la que los solicitantes pueden presentar propuestas. La descripción de un tema comprende su alcance específico y la repercusión prevista del proyecto financiado.

Convocatoria de propuestas

Procedimiento para invitar a los solicitantes a presentar propuestas de proyectos con el objetivo de obtener financiación de la UE.

FP6-2004-MOBILITY-12
Consulte otros proyectos de esta convocatoria

Régimen de financiación

Régimen de financiación (o «Tipo de acción») dentro de un programa con características comunes. Especifica: el alcance de lo que se financia; el porcentaje de reembolso; los criterios específicos de evaluación para optar a la financiación; y el uso de formas simplificadas de costes como los importes a tanto alzado.

IRG - Marie Curie actions-International re-integration grants

Coordinador

TECHNICAL UNIVERSITY OF CRETE
Aportación de la UE
Sin datos
Coste total

Los costes totales en que ha incurrido esta organización para participar en el proyecto, incluidos los costes directos e indirectos. Este importe es un subconjunto del presupuesto total del proyecto.

Sin datos
Mi folleto 0 0