Reinforcement learning (RL) characterizes how we adaptively learn, by trial and errors, to select actions that maximize the occurrence of rewards, and minimize the occurrence of punishments. The behavioural, computational and neurobiological features of reinforcement-learning outcomes have been extensively studied in humans and other animals, but mostly in a standard context where the decision-maker only face one outcome (usually a reward) associated with the option they chose. As a consequence, little is known about how we prioritize, filter or value richer and more complex outcome information in RL, and how we subjectively evaluate the quality of information that supports our decision. This project proposes to address this gap, and hypothesizes that humans do learn from complex outcome information (multiple samples), but that computational limitations and affective biases curb information integration.
The prioritization, filtering and biased integration of the information carried by the outcomes of our decision may underpin critical (and undesirable) behavioral phenomena like confirmatory biases, overconfidence, and ultimately complex social phenomena like political polarization.
The objectives of our project are to investigate these cognitive processes in a well-controlled laboratory environment, to decipher the behavioral, computational and neurobiological aspects of information integration in reinforcement-learning and its biases and limitations.