LUCS seminar: 2023-10-03@10:15: Mehdi Khamassi: Model-based and model-free reinforcement learning mechanisms in brains and robot
The reinforcement learning (RL) theory constitutes a framework for an artificial agent to learn actions that maximize rewards in the environment. It has been successfully applied to Neuroscience to account for animal neural and behavioral processes in simple laboratory tasks, such as Pavlovian and instrumental conditioning, and single-step economic decision-making tasks. It moreover became very popular due to its account for dopamine reward prediction error signals. However, more complex multi-step tasks, such as navigation and social interaction tasks, illustrate their computational limitations.
In parallel, researches in engineering, in robotics in particular, have emphasized the complementarity between different learning strategies when facing complex tasks, and explored solutions to combine these different strategies. One central distinction is between model-based and model-free reinforcement learning strategies: In the former case, an agent learns a statistical model of the effects of its actions in the environment, and then use this model to plan sequences of actions towards desired goals. In contrast, model-free strategies are relevant when the environment statistics are too noisy to learn a good internal model. In this case, RL agents can rather learn local action values and adapt reactively in each state of the environment.
In this presentation, I will show a series of work where we used a coordination of model-based and model-free reinforcement learning to account for a diversity of behavioral and neural observations in humans, non-human primates and rodents in different paradigms: Navigation, instrumental and Pavlovian conditioning. I will moreover present recent robotics results where the same algorithm with the same parameters produces optimal performance in simple navigation and social interaction tasks, with a drastically reduced computational cost compared to classical methods. Finally, I will show how the patterns of mental simulation within such internal models can mimic experimentally observed reactivations of the rodent hippocampus in spatial cognition tasks, and raise new predictions for future experiments.