"This study is a reference implementation of Keramati, Dezfouli, and Piray 2011 that proposed an arbitration mechanism between a goal-directed strategy and a habitual strategy, used to model the behavior of rats in instrumental conditionning tasks. The habitual strategy is the Kalman Q-Learning from Geist, Pietquin, and Fricout 2009. We replicate the results of the first task, i.e. the devaluation experiment with two states and two actions. ..."
Region(s) or Organism(s): Basal ganglia
Model Concept(s): Action Selection/Decision Making; Reinforcement Learning; Learning
Simulation Environment: Python (web link to model)
Implementer(s): Viejo, Guillaume [guillaume.viejo at isir.upmc.fr]; Girard, Benoit [girard at isir.upmc.fr]; Khamassi, Mehdi
References:
Keramati M, Dezfouli A, Piray P. (2011). Speed/accuracy trade-off between the habitual and the goal-directed processes. PLoS computational biology. 7 [PubMed]
Girard B, Khamassi M, Viejo G. (2016). [Re] Speed/accuracy trade-off between the habitual and the goal-directed processes ReScience. 2(1)