Speed/accuracy trade-off between the habitual and the goal-directed processes (Kermati et al. 2011)

"This study is a reference implementation of Keramati, Dezfouli, and Piray 2011 that proposed an arbitration mechanism between a goal-directed strategy and a habitual strategy, used to model the behavior of rats in instrumental conditionning tasks. The habitual strategy is the Kalman Q-Learning from Geist, Pietquin, and Fricout 2009. We replicate the results of the first task, i.e. the devaluation experiment with two states and two actions. ..."

Region(s) or Organism(s): Basal ganglia

Model Concept(s): Action Selection/Decision Making; Reinforcement Learning; Learning

Simulation Environment: Python (web link to model)

Implementer(s): Viejo, Guillaume [guillaume.viejo at isir.upmc.fr]; Girard, Benoit [girard at isir.upmc.fr]; Khamassi, Mehdi

References:

Keramati M, Dezfouli A, Piray P. (2011). Speed/accuracy trade-off between the habitual and the goal-directed processes. PLoS computational biology. 7 [PubMed]

Girard B, Khamassi M, Viejo G. (2016). [Re] Speed/accuracy trade-off between the habitual and the goal-directed processes ReScience. 2(1)

View on GitHub