ModelDB: Paper information

Sakai Y, Fukai T. (2008). The actor-critic learning is behind the matching law: matching versus optimal behaviors. Neural computation. 20 [PubMed]

See more from authors: Sakai Y · Fukai T

References and models cited by this paper

Abbott LF, Dayan P. (2001). Theoretical Neuroscience. Computational and Mathematical Modeling of Neural Systems.

Barraclough DJ, Conroy ML, Lee D. (2004). Prefrontal cortex and decision making in a mixed-strategy game. Nature neuroscience. 7 [PubMed]

Barto AG, Sutton RS. (1998). Reinforcement learning: an introduction.

A reinforcement learning example (Sutton and Barto 1998) [Model]

Baum WM. (1981). Optimization and the matching law as accounts of instrumental behavior. Journal of the experimental analysis of behavior. 36 [PubMed]

Baum WM, Rachlin HC. (1969). Choice as time allocation. Journal of the experimental analysis of behavior. 12 [PubMed]

Breiter HC, Aharon I, Kahneman D, Dale A, Shizgal P. (2001). Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron. 30 [PubMed]

Davison M, Mccarthy D. (1987). The matching law: A research review.

Daw ND, Touretzky DS. (2002). Long-term reward prediction in TD models of the dopamine system. Neural computation. 14 [PubMed]

Dayan P, Balleine BW. (2002). Reward, motivation, and reinforcement learning. Neuron. 36 [PubMed]

DeCarlo LT. (1985). Matching and maximizing with variable-time schedules. Journal of the experimental analysis of behavior. 43 [PubMed]

Doya K. (2000). Complementary roles of basal ganglia and cerebellum in learning and motor control. Current opinion in neurobiology. 10 [PubMed]

Gallistel CR, Mark TA, King AP, Latham PE. (2001). The rat approximates an ideal detector of changes in rates of reward: implications for the law of effect. Journal of experimental psychology. Animal behavior processes. 27 [PubMed]

Haruno M et al. (2004). A neural correlate of reward-based behavioral learning in caudate nucleus: a functional magnetic resonance imaging study of a stochastic decision task. The Journal of neuroscience : the official journal of the Society for Neuroscience. 24 [PubMed]

Herrnstein RJ, Heyman GM. (1979). Is matching compatible with reinforcement maximization on concurrent variable interval variable ratio? Journal of the experimental analysis of behavior. 31 [PubMed]

Herrnstein RJ, Rachlin H, Laibson DI. (1997). The matching law: papers in psychology and economics.

Herrnstein RJ, Vaughan WJ. (1980). Melioration and behavioral allocation Limits to action: the allocation of individual behavior.

Heyman GM. (1979). A Markov model description of changeover probabilities on concurrent variable-interval schedules. Journal of the experimental analysis of behavior. 31 [PubMed]

Heyman GM, Monaghan MM. (1994). Reinforcer magnitude (sucrose concentration) and the matching law theory of response strength. Journal of the experimental analysis of behavior. 61 [PubMed]

Houk JC, Beiser DG, Davis JL. (1995). Models of Information Processing in the Basal Ganglia.

Houston AI, McNamara J. (1981). How to maximize reward rate on two variable-interval paradigms. Journal of the experimental analysis of behavior. 35 [PubMed]

Jacobs EA, Hackenberg TD. (1996). Humans' choices in situations of time-based diminishing returns: effects of fixed-interval duration and progressive-interval step size. Journal of the experimental analysis of behavior. 65 [PubMed]

Knutson B, Adams CM, Fong GW, Hommer D. (2001). Anticipation of increasing monetary reward selectively recruits nucleus accumbens. The Journal of neuroscience : the official journal of the Society for Neuroscience. 21 [PubMed]

Mazur JE. (1981). Optimization theory fails to predict performance of pigeons in a two-response situation. Science (New York, N.Y.). 214 [PubMed]

Mazur JE. (2005). Learning and behavior (6th ed).

McClure SM, Berns GS, Montague PR. (2003). Temporal prediction errors in a passive learning task activate human striatum. Neuron. 38 [PubMed]

Montague PR, Berns GS. (2002). Neural economics and the biological substrates of valuation. Neuron. 36 [PubMed]

Montague PR, Dayan P, Sejnowski TJ. (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. The Journal of neuroscience : the official journal of the Society for Neuroscience. 16 [PubMed]

Morris G, Arkadir D, Nevet A, Vaadia E, Bergman H. (2004). Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons. Neuron. 43 [PubMed]

Platt ML, Glimcher PW. (1999). Neural correlates of decision variables in parietal cortex. Nature. 400 [PubMed]

Rachlin H, Green L, Kagel J, Battalio R. (1976). Economic demand theory and psychological studies of choice The psychology of learning and motivation. 10

Sakagami T, Hursh SR, Christensen J, Silberberg A. (1989). Income maximizing in concurrent interval-ratio schedules. Journal of the experimental analysis of behavior. 52 [PubMed]

Samejima K, Ueda Y, Doya K, Kimura M. (2005). Representation of action-specific reward values in the striatum. Science (New York, N.Y.). 310 [PubMed]

Savastano HI, Fantino E. (1994). Human choice in concurrent ratio-interval schedules of reinforcement. Journal of the experimental analysis of behavior. 61 [PubMed]

Schultz W. (2004). Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology. Current opinion in neurobiology. 14 [PubMed]

Schultz W, Dayan P, Montague PR. (1997). A neural substrate of prediction and reward. Science (New York, N.Y.). 275 [PubMed]

Seung HS. (2003). Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron. 40 [PubMed]

Silberberg A, Thomas JR, Berendzen N. (1991). Human choice on concurrent variable-interval variable-ratio schedules. Journal of the experimental analysis of behavior. 56 [PubMed]

Staddon JE, Hinson JM. (1983). Optimization: a result or a mechanism? Science. 221

Stubbs DA, Pliskoff SS, Reid HM. (1977). Concurrent schedules: a quantitative relation between changeover behavior and its consequences. Journal of the experimental analysis of behavior. 27 [PubMed]

Sugrue LP, Corrado GS, Newsome WT. (2004). Matching behavior and the representation of value in the parietal cortex. Science (New York, N.Y.). 304 [PubMed]

Tanaka SC et al. (2004). Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops. Nature neuroscience. 7 [PubMed]

Touretzky DS, Daw ND. (2001). Operant behavior suggestsattentional gating of dopamine system inputs Neurocomputing. 38

Vyse SA, Belke TW. (1992). Maximizing versus matching on concurrent variable-interval schedules. Journal of the experimental analysis of behavior. 58 [PubMed]

Wang XJ. (2002). Probabilistic decision making by slow reverberation in cortical circuits. Neuron. 36 [PubMed]

References and models that cite this paper