Barto AG, Sutton RS. (1998). Reinforcement learning: an introduction.
Bayer HM, Glimcher PW. (2005). Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron. 47 [PubMed]
Bayer HM, Lau B, Glimcher PW. (2007). Statistics of midbrain dopamine neuron spike trains in the awake primate. Journal of neurophysiology. 98 [PubMed]
Björklund A, Dunnett SB. (2007). Dopamine neuron systems in the brain: an update. Trends in neurosciences. 30 [PubMed]
Boeijinga PH, Mulder AB, Pennartz CM, Manshanden I, Lopes da Silva FH. (1993). Responses of the nucleus accumbens following fornix/fimbria stimulation in the rat. Identification and long-term potentiation of mono- and polysynaptic pathways. Neuroscience. 53 [PubMed]
Bolam JP, Pissadaki EK. (2012). Living on the edge with too many mouths to feed: why dopamine neurons die. Movement disorders : official journal of the Movement Disorder Society. 27 [PubMed]
Bromberg-Martin ES, Matsumoto M, Hikosaka O. (2010). Dopamine in motivational control: rewarding, aversive, and alerting. Neuron. 68 [PubMed]
Calabresi P, Maj R, Pisani A, Mercuri NB, Bernardi G. (1992). Long-term synaptic depression in the striatum: physiological and pharmacological characterization. The Journal of neuroscience : the official journal of the Society for Neuroscience. 12 [PubMed]
Doya K. (2000). Complementary roles of basal ganglia and cerebellum in learning and motor control. Current opinion in neurobiology. 10 [PubMed]
Fiorillo CD, Tobler PN, Schultz W. (2003). Discrete coding of reward probability and uncertainty by dopamine neurons. Science (New York, N.Y.). 299 [PubMed]
Gerfen CR, Surmeier DJ. (2011). Modulation of striatal projection systems by dopamine. Annual review of neuroscience. 34 [PubMed]
Gershman SJ. (2014). Dopamine ramps are a consequence of reward prediction errors. Neural computation. 26 [PubMed]
Glimcher PW. (2011). Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proceedings of the National Academy of Sciences of the United States of America. 108 Suppl 3 [PubMed]
Gustafsson B, Asztely F, Hanse E, Wigström H. (1989). Onset Characteristics of Long-Term Potentiation in the Guinea-Pig Hippocampal CA1 Region in Vitro. The European journal of neuroscience. 1 [PubMed]
Hardt O, Nader K, Nadel L. (2013). Decay happens: the role of active forgetting in memory. Trends in cognitive sciences. 17 [PubMed]
Hardt O, Nader K, Wang YT. (2014). GluA2-dependent AMPA receptor endocytosis and the decay of early and late long-term potentiation: possible mechanisms for forgetting of short- and long-term memories. Philosophical transactions of the Royal Society of London. Series B, Biological sciences. 369 [PubMed]
Hart AS, Rutledge RB, Glimcher PW, Phillips PE. (2014). Phasic dopamine release in the rat nucleus accumbens symmetrically encodes a reward prediction error term. The Journal of neuroscience : the official journal of the Society for Neuroscience. 34 [PubMed]
Howe MW, Tierney PL, Sandberg SG, Phillips PE, Graybiel AM. (2013). Prolonged dopamine signalling in striatum signals proximity and value of distant rewards. Nature. 500 [PubMed]
Ito M, Doya K. (2011). Multiple representations and algorithms for reinforcement learning in the cortico-basal ganglia circuit. Current opinion in neurobiology. 21 [PubMed]
Kawagoe R, Takikawa Y, Hikosaka O. (2004). Reward-predicting activity of dopamine and caudate neurons--a possible mechanism of motivational control of saccadic eye movement. Journal of neurophysiology. 91 [PubMed]
Laughlin SB. (2001). Energy as a constraint on the coding and processing of sensory information. Current opinion in neurobiology. 11 [PubMed]
Matsuzaki M, Honkura N, Ellis-Davies GC, Kasai H. (2004). Structural basis of long-term potentiation in single dendritic spines. Nature. 429 [PubMed]
Montague PR, Dayan P, Sejnowski TJ. (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. The Journal of neuroscience : the official journal of the Society for Neuroscience. 16 [PubMed]
Montague PR, Hyman SE, Cohen JD. (2004). Computational roles for dopamine in behavioural control. Nature. 431 [PubMed]
Morita K. (2014). Differential cortical activation of the striatal direct and indirect-pathway cells: reconciling the anatomical and optogenetic results by a computational method. J Neurophysiol. 21
Morita K, Morishima M, Sakai K, Kawaguchi Y. (2012). Reinforcement learning: computing the temporal difference of values via distinct corticostriatal pathways. Trends in neurosciences. 35 [PubMed]
Morita K, Morishima M, Sakai K, Kawaguchi Y. (2013). Dopaminergic control of motivation and reinforcement learning: a closed-circuit account for reward-oriented behavior. The Journal of neuroscience : the official journal of the Society for Neuroscience. 33 [PubMed]
Morris G, Nevet A, Arkadir D, Vaadia E, Bergman H. (2006). Midbrain dopamine neurons encode decisions for future action. Nature neuroscience. 9 [PubMed]
Niranjan M, Rummery GA. (1994). On-line Q-learning using connectionist systems Technical Report CUED/F-INFENG/TR 166.
Niv Y. (2013). Neuroscience: Dopamine ramps up. Nature. 500 [PubMed]
Niv Y, Daw ND, Dayan P. (2006). Choice values. Nature neuroscience. 9 [PubMed]
Niv Y, Daw ND, Joel D, Dayan P. (2007). Tonic dopamine: opportunity costs and the control of response vigor. Psychopharmacology. 191 [PubMed]
Niv Y, Duff MO, Dayan P. (2005). Dopamine, uncertainty and TD learning. Behavioral and brain functions : BBF. 1 [PubMed]
O'Doherty JP, Hampton A, Kim H. (2007). Model-based fMRI and its application to reward learning and decision making. Annals of the New York Academy of Sciences. 1104 [PubMed]
Pennartz CM, Ito R, Verschure PF, Battaglia FP, Robbins TW. (2011). The hippocampal-striatal axis in learning, prediction and goal-directed behavior. Trends in neurosciences. 34 [PubMed]
Pissadaki EK, Bolam JP. (2013). The energy cost of action potential propagation in dopamine neurons: clues to susceptibility in Parkinson's disease. Frontiers in computational neuroscience. 7 [PubMed]
Potjans W, Diesmann M, Morrison A. (2011). An imperfect dopaminergic error signal can drive temporal-difference learning. PLoS computational biology. 7 [PubMed]
Rangel A, Camerer C, Montague PR. (2008). A framework for studying the neurobiology of value-based decision making. Nature reviews. Neuroscience. 9 [PubMed]
Reynolds JN, Hyland BI, Wickens JR. (2001). A cellular mechanism of reward-related learning. Nature. 413 [PubMed]
Roesch MR, Calu DJ, Schoenbaum G. (2007). Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards. Nature neuroscience. 10 [PubMed]
Samejima K, Ueda Y, Doya K, Kimura M. (2005). Representation of action-specific reward values in the striatum. Science (New York, N.Y.). 310 [PubMed]
Schultz W, Dayan P, Montague PR. (1997). A neural substrate of prediction and reward. Science (New York, N.Y.). 275 [PubMed]
Shen W, Flajolet M, Greengard P, Surmeier DJ. (2008). Dichotomous dopaminergic control of striatal synaptic plasticity. Science (New York, N.Y.). 321 [PubMed]
Steinberg EE et al. (2013). A causal link between prediction errors, dopamine neurons and learning. Nature neuroscience. 16 [PubMed]
Threlfell S et al. (2012). Striatal dopamine release is triggered by synchronized activity in cholinergic interneurons. Neuron. 75 [PubMed]
Ungerstedt U. (1971). Stereotaxic mapping of the monoamine pathways in the rat brain. Acta physiologica Scandinavica. Supplementum. 367 [PubMed]
Watabe-Uchida M, Zhu L, Ogawa SK, Vamanrao A, Uchida N. (2012). Whole-brain mapping of direct inputs to midbrain dopamine neurons. Neuron. 74 [PubMed]
Watkins CJCH. (1989). Learning from delayed rewards Unpublished doctoral dissertation.
Xiao MY, Niu YP, Wigström H. (1996). Activity-dependent decay of early LTP revealed by dual EPSP recording in hippocampal slices from young rats. The European journal of neuroscience. 8 [PubMed]
Kato A, Morita K. (2016). Forgetting in Reinforcement Learning Links Sustained Dopamine Signals to Motivation. PLoS computational biology. 12 [PubMed]