Bakker B. (2002). Reinforcement learning with long short-term memory Neural information processing systems.
Barto AG, Sutton RS. (1998). Reinforcement learning: an introduction.
Belavkin RV, Huyck CR. (2011). Conflict resolution and learning probability matching in a neural cell-assembly architecture Cognitive Systems Research. 12
Boerlin M, Denève S. (2011). Spike-based population coding and working memory. PLoS computational biology. 7 [PubMed]
Doya K. (2002). Metalearning and neuromodulation. Neural networks : the official journal of the International Neural Network Society. 15 [PubMed]
Doya K, Otsuka M, Elfwing S, Uchibe E. (2010). Free-energy based reinforcement learning for visionbased navigation with high-dimensional sensory inputs Neural Information Processing. Theory and Algorithms.
Doya K, Yoshimoto J, Otsuka M. (2008). Robust population coding in free-energy-based reinforcement learning Proceedings of the International Conference on Arti Neural Networks (ICANN).
Doya K, Yoshimoto J, Otsuka M. (2010). Free-energy-based reinforcement learning in a partially observable environment European Symposium on Artificial Neural Networks (ESANN).
Florian RV. (2007). Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural computation. 19 [PubMed]
Freedman DJ, Assad JA. (2006). Experience-dependent representation of visual categories in parietal cortex. Nature. 443 [PubMed]
Freedman DJ, Riesenhuber M, Poggio T, Miller EK. (2001). Categorical representation of visual stimuli in the primate prefrontal cortex. Science (New York, N.Y.). 291 [PubMed]
Hinton GE. (2002). Training products of experts by minimizing contrastive divergence. Neural computation. 14 [PubMed]
Hinton GE, Sallans B. (2004). Reinforcement learning with factored states and actions Journal Of Machine Learning Research. 5
Izhikevich EM. (2007). Solving the distal reward problem through linkage of STDP and dopamine signaling. Cerebral cortex (New York, N.Y. : 1991). 17 [PubMed]
Jimenez Rezende D, Gerstner W. (2014). Stochastic variational learning in recurrent spiking networks. Frontiers in computational neuroscience. 8 [PubMed]
Kistler WM, Gerstner W. (2002). Spiking neuron models.
Kwee I, Hutter M. (2001). Market-based reinforcement learning in partially observable worlds. Proceedings of the International Conference on Arti Neural Networks (ICANN).
Matsuda W et al. (2009). Single nigrostriatal dopaminergic neurons form widely spread and highly dense axonal arborizations in the neostriatum. The Journal of neuroscience : the official journal of the Society for Neuroscience. 29 [PubMed]
Miller EK, Freedman DJ, Wallis JD. (2002). The prefrontal cortex: categories, concepts and cognition. Philosophical transactions of the Royal Society of London. Series B, Biological sciences. 357 [PubMed]
Potjans W, Morrison A, Diesmann M. (2009). A spiking neural network model of an actor-critic learning agent. Neural computation. 21 [PubMed]
Reynolds JN, Hyland BI, Wickens JR. (2001). A cellular mechanism of reward-related learning. Nature. 413 [PubMed]
Roberts PD, Santiago RA, Lafferriere G. (2008). An implementation of reinforcement learning based on spike timing dependent plasticity. Biological cybernetics. 99 [PubMed]
Saeb S, Weber C, Triesch J. (2009). Goal-directed learning of features and forward models. Neural networks : the official journal of the International Neural Network Society. 22 [PubMed]
Samejima K, Ueda Y, Doya K, Kimura M. (2005). Representation of action-specific reward values in the striatum. Science (New York, N.Y.). 310 [PubMed]
Schmidhuber J. (2014). Deep learning in neural networks: An overview. CoRR abs-1404.7828.
Schultz W, Dayan P, Montague PR. (1997). A neural substrate of prediction and reward. Science (New York, N.Y.). 275 [PubMed]
Szatmary B, Izhikevich EM (2010). (). Spike-timing theory of working memory PLoS Computational Biology.
Trappenberg T, Hollensen P, Hartono P. (2011). Topographic RBM as robot controller. The 21st Annual Conference of the Japanese Neural Network Society.
Whitehead SD, Lin LJ. (1995). Reinforcement learning of non-Markov decision processes Artificial Intel. 73
Rössert C, Dean P, Porrill J. (2015). At the Edge of Chaos: How Cerebellar Granular Layer Network Dynamics Can Provide the Basis for Temporal Filters. PLoS computational biology. 11 [PubMed]
Wilson CJ, Beverlin B, Netoff T. (2011). Chaotic desynchronization as the therapeutic mechanism of deep brain stimulation. Frontiers in systems neuroscience. 5 [PubMed]