Chadderdon GL, Neymotin SA, Kerr CC, Lytton WW. (2012). Reinforcement learning of targeted movement in a spiking neuronal model of motor cortex. PloS one. 7 [PubMed]
Legenstein R, Pecevski D, Maass W. (2008). A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback. PLoS computational biology. 4 [PubMed]
Mozafari M, Kheradpisheh SR, Masquelier T, Nowzari-Dalini A, Ganjtabesh M. (2018). First-Spike-Based Visual Categorization Using Reward-Modulated STDP IEEE Transactions on Neural Networks and Learning Systems.
Neymotin SA, Chadderdon GL, Kerr CC, Francis JT, Lytton WW. (2013). Reinforcement learning of two-joint virtual arm reaching in a computer model of sensorimotor cortex. Neural computation. 25 [PubMed]
Richmond P, Buesing L, Giugliano M, Vasilaki E. (2011). Democratic population decisions result in robust policy-gradient learning: a parametric study with GPU simulations. PloS one. 6 [PubMed]