Bartlett PL, Baxter J, Weaver L. (1999). Direct gradient-based reinforcement learning: II. Gradient ascent algorithms and experiments Tech Rep Australian National University, Research School of Information Sciences and Engineering.

See more from authors: Bartlett PL · Baxter J · Weaver L

References and models cited by this paper
References and models that cite this paper

Florian RV. (2007). Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural computation. 19 [PubMed]

Legenstein R, Pecevski D, Maass W. (2008). A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback. PLoS computational biology. 4 [PubMed]

This website requires cookies and limited processing of your personal data in order to function. By continuing to browse or otherwise use this site, you are agreeing to this use. See our Privacy policy and how to cite and terms of use.