Bartlett PL, Baxter J. (2001). Infinite-horizon policy-gradient estimation J Artif Intell Res. 15

See more from authors: Bartlett PL · Baxter J

References and models cited by this paper
References and models that cite this paper

Baras D, Meir R. (2007). Reinforcement learning, spike-time-dependent plasticity, and the BCM rule. Neural computation. 19 [PubMed]

Fiete IR, Fee MS, Seung HS. (2007). Model of birdsong learning based on gradient estimation by dynamic perturbation of neural conductances. Journal of neurophysiology. 98 [PubMed]

Florian RV. (2007). Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity. Neural computation. 19 [PubMed]

Roelfsema PR, van Ooyen A. (2005). Attention-gated reinforcement learning of internal representations for classification. Neural computation. 17 [PubMed]

This website requires cookies and limited processing of your personal data in order to function. By continuing to browse or otherwise use this site, you are agreeing to this use. See our Privacy policy and how to cite and terms of use.