Andrieu C, Doucet A, Godsill S. (2000). On sequential Monte Carlo sampling methods for Bayesian filtering Stat Comput. 10
Barto AG, Sutton RS. (1998). Reinforcement learning: an introduction.
Barto AG, Sutton RS, Anderson CW. (1983). Neuronlike elements that can solve difficult learning control problems IEEE Trans Systems Man Cybern. 13
Crassidis JL, Junkins JL. (2004). Optimal estimation of dynamic systems.
Doya K. (2000). Reinforcement learning in continuous time and space. Neural computation. 12 [PubMed]
Doya K, Morimoto J. (2002). Development of an observer by using reinforcement learning Proc 12th Annual Conf Japn Neural Network Soc.
Erdogmus D, Principe JC, Genc AU. (2002). A neural network perspectiveto extended Luenberger observers Institute Of Measurement And Control. 35
Erdogmus D, Principe JC, Xu JW. (2005). Minimum error entropy Luenberger observer Proc Am Control Conf.
Gordon N, Arulampalam MS, Maskell S, Clapp T. (2002). A tutorial on particle filters for online nonlinear-non-gaussian bayesian tracking IEEE Trans Signal Process. 50
Goswami A, Thuilot B, Espiau B. (1996). Compass-like biped robot part I: Stability and bifurcation of passive gaits Tech Rep No RR-2996 INRI.
Kaelbling LP, Littman ML, Cassandra AR. (1995). Learning policies for partially observable environments: Scaling up Proc 12th Intl Conf Mach Learn.
Kaelbling LP, Meuleau N, Kim KE. (2001). Exploration in gradient-based reinforcement learning Tech Rep MIT.
Kaelbling LP, Meuleau N, Peshkin L, Kim KE. (2000). Learning finite state controllers for partially observable environments Proc 15th Ann Conf Uncertainty in Artificial Intelligence.
Kalman RE, Bucy R. (1961). New results in linear filtering and prediction Trans ASME J Basic Eng. 83
Kobayashi S, Kimura H. (1998). An analysis of actor-critic algorithms using eligibility traces: Reinforcement learning with imperfect value functions Proc 15th Intl Conf Mach Learn.
Luenberger DG. (1971). An introduction to observers IEEE Trans. 16
McCallum RA. (1995). Reinforcement learning with selective perception and hidden state Unpublished doctoral dissertation, University of Rochester.
Moore AW, Baird LC. (1999). Gradient descent for general reinforcement learning Advances in neural information processing systems. 11
Porter LL, Passino KM. (1995). Genetic adaptive observers Engineering Applications of Artificial Intelligence. 8
Raghavan IR, Hedrick JK. (1994). Observer design for a class of nonlinear systems International Journal Of Control. 59
Singh SP, Jordan MI, Jaakkola T. (1995). Reinforcement learning algorithm for partially observable Markov decision problems Advances in neural information processing systems. 7
Thau FE. (1973). Observing the state of nonlinear dynamic systems Intl J Control. 17
Thrun S. (2000). Monte Carlo POMDPs Advances in neural information processing systems. 12
Wan EA, Nelson AT. (1997). Dual Kalman filtering methods for nonlinear prediction, smoothing, and estimation Advances in neural information processing systems. 9
van_der_Merwe R, Wan E. (2000). The unscented Kalman filter for nonlinear estimation Proc IEEE Symposium.