Amari S. (1982). Differential geometry of curved exponential families-curvatures and information loss Ann Stat. 10
Amari S. (1991). Dualistic geometry of the manifold higher-order neurons Neural Netw. 4
Amari S. (1995). Information geometry of EM and EM algorithms for neural networks Neural Netw. 8
Amari S, Ikeda S, Shimokawa H. (2001). Information geometry and meanfield approximation: The Œ+ projection approach Advanced mean field methods: Theory and practive.
Amari S, Kurata K, Nagaoka H. (1992). Information geometry of Boltzmann machines. IEEE transactions on neural networks. 3 [PubMed]
Amari S, Nagaoka H. (2000). Methods of information geometry.
Amari S, Takeuchi J. (2004). Parallel prior and its properties IEEE Transaction on Information Theory (submitted).
Amari SI. (1985). Differential-geometrical methods in statistics.
Bauschke HH, Borwein JM, Combettes PL. (2002). Bregman monotone optimization algorithms CECM Preprint 02:184 (Available on-line: http:--www.cecm.sfu.ca-preprints-2002pp.html).
Bauschke HH, Combettes PL. (2002). Iterating Bregman retractions CECM Preprint 02:186 (Available on-line: http:--www.cecm.sfu.ca-preprints-2002pp.html).
Bregman LM. (1967). The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming USSR Computational Mathematics And Physics. 7
Chentsov NN. (1982). Statistical decision rules and optimal inference.
Csiszar I. (1967). On topical properties of f-divergence Studia Mathematicarum Hungarica. 2
Eguchi S. (1983). Second order efficiency of minimum contrast estimators in a curved exponential family Ann Stat. 11
Eguchi S. (1992). Geometry of minimum contrast Hiroshima Mathematical Journal. 22
Eguchi S. (2002). U-boosting method for classification and information geometry Paper presented at the SRCCS International Statistical Workshop.
Hardy G, Littlewood JE, Polya G. (1952). Inequalities.
Kaas RE, Vos PW. (1997). Geometric foundation of asymptotic inference.
Kurose T. (1994). On the divergences of 1-conformally flat statistical manifolds Tohoko Mathematical Journal. 46
Lafferty J, Lebanon G. (2002). Boosting and maximum likelihood for exponential models Advances in neural information processing systems. 14
Lauritzen S. (1987). Statistical manifolds Differential geometry in statistical inference.
Matsuzoe H. (1998). On realization of conformally-projectively flat statistical manifolds and the divergences Hokkaido Mathematical Journal. 27
Matsuzoe H. (1999). Geometry of contrast functions and conformal geometry Hiroshima Mathematical Journal. 29
Matumoto T. (1993). Any statistical manifold has a contrast function-On the C3-functions taking the minimum at the diagonal of the product manifold Hiroshima Mathematical Journal. 23
Mihoko M, Eguchi S. (2002). Robust blind source separation by beta divergence. Neural computation. 14 [PubMed]
Nakahara H, Amari S, Ikeda S. (1999). Convergence of the wake-sleep algorithm Advances in neural information processing systems. 11
Rao CR. (1987). Differential metrics in probability spaces Differential geometry in statistical inference.
Rockafellar RT. (1970). Convex analysis.
Sejnowski TJ, Ackley DH, Hinton GE. (1985). A learning algorithm for Bolzmann machines. Cognitive Sci. 9
Shima H. (1978). Compact locally Hessian manifolds Osaka Journal Of Mathematics. 15
Uohashi K, Ohara A, Fujii T. (2000). 1-Conformally flat statistical submanifolds Osaka Journal Of Mathematics. 37
Yagi K, Shima H. (1997). Geometry of Hessian manifolds Differential Geometry And Its Applications. 7
Zhu HY, Rohwer R. (1995). Bayesian invariant measurements of generalization Neural Processing Letter. 2
Zhu HY, Rohwer R. (1997). Measurements of generalisation based on information geometry Mathematics of neural networks: Model algorithms and applications.
della_Pietra S, Lafferty J, della_Pietra V. (1997). Statistical learning algorithms based on Bregman distances Proceedings of 1997 Canadian Workshop on Information Theory.
della_Pietra S, dDella_Pietra V, Lafferty J. (2002). Duality and auxiliary functions for Bregman distances Tech. Rep. No. CMU-CS-01-109.
Amari S. (2007). Integration of stochastic models by minimizing alpha-divergence. Neural computation. 19 [PubMed]