Akaho S, Kappen HJ. (2000). Nonmonotonic generalization bias of Gaussian mixture models. Neural computation. 12 [PubMed]
Akaike H. (1974). A new look at the statistical model identification IEEE Trans Appl Comp. 19
Amari S. (1967). Theory of adaptive pattern classifiers IEEE Trans. 16
Amari S. (1977). Dynamics of pattern formation in lateral-inhibition type neural fields. Biological cybernetics. 27 [PubMed]
Amari S. (1987). Differential geometry of a parametric family of invertible linear systems-Riemannian metric, dual affine connections and divergence Mathematical Systems Theory. 20
Amari S. (1998). Natural gradient works efficiently in learning Neural Comput. 10
Amari S. (2003). New consideration on criteria of model selection Neural networks and soft computing (Proceedings of the Sixth International Conference on Neural Networks and Soft Computing).
Amari S, Burnashev MV. (2003). On some singularities in parameter estimation problems Problems Of Information Transmission. 39
Amari S, Murata N. (1993). Statistical theory of learning curves under entropic loss criterion Neural Comput. 5
Amari S, Nagaoka H. (2000). Methods of information geometry.
Amari S, Nakahara H. (2005). Difficulty of singularity in population coding. Neural computation. 17 [PubMed]
Amari S, Ozeki T. (2001). Differential and algebraic geometry of multilayer perceptrons IEICE Trans. 84
Amari S, Park H, Fukumizu K. (2000). Adaptive method of realizing natural gradient learning for multilayer perceptrons. Neural computation. 12 [PubMed]
Amari S, Park H, Ozeki T. (2001). Statistical inference in nonidentifiable and singular statistical models J Of The Korean Statistical Society. 30
Amari S, Park H, Ozeki T. (2002). Geometrical singularities in the neuromani-fold of multilayer perceptrons Advances in neural information processing systems. 14
Amari S, Park H, Ozeki T. (2003). Learning and inference in hierarchical models with singularities Systems And Computers In Japan. 34
Amari S, Rattray M, Saad D. (1998). Natural gradient descent for on-line learning Phys Rev Lett. 81
Brockett RW. (1976). Some geometric questions in the theory of linear systems IEEE Trans On Automatic Control. 21
Dacunha-castelle D, Gassiat E. (1997). Testing in locally conic models, and application to mixture models Probability And Statistics. 1
Fukumizu K. (1999). Generalization error of linear neural networks in unidentifiable cases Algorithmic learning theory: Proceedings of the 10th International Conference on Algorithmic Learning Theory (ALT99).
Fukumizu K. (2003). Likelihood ratio of unidentifiable models and multilayer neural networks Annals Of Statistics. 31
Fukumizu K, Amari S. (2000). Local minima and plateaus in hierarchical structures of multilayer perceptrons. Neural networks : the official journal of the International Neural Network Society. 13 [PubMed]
Hagiwara K. (2002). On the problem in model selection of neural network regression in overrealizable scenario. Neural computation. 14 [PubMed]
Hagiwara K. (2002). Regularization learning, early stopping and biased estimator Neurocomputing. 48
Hagiwara K, Hayasaka T, Toda N, Usui S, Kuno K. (2001). Upper bound of the expected training error of neural network regression for a Gaussian noise sequence. Neural networks : the official journal of the International Neural Network Society. 14 [PubMed]
Hartigan JA. (1985). A failure of likelihood asymptotics for normal mixtures Proc Barkeley Conf in Honor of J Neyman and J Kiefer. 2
Hotelling H. (1939). Tubes and spheres in n-spaces, and a class of statistical problems Amer J Math. 61
Inoue M, Okada M, Park H. (2003). On-line learning theory of soft committee machines with correlated hidden units-Steepest gradient descent and natural gradient descent J Phys Soc Jpn. 72
Inoue M, Okada M, Park H. (2003). On-line learning dynamics of multilayer perceptrons with unidentifiable parameters J Phys A Math Gen. 36
Kang K, Oh JH, Kwon S, Park Y. (1993). Generalization in a two-layer neural networks Phys Rev E. 48
Kurkova V, Kainen PC. (1994). Functionally equivalent feedforward neural networks Neural Comput. 6
Liu X, Shao Y. (2003). Asymptotics for likelihood ratio tests under loss of identifiability Annals Of Statistics. 31
Lu H, Chen AM, Hecht-nielsen R. (1993). On the geometry of feedforward neural network error surfaces Neural Comput. 5
Minsky M. (1969). Perceptrons.
Murata N, Yoshizawa S, Amari S. (1994). Network information criterion-determining the number of hidden units for an artificial neural network model. IEEE transactions on neural networks. 5 [PubMed]
Park H, Amari SI, Fukumizu K. (2000). Adaptive natural gradient learning algorithms for various stochastic models. Neural networks : the official journal of the International Neural Network Society. 13 [PubMed]
Rattray M, Saad D. (1999). Analysis of natural gradient descent for multilayer neural networks Phys Rev E. 59
Riegler P, Biehl M. (1995). On-line backpropagation in two-layered neural networks J Phys A Math Gen. 28
Risssanen J. (1986). Stochastic complexity and modeling Ann Statist. 14
Rosenblatt F. (1962). Principles Of Neurodynamics.
Ruger SM, Ossen A. (1997). The metric of weight space Neural Processing Letters. 5
Rumelhart DE, Hinton GE, Williams RJ. (1986). Learning internal representations by error propagation Parallel Distributed Processing. 1
Saad D, Solla SA. (1995). On-line learning in soft committee machines. Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics. 52 [PubMed]
Schwarz G. (1978). Estimating the dimension of a model Ann Stat. 6
Sussmann HJ. (1992). Uniqueness of the weights for minimal feedforward nets with a given input-output map Neural Netw. 5
Usui S, Hagiwara K, Toda N. (1993). On the problem of applying AIC to determine the structure of a layered feed-forward neural network Proceedings Of IJCNN. 3
Usui S, Hayasaka T, Toda N, Kitahara M. (2000). On the statistical properties of least squares estimators of layered neural networks (in Japanese) IEICE Transactions. 86
Watanabe S. (2001). Algebraic analysis for nonidentifiable learning machines. Neural computation. 13 [PubMed]
Watanabe S. (2001). Algebraic geometrical methods for hierarchical learning machines. Neural networks : the official journal of the International Neural Network Society. 14 [PubMed]
Watanabe S. (2001). Algebraic information geometry for learning machines with singularities Advances in neural information processing systems. 13
Watanabe S, Amari S. (2003). Learning coefficients of layered models when the true distribution mismatches the singularities Neural Comput. 15
Watanabe S, Yamazaki K. (2002). A probabilistic algorithm to calculate the learning curves of hierarchical learning machines with singularities Trans on IEICE. 85
Weyl H. (1939). On the volume of tubes Amer J Math. 61
Wu S, Amari S, Nakahara H. (2002). Population coding and decoding in a neural field: a computational study. Neural computation. 14 [PubMed]
Wu S, Nakahara H, Amari S. (2001). Population coding with correlation and an unfaithful model. Neural computation. 13 [PubMed]
Yamazaki K, Watanabe S. (2003). Singularities in mixture models and upper bounds of stochastic complexity. Neural networks : the official journal of the International Neural Network Society. 16 [PubMed]
Hiratani N, Fukai T. (2018). Redundancy in synaptic connections enables neurons to learn optimally. Proceedings of the National Academy of Sciences of the United States of America. 115 [PubMed]
Nakajima S, Watanabe S. (2007). Variational Bayes solution of linear neural networks and its generalization performance. Neural computation. 19 [PubMed]