Andrews D, Mallows C. (1974). Scale mixtures of normal distributions J Royal Stat Soc. 36
Atick JJ, Redlich AN. (1992). What does the Retina Know about Natural Scenes? Neural Comput. 4
Bell AJ, Sejnowski TJ. (1995). An information-maximization approach to blind separation and blind deconvolution. Neural computation. 7 [PubMed]
Bell AJ, Sejnowski TJ. (1997). The "independent components" of natural scenes are edge filters. Vision research. 37 [PubMed]
Haussler D, Freund Y. (1992). Unsupervised learning of distributions of binary vectors using 2-layer networks Advances in neural information processing systems.
Heskes T. (1998). Selecting weighting factors in logarithmic opinion pools Advances in neural information processing systems.
Hinton G, Carreira-perpinan M. (2005). On contrastive divergence learning. Proceedings of the Society for Artificial Intelligence and Statistics.
Hinton G, Teh Y. (2001). Discovering multiple constraints that are frequently approximately satisfied Proceedings of the Conference on Uncertainty in Artificial Intelligence.
Hinton G, Teh Y, Welling M, Osindero S. (2003). Energy-based models for sparse overcomplete representations Journal Of Machine Learning Res [special Issue]. 4
Hinton G, Welling M, Osindero S. (2002). Learning sparse topographic representations with products of student-t distributions Advances in neural information processing systems. 15
Hinton G, Zemel R, Welling M. (2002). Self-supervised boosting Advances in neural information processing systems. 15
Hinton G, Zemel R, Welling M. (2003). A tractable probabilistic model for projection pursuit Proceedings of the Conference on Uncertainty in Artificial Intelligence.
Hinton GE. (2002). Training products of experts by minimizing contrastive divergence. Neural computation. 14 [PubMed]
Hoyer PO, Hyvärinen A. (2000). Independent component analysis applied to feature extraction from colour and stereo images. Network (Bristol, England). 11 [PubMed]
Hyvärinen A, Hoyer PO. (2001). A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images. Vision research. 41 [PubMed]
Hyvärinen A, Hoyer PO, Inki M. (2001). Topographic independent component analysis. Neural computation. 13 [PubMed]
Karklin Y, Lewicki MS. (2003). Learning higher-order structures in natural images. Network (Bristol, England). 14 [PubMed]
Karklin Y, Lewicki MS. (2005). A hierarchical Bayesian model for learning nonlinear statistical regularities in nonstationary natural signals. Neural computation. 17 [PubMed]
Lewicki MS, Sejnowski TJ. (2000). Learning overcomplete representations. Neural computation. 12 [PubMed]
Marks TK, Movellan JR. (2001). Diffusion networks, products of experts, and factor analysis Tech. Rep. UCSD MPLab TR 2001.02.
Olshausen BA, Field DJ. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature. 381 [PubMed]
Olshausen BA, Field DJ. (1997). Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision research. 37 [PubMed]
Portilla J, Strela V, Wainwright MJ, Simoncelli EP. (2003). Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE transactions on image processing : a publication of the IEEE Signal Processing Society. 12 [PubMed]
Ringach DL. (2002). Spatial structure and symmetry of simple-cell receptive fields in macaque primary visual cortex. Journal of neurophysiology. 88 [PubMed]
Simoncelli E. (1997). Statistical models for images: Compression, restoration and synthesis Proceedings of the 31st Asilomar Conference on Signals, Systems and Computers.
Simoncelli EP, Wainwright MJ. (2000). Scale mixtures of gaussians and the statistics of natural images Advances in neural information processing systems. 12
Simoncelli EP, Wainwright MJ, Willsky AS. (2000). Random cascades of gaussian scale mixtures and their use in modelingnatural images with application to denoising Proceedings of the 7th International Conference on Image Processing.
Smolensky P. (1986). Information processing in dynamical systems: Foundations of harmony theory Parallel distributed processing: Explorations in the microstructure of cognition. 1
Welling M, Zemel RS, Hinton GE. (2004). Probabilistic sequential independent components analysis. IEEE transactions on neural networks. 15 [PubMed]
Williams C, Agakov F, Felderhof S. (2001). Products of gaussians Advances in neural information processing systems. 14
Williams C, Welling M, Agakov F. (2003). Extreme components analysis Advances in neural information processing systems. 16
Williams CKI, Agakov F. (2002). An analysis of contrastive divergence learning in gaussian Boltzmann machines Tech. Rep. EDI-INF-RR-0120.
Willmore B, Tolhurst DJ. (2001). Characterizing the sparseness of neural codes. Network (Bristol, England). 12 [PubMed]
Wu YN, Mumford D, Zhu SC. (1998). Filters, random fields and maximum entropy (frame): Towards a unified theory for texture modeling International Journal Of Computer Vision. 27
Yuille A. (2004). A comment on contrastive divergence Tech Rep.
van Hateren JH, van der Schaaf A. (1998). Independent component filters of natural images compared with simple cells in primary visual cortex. Proceedings. Biological sciences. 265 [PubMed]
Schwartz O, Sejnowski TJ, Dayan P. (2006). Soft mixer assignment in a hierarchical generative model of natural scene statistics. Neural computation. 18 [PubMed]
Turner R, Sahani M. (2007). A maximum-likelihood interpretation for slow feature analysis. Neural computation. 19 [PubMed]