Anthony M, Bartlett PL. (1999). Neural network learning: Theoretical foundations.
Aronszajn N. (1950). Theory of reproducing kernels Transactions Of The American Mathematical Society. 68
Barron AR. (1990). Complexity regularization with applications to artificial neural networks Nonparametric functional estimation.
Bartlett PL. (1998). The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network IEEE Trans Inform Theory. 44
Bousquet O, Blanchard G, Massart P. (2004). Statistical performance of support vector machines Unpublished manuscript.
Bousquet O, Elisseeff A. (2002). Stability and generalization J Mach Learn Res. 2
Chen DR, Wu Q, Ying Y, Zhou DX. (2004). Support vector machine soft margin classifiers: Error analysis J Mach Learn Res. 5
Devroye L, Gyorfi L, Lugosi G. (1996). A probabilistic theory of pattern recognition.
Girosi F, Niyogi P. (1996). On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions Neural Comput. 8
Jordan MI, Bartlett PL, Mcauliffe JD. (2003). Convexity, classification, and risk bounds Unpublished manuscript.
Kecman V, Hadzic I. (2000). Support vector selection by linear programming Proc IJCNN. 5
Lugosi G, Vayatis N. (2004). On the Bayes-risk consistency of regularized boosting methods Ann Stat. 32
Mangasarian OL, Bradley PS. (2000). Massive data discrimination via linear support vector machines Optimization Methods And Software. 13
Mendelson S. (2002). Improving the sample complexity using global data IEEE Trans Inform Theory. 48
Murata N, Pedroso JP. (2001). Support vector machines with different norms: Motivation, formulations and results Pattern Recognition Letters. 22
Niyogi P. (1998). The informational complexity of learning.
Poggio T, Evgeniou T, Pontil M. (2000). Regularization networks and support vector machines Adv Comp Math. 13
Poggio T, Rifkin R, Mukherjee S. (2002). Regression and classification with regularization Nonlinear estimation and classification.
Pontil M. (2003). A note on different covering numbers in learning theory J Complexity. 19
Rosasco L, De Vito E, Caponnetto A, Piana M, Verri A. (2004). Are loss functions all the same? Neural computation. 16 [PubMed]
Shawe-taylor J, Cristianini N. (2000). An introduction to support vector machines.
Smale S, Cucker F. (2001). On the mathematical foundations of learning Bull Amer Math Soc. 39
Smale S, Zhou DX. (2003). Estimating the approximation error in learning theory Anal Appl. 1
Smale S, Zhou DX. (2004). Shannon sampling and function reconstruction from point values Bull Amer Math Soc. 41
Steinwart I. (2002). Support vector machines are universally consistent J Complexity. 18
Steinwart I, Scovel C. (2005). Fast Rates for Support Vector Machines Learning Theory.
Tsybakov AB. (2004). Optimal aggregation of classifiers in statistical learning Ann Stat. 32
Vapnik V. (1998). Statistical Learning Theory.
Vapnik V, Cortes C. (1995). Support-vector networks Mach Learn. 20
Vapnik V, Guyon I, Boser BE. (1992). A training algorithm for optimal margin classifiers Proceedings Of The Fifth Annual Workshop Of Computational Learning Theory. 5
Wahba G. (1990). Splines models for observational data.
Wu Q, Ying Y, Zhou D. (2007). Multi-kernel regularized classifiers Journal of Complexity. 23
Wu Q, Zhou DX. (2004). Analysis of support vector machine classification Manuscript submitted for publication.
Zhang T. (2002). Covering number bounds of certain regularized linear function classes J Mach Learn Res. 2
Zhang T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization Ann Stat. 32
Zhou DX. (2002). The covering number in learning theory J Complexity. 18
Zhou DX. (2003). Capacity of reproducing kernel spaces in learning theory IEEE Trans Inform Theory. 49
van_der_Vaart AW, Wellner JA. (1996). Weak convergence and empirical processes.