Aronszajn N. (1950). Theory of reproducing kernels Transactions Of The American Mathematical Society. 68
Haussler D, Ben-david S, Alon N, Cesa-bianchi N. (1993). Scale-sensitive dimensions,uniform convergence and learnability Proceedings of the 34th Annual IEEE Conference on Foundations of Computer Science.
Lugosi G, Vayatis N. (2004). On the Bayes-risk consistency of regularized boosting methods Ann Stat. 32
Poggio T, Evgeniou T, Pontil M. (2000). Regularization networks and support vector machines Adv Comp Math. 13
Poggio T, Girosi F, Jones M. (1995). Regularization theory and neural network architectures Neural Comput. 7
Pontil M. (2003). A note on different covering numbers in learning theory J Complexity. 19
Rockafellar RT. (1970). Convex analysis.
Rosasco L, Caponnetto A, Piana M, De_vito E. (2003). Notes on the use of different loss functions Tech. Rep. No. DISI-TR-03-07.
Shawe-taylor J, Cristianini N. (2000). An introduction to support vector machines.
Smale S, Cucker F. (2001). On the mathematical foundations of learning Bull Amer Math Soc. 39
Smale S, Cucker F. (2002). Best choices for regularization parameters in learning theory: On the bias-variance problem Foundation Of Computational Mathematics. 2
Tibshirani R, Hastie T, Friedman J. (2001). The elements of statistical learning.
Vapnik V. (1995). The Nature of Statistical Learning Theory.
Vapnik V. (1998). Statistical Learning Theory.
Wahba G. (1990). Splines models for observational data.
Zhang H, Lee Y, Lin G, Wahba Y. (2003). Statistical properties and adaptive tuning of support vector machines Mach Learn. 48
Zhang T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization Ann Stat. 32
Zhou DX. (2002). The covering number in learning theory J Complexity. 18
Wu Q, Zhou DX. (2005). SVM Soft Margin Classifiers: Linear Programming versus Quadratic Programming Neural Comput. 17