Amari S, Nagaoka H. (2000). Methods of information geometry.
Bartlett P, Freund Y, Schapire R, Lee W. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods Ann Stat. 26
Bartlett PL, Baxter J, Mason L, Frean M. (1999). Boosting algorithms as gradient descent Advances in neural information processing systems. 11
Bertsekas DP. (1999). Nonlinear programming (2nd ed).
Blanchard G, Schafer C, Rozenholc Y, Muller KR . (2005). Optimal dyadicdecision trees (Tech. rep.) Berlin: Fraunhofer FIRST, 2005. Available online athttp:--ida.first.fraunhofer.de-blanchard-publi-index.html..
Breiman L. (1994). Bagging predictors Tech. rep. 421 Berkeley: Statistics Department,University of California, Berkeley.
Copas J. (1988). Binary regression models for contaminated data J Royal Statist Soc B. 50
Eguchi S, Murata N, Takenouchi T, Kanamori T. (2004). The most robust loss function for boosting Neural Information Processing: 11th International Conference.
Freund Y, Schapire R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting J Comput Sys Sci. 55
Friedman JH, Breiman L, Olshen RA, Stone CJ. (1983). Classification and regression trees.
Grunwald PD, Dawid AP. (2004). Game theory, maximum entropy, minimum discrepancy, and robust Bayesian decision theory Ann Stat. 32
Halmos PR. (1974). Measure theory.
Hampel FR, Rousseeuw PJ, Ronchetti EM, Stahel WA. (1986). Robust statistics: The approach based on influence functions.
Jordan MI, Bartlett PL, Mcauliffe JD. (2003). Convexity, classification, and risk bounds Unpublished manuscript.
Kalai A, Servedio RA. (2003). Boosting in the presence of noise STOC03:Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing.
Lafferty J, Lebanon G. (2002). Boosting and maximum likelihood for exponential models Advances in neural information processing systems. 14
Maccullagh PA, Nelder J. (1989). Generalized linear models.
Mclachlan G. (1992). Discriminant analysis and statistical pattern recognition.
Meir R, Ratsch G. (2003). An introduction to Boosting and leveraging Advanced lectures on machine learning (Available on-line: http:--www.boosting.org-papers-MeiRae03.ps.gz).
Muller KR, Ratsch G, Onoda T. (2001). Soft margins for AdaBoost Mach Learn. 42
Murata N, Takenouchi T, Kanamori T, Eguchi S. (2004). Information geometry of U-Boost and Bregman divergence. Neural computation. 16 [PubMed]
Ratsch G. (2001). Robust boosting via convex optimization Unpublished doctoral dissertation.
Ratsch G, Demiriz A, Bennett K. (2002). Sparse regression ensembles in infinite and finite hypothesis spaces Mach Learn. 48
Rosset S. (2005). Robust boosting and its relation to bagging Proc 11th ACM SICKDD International Conference on Knowledge Discovery in Data Mining.
Scholkopf B et al. (2000). Robust ensemble learning.
Scholkopf B, Smola AJ. (2001). Learning with kernels: Support vector machines, regularization, optimization, and beyond.
Servedio R. (2003). Smooth boosting and learning with malicious noise J Mach Learn Res. 4
Shawe-Taylor J, Bennett KP, Demiriz A. (2002). Linear programming boostingvia column generation Mach Learn. 46
Takenouchi T, Eguchi S. (2004). Robustifying AdaBoost by adding the naive error rate. Neural computation. 16 [PubMed]
Tibshirani R, Hastie T, Friedman J. (2000). Additive logistic regression: A statictical view of boosting Ann Stat. 28
Vapnik V. (1998). Statistical Learning Theory.
Vapnik V, Cortes C. (1995). Support-vector networks Mach Learn. 20
Victoria-feser MP. (2002). Robust inference with binary data Psychometrika. 67
Watanabe O, Domingo C. (2000). MadaBoost: A modification of AdaBoost Proceedings of the 13th Conference on Computational Learning Theory.
van_derVaart A. (1998). Asymptotic statistics.