Coder Social home page Coder Social logo

skboost's Introduction

skboost

Build Status

Boosting Algorithms compatible with scikit-learn.

Boosting Algorithms

The skboost package contains implementations of some boosting algorithms that are outside the scope of scikit-learn.

The main point of interest is the MILBoost algorithm, which performs boosting with a Multiple Instance Learning formulation.

MILBoost

See [1], [2] and [4].

GentleBoost

See [3].

LogitBoost

See [3].

Datasets

This repository includes a vendored copy of the MUSK datasets ([5]), both version 1 and version 2. These are used for multiple instance learning benchmarks:

This dataset describes a set of 92 molecules of which 47 are judged by human experts to be musks and the remaining 45 molecules are judged to be non-musks. The goal is to learn to predict whether new molecules will be musks or non-musks. However, the 166 features that describe these molecules depend upon the exact shape, or conformation, of the molecule. Because bonds can rotate, a single molecule can adopt many different shapes. To generate this data set, the low-energy conformations of the molecules were generated and then filtered to remove highly similar conformations. This left 476 conformations. Then, a feature vector was extracted that describes each conformation.

This many-to-one relationship between feature vectors and molecules is called the "multiple instance problem". When learning a classifier for this data, the classifier should classify a molecule as "musk" if ANY of its conformations is classified as a musk. A molecule should be classified as "non-musk" if NONE of its conformations is classified as a musk.

References

[1] B. Babenko, P. Dollar, Z. Tu, and S. Belongie. Simultaneous learning and alignment: Multi-instance and multi-pose learning. In Faces in Real-Life Images, October 2008.

[2] Babenko, B.; Ming-Hsuan Yang; Belongie, S., "Robust Object Tracking with Online Multiple Instance Learning," in Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.33, no.8, pp.1619-1632, Aug. 2011 doi: 10.1109/TPAMI.2010.226

[3] Friedman, Jerome, Hastie, Trevor, Tibshirani, Robert & others (2000). Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The annals of statistics, 28, 337-407.

[4] Paul Viola, John C. Platt, and Cha Zhang. Multiple instance boosting for object detection. In In NIPS 18, pages 1419–1426. MIT Press, 2006.

[5] Lichman, M. (2013). UCI Machine Learning Repository http://archive.ics.uci.edu/ml. Irvine, CA: University of California, School of Information and Computer Science.

Other references

[6] C.M. Bishop. Pattern recognition and machine learning. Information science and statistics. Springer, 2006.

[7] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, March 2004.

[8] Thomas G. Dietterich and Richard H. Lathrop. Solving the multiple- instance problem with axis-parallel rectangles. Artificial Intelligence, 89:31–71, 1997.

[9] Yoav Freund and Robert E. Schapire. A short introduction to boosting, 1999.

[10] Jerome H. Friedman. Stochastic gradient boosting. Computational Statistics and Data Analysis, 38:367–378, 1999.

[11] Jerome H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189–1232, 2000.

[12] James D. Keeler, David E. Rumelhart, and Wee Kheng Leow. Integrated segmentation and recognition of hand-printed numerals. In NIPS’90, pages 557–563, 1990.

[13] Llew Mason, Jonathan Baxter, Peter Bartlett, and Marcus Frean. Boosting algorithms as gradient descent in function space, 1999.

[14] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. Numerical Recipes 3rd Edition: The Art of Sci- entific Computing. Cambridge University Press, New York, NY, USA, 3 edition, 2007.

[15] Vladimir N. Vapnik. The nature of statistical learning theory. Springer- Verlag New York, Inc., New York, NY, USA, 1995.

[16] Paul Viola and Michael Jones. Robust real-time object detection. International Journal of Computer Vision, 57(2):137–154, 2002.

skboost's People

Contributors

hbldh avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.