Coder Social home page Coder Social logo

yzhao062 / combo Goto Github PK

View Code? Open in Web Editor NEW
638.0 30.0 107.0 5.07 MB

(AAAI' 20) A Python Toolbox for Machine Learning Model Combination

Home Page: https://pycombo.readthedocs.io

License: BSD 2-Clause "Simplified" License

Python 100.00%
machine-learning data-mining data-science ensemble-learning python model-combination aggregation pipeline-framework machine-learning-pipelines

combo's People

Contributors

yzhao062 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

combo's Issues

predict method is disabled in ClustererEnsemble class

I saw "predict function is currently disabled for clustering due to inconsistent behaviours." in predict method of ClustererEnsemble class. Could you explain more and what's the meaning of "inconsistent behaviours". Many thanks for your work and looking forward your reply.

Circular dependency for installation.

There is circular dependency for installation between combo, pyod, and suod. Can you please explain more on this?. Any plans to remove this cyclic dependency?
Dependency details :

  1. Package name :combo
    Required-by : pyod
  2. Package name : pyod
    Required-by : combo, suod
  3. Package name : suod
    Required-by : pyod

Thanks for your work and the question of the schedule of update GPU acceleration

hi,i m a Postgraduate freshmen in Zhejiang University of Finance & Economics in China,and appreciate for your work in machine learning domain. When I use the skelearn library, it takes a lot of time to train large amounts of data on CPUs, and maybe use GPUs computing can improve this situation. So, im very excitied to konw that it will be support GPU computing in combo libarary. I m looking forward to it, and do you have a development schedule so I can follow your progress in real time. Thanks very much.

SimpleDetectorAggregator can not be used for novelty detection

Currently, the implementation of SimpleDetectorAggregator does not allow for novelty detection usages.

The method _create_scores(self, X) does apply standardization based on the tensor of scores X. The data evaluated for novelty detection are not transformed the same way as the data used to fit the SimpleDetectorAggregator, and thus the threshold defined at fitting can not be applied to determine wether or not the data is a novelty.

One notable consequence is that running SimpleDetectorAggregator(...).predict(X[0, :]) (novelty detection on a single point) will always output the a score of 0.

SimpleDetectorAggregator should instead keep the scalers used when fitting and use them to process the data when creating new scores. This would add support for novelty detection.

Let me know if I am missing something as I am new to combo. If some help is wanted, I might have bandwidth to contribute to this cool project !

Plan to integrate Numba

Hi there!

I'm currently using combo for my projects, and it appears to me that the library is quite slow. Is there any plan to add numba, as mentioned in the README, to speed it up?

I have had some experiences with numba and will be more than happy to help!

Opening a PR in either English or Chinese is fine [FYI]

The dev team of combo supports the PR written in either English or Chinese. However, English is preferred for broader audiences. Other languages may be fine if it is easy to understand; although no guarantee that Google translate might work. Thanks for your understanding :)

我们同时支持中英文的Issue report和PR。但如果可能,请大家用英文提问,这样绝大多数人都可以理解并从中受益。谢谢 :)

combo.models.score_comb.median() doesn't work properly with pd.DataFrame's

combo.models.score_comb.median() only work properly with pd.DataFrame as input X, which seems to compute the medians over columns rather than rows (as axis=1 works differently for np.ndarray and pd.DataFrame).

As far as I know, average() and maximization() don't have such problem, but I haven't tried aom() nor moa() yet.

API for Sequential Combination

Hi @yzhao062 !

I am interested in implementing the sequential combination algorithm mentioned in the advanced methods mentioned in the main figure. Please can you point me to the API for implementing this/ example?

Thanks in advance!
-- Shreyas

EAC doesn't support n_clusters=2

Hey there! When trying to use the Evidence Accumulation Clustering class I get an error stating that the number of clusters passed to the class is not supported, and it only supports values within (2, 2147483647) (notice the usage of parenthesis instead of brackets).

It seems that the error comes from doing check_parameter(n_clusters, low=2, param_name='n_clusters') on here. The thing is, by default, check_parameter doesn't include the limits passed as parameters, as seen in its documentation. So to get this working we should instead pass include_left=True or low=1.

Is this project still being maintained? If so I would be more than happy to create a PR for this if you need me to, but figured I would ask first.

Thank you so much for working on the project!

Question about boosting

Hello, first of all i'd like to say that you've done a great work with combo. I would like to know if it has implemented the Adaboost algorithm for boosting ensembles. I couldn't find any reference to this in the docs.

issue with heavily unbalanced data

HI
surprisingly your combo fails with unbalanced data...probably because You 'split_datasets' function in classifier_stacking module can't handle stratified splitting; in real world task heavily unbalanced data occurs very often;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.