yzhao062 / combo Goto Github PK
View Code? Open in Web Editor NEW(AAAI' 20) A Python Toolbox for Machine Learning Model Combination
Home Page: https://pycombo.readthedocs.io
License: BSD 2-Clause "Simplified" License
(AAAI' 20) A Python Toolbox for Machine Learning Model Combination
Home Page: https://pycombo.readthedocs.io
License: BSD 2-Clause "Simplified" License
I saw "predict function is currently disabled for clustering due to inconsistent behaviours." in predict method of ClustererEnsemble class. Could you explain more and what's the meaning of "inconsistent behaviours". Many thanks for your work and looking forward your reply.
There is circular dependency for installation between combo, pyod, and suod. Can you please explain more on this?. Any plans to remove this cyclic dependency?
Dependency details :
hi,i m a Postgraduate freshmen in Zhejiang University of Finance & Economics in China,and appreciate for your work in machine learning domain. When I use the skelearn library, it takes a lot of time to train large amounts of data on CPUs, and maybe use GPUs computing can improve this situation. So, im very excitied to konw that it will be support GPU computing in combo libarary. I m looking forward to it, and do you have a development schedule so I can follow your progress in real time. Thanks very much.
Currently, the implementation of SimpleDetectorAggregator
does not allow for novelty detection usages.
The method _create_scores(self, X)
does apply standardization based on the tensor of scores X. The data evaluated for novelty detection are not transformed the same way as the data used to fit the SimpleDetectorAggregator
, and thus the threshold defined at fitting can not be applied to determine wether or not the data is a novelty.
One notable consequence is that running SimpleDetectorAggregator(...).predict(X[0, :])
(novelty detection on a single point) will always output the a score of 0.
SimpleDetectorAggregator
should instead keep the scalers used when fitting and use them to process the data when creating new scores. This would add support for novelty detection.
Let me know if I am missing something as I am new to combo. If some help is wanted, I might have bandwidth to contribute to this cool project !
Hi there!
I'm currently using combo for my projects, and it appears to me that the library is quite slow. Is there any plan to add numba, as mentioned in the README, to speed it up?
I have had some experiences with numba and will be more than happy to help!
The dev team of combo supports the PR written in either English or Chinese. However, English is preferred for broader audiences. Other languages may be fine if it is easy to understand; although no guarantee that Google translate might work. Thanks for your understanding :)
我们同时支持中英文的Issue report和PR。但如果可能,请大家用英文提问,这样绝大多数人都可以理解并从中受益。谢谢 :)
combo.models.score_comb.median()
only work properly with pd.DataFrame as input X
, which seems to compute the medians over columns rather than rows (as axis=1
works differently for np.ndarray and pd.DataFrame).
As far as I know, average()
and maximization()
don't have such problem, but I haven't tried aom()
nor moa()
yet.
Hi @yzhao062 !
I am interested in implementing the sequential combination
algorithm mentioned in the advanced methods
mentioned in the main figure. Please can you point me to the API for implementing this/ example?
Thanks in advance!
-- Shreyas
Hey there! When trying to use the Evidence Accumulation Clustering class I get an error stating that the number of clusters passed to the class is not supported, and it only supports values within (2, 2147483647)
(notice the usage of parenthesis instead of brackets).
It seems that the error comes from doing check_parameter(n_clusters, low=2, param_name='n_clusters')
on here. The thing is, by default, check_parameter
doesn't include the limits passed as parameters, as seen in its documentation. So to get this working we should instead pass include_left=True
or low=1
.
Is this project still being maintained? If so I would be more than happy to create a PR for this if you need me to, but figured I would ask first.
Thank you so much for working on the project!
Hello, first of all i'd like to say that you've done a great work with combo. I would like to know if it has implemented the Adaboost algorithm for boosting ensembles. I couldn't find any reference to this in the docs.
Hi, any plans to support regression?
HI
surprisingly your combo fails with unbalanced data...probably because You 'split_datasets' function in classifier_stacking module can't handle stratified splitting; in real world task heavily unbalanced data occurs very often;
"Example of Outlier Detector Combination" under readme in the home page shows "from combo.models.detector combination import SimpleDetectorAggregator", where "detector combination" should be "detector_comb". Please verity it.
Hi there,
Do you have a specific reason why the code here only supports even number of detectors or buckets? Thanks!
比如xgb/lgb early stoping参数
AttributeError: 'LGBMRanker' object has no attribute 'fitted_'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.