yzhao062 / combo Goto Github PK

(AAAI' 20) A Python Toolbox for Machine Learning Model Combination

Home Page: https://pycombo.readthedocs.io

License: BSD 2-Clause "Simplified" License

Python 100.00%

machine-learning data-mining data-science ensemble-learning python model-combination aggregation pipeline-framework machine-learning-pipelines

combo's People

Contributors

Stargazers

Watchers

Forkers

vickzhang yyht sprinterzzj jingx8885 little1tow bigbiggirl50194 zeng8280 qwshy feiwofeifeixiaowo liuweiping2020 collector-m quant-kangchen littlemoon13 foristkirito jarygrace xuisshoe xmxoxo kejiejiang frankff amayii ruanhq kyle0612 ricegithup magnetstone limingbei leichen9 work996 silky ml-lab lihengtianxia simashanhe fjchange maomige jasonnilsh quzd1994 lucas4github czzlegend barryzm eshenxd softdzx mayukhdifferent umgaolu jayshah96 patkakou-zz julyyg pps12 ryanbekabe zhangkehao michaelschreier lihuaqiang0101 sanyam07 clatonhendricks kangmengbuct jangelojr marvin106722 yilunchen27 junfanz1 dingluchuan wuhuhaha aiedward jiajieyuan1010 datamining2020 huangama00 chrinide mango99 nssiw1 suvrajeet01 jason-zhh kabilankarunakaran haofengrushui204 quinlanvictor mifeec 321hg yh2010 gsj1029 milkigit huty1998 xrosliang cherrylemon olivier2311 terminiter chetanmehra vishalbelsare stjordanis yaminadjoudi linhduongtuan deeplearning-uci marrygit damarobe druidz xtu-cuichunfeng enriczhang zhongkailv d4rk-lucif3r ccanxue gd-singh011 xfx88 djylb xmftweng tiphu68

combo's Issues

predict method is disabled in ClustererEnsemble class

I saw "predict function is currently disabled for clustering due to inconsistent behaviours." in predict method of ClustererEnsemble class. Could you explain more and what's the meaning of "inconsistent behaviours". Many thanks for your work and looking forward your reply.

Circular dependency for installation.

There is circular dependency for installation between combo, pyod, and suod. Can you please explain more on this?. Any plans to remove this cyclic dependency?
Dependency details :

Package name :combo
Required-by : pyod
Package name : pyod
Required-by : combo, suod
Package name : suod
Required-by : pyod

Thanks for your work and the question of the schedule of update GPU acceleration

hi,i m a Postgraduate freshmen in Zhejiang University of Finance & Economics in China，and appreciate for your work in machine learning domain. When I use the skelearn library, it takes a lot of time to train large amounts of data on CPUs, and maybe use GPUs computing can improve this situation. So, im very excitied to konw that it will be support GPU computing in combo libarary. I m looking forward to it, and do you have a development schedule so I can follow your progress in real time. Thanks very much.

SimpleDetectorAggregator can not be used for novelty detection

Currently, the implementation of SimpleDetectorAggregator does not allow for novelty detection usages.

The method _create_scores(self, X) does apply standardization based on the tensor of scores X. The data evaluated for novelty detection are not transformed the same way as the data used to fit the SimpleDetectorAggregator, and thus the threshold defined at fitting can not be applied to determine wether or not the data is a novelty.

One notable consequence is that running SimpleDetectorAggregator(...).predict(X[0, :]) (novelty detection on a single point) will always output the a score of 0.

SimpleDetectorAggregator should instead keep the scalers used when fitting and use them to process the data when creating new scores. This would add support for novelty detection.

Let me know if I am missing something as I am new to combo. If some help is wanted, I might have bandwidth to contribute to this cool project !

Plan to integrate Numba

Hi there!

I'm currently using combo for my projects, and it appears to me that the library is quite slow. Is there any plan to add numba, as mentioned in the README, to speed it up?

I have had some experiences with numba and will be more than happy to help!

Opening a PR in either English or Chinese is fine [FYI]

The dev team of combo supports the PR written in either English or Chinese. However, English is preferred for broader audiences. Other languages may be fine if it is easy to understand; although no guarantee that Google translate might work. Thanks for your understanding :)

我们同时支持中英文的Issue report和PR。但如果可能，请大家用英文提问，这样绝大多数人都可以理解并从中受益。谢谢 :)

combo.models.score_comb.median() doesn't work properly with pd.DataFrame's

combo.models.score_comb.median() only work properly with pd.DataFrame as input X, which seems to compute the medians over columns rather than rows (as axis=1 works differently for np.ndarray and pd.DataFrame).

As far as I know, average() and maximization() don't have such problem, but I haven't tried aom() nor moa() yet.

API for Sequential Combination

Hi @yzhao062 !

I am interested in implementing the sequential combination algorithm mentioned in the advanced methods mentioned in the main figure. Please can you point me to the API for implementing this/ example?

Thanks in advance!
-- Shreyas

EAC doesn't support n_clusters=2

Hey there! When trying to use the Evidence Accumulation Clustering class I get an error stating that the number of clusters passed to the class is not supported, and it only supports values within (2, 2147483647) (notice the usage of parenthesis instead of brackets).

It seems that the error comes from doing check_parameter(n_clusters, low=2, param_name='n_clusters') on here. The thing is, by default, check_parameter doesn't include the limits passed as parameters, as seen in its documentation. So to get this working we should instead pass include_left=True or low=1.

Is this project still being maintained? If so I would be more than happy to create a PR for this if you need me to, but figured I would ask first.

Thank you so much for working on the project!