Coder Social home page Coder Social logo

remykarem / mixed-naive-bayes Goto Github PK

View Code? Open in Web Editor NEW
65.0 5.0 7.0 171 KB

Naive Bayes with support for categorical and continuous data

Home Page: https://mixed-naive-bayes.readthedocs.io

License: MIT License

Python 100.00%
categorical-data machine-learning naive-bayes-algorithm

mixed-naive-bayes's Introduction

mixed-naive-bayes's People

Contributors

bharatr21 avatar remykarem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

mixed-naive-bayes's Issues

meaning for feature distributions

can you please clarify , what you mean

under feature distributions per :

discrete features that are categorically distributed. The categories of each feature are drawn from a categorical distribution.
https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.CategoricalNB.html#sklearn.naive_bayes.CategoricalNB

seems to be you mean that distributions are estimated from data?
for given categorical feature it is probabilities and conditional probability (values and target) from calculated from data?

as mentioned in
https://datascience.stackexchange.com/questions/58720/naive-bayes-for-categorical-features-non-binary
Some people recommend using MultinomialNB which according to me doesn't make sense because it considers feature values to be frequency counts

Question about code 268-278, p and t probability calculated by different NB and logic for normalizing them

@remykarem Hi Remy
Thank you for the fantastic work! I notice you calculated the probability for response to be in certain class p, t based on different NB, and tried to combine p, t to a final probability in code 268-278. Yet the usage of pt confused me a little, do they need to be weighted here? Extreme case would be, we have one categorical variable and rest 100+ continuous, pt will off-set the impact of continuous variable. Also could you provide more detailed guild in setting the self.priors?
Thank you very much for the sharing.
B.R.

predict_proba bug

Hello!

I hope you are doing well.
I'm learning Data Science and found your Mixed Naive Bayes implementation on pip. I tested the model and everything worked fine, until I noticed that the method predict_proba is not working correctly. The sum of the different probabilities for the different categories does not add up to 1.
I've discussed this situation with my colleagues and they all run into the same problem. Do you know if it is an error of the implementation of that method?
I am leaving attachments to show you the bug, and I also sent you an email with the Jupyter Notebook and the dataset in case you want to see in detail what I am talking about.

Model
Bug

Thank you for the effort you put into building this model! If you happen to have a solution or/and explanation to the problem I ran into I would really appreciate it.

Good luck!

Integrating GMM

Hi @remykarem , thanks for this library.

Do you anticipate integrating mixture models rather than single Gaussians per feature? This has a lightweight implementation.

Sklearn interface

Hello,

I would have one suggestion/request. When designing scikit like interface, it would be actually extremely benefitial to implement scikit compliant classifier by inheriting from scikit base classes.

You can find how to do it here: https://scikit-learn.org/stable/developers/develop.html

The reason for this request is that sometimes the sklearn demands various properties (like classes_ property) which you can get for free by inheriting from their base classes.

Unfortunately, right now the model is not suitable for things like sklearn ensambling.

Bug report - priors validation

Hello,

I've found your lib on StackOverflow and found it awesome. There is just a little bug which makes it unusable when trying to play with the priors param.

When you do your validation that priors are 1, you are actually raise an error when they are valid.

Please see: https://github.com/remykarem/mixed-naive-bayes/blob/master/mixed_naive_bayes/mixed_naive_bayes.py#L156

The line 156, should actually be:

if not np.isclose(self.priors.sum(), 1.0):

May I ask you to fix it?

Anyway thanks for your time to put this lib together! It is really helpful :-)

David

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.