Coder Social home page Coder Social logo

pycorels's Introduction

Pycorels

Build Status Documentation Status

Welcome to the python binding of the Certifiably Optimal RulE ListS (CORELS) algorithm!

Overview

CORELS (Certifiably Optimal RulE ListS) is a custom discrete optimization technique for building rule lists over a categorical feature space. Using algorithmic bounds and efficient data structures, our approach produces optimal rule lists on practical problems in seconds.

The CORELS pipeline is simple. Given a dataset matrix of size n_samples x n_features and a labels vector of size n_samples, it will compute a rulelist (similar to a series of if-then statements) to predict the labels with the highest accuracy.

Here's an example: Whoops! The image failed to load

More information about the algorithm can be found here

Dependencies

CORELS uses Python, Numpy, GMP. GMP (GNU Multiple Precision library) is not required, but it is highly recommended, as it improves performance. If it is not installed, CORELS will run slower.

Installation

CORELS exists on PyPI, and can be downloaded with pip install corels

To install from this repo, simply run pip install . or python setup.py install from the corels/ directory.

Here are some detailed examples of how to install all the dependencies needed, followed by corels itself:

Ubuntu

sudo apt install libgmp-dev
pip install corels

Mac

# Install g++ and gmp
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install g++ gmp

pip install corels

Windows

Note: Python 2 is currently NOT supported on Windows.

pip install corels

Troubleshooting

  • If you come across an error saying Python version >=3.5 is required, try running pip install numpy before again running pip install corels.
  • If pip does not successfully install corels, try using pip3

Documentation

The docs for this package are hosted on here: https://pycorels.readthedocs.io/

Tests

After installing corels, run pytest (you may have to install it with pip install pytest first) from the tests/ folder, where the tests are located.

Examples

Large dataset, loaded from this file

from corels import *

# Load the dataset
X, y, _, _ = load_from_csv("data/compas.csv")

# Create the model, with 10000 as the maximum number of iterations 
c = CorelsClassifier(n_iter=10000)

# Fit, and score the model on the training set
a = c.fit(X, y).score(X, y)

# Print the model's accuracy on the training set
print(a)

Toy dataset (See picture example above)

from corels import CorelsClassifier

# ["loud", "samples"] is the most verbose setting possible
C = CorelsClassifier(max_card=2, c=0.0, verbosity=["loud", "samples"])

# 4 samples, 3 features
X = [[1, 0, 1], [0, 0, 0], [1, 1, 0], [0, 1, 0]]
y = [1, 0, 0, 1]
# Feature names
features = ["Mac User", "Likes Pie", "Age < 20"]

# Fit the model
C.fit(X, y, features=features, prediction_name="Has a dirty computer")

# Print the resulting rulelist
print(C.rl())

# Predict on the training set
print(C.predict(X))

More examples are in the examples/ directory

Questions?

Email the maintainer at: [email protected]

pycorels's People

Contributors

digitalpoetry avatar fingoldin avatar ssaamm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pycorels's Issues

Numpy Version Issue: np.bool depreciated

I think related to #21

Getting:

AttributeError: module 'numpy' has no attribute 'bool'.
`np.bool` was a deprecated alias for the builtin `bool`. To avoid this error in existing code, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

Full messages:

AttributeError                            Traceback (most recent call last)
Cell In[5], line 5
      3 defendants = defendants[defendants.is_recid != -1]
      4 y, X = patsy.dmatrices("is_recid ~ sex + age + juv_fel_count + priors_count", defendants)
----> 5 CorelsClassifier().fit(X, y.squeeze())

File [~/opt/miniconda3/lib/python3.10/site-packages/corels/corels.py:157](https://file+.vscode-resource.vscode-cdn.net/Users/nce8/github/unifyingdatascience_solutions/source/exercises/~/opt/miniconda3/lib/python3.10/site-packages/corels/corels.py:157), in CorelsClassifier.fit(self, X, y, features, prediction_name)
    154 if not isinstance(prediction_name, str):
    155     raise TypeError("Prediction name must be a string, got: " + str(type(prediction_name)))
--> 157 label = check_array(y, ndim=1)
    158 labels = np.stack([ np.invert(label), label ])
    159 samples = check_array(X, ndim=2)

File [~/opt/miniconda3/lib/python3.10/site-packages/corels/utils.py:17](https://file+.vscode-resource.vscode-cdn.net/Users/nce8/github/unifyingdatascience_solutions/source/exercises/~/opt/miniconda3/lib/python3.10/site-packages/corels/utils.py:17), in check_array(x, ndim)
     13 if ndim and ndim != x.ndim:
     14     raise ValueError("Array must be " + str(ndim) + "-dimensional in shape, got " + str(x.ndim) +
     15                      " dimensions instead")
---> 17 asbool = x.astype(np.bool)
     19 if not np.array_equal(x, asbool):
     20     raise ValueError("Array must contain only binary members (0 or 1), got " + str(x));

File [~/opt/miniconda3/lib/python3.10/site-packages/numpy/__init__.py:305](https://file+.vscode-resource.vscode-cdn.net/Users/nce8/github/unifyingdatascience_solutions/source/exercises/~/opt/miniconda3/lib/python3.10/site-packages/numpy/__init__.py:305), in __getattr__(attr)
    300     warnings.warn(
    301         f"In the future `np.{attr}` will be defined as the "
    302         "corresponding NumPy scalar.", FutureWarning, stacklevel=2)
    304 if attr in __former_attrs__:
--> 305     raise AttributeError(__former_attrs__[attr])
    307 # Importing Tester requires importing all of UnitTest which is not a
    308 # cheap import Since it is mainly used in test suits, we lazy import it
    309 # here to save on the order of 10 ms of import time for most users
    310 #
    311 # The previous way Tester was imported also had a side effect of adding
    312 # the full `numpy.testing` namespace
    313 if attr == 'testing':

Doesn't include requirements on install

Trying to install pip install corels in a clean environment fails because it doesn't include numpy as a requirement.

Can you update the setup.py to look something more like:

from setuptools import setup

with open("README", 'r') as f:
    long_description = f.read()

setup(
   name='foo',
   version='1.0',
   description='A useful module',
   license="MIT",
   long_description=long_description,
   author='Man Foo',
   author_email='[email protected]',
   url="http://www.foopackage.com/",
   packages=['foo'],  #same as name
   install_requires=['bar', 'greek'], #external packages as dependencies
   scripts=[
            'scripts/cool',
            'scripts/skype',
           ]
)

can you add version parameter to main class

can you add version parameter to main class
dir(C)
['class', 'delattr', 'dict', 'dir', 'doc', 'eq', 'format', 'ge', 'getattribute', 'gt', 'hash', 'init', 'init_subclass', 'le', 'lt', 'module', 'ne', 'new', 'reduce', 'reduce_ex', 'repr', 'setattr', 'sizeof', 'str', 'subclasshook', 'weakref', 'estimator_type', 'ablation', 'c', 'fit', 'get_params', 'load', 'map_type', 'max_card', 'min_support', 'n_iter', 'policy', 'predict', 'rl', 'rl', 'save', 'score', 'set_params', 'verbosity']

from
from corels import CorelsClassifier

["loud", "samples"] is the most verbose setting possible

C = CorelsClassifier(max_card=2, c=0.8, verbosity=["loud", "samples"])

4 samples, 3 features

X = [[1, 0, 1], [0, 0, 0], [1, 1, 0], [0, 1, 0]]
y = [1, 0, 0, 1]

Feature names

features = ["Mac User", "Likes Pie", "Age < 20"]

Fit the model

C.fit(X, y, features=features, prediction_name="Has a dirty computer")

Print the resulting rulelist

print(C.rl())

Predict on the training set

print("Prediction: " + str(C.predict(X)))

can you share examples for each parameter pls

there are many parameters for corels
but in example one few used
can you share examples for each parameter pls
better with demonstration how parameters influence to solution and recommendation how to choose parameters

class CorelsClassifier:
"""Certifiably Optimal RulE ListS classifier.

This class implements the CORELS algorithm, designed to produce human-interpretable, optimal
rulelists for binary feature data and binary classification. As an alternative to other
tree based algorithms such as CART, CORELS provides a certificate of optimality for its 
rulelist given a training set, leveraging multiple algorithmic bounds to do so.

In order to use run the algorithm, create an instance of the `CorelsClassifier` class, 
providing any necessary parameters in its constructor, and then call `fit` to generate
a rulelist. `printrl` prints the generated rulelist, while `predict` provides
classification predictions for a separate test dataset with the same features. To determine 
the algorithm's accuracy, run `score` on an evaluation dataset with labels.
To save a generated rulelist to a file, call `save`. To load it back from the file, call `load`.

Attributes
----------
c : float, optional (default=0.01)
    Regularization parameter. Higher values penalize longer rulelists.

n_iter : int, optional (default=1000)
    Maximum number of nodes (rulelists) to search before exiting.

map_type : str, optional (default="prefix")
    The type of prefix map to use. Supported maps are "none" for no map,
    "prefix" for a map that uses rule prefixes for keys, "captured" for
    a map with a prefix's captured vector as keys.

policy : str, optional (default="lower_bound")
    The search policy for traversing the tree (i.e. the criterion with which
    to order nodes in the queue). Supported criteria are "bfs", for breadth-first
    search; "curious", which attempts to find the most promising node; 
    "lower_bound" which is the objective function evaluated with that rulelist
    minus the default prediction error; "objective" for the objective function
    evaluated at that rulelist; and "dfs" for depth-first search.

verbosity : list, optional (default=["rulelist"])
    The verbosity levels required. A list of strings, it can contain any
    subset of ["rulelist", "rule", "label", "minor", "samples", "progress", "mine", "loud"].

    - "rulelist" prints the generated rulelist at the end.
    - "rule" prints a summary of each rule generated.
    - "label" prints a summary of the class labels.
    - "minor" prints a summary of the minority bound.
    - "samples" produces a complete dump of the rules, label, and/or minor data. You must also provide at least one of "rule", "label", or "minor" to specify which data you want to dump, or "loud" for all data. The "samples" option often spits out a lot of output.
    - "progress" prints periodic messages as corels runs.
    - "mine" prints debug information while mining rules, including each rule as it is generated.
    - "loud" is the equivalent of ["progress", "label", "rule", "mine", "minor"].

ablation : int, optional (default=0)
    Specifies addition parameters for the bounds used while searching. Accepted
    values are 0 (all bounds), 1 (no antecedent support bound), and 2 (no
    lookahead bound).

max_card : int, optional (default=2)
    Maximum cardinality allowed when mining rules. Can be any value greater than
    or equal to 1. For instance, a value of 2 would only allow rules that combine
    at most two features in their antecedents.

min_support : float, optional (default=0.01)
    The fraction of samples that a rule must capture in order to be used. 1 minus
    this value is also the maximum fraction of samples a rule can capture.
    Can be any value between 0.0 and 0.5.

ERROR: Failed building wheel for corels

I can't run pip or pip3 install corels on my laptop (Windows 10)

Is there any dependency relating to C++ that is missing?

Finished generating code LINK : fatal error LNK1158: cannot run 'rc.exe' error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN\\amd64\\link.exe' failed with exit status 1158

Checking for presence of features in fit()

When checking for the present of the features in the fit method, this raises an error unless the type is a list (for example using a numpy array). This is because the check for its presence is being done by way of an if features: shouldn't this be if features is not None for a more robust implementation?

Can't pip install on Macbook

Hi,
thanks for sharing your code!
I'm trying to install this package on a Macbook Pro, put using pip install corels or cloning the repo and using pip install . does not work.

Here is the output. As you can see, I'm using miniconda3, with a clean environment. I already tried to install g++ and gmp (brew install g++ gmp).

(corels) berga@Lucas-MacBook-Pro pycorels % pip install .
Processing /Users/berga/PycharmProjects/pycorels
  Preparing metadata (setup.py) ... done
Collecting numpy
  Using cached numpy-1.24.2-cp311-cp311-macosx_11_0_arm64.whl (13.8 MB)
Building wheels for collected packages: corels
  Building wheel for corels (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [73 lines of output]
      /Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
        warnings.warn(
      Traceback (most recent call last):
        File "/Users/berga/PycharmProjects/pycorels/setup.py", line 87, in <module>
          install(True)
        File "/Users/berga/PycharmProjects/pycorels/setup.py", line 61, in install
          setup(
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/__init__.py", line 86, in setup
          _install_setup_requires(attrs)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/__init__.py", line 80, in _install_setup_requires
          dist.fetch_build_eggs(dist.setup_requires)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/dist.py", line 874, in fetch_build_eggs
          resolved_dists = pkg_resources.working_set.resolve(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 789, in resolve
          dist = best[req.key] = env.best_match(
                                 ^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 1075, in best_match
          return self.obtain(req, installer)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 1087, in obtain
          return installer(requirement)
                 ^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/dist.py", line 944, in fetch_build_egg
          return fetch_build_egg(self, req)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/installer.py", line 87, in fetch_build_egg
          wheel.install_as_egg(dist_location)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 110, in install_as_egg
          self._install_as_egg(destination_eggdir, zf)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 118, in _install_as_egg
          self._convert_metadata(zf, destination_eggdir, dist_info, egg_info)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 162, in _convert_metadata
          os.rename(dist_info, egg_info)
      OSError: [Errno 66] Directory not empty: '/Users/berga/PycharmProjects/pycorels/.eggs/numpy-1.24.2-py3.11-macosx-11.1-arm64.egg/numpy-1.24.2.dist-info' -> '/Users/berga/PycharmProjects/pycorels/.eggs/numpy-1.24.2-py3.11-macosx-11.1-arm64.egg/EGG-INFO'
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/Users/berga/PycharmProjects/pycorels/setup.py", line 89, in <module>
          install(False)
        File "/Users/berga/PycharmProjects/pycorels/setup.py", line 61, in install
          setup(
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/__init__.py", line 86, in setup
          _install_setup_requires(attrs)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/__init__.py", line 80, in _install_setup_requires
          dist.fetch_build_eggs(dist.setup_requires)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/dist.py", line 874, in fetch_build_eggs
          resolved_dists = pkg_resources.working_set.resolve(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 789, in resolve
          dist = best[req.key] = env.best_match(
                                 ^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 1075, in best_match
          return self.obtain(req, installer)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 1087, in obtain
          return installer(requirement)
                 ^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/dist.py", line 944, in fetch_build_egg
          return fetch_build_egg(self, req)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/installer.py", line 87, in fetch_build_egg
          wheel.install_as_egg(dist_location)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 110, in install_as_egg
          self._install_as_egg(destination_eggdir, zf)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 118, in _install_as_egg
          self._convert_metadata(zf, destination_eggdir, dist_info, egg_info)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 162, in _convert_metadata
          os.rename(dist_info, egg_info)
      OSError: [Errno 66] Directory not empty: '/Users/berga/PycharmProjects/pycorels/.eggs/numpy-1.24.2-py3.11-macosx-11.1-arm64.egg/numpy-1.24.2.dist-info' -> '/Users/berga/PycharmProjects/pycorels/.eggs/numpy-1.24.2-py3.11-macosx-11.1-arm64.egg/EGG-INFO'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for corels
  Running setup.py clean for corels
  error: subprocess-exited-with-error
  
  × python setup.py clean did not run successfully.
  │ exit code: 1
  ╰─> [73 lines of output]
      /Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
        warnings.warn(
      Traceback (most recent call last):
        File "/Users/berga/PycharmProjects/pycorels/setup.py", line 87, in <module>
          install(True)
        File "/Users/berga/PycharmProjects/pycorels/setup.py", line 61, in install
          setup(
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/__init__.py", line 86, in setup
          _install_setup_requires(attrs)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/__init__.py", line 80, in _install_setup_requires
          dist.fetch_build_eggs(dist.setup_requires)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/dist.py", line 874, in fetch_build_eggs
          resolved_dists = pkg_resources.working_set.resolve(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 789, in resolve
          dist = best[req.key] = env.best_match(
                                 ^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 1075, in best_match
          return self.obtain(req, installer)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 1087, in obtain
          return installer(requirement)
                 ^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/dist.py", line 944, in fetch_build_egg
          return fetch_build_egg(self, req)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/installer.py", line 87, in fetch_build_egg
          wheel.install_as_egg(dist_location)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 110, in install_as_egg
          self._install_as_egg(destination_eggdir, zf)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 118, in _install_as_egg
          self._convert_metadata(zf, destination_eggdir, dist_info, egg_info)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 162, in _convert_metadata
          os.rename(dist_info, egg_info)
      OSError: [Errno 66] Directory not empty: '/Users/berga/PycharmProjects/pycorels/.eggs/numpy-1.24.2-py3.11-macosx-11.1-arm64.egg/numpy-1.24.2.dist-info' -> '/Users/berga/PycharmProjects/pycorels/.eggs/numpy-1.24.2-py3.11-macosx-11.1-arm64.egg/EGG-INFO'
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/Users/berga/PycharmProjects/pycorels/setup.py", line 89, in <module>
          install(False)
        File "/Users/berga/PycharmProjects/pycorels/setup.py", line 61, in install
          setup(
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/__init__.py", line 86, in setup
          _install_setup_requires(attrs)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/__init__.py", line 80, in _install_setup_requires
          dist.fetch_build_eggs(dist.setup_requires)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/dist.py", line 874, in fetch_build_eggs
          resolved_dists = pkg_resources.working_set.resolve(
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 789, in resolve
          dist = best[req.key] = env.best_match(
                                 ^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 1075, in best_match
          return self.obtain(req, installer)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/pkg_resources/__init__.py", line 1087, in obtain
          return installer(requirement)
                 ^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/dist.py", line 944, in fetch_build_egg
          return fetch_build_egg(self, req)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/installer.py", line 87, in fetch_build_egg
          wheel.install_as_egg(dist_location)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 110, in install_as_egg
          self._install_as_egg(destination_eggdir, zf)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 118, in _install_as_egg
          self._convert_metadata(zf, destination_eggdir, dist_info, egg_info)
        File "/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/wheel.py", line 162, in _convert_metadata
          os.rename(dist_info, egg_info)
      OSError: [Errno 66] Directory not empty: '/Users/berga/PycharmProjects/pycorels/.eggs/numpy-1.24.2-py3.11-macosx-11.1-arm64.egg/numpy-1.24.2.dist-info' -> '/Users/berga/PycharmProjects/pycorels/.eggs/numpy-1.24.2-py3.11-macosx-11.1-arm64.egg/EGG-INFO'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed cleaning build dir for corels
Failed to build corels
Installing collected packages: numpy, corels
  Running setup.py install for corels ... error
  error: subprocess-exited-with-error
  
  × Running setup.py install for corels did not run successfully.
  │ exit code: 1
  ╰─> [37 lines of output]
      running install
      /Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
        warnings.warn(
      running build
      running build_py
      creating build
      creating build/lib.macosx-11.1-arm64-cpython-311
      creating build/lib.macosx-11.1-arm64-cpython-311/corels
      copying corels/corels.py -> build/lib.macosx-11.1-arm64-cpython-311/corels
      copying corels/__init__.py -> build/lib.macosx-11.1-arm64-cpython-311/corels
      copying corels/utils.py -> build/lib.macosx-11.1-arm64-cpython-311/corels
      copying corels/VERSION -> build/lib.macosx-11.1-arm64-cpython-311/corels
      running build_ext
      building 'corels._corels' extension
      creating build/temp.macosx-11.1-arm64-cpython-311
      creating build/temp.macosx-11.1-arm64-cpython-311/corels
      creating build/temp.macosx-11.1-arm64-cpython-311/corels/src
      creating build/temp.macosx-11.1-arm64-cpython-311/corels/src/corels
      creating build/temp.macosx-11.1-arm64-cpython-311/corels/src/corels/src
      clang -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /Users/berga/miniconda3/envs/corels/include -arch arm64 -fPIC -O2 -isystem /Users/berga/miniconda3/envs/corels/include -arch arm64 -Icorels/src/ -Icorels/src/corels/src -I/Users/berga/miniconda3/envs/corels/include/python3.11 -I/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/numpy/core/include -c corels/_corels.cpp -o build/temp.macosx-11.1-arm64-cpython-311/corels/_corels.o -Wall -O3 -std=c++11 -DGMP
      corels/_corels.cpp:220:12: fatal error: 'longintrepr.h' file not found
        #include "longintrepr.h"
                 ^~~~~~~~~~~~~~~
      1 error generated.
      running install
      /Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
        warnings.warn(
      running build
      running build_py
      running build_ext
      building 'corels._corels' extension
      clang -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /Users/berga/miniconda3/envs/corels/include -arch arm64 -fPIC -O2 -isystem /Users/berga/miniconda3/envs/corels/include -arch arm64 -Icorels/src/ -Icorels/src/corels/src -I/Users/berga/miniconda3/envs/corels/include/python3.11 -I/Users/berga/miniconda3/envs/corels/lib/python3.11/site-packages/numpy/core/include -c corels/_corels.cpp -o build/temp.macosx-11.1-arm64-cpython-311/corels/_corels.o -Wall -O3 -std=c++11
      corels/_corels.cpp:220:12: fatal error: 'longintrepr.h' file not found
        #include "longintrepr.h"
                 ^~~~~~~~~~~~~~~
      1 error generated.
      error: command '/usr/bin/clang' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> corels

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Can you help me? Thanks :)

Conflict numpy<=1.16

I am installing corels with poetry and there is a numpy conflict with other libraries, corels need numpy<=1.16 while other libraries require numpy>=1.20.

  SolverProblemError

  Because no versions of corels match >1.1.29,<2.0.0
   and corels (1.1.29) depends on numpy (<=1.16), corels (>=1.1.29,<2.0.0) requires numpy (<=1.16).
  And because pacmap (0.5.5) depends on numpy (>=1.20), corels (>=1.1.29,<2.0.0) is incompatible with pacmap (0.5.5).
  So, because plato depends on both pacmap (0.5.5) and corels (^1.1.29), version solving failed.

what can be done for unbalanced data: oversampling has strange behavior ?

what can be done for unbalanced data?
for example :
number of target yes is 200
but number of target no is 500000

Oversampling , meaning replicating records with target yes helps little bit
when oversampling used one time
So it will be
number of target yes is 400
number of target no is 500000

but second replicating surprisingly is not helping , relatively to first replicating

so when it is
number of target yes is 600
number of target no is 500000

then performance is the same as when
number of target yes is 400
number of target no is 500000

The questions are:
1
do you remove identical rows?
2
Do you have weighting for particular rows?
for example
rows with targets yes is may have more influence than rows targets "no"
Then rows weighting can be used for unbalanced data?

Thanks

RuleList says prediction=True but predict() function returns False

I came across some weird bug making the predict() function return wrong predictions while the rule list is learned correctly. Why does this happen?

Code to reproduce

from corels import CorelsClassifier
import pandas as pd

x=pd.DataFrame([[1,0,0,0]]*200+[[0,1,0,0]]*200)
y=pd.Series([True]*390+[False]*10)

model=CorelsClassifier(verbosity=[])
model.fit(x,y)

print(model.rl())
print()
print(pd.value_counts(model.predict(x)))

Tested on Windows with pandas 1.0.5 (installed via conda) and corels 1.1.29 (installed via pip), and python 3.

Actual result

The code prints:

RULELIST:
prediction = True

False 400
dtype: int64

Apparently all predictions are False.

Expected result

The rule list is created as expected. The predictions are expected to be all True, just like the rule list states. The expected output consequently is:

RULELIST:
prediction = True

True 400
dtype: int64

only one rule list for pretty complicated data

I get only one rule list for pretty complicated data
data_matrix_train.shape
(1420, 18)
18 feature and 1420 observations
data was got from one hot data so it sparse

sum(data_matrix_train)
array([223, 253, 211, 1, 59, 79, 104, 92, 225, 179, 227, 117, 80,
79, 246, 214, 207, 215], dtype=uint8)

what can be done to get more rules

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.