Coder Social home page Coder Social logo

kmc2's Introduction

DOI

Fast and Provably Good Seedings for k-Means using k-MC^2 and AFK-MC^2

Introduction

The package provides a Cython implementation of the algorithms k-MC^2 and AFK-MC^2 described in the two papers:

Approximate K-Means++ in Sublinear Time. Olivier Bachem, Mario Lucic, S. Hamed Hassani and Andreas Krause. In Proc. Conference on Artificial Intelligence (AAAI), 2016.

Fast and Provably Good Seedings for k-Means. Olivier Bachem, Mario Lucic, S. Hamed Hassani and Andreas Krause. To appear in Neural Information Processing Systems (NIPS), 2016.

The implementation is compatible with Python 2.7.

Installation

First make sure that numpy is installed by running

pip install numpy

The following command will then install kmc2 from PyPI:

pip install kmc2

To install kmc2 locally from this repository, you may use

pip install .

Quickstart

The kmc2 function may be used to run the algorithm and obtain a seeding. The data should be provided in a Numpy array or a Scipy CSR matrix.

import kmc2
X = <Numpy array containing the data>
seeding = kmc2.kmc2(X, 5)  # Run k-MC2 with k=5

The seeding can then be refined using MiniBatchKMeans of scikit-learn:

from sklearn.cluster import MiniBatchKMeans
model = MiniBatchKMeans(5, init=seeding).fit(X)
new_centers = model.cluster_centers_

Detailed Usage / API

The kmc2 module exposes a single function kmc2(...) with all the functionality:

def kmc2(X, k, chain_length=200, afkmc2=True, random_state=None, weights=None):
    """Cython implementation of k-MC2 and AFK-MC2 seeding

    Args:
      X: (n,d)-shaped np.ndarray with data points (or scipy CSR matrix)
      k: number of cluster centers
      chain_length: length of the MCMC chain
      afkmc2: Whether to run AFK-MC2 (if True) or vanilla K-MC2 (if False)
      random_state: numpy.random.RandomState instance or integer to be used as seed
      weights: n-sized np.ndarray with weights of data points (default: uniform weights)


    Returns:
      (k, d)-shaped numpy.ndarray with cluster centers
    """
    ...

Tests

To run the unittests, use nose in the package directory

nosetests

Feedback / Citation

Please send any feedback to Olivier Bachem ([email protected]).

If you would like to cite this implementation, please reference the two original papers.

License

The software is released under the MIT License as detailed in kmeans.pyx.

Acknowledgments

This research was partially supported by ERC StG 307036, a Google Ph.D. Fellowship and an IBM Ph.D. Fellowship.

kmc2's People

Contributors

obachem avatar vmarkovtsev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kmc2's Issues

cannot install the KMC2

I run this command" pip install kmc2", then:
Collecting kmc2
Using cached kmc2-0.1.tar.gz (102 kB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy in c:\users\admin\anaconda3\envs\python38-pytorch18\lib\site-packages (from kmc2) (1.20.2)
Requirement already satisfied: scipy in c:\users\admin\anaconda3\envs\python38-pytorch18\lib\site-packages (from kmc2) (1.6.2)
Requirement already satisfied: scikit-learn in c:\users\admin\anaconda3\envs\python38-pytorch18\lib\site-packages (from kmc2) (1.1.1)
Requirement already satisfied: nose in c:\users\admin\anaconda3\envs\python38-pytorch18\lib\site-packages (from kmc2) (1.3.7)
Requirement already satisfied: joblib>=1.0.0 in c:\users\admin\anaconda3\envs\python38-pytorch18\lib\site-packages (from scikit-learn->kmc2) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\admin\anaconda3\envs\python38-pytorch18\lib\site-packages (from scikit-learn->kmc2) (3.1.0)
Building wheels for collected packages: kmc2
Building wheel for kmc2 (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [12 lines of output]
running bdist_wheel
running build
running build_ext
building 'kmc2' extension
creating build
creating build\temp.win-amd64-3.8
creating build\temp.win-amd64-3.8\Release
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30037\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -I. -Ic:\users\admin\anaconda3\envs\python38-pytorch18\lib\site-packages
numpy\core\include -Ic:\users\admin\anaconda3\envs\python38-pytorch18\include -Ic:\users\admin\anaconda3\envs\python38-pytorch18\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.3003
7\include" /Tckmc2.c /Fobuild\temp.win-amd64-3.8\Release\kmc2.obj -O3
cl: 命令行 warning D9002 :忽略未知选项“-O3”
kmc2.c
c:\users\admin\anaconda3\envs\python38-pytorch18\include\pyconfig.h(59): fatal error C1083: 无法打开包括文件: “io.h”: No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30037\bin\HostX86\x64\cl.exe' failed with exit status 2
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for kmc2
Running setup.py clean for kmc2
Failed to build kmc2
Installing collected packages: kmc2
Running setup.py install for kmc2 ... error
error: subprocess-exited-with-error

× Running setup.py install for kmc2 did not run successfully.
│ exit code: 1
╰─> [12 lines of output]
running install
running build
running build_ext
building 'kmc2' extension
creating build
creating build\temp.win-amd64-3.8
creating build\temp.win-amd64-3.8\Release
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30037\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -I. -Ic:\users\admin\anaconda3\envs\python38-pytorch18\lib\site-packages
numpy\core\include -Ic:\users\admin\anaconda3\envs\python38-pytorch18\include -Ic:\users\admin\anaconda3\envs\python38-pytorch18\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.3003
7\include" /Tckmc2.c /Fobuild\temp.win-amd64-3.8\Release\kmc2.obj -O3
cl: 命令行 warning D9002 :忽略未知选项“-O3”
kmc2.c
c:\users\admin\anaconda3\envs\python38-pytorch18\include\pyconfig.h(59): fatal error C1083: 无法打开包括文件: “io.h”: No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30037\bin\HostX86\x64\cl.exe' failed with exit status 2
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> kmc2

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

A problem about build

Hello , when I input the command that "pip install setup.py" , a error occurred . Where should the "kmc2.c" be put ? Please help me . Thank you very much .
image

Cannot install due to internal problems

I tried to install it by PyCharm and got these errors:

kmc2.c(18737): error C2039: "exc_type": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18738): error C2039: "exc_value": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18739): error C2039: "exc_traceback": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18751): error C2039: "exc_type": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18752): error C2039: "exc_value": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18753): error C2039: "exc_traceback": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18754): error C2039: "exc_type": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18755): error C2039: "exc_value": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18756): error C2039: "exc_traceback": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18799): error C2039: "exc_type": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18800): error C2039: "exc_value": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18801): error C2039: "exc_traceback": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18802): error C2039: "exc_type": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18803): error C2039: "exc_value": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18804): error C2039: "exc_traceback": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18826): error C2039: "exc_type": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18827): error C2039: "exc_value": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18828): error C2039: "exc_traceback": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18829): error C2039: "exc_type": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18830): error C2039: "exc_value": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"
kmc2.c(18831): error C2039: "exc_traceback": ­Ґ пў«пҐвбп з«Ґ­®¬ "_ts".
C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.2800.0_x64__qbz5n2kfra8p0\include\cpython/pystate.h(51): note: б¬. ®Ўкпў«Ґ­ЁҐ "_ts"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.