Coder Social home page Coder Social logo

behrouz-babaki / minsizekmeans Goto Github PK

View Code? Open in Web Editor NEW
80.0 3.0 28.0 230 KB

A python implementation of KMeans clustering with minimum cluster size constraint (Bradley et al., 2000)

License: GNU General Public License v3.0

Shell 7.41% Python 92.59%
constrained-clustering kmeans-clustering clustering-algorithm minimum-size-constraint

minsizekmeans's Issues

Run on CSV with multiple columns

Hi,

How should I run on a CSV with multiple columns?

ex: I want to cluster data from a csv that contains:

  1. Column One - Latitude
  2. Column Two - Longitude

Thanks!

Cannot implement in Python 3.5 Conda Environment

Hi all,

I cloned this repo and made a fresh install of Anaconda (ver. 4.7.12), created a new environment using the following command:

conda create -n py35 python=3.5 anaconda

Followed by activating py35

conda activate py35

After making this environment, proceeded to run the code, throwing the following error:

Requested Python version (3.5) is not installed

Even though I confirm that the python version is correct:

python --version
Python 3.5.5

It won't run. Any advice would be great. I am currently running on Windows 7, and this is my full conda info list:

active environment : py35
active env location : C:\Users\Doug\Anaconda3\envs\py35
shell level : 2
user config file : C:\Users\Doug\.condarc
populated config files : C:\Users\Doug\.condarc
conda version : 4.7.12
conda-build version : 3.18.9
python version : 3.7.4.final.0
virtual packages :
base environment : C:\Users\Doug\Anaconda3 (writable)
channel URLs : https://conda.anaconda.org/conda-forge/win-64
https://conda.anaconda.org/conda-forge/noarch
https://repo.anaconda.com/pkgs/main/win-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/win-64
https://repo.anaconda.com/pkgs/r/noarch
https://repo.anaconda.com/pkgs/msys2/win-64
https://repo.anaconda.com/pkgs/msys2/noarch
package cache : C:\Users\Doug\Anaconda3\pkgs
C:\Users\Doug\.conda\pkgs
C:\Users\Doug\AppData\Local\conda\conda\pkgs
envs directories : C:\Users\Doug\Anaconda3\envs
C:\Users\Doug\.conda\envs
C:\Users\Doug\AppData\Local\conda\conda\envs
platform : win-64
user-agent : conda/4.7.12 requests/2.22.0 CPython/3.7.4 Windows/7 W
ows/6.1.7601
administrator : False
netrc file : None
offline mode : False

Write best cluster algorithm

best = None best_clusters = None for i in range(args.NUM_ITER): clusters, centers = minsize_kmeans(data, args.k, args.min_size, args.max_size) if clusters: quality = compute_quality(data, clusters) if not best or (quality < best): best = quality best_clusters = clusters if best: if args.OUTFILE: with open(args.OUTFILE, 'w') as f: print('\n'.join(str(i) for i in clusters), file=f) else: print('cluster assignments:') for i in range(len(clusters)): print('%d: %d'%(i, clusters[i])) print('sum of squared distances: %.4f'%(best)) else: print('no clustering found')

Just a question, maybe I am wrong, but if you found a best cluster configuration, should not the variable best_clusters written to the file instead of the last cluster configuration? Regards Lukas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.