Coder Social home page Coder Social logo

mnist-k-means-clustering's Introduction

MNIST-K-Means-Clustering

Using K-Means Clustering to Identify Handwritten Digits

Uncompress the .tar.gz archive to get the digits.base64.json dataset, which you'll need. (tar -xzvf digits.base64.json.tar.gz)

Design decision: the clustering algorithm is designed to train on labelled data. However, I've written it in such a way that it's easy to change to unlabelled data -- I considered making it modular for labelled/unlabelled data, but the more I think about it, the less I'm convinced of the utility of having a k-means clustering algorithm for unlabelled training data. (If your data is unlabelled, you can just place a dummy label on every datapoint.)

Inspired by a homework assignment in John Lafferty's Large-Scale Data Analysis course that I took at UChicago in the Spring of 2015. I collaborated with Elliott Ding on that assignment. In the class, we used distributed systems via AWS and Apache Spark, parallellized code, and did most analysis using map-reduce. To make the computational statistics more accessible, I've rewritten this notebook to not use distributed techniques.


See my blog post on this project here.

mnist-k-means-clustering's People

Contributors

datamine avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.