Coder Social home page Coder Social logo

Comments (4)

okaneco avatar okaneco commented on August 16, 2024

I'm glad this project could be an inspiration. 😄

I came across your GPU image filter blog a few weeks ago and enjoyed it. I wanted to try GPU compute for similar problems but I found learning it to be a bit overwhelming. I decided to focus on other projects but I'll definitely check in on your project.

It was fun to implement the algorithm and fortunately the papers are fairly approachable for the kmeans algorithms and kmeans++. In case you're interested, the resources on this page were helpful when writing the library and implementing one of the optimizations with the triangle inequality.
https://cs.baylor.edu/~hamerly/software/kmeans

The slides from the presentation and Accelerating Lloyd's algorithm for k-means clustering were helpful for finding out about possible optimizations and visualizing them.
https://cs.baylor.edu/~hamerly/software/fast_kmeans_talk_20140515.pdf
https://www.ccs.neu.edu/home/radivojac/classes/2021fallcs6220/hamerly_bookchapter_2014.pdf

For kmeans++, the main algorithm is 1a, 1b, and 1c in 2.2
http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf

Good luck.

from kmeans-colors.

redwarp avatar redwarp commented on August 16, 2024

I found learning it to be a bit overwhelming

I totally feel you there! I'm still mainly struggling to understand all the implication of running stuff with shared memory. I'm more tinkering than being in control. But when it works, it's really rewarding :-) I'm happy that you enjoyed the blog post! It means a lot.

The rust lib wgpu is also a good abstraction to have something that works on all the platforms, without having to choose your poison (should I target metal, vulkan, directx, ...? that is hell)

Thanks a lot for the links! I'm impressed that you can simply read the paper and understand them all: I'm always faced with my lack of mathematical background, when I can't even read a simple formula like.

Screenshot 2022-03-14 175121

I probably should go back to the basics... But I will read each of those links anyway!

For kmeans++, I'm thinking of implementing this one: https://www.geeksforgeeks.org/ml-k-means-algorithm/
It totally drops the probability aspect of picking next centroids, simplifying it to "pick the point that has maximum distance from closest centroid, forgo probability altogether". And that, I think I know how to parallelize some of this work.

I'm curious what the impact will be!

from kmeans-colors.

okaneco avatar okaneco commented on August 16, 2024

I'm impressed that you can simply read the paper and understand them all

I struggle with them too, it's not always easy 🙂. It takes a lot of patience, and practice to get used to reading papers and understanding them. The main thing to try doing is figuring out the definitions of what's being discussed. Finding the works they reference or other papers in the field can help better understand the content, and some papers are much better at explaining topics than others. There's almost always something I have to try learning for the first time or brush up on from the past so don't worry if you don't understand on the first few passes.

From that link, that seems like a good way to go as I've read/seen other implementations do that. I'm also curious what the impact will be. Dropping the probabilities and extra iteration from how I've done it would be a faster init. The only real downside I can think of is that it can be overly affected by outlier points, resulting in taking longer to converge or converging on less than optimal centers. For colors though, that might not be enough to matter.

from kmeans-colors.

redwarp avatar redwarp commented on August 16, 2024

I actually have a hard time stabilizing my kmeans++ implementation. I mean by that that, comparing the chosen initial centroid from run to run, there are sometimes minor difference between each, and I'm not one 100% sure why. For now, I blame that on "float is not an exact science" and call it a day.

Still, with this, I get colors that are closer to what your lib produce, and the output colors are better. So it's definitely a win!
In terms of convergence, it's true that it does take time.

But it still converges faster that what I was doing before (just random colors from the image lol).

Side note, I have dithering now!
https://github.com/redwarp/k-means-gpu/blob/main/gfx/tokyo-mix-dither-lab-k6.png

It is not, and will never be I suspect, on par with your rscolorq ;-)

from kmeans-colors.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.