Coder Social home page Coder Social logo

Comments (5)

zxdev avatar zxdev commented on May 27, 2024

See my comment from Jan 5, "low load fill and missing items".

The TL;DR is that this code has a fundamental design flaw in the remodel/kick routine that causes items to be lost, I proved it if you look at my comments and analysis, and data, so unless you want to write a proper remodeler and cyclic detection, this code is useless.

One way around this might be to just use more indexes, but once the remodel events happen, you will loose items with this code in it's current state. This flaw is also is the reason you can't get good load/fill levels and it crashes well below the expected 95% fill.

Also, to answer your other question, this code has a HUGH memory footprint 7x larger than necessary because it does not use pointers or static sized types (eg. uses type fingerprint []byte instead of [2]byte) so the fingerprint is 16 in memory even if it only uses 2 for a fingerprint len()=2 and cap()=16, plus every manipulation causes memory allocation and causes copies, designed this way, so there is no memory reuse so you have to wait for garbage collector, so tons of memory usage. I use activity monitor on my Mac to see how much ram is used bt the script, rough but works.

I wound up writing my own, as I said, but it is architected differently with a counter, so still cookoo hash, but counting cookoo hash, for 5MM entries hash+int64 counter consumes 80MG, not 5GB, so I can't really give you much help with this code other than you can try incorporating my cyclic node detection code, but you have to write a proper remodel loop to not loose anything during remodel and fix the hash methods:

// cyclic detects node repeats up to max iterations
func cyclic(max int) func(i int64, b []byte) bool {

	type cyclic struct {
		i int64
		h string
	}

	hits := make(map[cyclic]int)

	visited := func(i int64, b []byte) bool {
		node := cyclic{i, string(b)}
		hits[node]++
		if len(hits) > max || hits[node] > 1 {
			return false
		}
		return true
	}

	return visited
}

use like so:

// note: I suppose this could be a recursion, ... if you wanted to do it that way
for {
cy := cyclic(maxCyclic)
  for {
     ... select
    if !cy(i, k[:]) { break } 
    ...swap
       for {
         ...insert
    }
  }
}

from cfilter.

irfansharif avatar irfansharif commented on May 27, 2024

so unless you want to write a proper remodeler and cyclic detection, this code is useless.

I'm sorry you feel this way though I can sympathize, I wrote this when initially getting started with Go about a year ago and have not had occasion to revisit it since. The large memory footprint & "fundamental" design flaw (this isn't apparent to me as yet; not that I've had a chance to investigate) is evidence of this, yes.

@ajaybodhe: In light of this I encourage you to instead try https://github.com/seiflotfy/cuckoofilter in the interim, or perhaps @g-w-p's implementation if he/she is willing to share. Keeping this and #5 open for when I do get around to addressing them.

from cfilter.

zxdev avatar zxdev commented on May 27, 2024

I wound up writing my own, as I said, but it is architected differently with counters, but still cookoo hash style, so I can't really give you much help with this code other than you can try to fix it, but you have to write a proper remodel loop to not loose anything during remodel and fix the hash methods ... that was where the problem was when I was playing with this.

from cfilter.

zxdev avatar zxdev commented on May 27, 2024

This code had potential, but needs work. These were my analysis tools, what I used when I was working on my cookoo hash architecture, might help you if you choose to revisit. Dump the image and dump the data to catch the remodel events and you will or should find the error in the code.

/*
	kind of handy and cool debug 
*/

// Img creates a hash map to inspect the hashed data container randomness of distrobution
// Note: the width*depth must equal container size, it doesn't calculate this
func (c *Cache) Img(width, height int) {
	img := image.NewRGBA(image.Rectangle{image.Point{0, 0}, image.Point{width, height}})
	var w1, h1 int
	for ib := range c.data {
		for ik := range c.data[ib] {
			if c.data[ib][ik] != *Empty {
				img.Set(w1, h1, color.RGBA{255, 0, 0, 255})
			}
			w1++
			if w1 > width {
				w1 = 0
				h1++
			}
		}
	}
	f, _ := os.Create("data/hash.png")
	defer f.Close()
	png.Encode(f, img)
}

/* debug purpose code */

func (c *Cache) Dump() {
	for ib := range c.data {
		fmt.Printf("%6d ", ib)
		for ik := range c.data[ib] {
			fmt.Print(hex.EncodeToString(c.data[ib][ik][:2]), " ")
		}
		fmt.Println()
		//if ib > 24 {
		//	fmt.Println("...")
		//	break
		//}
	}
}

from cfilter.

irfansharif avatar irfansharif commented on May 27, 2024

No longer maintaining this.

from cfilter.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.