Comments (9)
tested with https://github.com/spaolacci/murmur3
result
entries: 12417, evacuated: 0, overwritten: 0
avg_value_len: 43.480773
bytes per item: 84.44680679713296 - 43.480773 - 16 = 24.966033797132965
and is about (non-scientifically) 20% faster than md5
from freecache.
Hm... actually tested it some more. And while a better hash function does solve the issue in single-threaded case, it's still very bad in multithreaded case.
With 10 goroutines, all inserting same keys, but slightly different values (so a zillion of evacuates happens) - overhead per item still grows to ridiculous values.
for example:
same code as above gist, but changed the following params
const cache_size = 10 * 1024 * 1024 // was 1 megabyte
n_keys := 10 * 1000 * 1000 // was 1 million
here's the result with: const n_routines = 10
entries: 42078, evacuated: 120957842, overwritten: 77988935
avg_value_len: 43.49973858
bytes per item: 249.1981558058843 - 43.49973858 - 16 = 189.69841722588433
vs const n_routines = 1
entries: 125323, evacuated: 0, overwritten: 0
avg_value_len: 43.4929523
bytes per item: 83.6698770377345 - 43.4929523 - 16 = 24.176924737734495
from freecache.
Thank you for your intensive testing on it.
So it seems like the memory utilization issue is not caused by hash function.
The difference of entry overhead seems too big for the difference of hash function distribution quality.
from freecache.
I think this is normal.
If you set n_routines to one, then every key is new, there is no overwrites, the entry length is exactly (24+keyLen+valueLen).
If you set n_routines to 10, the entry can be overwritten.
If the new value is larger than the old entry's capacity, it won't be updated in place, then the old entry's space is not utilized.
If the new value is smaller than the old entry's capacity, then there would be an in-place update, (capacity - valueLen) of space is not utilized.
Whenever the capacity is smaller than the new value, the capacity is doubled, so when you keep increasing the size of the new value, it won't keep writing new entries, most of the update would be in place update.
from freecache.
Even in single threaded case with fnvhash - utilization is poor. With zero overwrites/evictions. might be wrong stats i guess ? since with avg value of 43 bytes and key 16 bytes, we should be able to fit all the values into 1 meg.
from freecache.
I will try to find the answer tomorrow, you can print out every segment's entry count to see how it is distributed .
from freecache.
hacked it together for 1 mil items, 1 goroutine, 1 meg cache
data gathered with
n := atomic.LoadInt64(&cache.segments[i].entryCount)
fmt.Printf("%d %d\n", i, n)
fnvhash
antoxa@antoxa-suse:~/_Dev/go/src/antoxa> cat 1.txt | awk '{ print $2 }' | sort -n | uniq -c
384 0
2 470
4 479
6 480
4 481
2 482
10 483
2 484
6 485
4 486
6 487
14 488
12 489
6 490
12 491
2 492
8 494
4 495
4 496
6 497
4 499
4 501
4 503
2 508
murmur3
antoxa@antoxa-suse:~/_Dev/go/src/antoxa> cat 1.txt | awk '{ print $2 }' | sort -n | uniq -c
4 471
2 473
2 475
6 477
8 478
12 479
12 480
20 481
6 482
26 483
14 484
20 485
24 486
24 487
18 488
40 489
30 490
24 491
18 492
44 493
44 494
18 495
18 496
14 497
20 498
10 499
12 500
2 501
10 502
4 503
2 504
2 505
2 508
from freecache.
Thank you for the great work to address this issue. The hash function definitely need to change.
from freecache.
Changed to MD5 in the latest commit 28070c0
from freecache.
Related Issues (20)
- freecache why not introduce singleflight for hot key issue? HOT 3
- Set support time.Duration as expire data type
- why `Get` sometimes become slow HOT 4
- HI, is it to delete element in iterator? HOT 1
- How to get total cache size in use? HOT 1
- 您好,请问freecache如何解决哈希倾斜问题 HOT 3
- a big bug? HOT 4
- Freecache with Disk persistence?
- GetOrSet doesn't return cache miss message HOT 3
- why entry use array instead of golang map for storage HOT 2
- TTL - difference between "no expiration" and 0 expiration. HOT 2
- NewIterator Output ExpiresAt HOT 5
- Incorrect cache duration due to a precision issue. HOT 7
- How can I get notice when a key expired? HOT 1
- EntryCount didn't update? HOT 8
- Coocoo may God grant your heart desire give you long life prosperity
- Extended Documentation request
- Need clarity around memory profile
- Peek() returns expired values HOT 2
- A little question about ringbuf
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from freecache.