Comments (3)
Hi @ben-manes , thank you for the detailed post!
I think it's a good idea to add the support of hill climbing window size change into W-TinyLFU. It's unlikely that we can implement it any time soon since TinyLFU does not have a lot of use case in Facebook. But we are more than happy to review if you can send a pull request. We can also help supply traces for evaluation in cachebench.
Regarding the additional suggestions:
- CacheLib supports swappable CMS implementations down to 8 bits. For 4 bits, it is quite a bit of complexity compared to the saved memory (of course this varies by systems). On the other hand I'm curious about bloom filter. Is there evidence showing bloom filter can be as good as, or achieves a good tradeoff compared to CMS? CacheLib does have a bloom filter implemented and we can add a type parameter to have the user specify whether they want to use bloom filter or which CMS.
- The way CacheLib DRAM LRUs handle hot updates is to set an interval of lruRefreshTime. Accesses on the same key occurring within this time interval won't update the policy more than once. I think a ring buffer can complement this but we are able to get around with this time interval.
- Currently CacheLib DRAM cache is slab based. So to adopt to size-aware memory management would probably be a very long term change that we need to look carefully into. Or did I misunderstand your point here?
- I think this could be added as an option in TinyLFU.
- Absolutely!
It's nice to see you here Ben! I also worked at Addepar a while back on Greenbaums team and had the chance to take over a lot of your code in the financial graph.
from cachelib.
It's unlikely that we can implement it any time soon... But we are more than happy to review if you can send a pull request.
Unfortunately I am not much of a C++ programmer, so I'd leave this to someone else if interest. What is nice is that this avoids needing developers to think about the eviction policy, since those differences mostly goes away.
On that note, your 2Q policy is misnamed. The algorithm by Johnson & Shasha uses a non-resident queue, whereas cachelib implemented OpenBSD's TU-Q policy named after Ted Unangst.
- I'm curious about bloom filter. Is there evidence showing bloom filter can be as good as, or achieves a good tradeoff compared to CMS?
This is actually in addition to the CMS. The idea is to avoid polluting the CMS with one-valued counters, which increases the error rate and requires more counters to compensate. Instead the CMS counters are only incremented if found in the BF. For large caches this reduces the CMS space, which at 100Ms+ entries could be a benefit. Caffeine doesn't use this trick, though, because as an application cache it is usually small enough to not be worth the trouble.
- The way CacheLib DRAM LRUs handle hot updates is to set an interval of lruRefreshTime.
Ahh, thanks for pointing that out. This is what memcached does too. A flaw is that it is coupled to the entry, so operations like row scans would trigger many tryLock attempts. The docs do discuss lock contention problems, so if that does become a problem again then that is a technique to consider.
- Currently CacheLib DRAM cache is slab based. So to adopt to size-aware memory management would probably be a very long term change that we need to look carefully into. Or did I misunderstand your point here?
Nope, thanks for pointing that out.
It's nice to see you here Ben! I also worked at Addepar a while back on Greenbaums team and had the chance to take over a lot of your code in the financial graph.
ha, what a mess. I hope you don't hold that against me, I inherited much of it. My "starter project" was to make the graph distributed instead of running all of Addepar on one big machine. I created a transaction log, replayed to "catch-up" the nodes to allow for snapshot reads, a persistent data structure for concurrent readers, and used select-for-update to exclusively row-level lock the graph during a write transaction. Since everything was dependent on that logic in a large code base, but which was designed as a SPOF, that solution gave enough breathing room to allow for an eventual rebuild. I hope by now that situation is a lot better. Anyway, small world! 🙂
from cachelib.
I'll leave this open then.
The trick of bloom filter on top of CMS is interesting. We do have the interest to reduce the overhead of how access history is tracked in another application of CMS.
Anyway, small world! 🙂
Sure it is! AMP is a piece of art. Let's leave it there.
from cachelib.
Related Issues (20)
- Fail to build dependency fbthrift (with errors reported in fmt) HOT 5
- make clean option for contrib/build.sh HOT 1
- build error about fizz on ubuntu22.04 HOT 2
- CDN trace expected behavior HOT 2
- Enable FDP for CacheBench HOT 26
- qDepth Support for NVM Cache HOT 6
- Questions about trace files when running cachebench HOT 2
- Running simple-cache-example gives an error, flag 'v' was defined more than once HOT 6
- OSS build broken as of May, 2024 -> PRs are all blocked HOT 7
- No build support for Fedora37 OS HOT 4
- failed to build CacheLib following document HOT 2
- Build fails on debian-10 HOT 2
- Segmentation fault while fetching refcount HOT 2
- Minimum Limit For Cache Allocation? HOT 1
- build failed when building dependency 'fbthrift' HOT 3
- Build issue with CacheLib with missing source files HOT 3
- [Seeking Volunteers] Add new builds to CacheLib HOT 2
- Build is failing with error: ‘fmt::v10::detail::type_is_unformattable_for<const facebook::cachelib::navy::Status, char> _’ has incomplete type 1600 | type_is_unformattable_for<T, typename Context::char_type> _; | ^ HOT 4
- NavySetup should not involve the MockDevice
- How to configure folly for logging CacheLib
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cachelib.