Comments (14)
Thanks. I won't be able to take a look at this until tonight.
Most of the unit tests use a Ticker
to manipulate time. Can you verify that it fails with that, too? If not then it would indicate an unexpected visibility problem.
from caffeine.
Oh, this is the async cache. Sorry, I need to play with the code to get a sense of what is going on. I just reproduced it failing and see it runs if I use a direct executor but fails on a single-thread executor. So definitely some race, but not sure what yet.
from caffeine.
Yes, that's async cache. Not sure how it behaves on a blocking cache as we do not use it.
Also whether the future is immediate or real one does not make any difference as we have the same issue with true futures.
So as far as I understood you do not need ticker based test anymore?
from caffeine.
Nope, that was my misreading. Since it runs fine with a direct executor and fails with a single-threaded, its a race. I have to verify your code is valid because sleeping and waking are not exact time durations. The big question is whether the race is valid or not.
So my first step will be to strengthen the visibility to see if it passes. Right now reads and writes to the timestamp piggy back on other volatiles generating our memory barriers to avoid excessive cpu stalls.
from caffeine.
I took a look and here's my current guess. I haven't been able to validate it yet, so hopefully I'm not too far off base and will have a patch released by the end of the weekend.
When a write to key1 occurs, it triggers a cache maintenance cycle (on the executor) to evict/expire/reorder entries in the policy's linked lists. The expiration check is racy, as key2 could have expired but then be resurrected by a concurrent update. The atomic removal (via computeIfPresent
) doesn't validate that the entry is still expired, though it probably should. This race was assumed to be rare, so one or two extra evictions wouldn't be noticeable. That may have been a bad assumption if this happened often enough for you to investigate and write a test that triggered the race reliably.
I verified that the tests pass if the maintenance cycle is disabled. I haven't been able to get a hack that resolves the race to work reliably yet, which would let me then fix it properly. So I might either be tired or there is a different issue at play. Regardless I hope to figure it out, fix it, and have a release by the end of the weekend.
from caffeine.
Thank you for looking at it promptly. Yes, the issue seems to be common at least with our use case, where we deal with a few keys used side by side within a short period of time. We noticed it by observing our Cassandra queries being way too more frequent than expected, which immediately raised a question about cache layer not working properly.
from caffeine.
I was exhausted last night and flipped a gt / lt sign which is why my quick hack of a fix didn't work. It was Yom Kippur and I was out late when we broke the fast.
So the above rational is correct. If I hack BoundedLocalCache#evictEntry
to use the following your tests pass. I should be able to clean this up, wrap your code into an integration test, and have a release out in the timeframe I promised.
Thanks for doing the detective work to isolate the bug and report it.
data.computeIfPresent(node.getKeyReference(), (k, n) -> {
if (actualCause == RemovalCause.EXPIRED) {
long now = expirationTicker().read();
if (expiresAfterWrite()) {
long expirationTime = now - expiresAfterWriteNanos();
if (n.getWriteTime() >= expirationTime) {
resurrect[0] = true;
return n;
}
}
}
if (n == node) {
writer.delete(key, value, actualCause);
removed[0] = true;
return null;
}
return n;
});
if (resurrect[0]) {
return;
}
from caffeine.
I have the fix ready and refactored your tests a little. Because this relies on exact timings it can't run reliably as part of the build / CI yet. There is enough other work (concurrent tests, shared CI) that the timing thresholds can be exceeded. I'll fix this later by isolating the test in its own build step, though that will probably only run locally because the CI is too heavily loaded.
I'll get the next release put together tonight.
from caffeine.
v1.3.3 released
from caffeine.
Oh, also note that the git history looks a little weird since I had to rebase master on top. That was to isolate it as a direct patch to the previous release. The current work has been on v2 which will introduce an new eviction policy. I didn't want that to go out prematurely
from caffeine.
Maven Central sync is broken MVNCENTRAL-840...
from caffeine.
all good. cache is now behaving ok with v1.3.3.
from caffeine.
great! Thanks for the update
from caffeine.
@yurgis2
If you're using size based eviction, too, it would be interesting to hear your feedback when upgrading to v2's W-TinyLfu policy.
from caffeine.
Related Issues (20)
- Some `Expiry` implementations return `currentTime + desiredLength` instead of `desiredLength` HOT 7
- RemovalListener not always called in unit test. HOT 5
- Mark cache entries as non-evictable HOT 2
- refreshAll feature for CacheLoader HOT 3
- Is the version that supports JDK8 no longer updated? HOT 1
- notifyRejected: com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache HOT 2
- Too many simple repetition of "multi put, multi get" results in immediate item eviction HOT 5
- Version 3.1.8 is not compatible with jdk21 HOT 2
- schedule() is not running exactly after expiry of each entry, it works only for first entry expiry HOT 4
- Race condition in computing write timestamp HOT 2
- Values currently being loaded shows up as `null` entries in `synchronous().asMap()` HOT 9
- Oldest Eviction option HOT 2
- When maximumValue and expireAfterAccessTime is used which is honored. HOT 1
- Compute BenchMark results comparison between Caffeine and ConcurrentHashMap HOT 3
- The GraalVM image building is successful, however, the execution fails HOT 2
- CaffeineCache.get results in SIGSEGV error HOT 1
- can support subkey like the hash of redis? HOT 1
- AsyncCache uses the configured ticker instead of the system ticker for stats when unbounded HOT 11
- Exception from weigher is ignored in an AsyncCache HOT 2
- [fr]can support the nested key HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from caffeine.