Coder Social home page Coder Social logo

Comments (12)

ben-manes avatar ben-manes commented on June 18, 2024

I had forgotten about that detail myself when reading the Guava bug. There are two options I can think of.

  1. Use AsyncLoadingCache. The load from the map's perspective is immediate since it stores a CompleteableFuture as the value. An invalidation will not block and the listener will be invoked after the future's computation has completed.
  2. Store the loading keys in a auxiliary set. This could be done internally using the compute methods or externally with CacheLoader. It may be too error prone to store the full key set because then we'd have to ensure that compute is used to mimic and instrument puts, removes, etc.

You might be able to get away with #1 or implement #2 as a decorator. I'm hesitant to implement #2 internally, at least until its proven problematic for enough people.

Thoughts?

from caffeine.

txshtkckr avatar txshtkckr commented on June 18, 2024

Obviously it only matters when there is a listener, but we will always have one for our use case because we are mapping events back to another API that allows dynamic registration of its listeners. Some options that occur to me:

  1. Use a read/write lock. This is what I'm using to work around the issue in my spike, by acquiring the read lock inside of the cache loader and dropping it in a finally block around the get if it is held by the current thread. The wrapper for invalidateAll acquires the write lock. I can do this without changing the underlying implementation, so that's what I'm going with for my spike. The main drawback is that it's pretty sloppy to have the cache loader acquire a read lock that code outside the get has to release iff this thread holds the lock. This works, but it is gross, and I don't know how well it will handle high lazy-load concurrency.
  2. Decorate map items with a version number and bump that version number in removeAll. The decorated loader would read the version number before calling its delegate and the cache would return values that include the version so that get(K) can verify it. The drawback here is that the underlying datatype being stored has to change to something that includes the version, otherwise only the thread performing the load() call will know what version applies to the loaded value; other threads that merge in to the computation may see the newer version number but still get the stale value.
  3. Hotswap that cache's underlying storage by making data an AtomicReference. The implementation of removeAll would create a new ConcurrentHashMap and use getAndSet to replace the current value of data before starting the cleanup process. The code that does the loading would have to check that the data reference hasn't changed before returning, because if it has then it will need to send a removal notification for the value that it just loaded. Seems hard to avoid races, here, but as both removeAll and the failed-to-put-back cases should be rare, using heavier synchronization around just those paths could be doable.

I haven't looked at the async option, yet. If other threads can wait on that CompletableFuture then that sounds promising.

from caffeine.

ben-manes avatar ben-manes commented on June 18, 2024

For the decorator approach, I was thinking something like,

Set<K> loadingKeys = ConcurrentHashMap.newKeySet();

invalidateAll() {
  delegate.invalidateAll();
  for (Iterator<K> it = loadingKeys.iterator(); it.hasNext()) {
    delegate.invalidate(it.next()); // blocks
    it.remove();
  }
}

// CacheLoader or other computes (atomic)
loadingKeys.add(key)
V value = // compute
loadingKeys.remove(key);
return value

In regards to the async cache, it has many nice properties that could make this simpler. If you don't need to block until each pending load completes, just call invalidate through the LoadingCache view. If you need to wait, then that would require additional methods to the AsyncLoadingCache view, which can be added.

from caffeine.

txshtkckr avatar txshtkckr commented on June 18, 2024

Ahhh, I didn't think about whether or not invalidate would block on the currently loading keys, but I suppose an explicit removal on a key works regardless. That raises a few more questions, though:

  1. How does the explicit secondary ConcurrentHashMap for tracking the loading keys would compare to the read/write lock in overhead terms? My understanding is that ReentrantReadWriteLock uses ThreadLocal storage to keep track of read lock ownership, and having additional entries in the ThreadLocal map for every one of these might be unacceptable.
  2. Is it strictly true that the loader is never entered from more than one thread for a given key, or can interactions between load and remove lead to multiple threads loading the same key concurrently, in which case the loadingKeys set wouldn't be good enough.
  3. Most importantly, what you've described still has a small race between loadingKeys.remove(key) and the reservation node getting updated. If a removeAll looks during that window, then it will not see the loading key entry even though the value it is loading is stale and will survive. It wouldn't be safe to call loadingKeys.remove(key) until after computeIfAbsent has returned.

It does occur to me now that I think about it that you are already doing what you need to in order to know whether or not to call remove on the loading keys, however -- to track the statistics you are already passing in a boolean[1] called missed so it can record whether or not the loader got used, and that's exactly what I need to know in order to decide whether or not to unlock, remove from loadingKeys or whatever...

from caffeine.

ben-manes avatar ben-manes commented on June 18, 2024
  1. The key set avoid latency fluctuations, more evident with a fair r/w lock, where new loads wait until the writer's turn to acquire exclusive access.
  2. Yes, for a particular cache instance. This is ensured by ConcurrentHashMap. Unfortunately this can't be guaranteed by ConcurrentMap which is one reason why the map is not configurable.
  3. You're right, I forgot that even another delegate.invalidateAll() to catch the race wouldn't work. Moving the key removal outside of the loader is the only safe approach.

I had to do some similar hacks for the JCache adapter (see CacheProxy) due to requiring that listeners receive events in-order for an entry. The spec was vague regarding if a removal followed by insertion of a key had to be in-order, though, so for simplicity I didn't enforce that. The other JCache implementations use fat locks around the cache and invoke the listeners synchronously (whereas I have dispatch queues via EventDispatcher). I can't tell whether any of that is similar to how your internal system behaves, but may be worth a quick glance.

from caffeine.

txshtkckr avatar txshtkckr commented on June 18, 2024

In my spike (and in my workaround for Guava as well) I went the fair R/W lock route. While this does mean that new loads have to wait, this should be offset by the rarity of doing a removeAll in the first place -- after all, if removeAll is something you're doing frequently, then why bother with the cache?

But I agree that the ConcurrentMap approach with loadingKeys.remove moved outside is worth a look, though I think that instead of using it like a set, I think it would have to be a ConcurrentMap<K,Long> and remove from it using loadingKeys.remove(key, Thread.currentThread().getId()). I don't love the idea of throwing an extra native call in there, but R/W lock would do one as well, and the point is that you need to know whether or not this thread is one that started the request. The main case I worry about is this rather unfortunate scheduling coincidence:

Thread 1 Thread 2
computeIfAbsent
loadingKeys.add
computeIfAbsent
computeIfAbsent returns
loadingKeys.remove
invalidate
computeIfAbsent
loadingKeys.put
computeIfAbsent returns
loadingKeys.remove
invalidateAll

Thread 1's concurrent load doesn't get invalidated because Thread 2 removed a loadingKeys entry that didn't belong to it. sigh

from caffeine.

ben-manes avatar ben-manes commented on June 18, 2024

Makes sense. Sorry that I should have been more careful when writing the loadingKeys idea. There are many subtle details to work around, like if the remove is stalled long enough for an evict/load cycle (which I think is what you are trying to show). For your case a r/w lock is simplest. I think that if I was pushed to handle this internally, I'd have to fully explore the key approach to see if it made more sense.

I'm glad you have a working solution.

from caffeine.

txshtkckr avatar txshtkckr commented on June 18, 2024

Another detail....

invalidateAll() {
  delegate.invalidateAll();
  for (Iterator<K> it = loadingKeys.iterator(); it.hasNext()) {
    delegate.invalidate(it.next()); // blocks
    it.remove();
  }
}

For correctness, I think you have to do this in the other order -- that is, you should call delegate.invalidateAll() after iterating, not before it.

from caffeine.

txshtkckr avatar txshtkckr commented on June 18, 2024

No worries. It's admittedly a very hard problem, and the fact that all of the technologies we have already been using (Guava, EhCache, and Hazelcast) also fail the concurrency problem in one way or another should be some indication of just how hard it is to get these things right.

from caffeine.

ben-manes avatar ben-manes commented on June 18, 2024

Yep, I saw that too. Sorry I was throwing the idea out, not an implementation, so it was off-the-cuff rather than a validated alternative.

from caffeine.

ben-manes avatar ben-manes commented on June 18, 2024

I started to look into this and think about how CacheWriter would interact with this improvement. Unfortunately all of ConcurrentHashMap's internal iterator methods, like removeIf and forEachKey, skip over computing entries.

I think this can only be done safely using a secondary set of all present keys. This is just the loadingKeys discussion without the race because the keys are not actively discarded.

The key set is only modified within a compute block, so all insertions and removals must be done through a compute method. This is to avoid the races mentioned earlier by always relying on the cache's hash table locks. The key is added to the set at the beginning of a load and, if that load failure, removes the key before completing.

An invalidateAll() walks our key set instead of data.keySet() using a compute-based removal. If there are races where the entry no longer exists then nothing extra is needed. Otherwise a removal notification and CacheWriter.delete() is triggered.

The cost of this approach is a slight overhead for writes and the extra memory used to retain an additional ConcurrentHashMap-based set. Neither of those are prohibitive, though it seems unfortunate to need an extra map just to cover a race condition. However I think that's better than the alternatives.

from caffeine.

ben-manes avatar ben-manes commented on June 18, 2024

When I caught up with Doug last year his response to this was,

The desire to do this is racy, even using clear, which non-atomically
traverses. So there's only weak consistency wrt arriving/ongoing/completed
computeIfAbsent's etc. In principle, CHM could offer a method with similar
properties, but it still wouldn't provide a simple-to-state guarantee.

The only solution I know is for users themselves to arrange quiescent
points if they need a strict demarcation. (And I wonder if this desire
about invalidateAll obscures assumptions they might have about consistency.)

Given his hesitation and my own regarding strengthening invalidateAll, I think the current behavior is the best approach and should be better documented. Since this can be adequately worked around in user code, as you've demonstrated. I'll close this when I've improved the documentation.

from caffeine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.