ben-manes / caffeine Goto Github PK
View Code? Open in Web Editor NEWA high performance caching library for Java
License: Apache License 2.0
A high performance caching library for Java
License: Apache License 2.0
Caffeine: 2.1.0
JUnit: 4.11
Gradle: 2.4
java: 1.8.0_72
Apologies if I am doing something peculiar or silly, but I've been tripping over the following cache error for a while now in unit test:
--setting up cache--
Cache<String, Profile> profileDefinitionCache =
Caffeine.newBuilder()
.maximumSize(1)
.build();
}
and later...
profileDefinitionCache.put(PROFILES_CONFIG_KEY, new Profile("Some text"));
throws the following stack:
java.lang.ArrayIndexOutOfBoundsException: 1
at com.github.benmanes.caffeine.cache.Ticker.disabledTicker(Ticker.java:52) ~[caffeine-2.1.0.jar:na]
at com.github.benmanes.caffeine.cache.BoundedLocalCache.expirationTicker(BoundedLocalCache.java:336) ~[caffeine-2.1.0.jar:na]
at com.github.benmanes.caffeine.cache.BoundedLocalCache.putFast(BoundedLocalCache.java:1337) ~[caffeine-2.1.0.jar:na]
at com.github.benmanes.caffeine.cache.BoundedLocalCache.put(BoundedLocalCache.java:1296) ~[caffeine-2.1.0.jar:na]
at com.github.benmanes.caffeine.cache.LocalManualCache.put(LocalManualCache.java:64) ~[caffeine-2.1.0.jar:na]
I realise my cache definition is quite spare, but I just wanted a simple, thread safe cache containing one item that never expires. Any ideas?
NOTE: this doesn't fail in IDE, only on the command line.
There is a common usage pattern for AsyncCaches that is currently not well supported, namely returning incomplete results if the computation can't finish for all keys in some timeout. Getting 100% of the data in 100ms is best, but it is often better to get 90% of the data in 100ms than wait 1 second for 100% of the data, but as the computation is expensive you want to finish it and cache it for future accesses.
Currently I solve that with uglyness like this in the loader;
`
public CompletableFuture[] asyncLoadAll(final Iterable[? extends String] keys, final Executor executor) {
return CompletableFuture.supplyAsync(() -> {
final CompletableFuture[] future = bulkGet(keys);
try {
return future.get(timeout, MILLISECONDS);
} catch (InterruptedException | ExecutionException | TimeoutException e) {
LOG.error("Error while fetching data for the cache", e);
future.whenComplete((map, ex) -> {
if(map != null) {
localCache.synchronous().putAll(map);
}
});
return Collections.EMPTY_MAP;
}
}, executor);
}
`
but I can't help but feel like there should be a better solution.
This:
@Test
public void throwsNullPointerException() {
LoadingCache<String, Object> cache = Caffeine.newBuilder()
.maximumSize(10000)
.build(key -> null);
cache.get("NullKey");
}
Throws:
Exception in thread "ForkJoinPool.commonPool-worker-1" java.lang.NullPointerException
at com.github.benmanes.caffeine.cache.BoundedLocalCache$RemovalTask.run(BoundedLocalCache.java:1131)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.drainWriteBuffer(BoundedLocalCache.java:1025)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:904)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.cleanUp(BoundedLocalCache.java:888)
at com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$4/1020154737.run(Unknown Source)
at java.util.concurrent.ForkJoinTask$RunnableExecuDisconnected from the target VM, address: '127.0.0.1:49629', transport: 'socket'
teAction.exec(ForkJoinTask.java:1402)
on shutdown. Is this desired behaviour or have I done something wrong above?
compile 'com.github.ben-manes.caffeine:caffeine:2.0.1'
Should the check on BondedLocalCache line 1812:
if (node == null) {
afterWrite(null, new RemovalTask(removed[0]), now);
return null;
}
maybe include removed[0] not being null? I.e.
if (node == null && removed[0] != null) {
afterWrite(null, new RemovalTask(removed[0]), now);
return null;
}
Decide on which approaches and implement. The arena should be lazily initialized and grow on contention to reduce the footprint in uncontested usages.
Optimistic w/ combiner mode
Once a producer can combine, it can no longer offer back to the arena on a CAS failure to the queue. The advantage is that no garbage is created as the arena holds only unlinked nodes.
Optimistic with dynamic mode
A producer can pivot between a combiner or transferrer on each failed CAS loop to the queue. A batch insert requires knowing the both the first and last nodes forming a linked list to be appended to the queue. The arena slot must hold both references.
Linearizable
In this mode, a producer that transferred its node must wait it to be added before completing the call. This requires a wrapper with a flag that the waiter spins on and, upon insertion, the combiner sets.
I am getting the following exception adding items to cache. Is this an actionable error?
WARNING:` Exception thrown when submitting maintenance task
java.util.concurrent.RejectedExecutionException: Queue capacity exceeded
at java.util.concurrent.ForkJoinPool$WorkQueue.growArray(ForkJoinPool.java:884)
at java.util.concurrent.ForkJoinPool.externalSubmit(ForkJoinPool.java:2354)
at java.util.concurrent.ForkJoinPool.externalPush(ForkJoinPool.java:2416)
at java.util.concurrent.ForkJoinPool.execute(ForkJoinPool.java:2645)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:816)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.afterWrite(BoundedLocalCache.java:799)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.putFast(BoundedLocalCache.java:1360)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.put(BoundedLocalCache.java:1296)
at com.github.benmanes.caffeine.cache.LocalManualCache.put(LocalManualCache.java:64)
The last line never terminates. Tested on Caffeine 2.2.1 and 2.2.2. Java 8u60 on Mac, El Capitan.
Cache<String, String> cache = Caffeine.newBuilder().build();
Function<String, String> loader2 = (String t) -> "3";
Function<String, String> loader1 = (String t) -> cache.get("2", loader2);
System.out.println(cache.get("1", loader1));
I am putting together some new Java Caching benchmarks. During a benchmark run I got a strange OOM exception with Caffeine (Version 2.1):
java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.github.benmanes.caffeine.SingleConsumerQueue$$Lambda$4/2122912182.apply(Unknown Source)
at com.github.benmanes.caffeine.SingleConsumerQueue.offer(SingleConsumerQueue.java:231)
at com.github.benmanes.caffeine.SingleConsumerQueue.add(SingleConsumerQueue.java:255)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.afterWrite(BoundedLocalCache.java:796)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.putFast(BoundedLocalCache.java:1360)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.put(BoundedLocalCache.java:1296)
at com.github.benmanes.caffeine.cache.LocalManualCache.put(LocalManualCache.java:64)
at org.cache2k.benchmark.thirdparty.CaffeineCacheFactory$MyBenchmarkCacheAdapter.put(CaffeineCacheFactory.java:92)
at org.cache2k.benchmark.thirdparty.CaffeineCacheFactory$MyBenchmarkCacheAdapter.put(CaffeineCacheFactory.java:69)
at org.cache2k.benchmark.jmh.suite.noEviction.symmetrical.PopulateParallelOnceBenchmark.populateChunkInCache(PopulateParallelOnceBenchmark.java:47)
at org.cache2k.benchmark.jmh.suite.noEviction.symmetrical.generated.PopulateParallelOnceBenchmark_populateChunkInCache_jmhTest.populateChunkInCache_ss_jmhSt
at org.cache2k.benchmark.jmh.suite.noEviction.symmetrical.generated.PopulateParallelOnceBenchmark_populateChunkInCache_jmhTest.populateChunkInCache_SingleSh
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:430)
at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:412)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
The interesting thing is, that the heap limit was 2G and the memory consumption, in case everything was inserted successful is only about 900MB. So, there is no real reason for an OOM.
You find the used setup described here: cruftex.net : Java Caching Benchmarks 2016 - Part 1 The benchmark that failed was the "Populate Parallel Once" benchmark.
The full log of JMH is available at: http://cruftex.net/2016/03/16/result-thirdparty.CaffeineCacheFactory-ReadOnlyBenchmark-4.out.gz
It could be difficult to reproduce. During my runs this effect happened once. I will do regular runs with the latest Caffeine and JVM versions in a couple of days and report back.
Caffeine's default expiration policy doesn't conform to the JSR107 standard, but despite that, I'd prefer to use it over the JSR107 compliant one.
It looks like, currently, the CacheFactory implementation, whilst resolving, will not configure a cache from the application.conf unless it is specifically named in its own config object.
Is there a way to get a cache using the Caffeine expiration policy using only jsr107 methods during construction?
can u give some examples about the strategy for config file. You refer to https://github.com/eishay/jvm-serializers/wiki for 3rd party serializers.
For example, if we use Protobuff/Protostuff etc. which classes we can use ? Or do we need to implement something ?
Any example would be good.
If you do not return a value from a test method, the JVM is free to do dead-code elimination optimizations.
All the benchmarks in the codebase I have seen are subject to a dead-code elimination.
For example :
https://github.com/ben-manes/caffeine/blob/master/caffeine/src/jmh/java/com/github/benmanes/caffeine/cache/FrequencySketchBenchmark.java#L57
Hi,
I'm trying to do:
Map<K,V> map = Caffeine.newBuilder()
.maximumSize(10)
.build()
.asMap()
but the compiler is not happy.
It works only if I split the operation in 2: instanciate the cache, then call .asMap() on it.
I'm using 1.8.0_45
Tks
.
How does your own single consumer queue implementation compare to the two variants in https://github.com/real-logic/Agrona ?
As per metamx/druid-cache-caffeine#7 , being able to properly unit test if a Weigher
s eviction is behaving as expected is not straight forward. This ask is to add documentation for best practices for how to properly ensure a project's cache Weigher
is behaving as intended when used in the caffeine cache.
Currently refreshAfterWrite
and LoadingCache.refresh(key)
are performed as blocking writes (computations). This was a simple to implement approach that does not obstruct reads, avoids duplicate in-flight calls, and postpones write expiration. However it has a few drawbacks that a fully non-blocking version would not,
ConcurrentHashMap
locking on the bin
levelasyncReload(key, executor)
does not use the provided executorA fully asynchronous refresh must handle conflicts and notify the removal listener appropriately. A conflict occurs when the refresh completes and the mapping has changed (different or absent value). This is detectable as a reload
computes a new value based on the key and old value. The refresh cannot know whether the change is benign or whether it would insert stale data.
In Guava the refresh clobbers or inserts the entry, but due to races dropping is the only safe default choice. That default is potentially wasteful, so the user should have the option to resolve the conflict instead. This can be done by adding a new default method to AsyncCacheLoader
,
// Placeholder name, suggestions welcome!
default boolean retainOnRefreshConflict(K key, V currentValue, V oldValue, V newValue) {
return false;
}
The removal listener must be notified to avoid resource leaks. In Guava the removal cause is replaced
, but that is intended as a user actions rather than an automatic one. As we need to signal when a reload replaced or dropped the value, to avoid confusion a refreshed
should be added to the enum.
The post-processing of a reload will be implemented by attaching a whenComplete
to the future returned by asyncReload
. This avoids the unnecessary thread case described above for AsyncLoadingCache
. It is worth noting that cancelling the refreshing future if the entry is removed was rejected. This is because unlike Future
, in CompletableFuture
the task is not interrupted to hint that it may stop and the removal listener would still need to be notified if the value materialized.
The final detail is whether an in-flight refresh should postpone expiration. This is currently the case for expireAfterWrite
(but not expireAfterAccess
) due to reusing the entry's writeTime
to avoid an extra field to indicate that a refresh was triggered. A stricter expireAfterWrite
would require an additional per-entry field and is more prone to a redundant load. For now we'll keep it as is, but be willing to change if requested.
The above is from brainstorming with Etienne Houle @ Stingray (@ehoule).
@abatkin, @yrfselrahc, @lowasser might have an opinions on this design
Can you please add information to the benchmarks wiki page how to run the benchmarks?
I dit try:
./gradlew jmh
And get:
...
FAILURE: Build failed with an exception.
Where:
Script '/home/jeans/ideaWork/caffeine/caffeine/gradle/jmh.gradle' line: 32
What went wrong:
Execution failed for task ':caffeine:jmh'.
jmh: includePattern expected
First of all thanks for this awesome project.
I may just be approaching this wrong but I was wondering if there was anyway to get computeIfAbsent
like behavior with Cache
. I see the methods available in LocalCache
but I can't seem to get to it. I would like to compute the value for the cache only once in a highly concurrent environment especially with expensive computations.
I could always put a guard on the cache object itself but I was hoping for something at the api level.
Thanks again!
As part of a benchmark I'd like to test the efficiency of the eviction algorithm. However, it seems Caffeine isn't strictly adhering to the maximumSize
setting. E.g. one test runs with a random pattern of 1000 keys and a maximum cache size of 500 and yields a phenomenal hitrate of 97%. The benchmark is CaffeineBenchmark.benchmarkTotalRandom1000_500
at https://github.com/cache2k/cache2k-benchmark
I tried with cleanUp
but it seems not to trigger the eviction. Is there a way to force eviction? Any other ideas to do a reasonable efficiency comparison?
Hi,
I think there is a bug with expirations.
When I create a cache and I don't access the elements, then expiration is never done.
But when I access the elements, it works.
public class TestExpiration {
public static void main(String[] args) throws InterruptedException {
RemovalListener listener = new RemovalListener<String, String>() {
@Override
public void onRemoval(@Nullable String key, @Nullable String value, @Nonnull RemovalCause cause) {
System.out.println(key + " " + value + " " + cause);
}
};
int initialCapacity = 2;
Cache<String, String> store = Caffeine.newBuilder()
.expireAfterWrite(10, TimeUnit.SECONDS)
.executor(ForkJoinPool.commonPool())
.removalListener(listener)
.recordStats()
.build();
while(true) {
System.out.println( store.stats());
Thread.sleep(1000);
}
}
}
System.out.println(key + " " + value + " " + cause);
is never called
Using the cache in its ConcurrentMap form (.asMap()), when time-limited entries expire the .get() method returns null as is should, but the .putIfAbsent() method return the old-expired value.
Consider the following example:
Cache<Integer, String> cache1 = Caffeine.newBuilder()
.expireAfterWrite(1, TimeUnit.SECONDS)
.build();
ConcurrentMap<Integer, String> cache2 = cache1.asMap();
cache2.put(1, "A");
Thread.sleep(2000);
System.out.println(cache2.get(1)); // Expect to see null (and see null)
System.out.println(cache2.putIfAbsent(1, "B")); // Expect to see null (and see "A")
Instead of chasing a moving target as producers add to the queue we can justify (IMO) representing a snapshot of the size at the time the request is made:
public int size() {
LinkedQueueNode<E> curr = lvConsumerNode();
final LinkedQueueNode<E> currProducerNode = lvProducerNode();
int size = 0;
// must chase the nodes all the way to the producer node, but there's no need to chase a moving target.
while (curr != currProducerNode && size < Integer.MAX_VALUE) {
LinkedQueueNode<E> next;
while((next = curr.lvNext()) == null);
curr = next;
size++;
}
return size;
}
As a simple example:
Caffeine<String, String> builder = Caffeine.newBuilder().recordStats();
builder = builder.weigher((String key, String val) -> key.length() + val.length());
Fails to assign the first builder from: incompatible types: com.github.benmanes.caffeine.cache.Caffeine<java.lang.Object,java.lang.Object> cannot be converted to com.github.benmanes.caffeine.cache.Caffeine<java.lang.String,java.lang.String>
Caffeine<Object, Object> builder = Caffeine.newBuilder().recordStats();
builder = builder.weigher((String key, String val) -> key.length() + val.length());
fails with
incompatible types: inference variable K1 has incompatible bounds
equality constraints: java.lang.Object
upper bounds: java.lang.String,java.lang.Object
Caffeine builder = Caffeine.newBuilder().recordStats();
builder = builder.weigher((String key, String val) -> key.length() + val.length());
fails weigher assignment with incompatible types: incompatible parameter types in lambda expression
The following behaves as expected with assigning a weigher:
@SuppressWarnings("unchecked")
Caffeine<String, String> builder = (Caffeine)Caffeine.newBuilder().recordStats();
builder = builder.weigher((String key, String val) -> key.length() + val.length());
The manifest contains an error in the Import-package
statement.
What it is : Import-Package: sun.misc,resolution:=optional
What should be: Import-Package: sun.misc;resolution:=optional
note the comma vs semicolon. It currently requires sun.misc
, and optionally requests the default
package, and OSGi blows up over it. I'd submit a PR, but I couldn't figure out where the manifest was getting generated.
Consider improving CacheLoader interface for case where the underlaying code/service has async API.
Currently the async builder is naturally expecting lamba (returning value):
AsyncLoadingCache<Key, Graph> cache = Caffeine.newBuilder()
.maximumSize(10_000)
.expireAfterWrite(10, TimeUnit.MINUTES)
.buildAsync(key -> createExpensiveGraph(key));
the method buildAsync is then calculating value in provided or default executor. However there is also case where underlaying value can be provided as CompletableFuture already by some underlaying service.
Currently we can use future values if we implement new class, but it could be done more user friendly way.
At a high-level, I want to cache a set of expensive computations for a set period of time (time-based, expireAfterWrite
). I would also like to be able to proactively refresh keys (well, their values) that have been used "recently" (whatever that means) but are about to be expired/evicted. I don't want to refresh everything in the cache, since then the cache will grow forever.
For example: If I'm happy with ~1hr stale values, I can use LoadingCache
and expireAfterWrite(60, TimeUnit.MINUTES)
. Everything works well, except that every hour-or-so, all of the "hot" keys suddenly take a long time to return (since they have been expired and now need to be recomputed the next time they are accessed).
This type of setup would be valuable when, for example, some of your keys are accessed frequently and repeatedly, and others are only rarely used, and you want to avoid a short period of slow access to those "hot" values when they expire. In my case, this is particularly bad since the common access pattern is such that most of the keys will get loaded when the system initially starts up (i.e. all at the same time) so they will all expire at the same time and then need to be reloaded at the same time (the startup delay is long to do the initial population - now I have to pay that same cost every N Time Units (where N Time Units is my expireAfterWrite()
time) instead of being proactive about it).
One potential generic solution would be to add a pre-expire event which would run at a configurable time offset before any item is to be expired/evicted. The event would need access to the "last read timestamp" (and possibly other metadata?) of the key (in my case, so it could decide if it wanted to refresh the value). You could do anything you wanted. In my case, I would just call refresh(key)
whenever the last read timestamp was recent enough (i.e. this is up to my own business logic).
There are a bunch of other potential solutions to this problem that I can think of, but this one is the most generic that I have found (so hopefully other people can find other good uses).
I think there would need to be a few rules around how this API would work:
The next logical step would be to make this a "bulk" API - then you would need to specify some granularity for how often to sweep for things that might expire. You might want to only check every X time units, which means your callback gets a collection of all keys/values that will expire during that X time unit period). For my use case, I'd also need a bulk refresh()
API (similar to the way you can override loadAll()
to load values in bulk) which doesn't currently exist either).
Thoughts?
We just discovered that while observing a lot of cache misses in our logs.
I isolated the problem down to simple unit test. However, it is intermittent, and is not reproducible if is run under gradle testing. So I put multiple attempts inside a loop and also recommend using built-in Eclipse tester to reproduce that. I am also able to reproduce as a simple app with main() on Linux and Mac.
The unit tests are provided below. There are two scenarios. One test that uses only one cache key works fine, another test that uses the same logic but with two keys fails on unexpected cache miss.
package com.thomsonreuters.onep.pito.utils;
import static org.junit.Assert.assertEquals;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.Executor;
import java.util.concurrent.TimeUnit;
import org.junit.Test;
import com.github.benmanes.caffeine.cache.AsyncLoadingCache;
import com.github.benmanes.caffeine.cache.CacheLoader;
import com.github.benmanes.caffeine.cache.Caffeine;
public class CaffeineTest {
private static final int ATTEMPTS = 100;
private static final int TTL = 100;
private static final int EPSILON = 10;
@Test
public void testWithCoupleOfKeys() throws Exception {
testExpiration(true);
}
@Test
public void testSameWithSingleKey() throws Exception {
testExpiration(false);
}
private void testExpiration(boolean withPairOfKey) throws Exception {
final ConcurrentMap<String,String> source = new ConcurrentHashMap<>();
final ConcurrentMap<String,Date> lastLoad = new ConcurrentHashMap<>();
class Loader implements CacheLoader<String,String> {
private void reportCacheMiss(String key) {
Date now = new Date();
Date last = lastLoad.get(key);
System.out.print(new SimpleDateFormat("hh:MM:ss.SSS").format(new Date()) + ": cache miss for key: " + key);
if (last != null) {
System.out.println(" - " + (now.getTime() - last.getTime()) + "ms after last load");
} else {
System.out.println();
}
lastLoad.put(key, now);
}
@Override
public String load(String key) {
reportCacheMiss(key);
return source.get(key);
}
@Override
public CompletableFuture<String> asyncLoad(String key, Executor executor) {
reportCacheMiss(key);
return CompletableFuture.completedFuture(source.get(key));
}
}
for (int i = 0; i < ATTEMPTS; i++) {
System.out.println("Attempt #" + i);
//initial value
source.put("foo", "foo0");
if (withPairOfKey) {
source.put("bar", "bar0");
}
lastLoad.clear();
AsyncLoadingCache<String,String> cache = Caffeine.newBuilder()
.expireAfterWrite(TTL, TimeUnit.MILLISECONDS)
.buildAsync(new Loader());
assertEquals("should serve initial value", "foo0", cache.get("foo").get());
if (withPairOfKey) {
assertEquals("should serve initial value", "bar0", cache.get("bar").get());
}
//first update
source.put("foo", "foo1");
if (withPairOfKey) {
source.put("bar", "bar1");
}
assertEquals("should serve cached initial value", "foo0", cache.get("foo").get());
if (withPairOfKey) {
assertEquals("should serve cached initial value", "bar0", cache.get("bar").get());
}
Thread.sleep(EPSILON); //sleep for less than expiration
assertEquals("should still serve cached initial value", "foo0", cache.get("foo").get());
if (withPairOfKey) {
assertEquals("should still serve cached initial value", "bar0", cache.get("bar").get());
}
Thread.sleep(TTL + EPSILON); //sleep until expiration
assertEquals("should now serve first updated value", "foo1", cache.get("foo").get());
if (withPairOfKey) {
assertEquals("should now serve first updated value", "bar1", cache.get("bar").get());
}
//second update
source.put("foo", "foo2");
if (withPairOfKey) {
source.put("bar", "bar2");
}
assertEquals("should still serve cached first updated value", "foo1", cache.get("foo").get());
if (withPairOfKey) {
assertEquals("should still serve cached first updated value", "bar1", cache.get("bar").get());
}
Thread.sleep(EPSILON); //sleep for less than expiration
//it never fails here for the first key
assertEquals("should still serve cached first updated value", "foo1", cache.get("foo").get());
if (withPairOfKey) {
//it fails here after a few attempts because of cache miss
assertEquals("should still serve cached first updated value", "bar1", cache.get("bar").get());
}
}
}
}
I'm particularly interested whether it has any relation to https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/util/concurrent/BoundedConcurrentHashMap.java.
If it doesn't, would it be possible to include this in your benchmarks?
There should be a possibility to clear all stats for a cache. As far as I can tell there is no such option -- at least I could not find one for the current API. This is usefull for having some kind of poor mans "cache manager", that displays basic information for used caches and enables invalidation of certain caches.
Invalidating a cache via cache.invalidateAll()
does remove the objects from the cache, but there should be a "reset all stats" option -- since the stat apply only to the old state.
Currently the Javadoc for EliminationStack.java says:
This stack orders elements LIFO (last-in-last-out).
This should probably say
This stack orders elements LIFO (last-in-first-out).
The GetPutBenchmark operates with a cache capacity that is bigger then the used key value range.
This yields 100% hitrate for all benchmarks.
We use this very nice library in the implementation of the Rascal meta-programming language (see http://www.rascal-mpl.org). However, I was hit by the following: when a cache call back function returns null
this is not registered in the cache and the next time around the same call back function will be called with the same arguments. The net effect is that no caching is done despite a lot of redundant calling activity.
If this is intended behavior, please add a warning in the documentation. Otherwise: change this behavior.
Here is a code snippet, where getCompanionDefaultsFunction
may return null
.
private Cache<String, Function> companionDefaultFunctionCache = Caffeine.newBuilder().build();
public Function getCompanionDefaultsFunction(String name, Type ftype){
String key = name + ftype;
return companionDefaultFunctionCache.get(key, k -> rvm.getCompanionDefaultsFunction(name, ftype));
}
For example, when measuring queue throughput it is not enough to benchmark the poll method as follows:
@Benchmark
Object poll() {
q.poll();
}
This benchmark groups the failure case with the success case. It would make a queue with very low throughput look great as long as it failed to deliver values quickly. The throughput of the queue is the number of successful polls. There are several ways to achieve this measurement either by using the Control object and spinning until a value is available or by using AuxilaryCounter objects and counting the success and fail metrics separately. I personally prefer the latter as the different cases can be quite telling.
This same issue is visible in the ReadBufferBenchmark. I assume the number of drain() calls is immaterial if they deliver nothing. In this case one must use an AuxilaryCounter to reflect the number of elements drained.
It would seem to logically and architecturally make the loading part of an AsyncCache populate itself from a CompletionStage instead of a CompletableFuture. CompletionStages are designed to be the 'read' side of a logical promise. There are a few concurrency bridging libraries (the most prominent of which is the Scala Java 8 compatibility layer) that map this properly - and in fact you cannot convert a Scala Future to a CompletableFuture, rather only a CompletionStage.
Of course, you can always bypass this by creating a new, unfulfilled CompletableFuture and completing it with the result of the scala future, but this just seems like a lot of overhead.
The EliminationStack is able to cancel opposing operations (push - pop). This works well in a balanced workload to reduce contention on the stack.
In an unbalanced workload, the producers contend and the stack degrades to its original form. There is also a penalty of scanning the arena for a potential mate. If combining is supported, then both of these problems are eliminated.
This is unlikely, but there's nothing to stop the unbounded queue to grow beyond MAX_INT length and so:
@Override
public int size() {
Node<E> h = head;
Node<E> cursor = h.getNextRelaxed();
int size = 0;
while (cursor != null) { // should be while (cursor != null && size < MAX_INT)
cursor = cursor.getNextRelaxed();
size++;
}
return size;
}
Hello @ben-manes,
I would like to say that from a bird's eye your work looks amazing. Nice benchmarks/results, very good test suite and code quality. I haven't dig up that deeply, but from what I see there is no non-concurrent version of the cache. Do you have intentions to implement such version or you are going to maintain only a concurrent version ( which may be lock free, but still, it is neither wait free, nor it will have the same latency characteristics of a data structure with all the concurrency related code removed ) ?
Hi,
I came across your work through your comment on the lock-free mailing list. I am the main developer on JCTools(https://github.com/JCTools/JCTools), which is an effort with some overlap to yours. I was wondering if you considered using JCTools?
We currently support SPSC/MPSC/SPMC/MPMC bounded queues and a more limited selection of unbounded queues. There's also an ongoing effort towards offheap data channels and other data structures.
I'd be interested in hearing your feedback and helping each other out if possible,
Thanks,
Nitsan
This Guava issue identified an expected optimization not being implemented. A getAll
where some of the entries should be refreshed due to refreshAfterWrite
schedules each key as an independent asynchronous operation. Due to the provided CacheLoader
supporting bulk loads, it is reasonable to expect that the refresh is performed as a single batch operation.
This optimization may be invasive and deals with complex interactions for both the synchronous and asynchronous cache implementations. Or, it could be as simple as using getAllPresent
and enhancing it to support batch read post-processing.
This might not be an issue that needs solving.
My project uses the maven-shade-plugin
to build itself into a fat jar. When I do this, I get warnings that caffeine
and tracing-api
define overlapping classes:
[WARNING] caffeine-1.2.0.jar, tracing-api-1.2.0.jar define 7 overlappping classes:
[WARNING] - com.github.benmanes.caffeine.cache.tracing.TraceEvent$Action
[WARNING] - com.github.benmanes.caffeine.cache.tracing.TracerIdGenerator
[WARNING] - com.github.benmanes.caffeine.cache.tracing.TraceEvent
[WARNING] - com.github.benmanes.caffeine.cache.tracing.DisabledTracer
[WARNING] - com.github.benmanes.caffeine.cache.tracing.TracerHolder
[WARNING] - com.github.benmanes.caffeine.cache.tracing.TraceEventFormats
[WARNING] - com.github.benmanes.caffeine.cache.tracing.Tracer
[WARNING] maven-shade-plugin has detected that some .class files
[WARNING] are present in two or more JARs. When this happens, only
[WARNING] one single version of the class is copied in the uberjar.
[WARNING] Usually this is not harmful and you can skeep these
[WARNING] warnings, otherwise try to manually exclude artifacts
[WARNING] based on mvn dependency:tree -Ddetail=true and the above
[WARNING] output
[WARNING] See http://docs.codehaus.org/display/MAVENUSER/Shade+Plugin
If I exclude the tracing-api
dependency like this, I don't get those warnings:
<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId>
<version>${caffeine.version}</version>
<exclusions>
<exclusion>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>tracing-api</artifactId>
</exclusion>
</exclusions>
</dependency>
But I'm not actually using any tracing features (that I know of), so I don't know if I'm breaking Caffeine by doing that.
Why does caffeine
declare classes that overlap with tracing-api
? Why doesn't it instead (a) not declare those classes, or (b) not depend on tracing-api
?
Are the two sets of classes equivalent? Said another way, do I need to be concerned with which versions of those classes that maven-shade-plugin
chooses to place in the resulting .jar file?
I noticed that concurrentlinkedhashmap originally planned on having LIRS eviction for v2.x according to the google code page [1].
Is this still planned for caffeine the successor to concurrentlinkedhashmap ?
I also am quite happy to see you have custom weighting for entries ๐
I was looking into Caffeine to see if I could use it to address the problem of invalidation concurrent with loads that Guava has. While Caffeine appears to address this quite solidly for invalidate(K)
, it fails the invalidateAll()
case if there is a removal listener.
That appears to be due to this:
@Override
public void clear() {
if (!hasRemovalListener() && !Tracer.isEnabled()) {
data.clear();
return;
}
for (K key : data.keySet()) {
remove(key);
}
}
So if there is a removal listener, then it explicitly removes each key and does not use the underlying ConcurrentMap
's clear
method so that it can send the notifications for them. Unfortunately, that means that it misses the loading entries, which are suppressed from the keySet()
view.
I don't see any obvious way to solve that problem, and I haven't checked the bounded implementation, but it seemed worth reporting.
"Retrieves and removes the head of this queue, or returns null if this queue is empty."
But I believe SingleConsumerQueue::poll can return null when queue is not empty.
This is due to:
/** Adds the linked list of nodes to the queue. */
void append(Node<E> first, Node<E> last) {
for (;;) {
Node<E> t = tail;
if (casTail(t, last)) {
/** Imagine a producer thread is interrupted here **/
t.next = first;
for (;;) {
first.complete();
if (first == last) {
return;
}
first = first.getNextRelaxed();
}
}
This will cause a 'bubble' in the queue. In poll we'll get:
public E poll() {
Node<E> next = head.getNextRelaxed();
if (next == null) {
// next is null, but head != tail ...
return null;
}
lazySetHead(next);
E e = next.value;
next.value = null;
return e;
}
I've been wishing that Guava could be used as a light-weight JCache provider. Does it make sense for Caffeine to be a JCache provider? Or is it too early for consideration?
It looks like Ticker is defined to only return non-negative nanoseconds, but is returning System.nanoTime() directly. This call's timestamp is referenced from an arbitrary point, and as described in the javadocs this point may be in the future resulting in a negative value returned.
In addition this value even if starts as positive, it is possible for it to overflow.
http://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime--
I would like to use this cache in my open source project sdfs but am stuck because data expires in an asynchronous fashion. This causes an out of memory issue for me because I then flush the expiring data to disk when it is aged out. This takes time and can cause an out of memory issue if there are a lot of pending writes. Would it be possible to add a synchronous option to the builder that would cause new inserts to block until data is expired, similar to guava?
Found a weird issue in 2.0.2 (didn't check any older version but this could be JDK related)
We have a test case that shows a LoadingCache looping forever inside computeIfPresent() when invalidate() is called (line 74 of LocalManualCache) -> remove() (line 1579 of BoundedLocalCache) and then it loops in the computeIfPresent method of java.util.ConcurrentHashMap.
This is JDK 1.8.0u66
This is due to the same conditions described in issue #9
the cache.tracing package is not exported and in this way not visible when creating e.g. a LocalCache:
java.lang.NoClassDefFoundError: com/github/benmanes/caffeine/cache/tracing/Tracer
at com.github.benmanes.caffeine.cache.LocalCache.tracer(LocalCache.java:64)
at com.github.benmanes.caffeine.cache.BoundedLocalCache.(BoundedLocalCache.java:139)
at com.github.benmanes.caffeine.cache.LocalCacheFactory$SS.(LocalCacheFactory.java:12966)
at com.github.benmanes.caffeine.cache.LocalCacheFactory$SSMS.(LocalCacheFactory.java:23034)
at com.github.benmanes.caffeine.cache.LocalCacheFactory$SSMSA.(LocalCacheFactory.java:23079)
at com.github.benmanes.caffeine.cache.LocalCacheFactory.newBoundedLocalCache(LocalCacheFactory.java:795)
at com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalManualCache.(BoundedLocalCache.java:1779)
at com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalManualCache.(BoundedLocalCache.java:1775)
at com.github.benmanes.caffeine.cache.Caffeine.build(Caffeine.java:814)
Can I use it with java 1.7?
When switching to v1.3.0 the exception below is thrown; this is resolved only if tracing-async is added as a dependency to the project.
Caused by: java.lang.ClassNotFoundException: com.github.benmanes.caffeine.cache.tracing.Tracer
at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1313) ~[catalina.jar:8.0.24]
at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1164) ~[catalina.jar:8.0.24]
at com.github.benmanes.caffeine.cache.LocalCache.tracer(LocalCache.java:64) ~[caffeine-1.3.0.jar:?]
at com.github.benmanes.caffeine.cache.BoundedLocalCache.<init>(BoundedLocalCache.java:140) ~[caffeine-1.3.0.jar:?]
can you please provide jar as well on release plz for easy download.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.