Coder Social home page Coder Social logo

turnersoftware / cachetower Goto Github PK

View Code? Open in Web Editor NEW
583.0 16.0 31.0 10.96 MB

An efficient multi-layered caching system for .NET

License: MIT License

C# 100.00%
redis-caching memory-cache caching-library caching cachemanager protobuf json redis dotnet cache

cachetower's Introduction

Icon

Cache Tower

An efficient multi-layered caching system for .NET

Build Codecov NuGet

Overview

Computers have multiple layers of caching from L1/L2/L3 CPU caches to RAM or even disk caches, each with a different purpose and performance profile.

Why don't we do this with our code?

Cache Tower isn't a single type of cache, its a multi-layer solution to caching with each layer on top of another. A multi-layer cache provides the performance benefits of a fast cache like in-memory with the resilience of a file, database or Redis-backed cache.

This library was inspired by a blog post by Nick Craver about how Stack Overflow do caching. Stack Overflow use a custom 2-layer caching solution with in-memory and Redis.

๐Ÿ“‹ Features

๐Ÿค Licensing and Support

Cache Tower is licensed under the MIT license. It is free to use in personal and commercial projects.

There are support plans available that cover all active Turner Software OSS projects. Support plans provide private email support, expert usage advice for our projects, priority bug fixes and more. These support plans help fund our OSS commitments to provide better software for everyone.

๐Ÿ“– Table of Contents

You will need the CacheTower package on NuGet - it provides the core infrastructure for Cache Tower as well as an in-memory cache layer. To add additional cache layers, you will need to install the appropriate packages as listed below.

Package NuGet Downloads
CacheTower
The core library with in-memory and file caching support.
NuGet NuGet
CacheTower.Extensions.Redis
Provides distributed locking & eviction via Redis.
NuGet NuGet
CacheTower.Providers.Database.MongoDB
Provides a cache layer for MongoDB.
NuGet NuGet
CacheTower.Providers.Redis
Provides a cache layer for Redis.
NuGet NuGet
CacheTower.Serializers.NewtonsoftJson
Provides a JSON serializer using Newtonsoft.Json.
NuGet NuGet
CacheTower.Serializers.SystemTextJson
Provides a JSON serializer using System.Text.Json.
NuGet NuGet
CacheTower.Serializers.Protobuf
Provides a Protobuf serializer using protobuf-net.
NuGet NuGet

At its most basic level, caching is designed to prevent reprocessing of data by storing the result somewhere. In turn, preventing the reprocessing of data makes our code faster and more scaleable. Depending on the method of storage or transportation, the performance profile can vary drastically. Not only that, limitations of different types of caches can affect what you can do with your application.


In-memory Caching

โœ” Pro: The fastest cache you can possible have!

โŒ Con: Only lasts the lifetime of the application.

โŒ Con: Memory capacity is more limited than other types of storage.

File-based Caching

โœ” Pro: Caching huge amounts of data is not just possible, it is usually cheap!

โœ” Pro: Resilient to application restarts!

โŒ Con: Even with fast SSDs, it can be 1500x slower than in-memory!

Database Caching

โœ” Pro: Database can run on the local machine OR a remote machine!

โœ” Pro: Resilient to application restarts!

โœ” Pro: Can support multiple systems at the same time!

โŒ Con: Performance is only as good as the database provider itself. Don't forget network latency either!

Redis Caching

โœ” Pro: Redis can run on the local machine OR a remote machine!

โœ” Pro: Resilient to application restarts!

โœ” Pro: Can support multiple systems at the same time!

โœ” Pro: High performance (faster than file-based, slower than in-memory).

โŒ Con: Linux only. *

* On Windows, Memurai is your best Redis-compatible alternative - just need to list some sort of con for Redis and what it ran on was all I could think of at the time.


An ideal caching solution should be fast, flexible, resilient and scale with your usage. It is through combining these different cache types that this can be achieved.

Cache Tower supports n-layers of caching with flexibility to even make your own. You "stack" the cache layers from the fastest to slowest for your particular usage.

For example, you might have:

  1. In-memory cache
  2. File-based cache

With this setup, you have:

  • A fast first-layer cache
  • A resilient second-layer cache

If your application restarts and your in-memory cache is empty, your second-layer cache will be checked. If a valid cache entry is found, that will be returned.

Which combination of cache layers you use to build your cache stack is up to you and what is best for your application.

โ„น Don't need a multi-layer cache right now?
Multi-layer caching is only one part of Cache Tower. If you only need one layer of caching, you can still leverage the different types of caches available and take advantage of background refreshing. If later on you need to add more layers, you only need to change the configuration of your cache stack!

Cache Tower has a number of officially supported cache layers that you can use.

MemoryCacheLayer

Bundled with Cache Tower

builder.AddMemoryCacheLayer();

Allows for fast, local memory caching. The data is kept as a reference in memory and not serialized. It is strongly recommended to treat the cached instance as immutable. Modification of an in-memory cached value won't be updated to other cache layers.

FileCacheLayer

Bundled with Cache Tower

builder.AddFileCacheLayer(new FileCacheLayerOptions("~/", NewtonsoftJsonCacheSerializer.Instance));

Provides a basic file-based caching solution using your choice of serializer. It stores each serialized cache item into its own file and uses a singular manifest file to track the status of the cache.

MongoDbCacheLayer

PM> Install-Package CacheTower.Providers.Database.MongoDB
builder.AddMongoDbCacheLayer(/* MongoDB Connection */);

Allows caching through a MongoDB server. Cache entries are serialized to BSON using MongoDB.Bson.Serialization.BsonSerializer.

RedisCacheLayer

PM> Install-Package CacheTower.Providers.Redis
builder.AddRedisCacheLayer(/* Redis Connection */, new RedisCacheLayerOptions(ProtobufCacheSerializer.Instance));

Allows caching of data in Redis using your choice of serializer.

The FileCacheLayer and RedisCacheLayer support custom serializers for caching data. Different serializers have different performance profiles as well as different tradeoffs for configuration.

NewtonsoftJsonCacheSerializer

PM> Install-Package CacheTower.Serializers.NewtonsoftJson

Uses Newtonsoft.Json to perform serialization.

SystemTextJsonCacheSerializer

PM> Install-Package CacheTower.Serializers.SystemTextJson

Uses System.Text.Json to perform serialization.

ProtobufCacheSerializer

PM> Install-Package CacheTower.Serializers.Protobuf

The use of protobuf-net requires decorating the class you want to cache with attributes [ProtoContract] and [ProtoMember].

Example with Protobuf Attributes

[ProtoContract]
public class UserProfile
{
	[ProtoMember(1)]
	public int UserId { get; set; }
	[ProtoMember(2)]
	public string UserName { get; set; }

	...
}

Additionally, as the Protobuf format doesn't have a way to represent an empty collection, these will be returned as null. While this can be inconvienent, using Protobuf ensures high performance and low allocations for serializing.

You can create your own cache layer by implementing ICacheLayer. With it, you could implement caching layers that talk to SQL databases or cloud-based storage systems.

When making your own cache layer, you will need to keep in mind that your implementation should be thread safe. Cache Stack prevents multiple threads at once calling the value factory, not preventing multiple threads accessing the cache layer.

In this example, UserContext is a type added to the service collection. It will be retrieved from the service provider every time a cache refresh is required.

Create and configure your CacheStack, this is the backbone for Cache Tower.

services.AddCacheStack<UserContext>((provider, builder) => builder
	.AddMemoryCacheLayer()
	.AddRedisCacheLayer(/* Your Redis Connection */, new RedisCacheLayerOptions(ProtobufCacheSerializer.Instance))
	.WithCleanupFrequency(TimeSpan.FromMinutes(5))
);

The cache stack will be injected into constructors that accept ICacheStack<UserContext>. Once you have your cache stack, you can call GetOrSetAsync - this is the primary way to access the data in the cache.

var userId = 17;

await cacheStack.GetOrSetAsync<UserProfile>($"user-{userId}", async (old, context) => {
	return await context.GetUserForIdAsync(userId);
}, new CacheSettings(TimeSpan.FromDays(1), TimeSpan.FromMinutes(60));

This call to GetOrSetAsync is configured with a cache expiry of 1 day and an effective stale time after 60 minutes. A good stale time is extremely useful for high performance scenarios where background refreshing is leveraged.

A high-performance cache needs to keep throughput high. Having a cache miss because of expired data stalls the potential throughput.

Rather than only having a cache expiry, Cache Tower supports specifying a stale time for the cache entry. If there is a cache hit on an item and the item is considered stale, it will perform a background refresh. By doing this, it avoids blocking the request on a potential cache miss later.

await cacheStack.GetOrSetAsync<MyCachedType>("my-cache-key", async (oldValue) => {
	return await DoWorkThatNeedsToBeCachedAsync();
}, new CacheSettings(timeToLive: TimeSpan.FromMinutes(60), staleAfter: TimeSpan.FromMinutes(30)));

In the example above, the cache would expire in 60 minutes time (timeToLive). However, in 30 minutes, the cache will be considered stale (staleAfter).

Example Flow of Background Refreshing

  • You request an item from the cache
    • No entry is found (cache miss)
    • Your value factory is called
    • The value is cached and returned
  • You request the item again later (after the staleAfter time but before timeToLive)
    • The non-expired entry is found
    • It is checked if it is stale (it is)
    • A background refresh is started
    • The non-expired (stale) entry is returned
  • You request the item again later (after the background refresh has finished)
    • The non-expired entry is found
    • It is checked if it is stale (it isn't)
    • The non-expired non-stale entry is returned

Picking A Good Stale Time

There is no one-size-fits-all staleAfter value - it will depend on what you're caching and why. That said, a reasonable rule of thumb would be to have a stale time no less than half of the timeToLive.

The shorter you make the staleAfter value, the more frequent background refreshing will happen.

โš  Warning: Avoid setting a stale time that is too short!

This is called "over refreshing" whereby the background refreshing happens far more frequently than is useful. Over refreshing is at its worse with stale times shorter than a few minutes for cache entries that are frequently hit.

This has two effects:

  1. Frequent refreshes would increase load on the factory that provides the data to cache, potentially degrading its performance.
  2. Background refreshing, while efficient, has a non-zero cost when invoked thus putting additional pressure on the application where they are triggering.

With this in mind, it is not advised to set your staleAfter time to 0. This effectively means the cache is always stale, performing a background refresh every hit of the cache.

Avoiding Disposed Contexts

With stale refreshes happening in the background, it is important to not reference potentially disposed objects and contexts. Cache Tower can help with this by providing a context into the GetOrSetAsync method.

await cacheStack.GetOrSetAsync<MyCachedType>("my-cache-key", async (oldValue, context) => {
	return await DoWorkThatNeedsToBeCachedAsync(context);
}, new CacheSettings(timeToLive: TimeSpan.FromMinutes(60), staleAfter: TimeSpan.FromMinutes(30)));

The type of context is established at the time of configuring the cache stack.

services.AddCacheStack<MyContext>((provider, builder) => builder
	.AddMemoryCacheLayer()
	.WithCleanupFrequency(TimeSpan.FromMinutes(5))
);

Cache Tower will resolve the context from the same service collection the AddCacheStack call was added to. A scope will be created and context resolved every time there is a cache refresh.

You can use this context to hold any of the other objects or properties you need for safe access in a background thread, avoiding the possibility of accessing disposed objects like database connections.

โ„น Need a custom context resolving solution?
You can specify your own context activator via builder.CacheContextActivator by implementing a custom ICacheContextActivator. To see a complete example, see this integration for SimpleInjector

You might not always want a single large CacheStack shared between all your code - perhaps you want an in-memory cache with a Redis layer for one section and a file cache for another. Cache Tower supports named CacheStack implementations via ICacheStackAccessor/ICacheStackAccessor<MyContext>.

This follows a similar pattern to how IHttpClientFactory works, allowing you to fetch the specific CacheStack implementation you want within your own class.

services.AddCacheStack<MyContext>("MyAwesomeCacheStack", (provider, builder) => builder
	.AddMemoryCacheLayer()
	.WithCleanupFrequency(TimeSpan.FromMinutes(5))
);

public class MyController
{
	private readonly ICacheStack<MyContext> cacheStack;
	
	public MyController(ICacheStackAccessor<MyContext> cacheStackAccessor)
	{
		cacheStack = cacheStackAccessor.GetCacheStack("MyAwesomeCacheStack");
	}
}

To allow more flexibility, Cache Tower uses an extension system to enhance functionality. Some of these extensions rely on third party libraries and software to function correctly.

Automatic Cleanup

Bundled with Cache Tower

builder.WithCleanupFrequency(TimeSpan.FromMinutes(5));

The cache layers themselves, for the most part, don't directly manage the co-ordination of when they need to delete expired data. While the RedisCacheLayer does handle cache expiration directly via Redis, none of the other official cache layers do. Unless you are only using the Redis cache layer, you will be wanting to include this extension in your cache stack.

Distributed Locking via Redis

PM> Install-Package CacheTower.Extensions.Redis
builder.WithRedisDistributedLocking(/* Your Redis connection */);

The RedisLockExtension uses Redis as a shared lock between multiple instances of your application. Using Redis in this way can avoid cache stampedes where multiple different web servers are refreshing values at the same instant.

If you are only running one web server/instance of your application, you won't need this extension.

Distributed Eviction via Redis

PM> Install-Package CacheTower.Extensions.Redis
builder.WithRedisRemoteEviction(/* Your Redis connection */);

The RedisRemoteEvictionExtension extension uses the pub/sub feature of Redis to co-ordinate cache invalidation across multiple instances of your application. This works in the situation where one web server has refreshed a key and wants to let the other web servers know their data is now old.

Cache Tower has been built from the ground up for high performance and low memory consumption. Across a number of benchmarks against other caching solutions, Cache Tower performs similarly or better than the competition.

Where Cache Tower makes up in speed, it may lack a variety of features common amongst other caching solutions. It is important to weigh both the feature set and performance when deciding on a caching solution.

Performance Comparisons to Cache Tower Alternatives

Flushing the Cache

There are times where you want to clear all cache layers - whether to help with debugging an issue or force fresh data on subsequent calls to the cache. This type of action is available in Cache Tower however is obfuscated somewhat to prevent accidental use. Please only flush the cache if you know what you're doing and what it would mean!

If you have injected ICacheStack or ICacheStack<UserContext> into your current method or class, you can cast to IFlushableCacheStack. This interface exposes the method FlushAsync.

await (myCacheStack as IFlushableCacheStack).FlushAsync();

For the MemoryCacheLayer, the backing store is cleared. For file cache layers, all cache files are removed. For MongoDB, all documents are deleted in the cache collection. For Redis, a FlushDB command is sent.

Combined with the RedisRemoteEvictionExtension, a call to FlushAsync will additionally be sent to all connected CacheStack instances.

cachetower's People

Contributors

danielmarbach avatar dependabot-preview[bot] avatar jodydonetti avatar mgoodfellow avatar renovate[bot] avatar rgueldenpfennig avatar teo-tsirpanis avatar turnerj avatar vp89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cachetower's Issues

Investigate performance improvements with WaitingKeyRefresh tasks

See: https://blog.scooletz.com/2020/10/19/improving-Azure-Functions-performance#sharedqueuewatcher-redesigned

Specifically whether an array-first approach is smarter than an enumerable concat approach, for both memory allocation and performance.

eg.

lock (WaitingKeyRefresh)
{
var waitList = new[] { delayedResultSource };
if (WaitingKeyRefresh.TryGetValue(cacheKey, out var oldList))
{
WaitingKeyRefresh[cacheKey] = oldList.Concat(waitList);
}
else
{
WaitingKeyRefresh[cacheKey] = waitList;
}
}

Adding generic support for ICacheStack with ICacheContext

Currently there is an empty interface to reference the cache context. While that is useful, having CacheStack support a generic parameter which allows setting the type of cache context could be more useful.

That said, it really all depends how CacheStack is consumed. Via dependency injection, it is probably a lot easier consuming it as ICacheStack rather than ICacheStack<MyTypeOfContext>.

Something to think about for a future version.

Investigate allocations from CacheStack

I've already got a hunch that most of the allocations are from the async/await logic but there is definitely a need to bring down the allocations when using CacheStack.

The overhead here probably isn't too noticeable for most cache layers but for the MemoryCacheLayer, it represents about 10x the actual allocations of the cache layer itself.

Possible thoughts:

  • Look again at ValueTask?
  • Try eliding on Set path?
  • Try For-loop (afaik, this won't affect allocations but ยฏ\_(ใƒ„)_/ยฏ )
  • ...

Least Recently Used (LRU) Eviction Strategy

Issue created from #53 (comment)

Currently the only eviction strategy is one of absolute time. If there was a method to define a "size" of cache, it would be worth exploring a "Least Recently Used" system to ensure that the hottest items stay in the cache.

Thoughts:

  • Implement an extension (eg. AutoEvict/FixedCapacity) that manages the more advanced eviction strategy
  • New extension tracks number of items in the cache, cache keys, expiry dates and last access dates
    • Could use a LinkedList to track most recently used or a timestamp directly
    • Probably easier with a ConcurrentDictionary with cache keys as the key and a custom type as the value
  • Once a criteria has been hit (eg. X number of items in the cache), use what it knows to evict locally

Challenges:

  • Keeping the extension up-to-date with cache data
    • Can use the various extension hooks but still is a bit fiddly
  • Only evicting locally
    • Can do what I already do for RedisRemoteEvictionExtension (pass the cache layers in directly) but it is pretty cumbersome

Notes:
To handle a distributed LRU would be excessively complex and likely not useful. Would mean that every access to a cache item (including local) would have to then broadcast that access back through the cache system and whatever distributed backends are used.

[Performance] RedisRemoteEvictionExtension evicts just added key from all running instances

What does the bug do?

Even though the key was just added to cache the eviction is called on all currently running instances. Which makes startup performance really bad when a lot of new keys are added to cache.
So when calling GetOrSetAsync even though key is not present in any layer of cache it still get's evicted from memory layer from all other instances which makes a lot of overhead when adding new keys ( especially startup).

How can it be reproduced?

You can reproduce the bug by running this code...

var redisCacheLayer = new RedisCacheLayer(redisConnection);
            var memoryCacheLayer = new MemoryCacheLayer();
            var cacheStack = new CacheStack(new ICacheLayer[]
            {
                memoryCacheLayer,
                redisCacheLayer

            }, new ICacheExtension[]
            {
                new RedisRemoteEvictionExtension(redisConnection, new ICacheLayer[] {memoryCacheLayer})
            });

cacheStack.GetOrSetAsync(); //even though it's first call for specific key it calls evict on all other instances. 

.NET 5 Dependencies

There's no good way to handle this currently, but I'd caution bumping Microsoft dependencies to version 5.0.0 yet. I had this problem with some of our enterprise projects. We have an azure function app, and the functions sdk doesn't yet support any of the 5.0.0 dependencies, even if only targeting .net standard 2 / .net 3.1. There are open github issues you can find regarding this. If any of our function dependencies try to use the v5 packages, the app fails to initialize. The only solution is to stay at v3.1.10.

With the fixes you've released I'd like to give cachetower another go, but the v5 deps are a non-starter. I understand MS has created this problem, so do what is best for your library, they'll have to support v5 eventually.

Question - Local Cache With Redis

I was wondering if the "RedisRemoteEvictionExtension" will notify the local in memory cache that an item has been invalidated? The use case I am looking at is I get an item from the cache and it is stored in the local cache and then the item is updated in Redis, Will the local in-memory cache be notified and remove the item from it's cache?

Adding Async-suffix

Currently Cache Tower doesn't use the async-suffix (which is fairly common place in .NET) across any of the API surface though to be a good citizen, it should...

Tuning JSON File Cache Layer

Currently I haven't messed around with any settings in JSON .NET in terms of tuning performance or allocations - there might be something worthwhile in it.

Some basic things I am thinking include making sure indentation is disabled.

Perhaps even look to see how data compression might be able to lower allocations from reading the file. While compression is more CPU work, if it is dramatic, it might mean less IO work which could even help in performance.

Support non-string keys (eg. tuples)

What problem does the feature solve?

Currently all keys must be of type string. This works fine but can cause unnecessary allocations, especially on data retrieval, when one would combine values together to form a string - eg. $"namespace:{someInteger}:{someOtherValue}".

They could instead use a Tuple like ("namespace", someInteger, someOtherValue) which would avoid allocations and overall increase performance.

Here is a basic benchmark of two dictionaries - one with a tuple and one without:

[SimpleJob(RuntimeMoniker.NetCoreApp50), MemoryDiagnoser]
public class TestBenchmark
{
	private Dictionary<string, int> DataA {get;set;} = new();
	private Dictionary<(string, int), int> DataB {get;set;} = new();
	
	[Benchmark]
	public void String()
	{
		DataA.Clear();
		for (var i = 0; i < 10000; i++)
		{
			DataA.Add($"hello:{i}", i);
		}
	}

	[Benchmark]
	public void Tuple()
	{
		DataB.Clear();
		for (var i = 0; i < 10000; i++)
		{
			DataB.Add(("hello", i), i);
		}
	}
}
Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
String 1,169.8 ฮผs 13.60 ฮผs 12.06 ฮผs 152.3438 58.5938 - 695.31 KB
Tuple 525.4 ฮผs 10.25 ฮผs 11.40 ฮผs 79.1016 22.4609 - 312.5 KB

The thing is, we can't easily use a tuple internally. If we said "all keys are tuples", we could use the ITuple interface however that creates issues with boxing. It would still save on allocations but not as many.

The thing is we would ideally need a generic argument for keys but defined for a CacheStack. This starts working against us again because while you might have a key like (string, int, int) in one place, you might have another being (string, int), Guid or anything else. We would hit boxing issues again if we wanted to support all variations at all times.

Many things would need to be worked out but in the mean time, this would be a good placeholder issue.

How would you use/interact with the feature? (if applicable)

Something like:

cacheStack.GetOrSetAsync(("namespace", 14), (old, ctx) => /* do whatever */, new CacheSettings(TimeSpan.FromMinutes(30)))`

A CacheStack might need to be defined as CacheStack<TKey> and CacheStack<TKey, TContext>.

Additional constraints include:

  • How do external cache calls handle the different keys?
  • How do the layers handle it?

Add support for Injected-at-Runtime Contexts

For every refresh, it would (optionally) create a newly scoped/transient context. This way, you could utilise more complex contexts derived from dependency injection without worrying about a memory build-up.

This would work by creating a new scope around the refresh itself.

Add EntityFramework Provider

Look into adding an EntityFramework caching provider. Possible options:

  • On one hand, it would be a specific table and structure as determined by the provider with some sort of serialization.
  • On the other hand, have some way where the setting is propagated to some other handler which then directly controls how the fetch and retrieve is done.

Note: If the latter makes more sense, maybe then it makes more senses as a "generic" thing?

Lots of things to think about...

Feature Request : method to flush all caches

Would like to have a method to fully flush all cache layers. Right now I'm doing it directly through my redis connection, and then disposing / recreating the full cache stack to clear the memory layer. I also have to implement the pub/sub myself to get remote flush to work.

Ability to dump all date in cache

Is there a way to dump all date in the cache? or maybe just get a list of keys so I can iterate over the cache and access each value.

Add support for Open Telemetry

Open Telemetry has reached v1 and may be worth directly supporting in Cache Tower.

See: https://medium.com/opentelemetry/opentelemetry-specification-v1-0-0-tracing-edition-72dd08936978

  • Tracing: Not sure where this would fit in - may relate closely with logging
  • Logging: Relates to #150
  • Metrics: May be worth capturing cache hits/misses, background refreshes and the interaction of different layers
    • May still want to capture the cache hit count of individual cache entries still (this can be seen in #56 though with other functionality added too)

Further investigation required.

Re-evaluating ValueTask and async-only operation

Currently with multiple layers, the logic requires frustrating looping and if-checking for the right type of cache layer method to call. We would be able to mostly remove instructions for handling both sync-and-async without gaining additional allocations.

There may be a small regression for in-memory caching however other cache layers may slightly improve in allocations.

Before Benchmarks

CacheStack Benchmark

Method WorkIterations Mean [ns] Error [ns] StdDev [ns] Gen 0 Gen 1 Gen 2 Allocated [B]
SetupAndTeardown 1 313.2 ns 6.00 ns 6.42 ns 0.2904 - - 912 B
Set 1 671.5 ns 10.74 ns 10.04 ns 0.3157 - - 992 B
Set_TwoLayers 1 903.5 ns 15.07 ns 13.36 ns 0.5608 - - 1760 B
Evict 1 754.6 ns 10.38 ns 8.67 ns 0.3157 - - 992 B
Evict_TwoLayers 1 1,066.6 ns 18.13 ns 15.14 ns 0.5608 - - 1760 B
Cleanup 1 1,130.0 ns 15.63 ns 14.62 ns 0.3567 - - 1120 B
Cleanup_TwoLayers 1 1,472.6 ns 22.04 ns 20.62 ns 0.6008 - - 1888 B
GetMiss 1 394.0 ns 6.98 ns 6.53 ns 0.2904 - - 912 B
GetHit 1 745.1 ns 9.19 ns 8.59 ns 0.3157 - - 992 B
GetOrSet_NeverStale 1 1,272.9 ns 20.19 ns 17.90 ns 0.4196 - - 1320 B
GetOrSet_AlwaysStale 1 1,282.1 ns 12.01 ns 10.65 ns 0.4196 - - 1320 B
GetOrSet_TwoSimultaneous 1 32,048,672.6 ns 366,587.14 ns 306,116.87 ns - - - 2448 B
GetOrSet_FourSimultaneous 1 32,448,376.2 ns 639,241.65 ns 627,820.78 ns - - - 2610 B
SetupAndTeardown 100 345.1 ns 11.62 ns 23.48 ns 0.2904 - - 912 B
Set 100 34,067.8 ns 655.45 ns 828.93 ns 1.2817 - - 4160 B
Set_TwoLayers 100 40,025.3 ns 943.27 ns 1,883.81 ns 1.5259 - - 4928 B
Evict 100 45,558.0 ns 1,421.49 ns 2,805.89 ns 2.8076 - - 8912 B
Evict_TwoLayers 100 57,106.9 ns 1,304.48 ns 2,385.32 ns 4.5776 - - 14432 B
Cleanup 100 54,180.1 ns 1,028.07 ns 961.66 ns 8.5449 - - 26808 B
Cleanup_TwoLayers 100 96,244.3 ns 1,847.18 ns 1,814.17 ns 5.6152 - - 17728 B
GetMiss 100 6,573.5 ns 99.06 ns 92.66 ns 0.2899 - - 912 B
GetHit 100 7,453.0 ns 87.04 ns 77.16 ns 0.3128 - - 992 B
GetOrSet_NeverStale 100 25,816.0 ns 506.45 ns 473.74 ns 0.3967 - - 1320 B
GetOrSet_AlwaysStale 100 96,324.5 ns 1,877.02 ns 1,927.56 ns 7.4463 - - 23496 B
GetOrSet_TwoSimultaneous 100 32,647,390.2 ns 642,369.00 ns 713,991.44 ns - - - 17081 B
GetOrSet_FourSimultaneous 100 32,677,946.7 ns 652,984.81 ns 801,923.99 ns - - - 28307 B

MemoryCacheLayer Benchmark

Method WorkIterations Mean [ns] Error [ns] StdDev [ns] Gen 0 Gen 1 Gen 2 Allocated [B]
Overhead 1 196.8 ns 3.90 ns 3.83 ns 0.2270 - - 712 B
GetMiss 1 237.0 ns 4.48 ns 4.19 ns 0.2270 - - 712 B
GetHit 1 488.5 ns 8.62 ns 8.06 ns 0.2518 - - 792 B
SetExisting 1 696.2 ns 13.49 ns 12.62 ns 0.2623 - - 824 B
EvictMiss 1 253.3 ns 5.04 ns 5.17 ns 0.2270 - - 712 B
EvictHit 1 515.0 ns 9.77 ns 11.63 ns 0.2518 - - 792 B
Cleanup 1 926.2 ns 18.33 ns 18.01 ns 0.2928 - - 920 B
Overhead 100 194.9 ns 3.33 ns 3.27 ns 0.2270 - - 712 B
GetMiss 100 2,722.7 ns 40.64 ns 38.02 ns 0.2251 - - 712 B
GetHit 100 3,400.4 ns 56.32 ns 52.69 ns 0.2518 - - 792 B
SetExisting 100 23,858.8 ns 323.26 ns 302.38 ns 1.2512 - - 3992 B
EvictMiss 100 4,821.1 ns 83.95 ns 74.42 ns 0.2213 - - 712 B
EvictHit 100 30,345.6 ns 490.47 ns 458.79 ns 2.7466 - - 8712 B
Cleanup 100 49,275.6 ns 968.67 ns 994.76 ns 12.0850 - - 37992 B

Investigate simplifying dependency tree around AsyncEx

Currently we are referencing a core package to bring in AsyncReaderWriterLock for the FileCacheLayerBase class - would be good if we could reference a more specific package or even change the locking system entirely.

RedisRemoteEvictionExtension evicts key on SetAsync

Hi,

Thanks for the great library.

Maybe I'm missing something but I have come across an issue that is confusing me:

Consider:

var layers = new ICacheLayer[]
                {
                    new RedisCacheLayer(cacheLayerMultiplexer, databaseIndex: 1),
                };

var myCacheStack = new CacheStack(
                layers,
                new ICacheExtension[]
                    {
                        new RedisRemoteEvictionExtension(
                            cacheLayerMultiplexer,
                            new ICacheLayer[]{})),    // Do not evict any layers for this example
                    });

// Add something to cache stack
await myCacheStack.SetAsync("hello:world", 123, TimeSpan.FromMinutes(5));

// Check redis, no key found

var result = await myCacheStack.GetAsync<int>("hello:world");

if (result == null)
{
      Console.WriteLine("No cache entry");
}

I wasn't sure on the setup, originally I also had a MemoryCacheLayer as well, and I figured you would pass layersToEvict as:

layersToEvict = layers.Where(x => x.GetType() != typeof(RedisCacheLayer)).ToArray();

And pass this into the RedisRemoteEvictionExtension - however, whatever method I use to configure this extension, it seems to evict the key I add to redis, from redis, the moment its added.

Let me know if I have misunderstood something here! Thanks!

Support for a Counter-caching API

Similar to discussions in dotnet/extensions#709 & stefanprodan/AspNetCoreRateLimit#83, add support for a new counter cache stack. This is a unique cache stack type with atomic actions by design.

public interface ICounterCacheStack<TContext>
{
    ValueTask<long> GetAsync(string cacheKey);
    ValueTask SetAsync(string cacheKey, long value, CacheSettings settings);
    ValueTask<long> IncrementAsync(string cacheKey, CacheSettings settings);
    ValueTask<long> DecrementAsync(string cacheKey, CacheSettings settings);
    ValueTask<long> IncrementByAsync(string cacheKey, long value, CacheSettings settings);
    ValueTask<long> DecrementByAsync(string cacheKey, long value, CacheSettings settings);
}

Providers:

  • In-memory: Need to research atomic operations for memory storage
  • Redis: INCR, INCRBY, DECR & DECRBY. May need to even go further and use Lua scripts in Redis like this example.
  • MongoDB: Should be possible with custom commands though not sure if it is even necessary. MongoDB would likely never perform fast enough to be used in this manner.

Other thoughts:

  • Still likely need to support auto-evicting etc, at least for the in-memory version. That is unless this new cache stack type has a custom memory provider different from the main one.
  • Do I need to support remote evicting too (provided through my Redis extension)?
  • Does this actually need a TContext?

Protobuf errors with redis provider

I'm getting protobuf serializer errors using the redis provider when caching custom types. Is this expected? Do I need to follow protobuf doc to create serializers, or pre-serialize my data into string for example?

I don't have an error example at the moment, I can get one later for you if you need to see it. But it's basically "serializer not available for model (default)".

Wrong dependency on CacheTower 1.0.0

What does the bug do?

Prevents installation of CacheTower.Providers.Redis (and maybe others).

How can it be reproduced?

Install CacheTower.Providers.Redis version 0.10.0. It has declared dependency on CacheTower version 1.0.0 which doesn't exist. So, it fails with error:

error NU1102: Unable to find package CacheTower with version (>= 1.0.0)

Environment

  • NuGet Package Version: 0.10.0
  • .NET Runtime Version: .NET 5.0.302
  • Operating System: Windows

Drop the ICacheContext type requirement

As no internal part of Cache Tower is actually dependent on ICacheContext, the type isn't necessary. It is up to consumers about what type they would actually want to supply.

Core Requirements

Inspired by the multi-level caching in Nick Craver's caching blog post: https://nickcraver.com/blog/2019/08/06/stack-overflow-how-we-do-app-caching/

CacheTower goal:

  • Multi-level cache with customer providers
  • Generic support for ahead-of-time refresh of caching items
  • Generic support for multi-server cache invalidation and refreshing

Potential API examples based from Nick's post:

[ProtoContract]
public class UserCounts
{
    [ProtoMember(1)] public int UserId { get; }
    [ProtoMember(2)] public int PostCount { get; }
    [ProtoMember(3)] public int CommentCount { get; }
}

public Dictionary<int, UserCounts> GetUserCounts() =>
    Current.SiteCache.GetSet<Dictionary<int, UserCounts>>("All-User-Counts", (old, ctx) =>
    {
        try
        {
            return ctx.DB.Query<UserCounts>(@"
  Select u.Id UserId, PostCount, CommentCount
    From Users u
         Cross Apply (Select Count(*) PostCount From Posts p Where u.Id = p.OwnerUserId) p
         Cross Apply (Select Count(*) CommentCount From PostComments pc Where u.Id = pc.UserId) pc")
                .ToDictionary(r => r.UserId);
        }
        catch(Exception ex)
        {
            Env.LogException(ex);
            return old; // Return the old value
        }
    }, 60, 5*60);

Initial version:

Potential cache provider interface:

Task Evict(string cacheKey);
Task Refresh(string cacheKey);
Task<T> Get(string cacheKey, Func<T, CacheContext, T> getter, CacheSettings settings);

NuGet: No package Id exists

Hey @miketimofeev - sorry to tag you this way but I can't comment on actions/runner-images#3038 (see screenshot below):

image

Given that you wanted to know if the issue was still happening, I'm still experiencing it in this repo, see the following action run: https://github.com/TurnerSoftware/CacheTower/runs/2351357022

Anyway, hopefully you get notified about this so you know the status of the fix. (Tried to see if there was another way to contact you but I couldn't find it easily)

Add Redis Support

As a separate library (and probably using SE.Redis), add a Redis caching layer.

Additionally look at having the pub/sub stuff Nick talks about in his blog post. This allows purging outdated data on other web servers.

Race condition and problem with GetOrAdd background refresh

Thank you for the latest updates with fixes to dependencies! I'm testing some things today and I think I found a small race condition, and a bug.

First the race, which only seems to happen with Redis (at least, it's the only place I can observe it). It's a really small one, probably unlikely to happen in the real world, because logically you wouldn't do this. Basically, if a cache entry exists in Redis but not in local memory, so for example if the app is restarted, and you call GetOrAddAsync immediately followed by EvictAsync on the same cache key, the entry is not evicted. I think there's a race between the background refresh in GetOrAdd and the eviction.

... After I wrote that I checked on something else, and while it is a race condition, it's caused by the issue below, which is kind of a big deal, keep reading...

Now the bug... In GetOrAdd, you check stale date to trigger a background refresh. In my case stale time is 0. Here's that check.

if (cacheEntry.GetStaleDate(settings) < currentTime)
{
}

//----

public DateTime GetStaleDate(CacheSettings cacheSettings)
{
	return Expiry - cacheSettings.TimeToLive + cacheSettings.StaleAfter;
}

But isn't Expiry - cacheSettings.TimeToLive always the date the entry was created? And therefore always less than currentTime? This triggers a background refresh on every access through GetOrAdd if stale time is 0 (at worst), and after staleTime if staleTime is greater than 0 (at best). But it basically ignores the expiry for refreshes.

I've confirmed this, because if I set a stale time, then I can no longer reproduce the race condition. And because the following always prints "refresh" in separate runs within the 2 minute expiration.

var result2 = await c.GetOrSetAsync<string>("getorsetasync", old =>
{
    Console.WriteLine("refresh");
    return Task.FromResult("hello world");
}, new CacheSettings(TimeSpan.FromMinutes(2)));

And this prints "refresh" twice.

var result2 = await c.GetOrSetAsync<string>("getorsetasync", old =>
{
    Console.WriteLine("refresh");
    return Task.FromResult("hello world");
}, new CacheSettings(TimeSpan.FromMinutes(2)));

await Task.Delay(500);

result2 = await c.GetOrSetAsync<string>("getorsetasync", old =>
{
    Console.WriteLine("refresh");
    return Task.FromResult("hello world");
}, new CacheSettings(TimeSpan.FromMinutes(2)));

Add Dependency Injection API

Add Microsoft Dependency Injection Abstraction to the various libraries and add extension methods where appropriate.

Create an all-round benchmark for CacheStack

Currently the cache layer comparison benchmark performs work across all the major APIs fairly evenly. It would be good to additionally include a CacheStack version to see how the overhead works in a more "real world"/less specific case.

Add File Cache Layer

Add a basic file cache layer - will need to work out what format is best and how best to serialize the data. Because files are persistent and a cache doesn't need to be, it might be fine even have binary serialization that (if it fails) it just calls the getter.

Wouldn't be the most popular layer but is vastly better than re-processing an expensive getter every call.

Cache fails to retrieve empty Dictionary from Redis

This is a weird one. I get a null result retrieving an empty Dictionary from Redis. A Dictionary with values works fine.

Here's a repro. The .Dump() method is from LINQPad.

async Task Main()
{
    var config = ConfigurationOptions.Parse("connect string");
    config.AllowAdmin = true;

    var conn = ConnectionMultiplexer.Connect(config);

    var lockOptions = new RedisLockOptions(databaseIndex: 7);
    var redisLayer = new RedisCacheLayer(conn, 8);

    var cache = new CacheStack(
        new ICacheLayer[]
        {
            redisLayer
        },
        new ICacheExtension[]
        {
            new AutoCleanupExtension(TimeSpan.FromMinutes(5)),
            new RedisLockExtension(conn, lockOptions)
        });

    await cache.EvictAsync("mykey");

    var d = await cache.GetOrSetAsync<Dictionary<string,string>>("mykey",
        old => Task.FromResult(new Dictionary<string,string>()),
        new CacheSettings(TimeSpan.FromMinutes(1), TimeSpan.FromMinutes(1)));
    
    d.Dump();

    d = await cache.GetOrSetAsync<Dictionary<string, string>>("mykey",
        old => Task.FromResult(new Dictionary<string, string>()),
        new CacheSettings(TimeSpan.FromMinutes(1), TimeSpan.FromMinutes(1)));

    // d is null here
    d.Dump();
}

Concept - Unpopular cache items don't forward propagate

https://blog.cloudflare.com/why-we-started-putting-unpopular-assets-in-memory/

For Cloudflare, they wanted to prevent unnecessary writes to SSDs to keep them around longer so they didn't want to have unpopular cache items in their SSD cache. On top of the lifetime benefits, it also increased overall performance as writes-while-reading slows down individual chips on an SSD.

While CacheTower is unaware of the storage medium, it would be interesting to implement a similar approach. If we track how often an item in the cache is hit, we can track whether we should even back populate it from in-memory to whatever other cache layers are configured.

This can prevent unnecessary writing and storage for "one-hit-wonders" in deeper layers of the cache stack. If you are using Redis, it would take pressure off the Redis server. If you are using Disk, it would take pressure off the disk.

At minimum we would need:

  • To track every cache access (As a uint perhaps?)
  • Update back populate logic to check if cache access is often or not

Considerations:

  • Does this only work for in-memory cache?
  • If not, does this work just for the first cache layer?

Improving support for more complex IoC usecases

See #163 (comment)

At the core of the change is implementing the interfaces below:

public interface ICacheContextActivator
{
    ICacheContextScope BeginScope();
}

public interface ICacheContextScope : IDisposable
{
    T Resolve<T>();
}

And the CacheStack<TContext>.GetOrSetAsync to internally go:

return await GetOrSetAsync<T>(cacheKey, async (old) =>
{
    using var scope = CacheContextActivator.BeginScope();
    var context = scope.Resolve<TContext>();
    return await getter(old, context);
}, settings);

To be backwards compatible, we'd need 2 implementations of this new activator/scope system - one that fakes it to support the older Func<TContext> method and one that properly supports the Microsoft DI services types.

Improving the documentation/usability around Redis caching and extensions

From #163 (comment)

I think from this perspective adding the Redis setup examples would be good. The RedisLockExtension has RedisLockOptions and it isn't clear from a first glance what the configurable options are for. Also, with this RedisRemoteEvictionExtension I think it needs to be made clear that the layers you want to evict from will not be Redis in general.

This aspect would be important to improve to help make onboarding to Cache Tower easier.

Add Complex Objects to Tests & Benchmarks

Currently the test suite is quite limiting to the types of objects it caches. To make a more real world example, more variety of types (including complex types) should be tested.

Similarly with benchmarking, more complex objects would put more strain on the different cache layers so their true real world performance would come through more accurately.

Backport #170 (IoC improvements) into existing release

Hi,

I notice master has been stuck on this custom serialiser work for a while, and in addition I see you are now stuck with a protobuf change that doesn't seem to be getting much traction over there. As I believe(?) we have had a partial merge of some of the customer serialiser work, master is effectively blocked.

Would you be happy to do a 0.9.1 release, or a 0.10.0 release (I guess 0.10.0 is more relevant from a semver perspective) where we fork from pre-master (we can fork the 0.9 tag) and cherry pick my #170 PR into it? (and I guess anything else that might be wanted in a fix release)

I'm happy to prep it as a branch, but I don't want to mess up your future release plans?

Many thanks!

Mike

`RedisCacheLayer` should not be allowed in `RedisRemoteEvictionExtension`

See #163 (comment)

While the main issue is that RedisCacheLayer shouldn't be allowed, it is reasonable to consider that no distributed caches should be allowed (though let's be honest, would someone use a different distributed cache if they used Redis for remote eviction?).

Solution 1

Loop over the cache layers and check if any are a RedisCacheLayer (eg. cacheLayer is RedisCacheLayer). Problem with this approach is that it is in a different library and referencing it would suck.

Solution 2

Have some sort of shared IDistributedCacheLayer in the main CacheTower library and use that as the check. This avoids the problem above with the only downside that we have an interface we don't really care about (all cache layers are treated the same otherwise).

GetOrSetAsync fails to write value to redis when any redis extensions are enabled

I'm trying a basic redis backed setup. SetAsync works, but GetOrSetAsync is not setting the value into redis. It is setting the value into memory however, and does return the expected value.

If I remove the redis extensions, then it works and does set the value into redis. However I can't really use redis without them, since I have scaled app services in azure. Note I need to remove both redis extensions, having either one will cause the problem.

Here's a basic reproducer I mocked up in linqpad.

var redisConnection = ConnectionMultiplexer.Connect("-azure-cnxn-string-");

var c = new CacheStack(
    new ICacheLayer[]
    {
        new MemoryCacheLayer(),
        new RedisCacheLayer(redisConnection, 0)
    },
    new ICacheExtension[]
    {
        new AutoCleanupExtension(TimeSpan.FromMinutes(5)),
        new RedisLockExtension(redisConnection, 0),
        new RedisRemoteEvictionExtension(redisConnection)
    });

// works
var result1 = c.SetAsync("setasync", "hello world", TimeSpan.FromHours(24)).GetAwaiter().GetResult();
// doesn't work
var result2 = c.GetOrSetAsync<string>("getorsetasync", old => Task.FromResult("hello world"), new CacheSettings(TimeSpan.FromMinutes(2))).GetAwaiter().GetResult();
result1.Dump();
result2.Dump();

Add Information on Requirements for Contributing

This is to, in particular, highlight what is needed for tests to pass on a local machine. Currently this project doesn't use containers with development so tools would need to be installed on the main machine.

Currently there are two primary requirements:

  • MongoDB
  • Redis

Should go into detail about minimum versions and alternatives (like Memurai for doing Redis on Windows).

Bring your own serializer

Wanted to see how you feel about giving a user the ability to bring their own serializer to the RedisCacheLayer or potentially any cache layer

The current situation as I understand it is:

MongoDB -> BSON only by virtue of the library used
Filesystem -> JSON or Protobuf, provided as 2 distinct implementations of ICacheLayer
Redis -> Protobuf only

It's a good choice to have Proto be the default, but it may increase adoption of the library if people are able to use JSON encoding without needing to implement their own ICacheLayer for Redis

It seems to me there are different ways this could be achieved, but wanted to first see if you are even open to the idea

CacheSettings - default StaleAfter to TimeToLive if one isn't provided

Currently if you only provide a TTL to CacheSettings, then all entries are always considered stale. This forces the user to always engage with this additional complexity and I can imagine would be a pretty common way the library is accidentally misused.

I suggest that the default value for StaleAfter should be TimeToLive and not 0, so only users who actually need staleness to exist as a distinct concept have to deal with it. Curious to hear your thoughts on this.

This would break users who want StaleAfter to be 0 and aren't currently explicitly passing that in, which does complicate things.

Is it possible to by pass a cache layer?

Is there a way to define different layers for different data types?
Say we have CacheTower set up with two layers, memory and redis, but have some big objects we never want to store in memory what would be the recommended way to go about this?

We could workaround it in a few ways like having different CacheTowers defined with different layers but I wonder if there is something already in place like bypassing?

Cheers

SetAsync and EvictAsync do not do remote eviction

I'm surprised I didn't notice this before, but SetAsync and EvictAsync don't do remote eviction through the redis remote eviction extension. This leaves incorrect cache data in other instance's memory cache for example.

This is a really critical feature for distributed cache.

Add FileCacheLayerBase benchmarks

These would be to determine the actual overhead from the current implementation of FIleCacheLayerBase. The serialize/deserialize methods would effectively be a no-op.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.