mcollina / async-cache-dedupe Goto Github PK
View Code? Open in Web Editor NEWAsync cache with dedupe support
License: MIT License
Async cache with dedupe support
License: MIT License
as titled
we should be able to call invalidation
as well as clear
function, supporting wildcard, both for memory and redis storage
For some use cases where we don't necessarily need the latest data as soon as ttl is passed, it can be useful to be able to specify a separate interval for staleness so we can continue serving data that's considered stale while we trigger a refetch in the background.
This feature is similar in principle to #14, and can possibly share some implementation details.
Proposal:
Accept a new staleWhileRevalidate: number
argument (or maybe staleWhileRefetch
? since we're not necessarily going to be doing any revalidation here), specifying the staleness interval.
Between the ttl and the staleness interval, data is considered stale, and any requests for it will resolve immediately with stale data while triggering a deduped refetch of the data in the background. After the interval, stale data must not be used and requests for the data must await on the refetch of the new data before being served.
Defaulting to 0 should preserve existing behavior where no stale data is ever served.
Thoughts?
I have a bunch of higher priority customer-facing stuff I have to work on over the next few weeks before I'm going to be able to work on an optimization making use of this, but I'd be happy to take a crack at a PR if nobody has tackled it by then!
Good day,
when setting the log level to level higher than debug in fastify, log.debug becomes undefined and causes a failure.
I can submit a PR that checks for a level's existence and skips if it isn't there
The memory storage module uses setImmediate
, which is not implemented by modern browsers. I see that the test suite adds some kind of shim to work around this but it's not otherwise mentioned anywhere.
I propose a solution that doesn't require externally providing setImmediate:
const { isServerSide } = require('../util')
const setImmediate = isServerSide ? globalThis.setImmediate : (fn, ...args) => setTimeout(fn, 0, ...args);
I've done this in a fork and it passes tests as well as working in my actual application: https://github.com/dimfeld/async-cache-dedupe/tree/fix-setimmediate-in-browser
The main problem is that the code coverage check fails because it's not exercising the browser-only part of that line. I'm not sure how to deal with that.
What do you think? Happy to submit a PR if this sounds good to you and there's some way to make the test coverage happy.
Describe the bug
Using the async-cache-dedupe
in a simple web application I'm getting the following error: setTimeout(...).unref is not a function
.
To Reproduce
Steps to reproduce the behavior:
cache
objectScreenshots
Desktop (please complete the following information):
Reproduction
https://codesandbox.io/s/objective-firefly-25c5tu?file=/src/index.js
The unref
method belongs to the Timeout Node class. So it would be a good fallback to add a check if the method is defined.
function now () {
if (_timer !== undefined) {
return _timer
}
_timer = Math.floor(Date.now() / 1000)
const timeout = setTimeout(_clearTimer, 1000)
// istanbul ignore next
if (typeof timeout.unref === 'function') timeout.unref()
return _timer
}
I think it can be done if cache.define return the cache instance instead of void. If you are okay with it, I can prepare a PR.
As mentioned here: #8
the safe-stable-stringify
is a bottleneck in the key generation.
The safe-stable
option is not always required.
When the source of the keys is well know the developer can choose to use the JSON.stringify
function instead of having a safe one.
A GQL server, for examples, does the request always in the same way and a lot of cache parameters are often very simple
eg.
{"id": "abc"}
Using the non-safe
version can cause more values in the cache but some times the result is better than having an over optimised cache.
The ttl
option can either be a number or a function that returns a number, recently a stale
option has been added that supports a cache-while-revalidate strategy, however this can only be a static number. Should this mirror the ttl
option and allow a function to determine the stale
config (using the injected cache result, same as the ttl function)?
Really like this project! Good job!
I was going through the source code but didn't find a option to renew TTL onHit. Is it possible to add that as an option?
as titled, if serializer function throws an error, onError
event should be emitted
as titled
[11:22:07.302] ERROR (28): acd/storage/redis.clearReferences error
err: {
"type": "RangeError",
"message": "Invalid string length",
"stack":
RangeError: Invalid string length
at Object.write (/usr/app/node_modules/ioredis/built/Pipeline.js:310:29)
at EventEmitter.sendCommand (/usr/app/node_modules/ioredis/built/Redis.js:387:28)
at execPipeline (/usr/app/node_modules/ioredis/built/Pipeline.js:330:25)
at Pipeline.exec (/usr/app/node_modules/ioredis/built/Pipeline.js:282:5)
at StorageRedis.clearReferences (/usr/app/node_modules/async-cache-dedupe/src/storage/redis.js:323:56)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async StorageRedis._invalidateReferences (/usr/app/node_modules/async-cache-dedupe/src/storage/redis.js:227:5)
at async StorageRedis.invalidate (/usr/app/node_modules/async-cache-dedupe/src/storage/redis.js:193:16)
at async Cache.invalidateAll (/usr/app/node_modules/async-cache-dedupe/src/cache.js:185:5)
at async Promise.all (index 0)
}
When I try to invalidate array of references, catching this error
Using chunks, maximum array of 10, if we had a lot, but it's even crushes with 4
I think maybe because I'm operating with big data in redis, but library should control this
Let's say we want to cache something like an access token where expiration is set from the server.
const cache = createCache();
cache.define('fetchSomething', fetchSomethingHandler);
async function fetchSomethingHandler() {
const data = { "token": "abc", "expiresInSeconds": 60 }
// something like this
cache.fetchSomething.ttl(data.expiresInSeconds)
return data;
}
expiresInSeconds
is changing every function call
I find the difference between ttl and stale confusing. For me they are basically the same thing. Would be helpful to have a more in-depth explanation and/or example.
I wanted to try this library and started with my scaffolding based on Typescript and noticed it doesn't have any declaration at all, making my IDE and my compiler yell at me all the time.
I created a .d.ts file and included it in my tsconfig.json, yet I have some questions:
define
property I'm missing the chance to correctly type my functions inside the cache, so something like the following is plausible and Typescript wouldn't care:import { createCache } from "async-cache-dedupe";
const cache = createCache({
ttl: 5, // seconds
storage: { type: "memory" },
});
cache.define("fetchSomething", async (k: any) => {
console.log("query", k);
// query 42
// query 24
return { k };
});
cache.fetchSomething();
Is there something I'm not understanding correctly, or isn't this library suitable for TS projects?
I would like to use async cache dedupe for http request caching. However, I'm missing a way to set the stale value on a per entry basis after the request is completed, i.e.
cache.define('request', async url => {
const { headers, body } = await undici.request(url)
if (headers['cache-control') {
// Set the stale/ttl for the return value
}
return await body.json()
})
Any ideas whether this would be possible to add?
A quick analysis with a flamegraph shows that safe-stable-stringify
is the major bottleneck for this cache.
We should investigate if we could come up with a faster algorithm for hashing objects but with the same properties.
I'm logging data with onHit.
I saw onHit being called with keyA and the result was dataA, sometime later I saw onHit being called again with keyA, but with result dataB.
I also was logging the storage set function and saw it is not being called in between the above two onHits.
dataA looks correct, but in order to get dataB, I have to add an extra query param which I didn't do in my codebase.
There are two problems here.
How is this possible? How can I debug it to find where is the problem?
I'm using memory as cache, but I'm going to try redis and see if that solves the problem
EDIT:
This problem doesn't happen in redis
as well as onDedupe, onHit, onMiss, we can have "onError" event
consider to implement also "stale on error" and relative options when an error occur, for example the first implementation can be serve the latest response for an amount of time
It has been observed that the transformer
does not function correctly when the reference
argument is set in cache.define
. Specifically, transformer.serialize
is not called, resulting in cache never hitting.
node: 18.15.0
async-cache-dedupe: 1.10.2
reference
argument in cache.define
cache[name]
function created by cache.define
transformer.serialize
is not being called and the cache is not hitting// 1. Create a new cache and setting transformer options.
const cache = createCache({
ttl: 5,
storage: {
type: 'redis',
options: {
client: redis
}
},
onHit: (key) => console.log("HIT", key), // never hitting
onMiss: (key) => console.log("MISS", key),
transformer: {
serialize: (data: Object) => { // never calling serialize
console.log('serialize called');
return SuperJSON.serialize(data)
},
deserialize: (data: SuperJSONResult) => {
console.log('deserialize called');
return SuperJSON.deserialize(data)
}
}
}) as Cache & CachedFunctions;
// 2. Set the `reference` argument in `cache.define`
cache.define('fetchSomethingWithRef', {
references(args, key) {
return [`key`]
},
}, async (date) => {
return { date }
})
// 3. Run the `cache[name]` function created by `cache.define`
const main = async () => {
const p1 = await cache.fetchSomethingWithRef(new Date("2000-01-14"));
const p2 = await cache.fetchSomethingWithRef(new Date("2000-02-27"));
const p3 = await cache.fetchSomethingWithRef(new Date("2000-01-14"));
const p4 = await cache.fetchSomethingWithRef(new Date("2000-02-27"));
}
main();
// output:
// deserialize called
// MISS "2000-01-14T00:00:00.000Z"
// deserialize called
// MISS "2000-02-27T00:00:00.000Z"
// deserialize called
// MISS "2000-01-14T00:00:00.000Z"
// deserialize called
// MISS "2000-02-27T00:00:00.000Z"
Even when the reference
argument is set in cache.define
, transformer.serialize
should be called and the cache should hit.
The cache never hits and transformer.serialize
is not being called when the reference
argument is set in cache.define
.
This issue has been confirmed for both memory
and redis
storage types.
I want to get data from cache synchronously if cache hits, so as to reduce an extra event loop. But it's seems the cache.get
method is always returning a promise.
Is there a existing workaround to achieve that? would be really appreciated.
as titled
The global cache TTL takes precedence over the more specific function TTL when the function TTL is 0. This happens in
async-cache-dedupe/src/cache.js
Line 103 in afdf82b
opts.ttl
is falsy.
Please see the following example, where caching should be disabled for fetchSomething
. As can be seen from the output, the function is called only once.
$ cat example.mjs
import { createCache } from 'async-cache-dedupe'
const cache = createCache({
ttl: 5,
storage: { type: 'memory' },
})
cache.define('fetchSomething', { ttl: 0 }, async (k) => {
console.log('query', k)
return { k }
})
await cache.fetchSomething(1)
await cache.fetchSomething(1)
await cache.fetchSomething(1)
$ node example.mjs
query 1
I've tested against version 1.2.2 of async-cache-dedupe.
$ cat package.json
{
"name": "zero-ttl",
"version": "1.0.0",
"dependencies": {
"async-cache-dedupe": "1.2.2"
}
}
Hey guys,
i'm using a few different libraries together: prisma, prisma-redis-middleware(that uses this lib under the hood), and trpc. On the first call, i am receiving a Date object, but on subsequent calls, i am receiving a string. I believe this is because when using JSON stringify and parse, the data types are not kept intact. With the superjson you could keep the types as is.
I dont know the performance hit and i have strong feeling it would be worse than the current soultion, and as someone mentioned the bottleneck is the json parsing.
Thanks
as titled
First of all thanks for this great plugin.
I'm using it to cache some database searchs and I want to answer the clients with a X-Cache-Status header that informs if the data was taken from cache or from database (header values Hit or Miss).
cache.define('fetchSomething', async (k) => {
return { k }
})
fastify.get('/foo', async function (request, reply) {
const p1 = await cache.fetchSomething(42)
// If p1 comes from a cache Hit, set reply header[X-Cache-Status] to Hit. Else Miss.
return reply.send({msg: 'Hello'});
});
Is there a way to tell if the result of fetchSomething comes from the cache (Hit) or from database (Miss)? I've been struggling with the onHit, onMiss... events but they only receive the key as parameter.
Thanks!
when an error occur, we can add the option "staleOnError" to serve the latest cached response, for example
const cache = createCache({ ttl: 60 })
cache.define('fetchUser', {
staleOnError: 10
},
(id) => database.find({ table: 'users', where: { id }}))
note the error is caused by the defined function, so for example here the database may not respond
for the first version, I'd go with a simple time (in seconds), then we could add a function to add some logic later, I'm not sure at the moment
we must also renew the cache ttl for staling entries
this logic should go here https://github.com/mcollina/async-cache-dedupe/blob/main/src/cache.js#L247
Hello,
I wanted to try to add support for @upstash/redis http client for using this package on serverless functions on edge.
current work in progress can be viewed here: https://github.com/cemreinanc/async-cache-dedupe/tree/feature/upstash
I copied and modified redis storage code and tests to adapt with upstash sdk, but the problem is upstash http client connects to a replica set and it only supports eventual consistency and not strong consistency where we can read our writes immediately. Thats why the half of the test suite fails with the new upstash storage.
Now I have couple of questions:
And finally do you think this support is achievable under this circumstances?
Similar to #55, the cloudflare workerd runtime does not provide setImmediate
.
Rather than using the isServerSide
check, I propose simply checking for the existence of the setImmediate
function. I'll submit a PR shortly!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.