mcollina / async-cache-dedupe Goto Github PK

View Code? Open in Web Editor NEW

579.0 579.0 39.0 167 KB

Async cache with dedupe support

License: MIT License

JavaScript 96.72% Shell 0.29% TypeScript 1.94% HTML 1.05%

async-cache-dedupe's People

Contributors

Stargazers

Watchers

Forkers

mooyoul simone-sanfratello der-ofenmeister vidarc zbo14 liuhanqu ramonmulia anthonyringoet tinchoz49 alexf4dev dualbus cma-skedulo tcandens b4r05 evanlucas pvogel1967 cemreinanc baijanaththaru susztak mateonunez acburdine hmbrg himself65 dancastillo dimfeld thomaspeklak udonc thelinuxlich agubler kakarot-dev aggelosast toriphes mohamedlamineallal tryb3l asyskevin lexzer42 herrmannplatz jmaroeder skellla

async-cache-dedupe's Issues

update deps

as titled

implement wild card invalidation

we should be able to call invalidation as well as clear function, supporting wildcard, both for memory and redis storage

Stale-while-revalidate style cache strategy

For some use cases where we don't necessarily need the latest data as soon as ttl is passed, it can be useful to be able to specify a separate interval for staleness so we can continue serving data that's considered stale while we trigger a refetch in the background.

This feature is similar in principle to #14, and can possibly share some implementation details.

Proposal:

Accept a new staleWhileRevalidate: number argument (or maybe staleWhileRefetch? since we're not necessarily going to be doing any revalidation here), specifying the staleness interval.

Between the ttl and the staleness interval, data is considered stale, and any requests for it will resolve immediately with stale data while triggering a deduped refetch of the data in the background. After the interval, stale data must not be used and requests for the data must await on the refetch of the new data before being served.

Defaulting to 0 should preserve existing behavior where no stale data is ever served.

Thoughts?

I have a bunch of higher priority customer-facing stuff I have to work on over the next few weeks before I'm going to be able to work on an optimization making use of this, but I'd be happy to take a crack at a PR if nobody has tackled it by then!

log.debug isn't defined

Good day,
when setting the log level to level higher than debug in fastify, log.debug becomes undefined and causes a failure.
I can submit a PR that checks for a level's existence and skips if it isn't there

"require is not defined"

When using a SSR framework like Sveltekit, it will want to include this and it seems it's looking for ESM:

Browsers do not implement setImmediate

The memory storage module uses setImmediate, which is not implemented by modern browsers. I see that the test suite adds some kind of shim to work around this but it's not otherwise mentioned anywhere.

I propose a solution that doesn't require externally providing setImmediate:

const { isServerSide } = require('../util')
const setImmediate = isServerSide ? globalThis.setImmediate : (fn, ...args) => setTimeout(fn, 0, ...args);

I've done this in a fork and it passes tests as well as working in my actual application: https://github.com/dimfeld/async-cache-dedupe/tree/fix-setimmediate-in-browser

The main problem is that the code coverage check fails because it's not exercising the browser-only part of that line. I'm not sure how to deal with that.

What do you think? Happy to submit a PR if this sounds good to you and there's some way to make the test coverage happy.

Undefined method: `setTimeout(...).unref`

Describe the bug
Using the async-cache-dedupe in a simple web application I'm getting the following error: setTimeout(...).unref is not a function.

To Reproduce
Steps to reproduce the behavior:

Create a web application (Vite, Vanilla, etc)
Create a new cache object
Define a method

Screenshots

Desktop (please complete the following information):

Ubuntu 20.04
async-cache-dedupe v1.8.0

Reproduction

https://codesandbox.io/s/objective-firefly-25c5tu?file=/src/index.js

The unref method belongs to the Timeout Node class. So it would be a good fallback to add a check if the method is defined.

function now () {
  if (_timer !== undefined) {
    return _timer
  }
  _timer = Math.floor(Date.now() / 1000)
  const timeout = setTimeout(_clearTimer, 1000)
  // istanbul ignore next
  if (typeof timeout.unref === 'function') timeout.unref()
  return _timer
}

Removing the need of CachedFunctions in Typescript

I think it can be done if cache.define return the cache instance instead of void. If you are okay with it, I can prepare a PR.

Add an option to use JSON.stringify for the hashing

As mentioned here: #8

the safe-stable-stringify is a bottleneck in the key generation.

The safe-stable option is not always required.
When the source of the keys is well know the developer can choose to use the JSON.stringify function instead of having a safe one.

A GQL server, for examples, does the request always in the same way and a lot of cache parameters are often very simple

eg.

{"id": "abc"}

Using the non-safe version can cause more values in the cache but some times the result is better than having an over optimised cache.

Should `stale` configuration option support a function

The ttl option can either be a number or a function that returns a number, recently a stale option has been added that supports a cache-while-revalidate strategy, however this can only be a static number. Should this mirror the ttl option and allow a function to determine the stale config (using the injected cache result, same as the ttl function)?

renew TTL onHit

Really like this project! Good job!

I was going through the source code but didn't find a option to renew TTL onHit. Is it possible to add that as an option?

onError should emit on serialization errors

as titled, if serializer function throws an error, onError event should be emitted

adopt borp as test reporter

as titled

Invalid string length at StorageRedis.clearReferences

[11:22:07.302] ERROR (28): acd/storage/redis.clearReferences error
err: {
"type": "RangeError",
"message": "Invalid string length",
"stack":
RangeError: Invalid string length
at Object.write (/usr/app/node_modules/ioredis/built/Pipeline.js:310:29)
at EventEmitter.sendCommand (/usr/app/node_modules/ioredis/built/Redis.js:387:28)
at execPipeline (/usr/app/node_modules/ioredis/built/Pipeline.js:330:25)
at Pipeline.exec (/usr/app/node_modules/ioredis/built/Pipeline.js:282:5)
at StorageRedis.clearReferences (/usr/app/node_modules/async-cache-dedupe/src/storage/redis.js:323:56)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async StorageRedis._invalidateReferences (/usr/app/node_modules/async-cache-dedupe/src/storage/redis.js:227:5)
at async StorageRedis.invalidate (/usr/app/node_modules/async-cache-dedupe/src/storage/redis.js:193:16)
at async Cache.invalidateAll (/usr/app/node_modules/async-cache-dedupe/src/cache.js:185:5)
at async Promise.all (index 0)
}

When I try to invalidate array of references, catching this error
Using chunks, maximum array of 10, if we had a lot, but it's even crushes with 4
I think maybe because I'm operating with big data in redis, but library should control this

Dynamic TTL for each function call

Let's say we want to cache something like an access token where expiration is set from the server.

const cache = createCache();
cache.define('fetchSomething', fetchSomethingHandler);

async function fetchSomethingHandler() {
  const data =  { "token": "abc", "expiresInSeconds": 60 }

  // something like this
  cache.fetchSomething.ttl(data.expiresInSeconds)
  return data;
}

expiresInSeconds is changing every function call

ttl vs stale

I find the difference between ttl and stale confusing. For me they are basically the same thing. Would be helpful to have a more in-depth explanation and/or example.

Typescript support missing - Probably impossible to achieve

I wanted to try this library and started with my scaffolding based on Typescript and noticed it doesn't have any declaration at all, making my IDE and my compiler yell at me all the time.
I created a .d.ts file and included it in my tsconfig.json, yet I have some questions:

I'm forced to know how the library behaves internally in order to provide a correct typing
Using the define property I'm missing the chance to correctly type my functions inside the cache, so something like the following is plausible and Typescript wouldn't care:

import { createCache } from "async-cache-dedupe";

const cache = createCache({
  ttl: 5, // seconds
  storage: { type: "memory" },
});

cache.define("fetchSomething", async (k: any) => {
  console.log("query", k);
  // query 42
  // query 24

  return { k };
});

cache.fetchSomething();

Is there something I'm not understanding correctly, or isn't this library suitable for TS projects?

Using async cache dedupe for http request caching

I would like to use async cache dedupe for http request caching. However, I'm missing a way to set the stale value on a per entry basis after the request is completed, i.e.

cache.define('request', async url => {
  const { headers, body } = await undici.request(url)
  if (headers['cache-control') {
     // Set the stale/ttl for the return value
  }
  return await body.json()
})

Any ideas whether this would be possible to add?

Investigate a faster hashing algorithm that JSON stable stringify

A quick analysis with a flamegraph shows that safe-stable-stringify is the major bottleneck for this cache.

We should investigate if we could come up with a faster algorithm for hashing objects but with the same properties.

Got data hits two times in a roll from cache but returned different results

I'm logging data with onHit.

I saw onHit being called with keyA and the result was dataA, sometime later I saw onHit being called again with keyA, but with result dataB.

I also was logging the storage set function and saw it is not being called in between the above two onHits.

dataA looks correct, but in order to get dataB, I have to add an extra query param which I didn't do in my codebase.

There are two problems here.

dataB came out of nowhere. The first time saw it was when it came from the cache. You have to get the data from the database first before it can be stored in the cache, but there is nothing I can find that shows dataB is from the database query.
dataB assigned itself to keyA out of nowhere. I couldn't track how it is being done.

How is this possible? How can I debug it to find where is the problem?

I'm using memory as cache, but I'm going to try redis and see if that solves the problem

EDIT:
This problem doesn't happen in redis

implement "onError" event listener

as well as onDedupe, onHit, onMiss, we can have "onError" event

~~consider to implement also "stale on error" and relative options when an error occur, for example the first implementation can be serve the latest response for an amount of time~~

`transformer` not working properly when `reference` argument is set in `cache.define`

Description

It has been observed that the transformer does not function correctly when the reference argument is set in cache.define. Specifically, transformer.serialize is not called, resulting in cache never hitting.

Environment

node: 18.15.0
async-cache-dedupe: 1.10.2

Steps to reproduce

Create a new cache and setting transformer options.
Set the reference argument in cache.define
Run the cache[name] function created by cache.define
Observe that transformer.serialize is not being called and the cache is not hitting

Code example

// 1. Create a new cache and setting transformer options.
const cache = createCache({
  ttl: 5,
  storage: {
    type: 'redis',
    options: {
      client: redis
    }
  },
  onHit: (key) => console.log("HIT", key), // never hitting
  onMiss: (key) => console.log("MISS", key),

  transformer: {
    serialize: (data: Object) => { // never calling serialize
      console.log('serialize called');
      return SuperJSON.serialize(data)
    },
    deserialize: (data: SuperJSONResult) => {
      console.log('deserialize called');
      return SuperJSON.deserialize(data)
    }
  }
}) as Cache & CachedFunctions;

// 2. Set the `reference` argument in `cache.define`
cache.define('fetchSomethingWithRef', {
  references(args, key) {
    return [`key`]
  },
}, async (date) => {
  return { date }
})

// 3. Run the `cache[name]` function created by `cache.define`
const main = async () => {
  const p1 = await cache.fetchSomethingWithRef(new Date("2000-01-14"));
  const p2 = await cache.fetchSomethingWithRef(new Date("2000-02-27"));
  const p3 = await cache.fetchSomethingWithRef(new Date("2000-01-14"));
  const p4 = await cache.fetchSomethingWithRef(new Date("2000-02-27"));
}

main();
// output:
// deserialize called
// MISS "2000-01-14T00:00:00.000Z"
// deserialize called
// MISS "2000-02-27T00:00:00.000Z"
// deserialize called
// MISS "2000-01-14T00:00:00.000Z"
// deserialize called
// MISS "2000-02-27T00:00:00.000Z"

Expected behavior

Even when the reference argument is set in cache.define, transformer.serialize should be called and the cache should hit.

Current behavior

The cache never hits and transformer.serialize is not being called when the reference argument is set in cache.define.

Additional information

This issue has been confirmed for both memory and redis storage types.

Get data from cache synchronously

I want to get data from cache synchronously if cache hits, so as to reduce an extra event loop. But it's seems the cache.get method is always returning a promise.

Is there a existing workaround to achieve that? would be really appreciated.

Shouldn't reference parameters be typed?

So I don't need to do this:

fix flaky test for node v18

as titled

Global cache TTL overrides function TTL when the function TTL is 0

The global cache TTL takes precedence over the more specific function TTL when the function TTL is 0. This happens in

async-cache-dedupe/src/cache.js

Line 103 in afdf82b

const ttl = opts.ttl || this[kTTL]

, since opts.ttl is falsy.

Please see the following example, where caching should be disabled for fetchSomething. As can be seen from the output, the function is called only once.

$ cat example.mjs
import { createCache } from 'async-cache-dedupe'

const cache = createCache({
  ttl: 5,
  storage: { type: 'memory' },
})

cache.define('fetchSomething', { ttl: 0 }, async (k) => {
  console.log('query', k)
  return { k }
})

await cache.fetchSomething(1)
await cache.fetchSomething(1)
await cache.fetchSomething(1)

$ node example.mjs 
query 1

I've tested against version 1.2.2 of async-cache-dedupe.

$ cat package.json 
{
  "name": "zero-ttl",
  "version": "1.0.0",
  "dependencies": {
    "async-cache-dedupe": "1.2.2"
  }
}

Option to use superjson to keep furher types (eg. Dates)

Hey guys,

i'm using a few different libraries together: prisma, prisma-redis-middleware(that uses this lib under the hood), and trpc. On the first call, i am receiving a Date object, but on subsequent calls, i am receiving a string. I believe this is because when using JSON stringify and parse, the data types are not kept intact. With the superjson you could keep the types as is.

I dont know the performance hit and i have strong feeling it would be worse than the current soultion, and as someone mentioned the bottleneck is the json parsing.

Thanks

migrate test to node:test core lib

as titled

Is there a way to tell if the result comes from a cache hit or miss?

First of all thanks for this great plugin.

I'm using it to cache some database searchs and I want to answer the clients with a X-Cache-Status header that informs if the data was taken from cache or from database (header values Hit or Miss).

cache.define('fetchSomething', async (k) => {
  return { k }
})

fastify.get('/foo', async function (request, reply) {
        const p1 = await cache.fetchSomething(42)

        // If p1 comes from a cache Hit, set reply header[X-Cache-Status] to Hit. Else Miss.

        return reply.send({msg: 'Hello'});
    });

Is there a way to tell if the result of fetchSomething comes from the cache (Hit) or from database (Miss)? I've been struggling with the onHit, onMiss... events but they only receive the key as parameter.

Thanks!

implement "stale on error"

when an error occur, we can add the option "staleOnError" to serve the latest cached response, for example

const cache = createCache({ ttl: 60 })

cache.define('fetchUser', {
  staleOnError: 10
}, 
(id) => database.find({ table: 'users', where: { id }}))

note the error is caused by the defined function, so for example here the database may not respond

for the first version, I'd go with a simple time (in seconds), then we could add a function to add some logic later, I'm not sure at the moment

we must also renew the cache ttl for staling entries

this logic should go here https://github.com/mcollina/async-cache-dedupe/blob/main/src/cache.js#L247

support @upstash/redis client

Hello,

I wanted to try to add support for @upstash/redis http client for using this package on serverless functions on edge.
current work in progress can be viewed here: https://github.com/cemreinanc/async-cache-dedupe/tree/feature/upstash

I copied and modified redis storage code and tests to adapt with upstash sdk, but the problem is upstash http client connects to a replica set and it only supports eventual consistency and not strong consistency where we can read our writes immediately. Thats why the half of the test suite fails with the new upstash storage.

Now I have couple of questions:

Is strong consistency really needed? What happens otherwise?
Can we overcome this by adding an option to combining memory + remote storage. Reading should first look for in memory data, then remote redis data. If both misses then proceed to do actual query and write both memory and redis again. What are the downsides of this approach?

And finally do you think this support is achievable under this circumstances?

Some non-browser runtimes do not provide setImmediate

Similar to #55, the cloudflare workerd runtime does not provide setImmediate.

Rather than using the isServerSide check, I propose simply checking for the existence of the setImmediate function. I'll submit a PR shortly!

mcollina / async-cache-dedupe Goto Github PK

async-cache-dedupe's People

Contributors

Stargazers

Watchers

Forkers

async-cache-dedupe's Issues

Description

Environment

Steps to reproduce

Code example

Expected behavior

Current behavior

Additional information

Recommend Projects

Recommend Topics

Recommend Org