Coder Social home page Coder Social logo

async-cache-dedupe's People

Contributors

acburdine avatar agubler avatar ahmetuysal avatar cadienvan avatar dancastillo avatar dependabot[bot] avatar dimfeld avatar dualbus avatar evanlucas avatar herrmannplatz avatar hmbrg avatar jmaroeder avatar liuhanqu avatar mateonunez avatar mcollina avatar mooyoul avatar pvogel1967 avatar ramonmulia avatar seanghay avatar simone-sanfratello avatar skellla avatar smeijer avatar thelinuxlich avatar thomaspeklak avatar toriphes avatar udonc avatar zbo14 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

async-cache-dedupe's Issues

Stale-while-revalidate style cache strategy

For some use cases where we don't necessarily need the latest data as soon as ttl is passed, it can be useful to be able to specify a separate interval for staleness so we can continue serving data that's considered stale while we trigger a refetch in the background.

This feature is similar in principle to #14, and can possibly share some implementation details.

Proposal:

Accept a new staleWhileRevalidate: number argument (or maybe staleWhileRefetch? since we're not necessarily going to be doing any revalidation here), specifying the staleness interval.

Between the ttl and the staleness interval, data is considered stale, and any requests for it will resolve immediately with stale data while triggering a deduped refetch of the data in the background. After the interval, stale data must not be used and requests for the data must await on the refetch of the new data before being served.

Defaulting to 0 should preserve existing behavior where no stale data is ever served.

Thoughts?

I have a bunch of higher priority customer-facing stuff I have to work on over the next few weeks before I'm going to be able to work on an optimization making use of this, but I'd be happy to take a crack at a PR if nobody has tackled it by then!

log.debug isn't defined

Good day,
when setting the log level to level higher than debug in fastify, log.debug becomes undefined and causes a failure.
I can submit a PR that checks for a level's existence and skips if it isn't there

"require is not defined"

When using a SSR framework like Sveltekit, it will want to include this and it seems it's looking for ESM:

image

Browsers do not implement setImmediate

The memory storage module uses setImmediate, which is not implemented by modern browsers. I see that the test suite adds some kind of shim to work around this but it's not otherwise mentioned anywhere.

I propose a solution that doesn't require externally providing setImmediate:

const { isServerSide } = require('../util')
const setImmediate = isServerSide ? globalThis.setImmediate : (fn, ...args) => setTimeout(fn, 0, ...args);

I've done this in a fork and it passes tests as well as working in my actual application: https://github.com/dimfeld/async-cache-dedupe/tree/fix-setimmediate-in-browser

The main problem is that the code coverage check fails because it's not exercising the browser-only part of that line. I'm not sure how to deal with that.

What do you think? Happy to submit a PR if this sounds good to you and there's some way to make the test coverage happy.

Undefined method: `setTimeout(...).unref`

Describe the bug
Using the async-cache-dedupe in a simple web application I'm getting the following error: setTimeout(...).unref is not a function.

To Reproduce
Steps to reproduce the behavior:

  1. Create a web application (Vite, Vanilla, etc)
  2. Create a new cache object
  3. Define a method

Screenshots

image

Desktop (please complete the following information):

  • Ubuntu 20.04
  • async-cache-dedupe v1.8.0

Reproduction

https://codesandbox.io/s/objective-firefly-25c5tu?file=/src/index.js

The unref method belongs to the Timeout Node class. So it would be a good fallback to add a check if the method is defined.

function now () {
  if (_timer !== undefined) {
    return _timer
  }
  _timer = Math.floor(Date.now() / 1000)
  const timeout = setTimeout(_clearTimer, 1000)
  // istanbul ignore next
  if (typeof timeout.unref === 'function') timeout.unref()
  return _timer
}

Add an option to use JSON.stringify for the hashing

As mentioned here: #8

the safe-stable-stringify is a bottleneck in the key generation.

The safe-stable option is not always required.
When the source of the keys is well know the developer can choose to use the JSON.stringify function instead of having a safe one.

A GQL server, for examples, does the request always in the same way and a lot of cache parameters are often very simple

eg.

{"id": "abc"}

Using the non-safe version can cause more values in the cache but some times the result is better than having an over optimised cache.

Should `stale` configuration option support a function

The ttl option can either be a number or a function that returns a number, recently a stale option has been added that supports a cache-while-revalidate strategy, however this can only be a static number. Should this mirror the ttl option and allow a function to determine the stale config (using the injected cache result, same as the ttl function)?

renew TTL onHit

Really like this project! Good job!

I was going through the source code but didn't find a option to renew TTL onHit. Is it possible to add that as an option?

Invalid string length at StorageRedis.clearReferences

[11:22:07.302] ERROR (28): acd/storage/redis.clearReferences error
err: {
"type": "RangeError",
"message": "Invalid string length",
"stack":
RangeError: Invalid string length
at Object.write (/usr/app/node_modules/ioredis/built/Pipeline.js:310:29)
at EventEmitter.sendCommand (/usr/app/node_modules/ioredis/built/Redis.js:387:28)
at execPipeline (/usr/app/node_modules/ioredis/built/Pipeline.js:330:25)
at Pipeline.exec (/usr/app/node_modules/ioredis/built/Pipeline.js:282:5)
at StorageRedis.clearReferences (/usr/app/node_modules/async-cache-dedupe/src/storage/redis.js:323:56)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async StorageRedis._invalidateReferences (/usr/app/node_modules/async-cache-dedupe/src/storage/redis.js:227:5)
at async StorageRedis.invalidate (/usr/app/node_modules/async-cache-dedupe/src/storage/redis.js:193:16)
at async Cache.invalidateAll (/usr/app/node_modules/async-cache-dedupe/src/cache.js:185:5)
at async Promise.all (index 0)
}

When I try to invalidate array of references, catching this error
Using chunks, maximum array of 10, if we had a lot, but it's even crushes with 4
I think maybe because I'm operating with big data in redis, but library should control this

Dynamic TTL for each function call

Let's say we want to cache something like an access token where expiration is set from the server.

const cache = createCache();
cache.define('fetchSomething', fetchSomethingHandler);

async function fetchSomethingHandler() {
  const data =  { "token": "abc", "expiresInSeconds": 60 }

  // something like this
  cache.fetchSomething.ttl(data.expiresInSeconds)
  return data;
}

expiresInSeconds is changing every function call

ttl vs stale

I find the difference between ttl and stale confusing. For me they are basically the same thing. Would be helpful to have a more in-depth explanation and/or example.

Typescript support missing - Probably impossible to achieve

I wanted to try this library and started with my scaffolding based on Typescript and noticed it doesn't have any declaration at all, making my IDE and my compiler yell at me all the time.
I created a .d.ts file and included it in my tsconfig.json, yet I have some questions:

  1. I'm forced to know how the library behaves internally in order to provide a correct typing
  2. Using the define property I'm missing the chance to correctly type my functions inside the cache, so something like the following is plausible and Typescript wouldn't care:
import { createCache } from "async-cache-dedupe";

const cache = createCache({
  ttl: 5, // seconds
  storage: { type: "memory" },
});

cache.define("fetchSomething", async (k: any) => {
  console.log("query", k);
  // query 42
  // query 24

  return { k };
});

cache.fetchSomething();

Is there something I'm not understanding correctly, or isn't this library suitable for TS projects?

Using async cache dedupe for http request caching

I would like to use async cache dedupe for http request caching. However, I'm missing a way to set the stale value on a per entry basis after the request is completed, i.e.

cache.define('request', async url => {
  const { headers, body } = await undici.request(url)
  if (headers['cache-control') {
     // Set the stale/ttl for the return value
  }
  return await body.json()
})

Any ideas whether this would be possible to add?

Got data hits two times in a roll from cache but returned different results

I'm logging data with onHit.

I saw onHit being called with keyA and the result was dataA, sometime later I saw onHit being called again with keyA, but with result dataB.

I also was logging the storage set function and saw it is not being called in between the above two onHits.

dataA looks correct, but in order to get dataB, I have to add an extra query param which I didn't do in my codebase.

There are two problems here.

  1. dataB came out of nowhere. The first time saw it was when it came from the cache. You have to get the data from the database first before it can be stored in the cache, but there is nothing I can find that shows dataB is from the database query.
  2. dataB assigned itself to keyA out of nowhere. I couldn't track how it is being done.

How is this possible? How can I debug it to find where is the problem?

I'm using memory as cache, but I'm going to try redis and see if that solves the problem

EDIT:
This problem doesn't happen in redis

implement "onError" event listener

as well as onDedupe, onHit, onMiss, we can have "onError" event

consider to implement also "stale on error" and relative options when an error occur, for example the first implementation can be serve the latest response for an amount of time

`transformer` not working properly when `reference` argument is set in `cache.define`

Description

It has been observed that the transformer does not function correctly when the reference argument is set in cache.define. Specifically, transformer.serialize is not called, resulting in cache never hitting.

Environment

node: 18.15.0
async-cache-dedupe: 1.10.2

Steps to reproduce

  1. Create a new cache and setting transformer options.
  2. Set the reference argument in cache.define
  3. Run the cache[name] function created by cache.define
  4. Observe that transformer.serialize is not being called and the cache is not hitting

Code example

// 1. Create a new cache and setting transformer options.
const cache = createCache({
  ttl: 5,
  storage: {
    type: 'redis',
    options: {
      client: redis
    }
  },
  onHit: (key) => console.log("HIT", key), // never hitting
  onMiss: (key) => console.log("MISS", key),

  transformer: {
    serialize: (data: Object) => { // never calling serialize
      console.log('serialize called');
      return SuperJSON.serialize(data)
    },
    deserialize: (data: SuperJSONResult) => {
      console.log('deserialize called');
      return SuperJSON.deserialize(data)
    }
  }
}) as Cache & CachedFunctions;

// 2. Set the `reference` argument in `cache.define`
cache.define('fetchSomethingWithRef', {
  references(args, key) {
    return [`key`]
  },
}, async (date) => {
  return { date }
})

// 3. Run the `cache[name]` function created by `cache.define`
const main = async () => {
  const p1 = await cache.fetchSomethingWithRef(new Date("2000-01-14"));
  const p2 = await cache.fetchSomethingWithRef(new Date("2000-02-27"));
  const p3 = await cache.fetchSomethingWithRef(new Date("2000-01-14"));
  const p4 = await cache.fetchSomethingWithRef(new Date("2000-02-27"));
}

main();
// output:
// deserialize called
// MISS "2000-01-14T00:00:00.000Z"
// deserialize called
// MISS "2000-02-27T00:00:00.000Z"
// deserialize called
// MISS "2000-01-14T00:00:00.000Z"
// deserialize called
// MISS "2000-02-27T00:00:00.000Z"

Expected behavior

Even when the reference argument is set in cache.define, transformer.serialize should be called and the cache should hit.

Current behavior

The cache never hits and transformer.serialize is not being called when the reference argument is set in cache.define.

Additional information

This issue has been confirmed for both memory and redis storage types.

Get data from cache synchronously

I want to get data from cache synchronously if cache hits, so as to reduce an extra event loop. But it's seems the cache.get method is always returning a promise.

Is there a existing workaround to achieve that? would be really appreciated.

Global cache TTL overrides function TTL when the function TTL is 0

The global cache TTL takes precedence over the more specific function TTL when the function TTL is 0. This happens in

const ttl = opts.ttl || this[kTTL]
, since opts.ttl is falsy.

Please see the following example, where caching should be disabled for fetchSomething. As can be seen from the output, the function is called only once.

$ cat example.mjs
import { createCache } from 'async-cache-dedupe'

const cache = createCache({
  ttl: 5,
  storage: { type: 'memory' },
})

cache.define('fetchSomething', { ttl: 0 }, async (k) => {
  console.log('query', k)
  return { k }
})

await cache.fetchSomething(1)
await cache.fetchSomething(1)
await cache.fetchSomething(1)
$ node example.mjs 
query 1

I've tested against version 1.2.2 of async-cache-dedupe.

$ cat package.json 
{
  "name": "zero-ttl",
  "version": "1.0.0",
  "dependencies": {
    "async-cache-dedupe": "1.2.2"
  }
}

Option to use superjson to keep furher types (eg. Dates)

Hey guys,

i'm using a few different libraries together: prisma, prisma-redis-middleware(that uses this lib under the hood), and trpc. On the first call, i am receiving a Date object, but on subsequent calls, i am receiving a string. I believe this is because when using JSON stringify and parse, the data types are not kept intact. With the superjson you could keep the types as is.

I dont know the performance hit and i have strong feeling it would be worse than the current soultion, and as someone mentioned the bottleneck is the json parsing.

Thanks

Is there a way to tell if the result comes from a cache hit or miss?

First of all thanks for this great plugin.

I'm using it to cache some database searchs and I want to answer the clients with a X-Cache-Status header that informs if the data was taken from cache or from database (header values Hit or Miss).

cache.define('fetchSomething', async (k) => {
  return { k }
})

fastify.get('/foo', async function (request, reply) {
        const p1 = await cache.fetchSomething(42)

        // If p1 comes from a cache Hit, set reply header[X-Cache-Status] to Hit. Else Miss.

        return reply.send({msg: 'Hello'});
    });

Is there a way to tell if the result of fetchSomething comes from the cache (Hit) or from database (Miss)? I've been struggling with the onHit, onMiss... events but they only receive the key as parameter.

Thanks!

implement "stale on error"

when an error occur, we can add the option "staleOnError" to serve the latest cached response, for example

const cache = createCache({ ttl: 60 })

cache.define('fetchUser', {
  staleOnError: 10
}, 
(id) => database.find({ table: 'users', where: { id }}))

note the error is caused by the defined function, so for example here the database may not respond

for the first version, I'd go with a simple time (in seconds), then we could add a function to add some logic later, I'm not sure at the moment

we must also renew the cache ttl for staling entries

this logic should go here https://github.com/mcollina/async-cache-dedupe/blob/main/src/cache.js#L247

support @upstash/redis client

Hello,

I wanted to try to add support for @upstash/redis http client for using this package on serverless functions on edge.
current work in progress can be viewed here: https://github.com/cemreinanc/async-cache-dedupe/tree/feature/upstash

I copied and modified redis storage code and tests to adapt with upstash sdk, but the problem is upstash http client connects to a replica set and it only supports eventual consistency and not strong consistency where we can read our writes immediately. Thats why the half of the test suite fails with the new upstash storage.

Now I have couple of questions:

  • Is strong consistency really needed? What happens otherwise?
  • Can we overcome this by adding an option to combining memory + remote storage. Reading should first look for in memory data, then remote redis data. If both misses then proceed to do actual query and write both memory and redis again. What are the downsides of this approach?

And finally do you think this support is achievable under this circumstances?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.