Coder Social home page Coder Social logo

Comments (17)

pvdlg avatar pvdlg commented on May 17, 2024 1

In the case of the GitHub API, the X-RateLimit-Remaining is capped to X-RateLimit-Limit.
So GitHub doesn't "refill" the reservoir per say, but it reset it to its max value on a certain frequency.

One thing that we should consider is that possibly other things (not managed by Bottleneck) are using the API. So when X-RateLimit-Remaining is reset to its max, we shouldn't assume we can use all of them.

The event based solution allow to:

  • Check what's remaining in the API when the reservoir exhaust
    • If there is some call remaining, set the reservoir value with what's remaining
    • If there is no calls remaining tell Bottleneck to "check again" (by calling the empty reservoir again for example) at X-RateLimit-Reset

from bottleneck.

pvdlg avatar pvdlg commented on May 17, 2024 1

Thanks for the update. I tested quickly and it works. I will do more in depth test shortly when I get some time.

from bottleneck.

pvdlg avatar pvdlg commented on May 17, 2024 1

Thanks! That's awesome! I'll let you if I encounter other issues. But so far it looks good!

from bottleneck.

SGrondin avatar SGrondin commented on May 17, 2024

Maybe an event could be sent when the reservoir get exhausted, proving a function to schedule its refill.

This is coming in v2.1, to be released within a week!

Another option would be to allow to define the frequency and amount of refills in the Bottleneck constructor. Bottleneck would pause execution when the reservoir get exhausted, wait until the refill schedule, refill of the given amount and continue to process tasks.

I also like this idea, but should it increment the reservoir value or simply set it to a specific value? The former has the potential to reach high values if the limiter isn't in use for a period of time, while the latter is capped to a specific max. I'm leaning towards the latter option.

Thank you for your excellent feedback and suggestions, and for using Bottleneck!

from bottleneck.

SGrondin avatar SGrondin commented on May 17, 2024

I've just released v2.1.0 which adds the depleted event. It is only fired when the reservoir reaches 0.

from bottleneck.

pvdlg avatar pvdlg commented on May 17, 2024

Thanks, that was fast!

I realize now that a few things might be missing to implement the GitHub rate limiting mentioned previously though.

When the depleted event occurs I'd like to:

  • Call the GitHub API to get the current X-RateLimit-Remaining
  • If there is some calls remaning in the API, set the reservoir with incrementReservoir() which is an async function
  • If there is no calls remaning in the API, schedule to set the reservoir with incrementReservoir() at RateLimit-Reset

I see two potentials problems doing that:

  1. Can I easily run an async function in the depleted event handler?
  2. When the depleted event occurs and I have to schedule to set the reservoir at RateLimit-Reset, that will probably involves calling setTimeout. But if no more calls to the limiter are made (because by chance the user called the limiter exactly the number of times initially set in the reservoir) then we'll end up with a timeout that can takes several minutes to resolve. As Node will wait for all timeouts to be resolve before exiting gracefully, we'll keep the program hanging until this timeout resolve, even though there is no reason to wait as the user is not making any new calls to the limiter.

For 1. using a package like events-async in Bottlerneck would solve the problem.

For 2. we would need to trigger the depleted event on the first call done on a depleted reservoir rather than on the call made with the last reservoir unit. This way we can be sure to not wait for nothing, as we now we have at least one more request to execute after the reservoir was depleted.

Or maybe there is a better way to implement this that I'm missing?

from bottleneck.

SGrondin avatar SGrondin commented on May 17, 2024
  1. Yes you can, just make sure to put a catch on that async function (or a .catch() on it if it returns a promise).

You can also call limiter. updateSettings( reservoir: newValue ).
You don't have to use incrementReservoir() if you simply want to set it to a specific value.

The reason incrementReservoir() exists is that it lets you increment without having to know the current value, which you would get through currentReservoir() which is yet another async call and thus, the value could potentially be out of date.

There's no need for events-async, nothing prevents you from putting a promise chain within an event listener!

  1. You don't need to schedule() to set the reservoir, you can just do it inline. I didn't test this code, but it should do exactly what you want to accomplish:
limiter.on('depleted', () => {
  fetchGithubRemaining()
  .then((remaining) => {
    return limiter.updateSettings({ reservoir: remaining })
  })
  .catch((err) => {
    // handle error here
  })
});

from bottleneck.

pvdlg avatar pvdlg commented on May 17, 2024

Thanks for your answer!

I think I wasn't really clear regarding 2.
When I call the GitHub API to know how many calls are available the API will return:

  • X-RateLimit-Remaining: number of calls remaining
  • X-RateLimit-Reset: time at which the remaining calls will be reset to their max value
  • X-RateLimit-Limit: number of calls that will be available at the date inX-RateLimit-Reset

So there can be a situation in which X-RateLimit-Remaining is 0 and X-RateLimit-Reset is for example 2 minutes.
So in the depleted handler I will have to request the API, then wait 2 minutes, then call limiter.updateSettings({ reservoir: remaining }).
Something like this:

limiter.on('depleted', () => {
  fetchGithubRemaining()
  .then((response) => {
    const timeToWait = response.rateLimitReset - Date.now(); // would be 2 minutes in that example
    return delay(timeToWait).then(() => limiter.updateSettings({ reservoir: response.maxValue })) 
  })
  .catch((err) => {
    // handle error here
  })
});

If users create a limiter with a reservoir of 10 and they use it 10 times, when depleted is triggered, that will call delay(timeToWait) that internally sets a timeout, in order to update the reservoir in in 2 minutes, as the GitHub API will have new calls available only in 2 minutes.
The problem is that if the user doesn't make an 11th call, there is no point in calling delay(timeToWait) and doing so will prevent the program the exit gracefully for the next 2 minutes. Inside the depleted handler I have no way to know if the user is done or they will call it more in the future.

from bottleneck.

SGrondin avatar SGrondin commented on May 17, 2024

Oh! Did you know about .unref()? https://nodejs.org/api/timers.html#timers_timeout_unref

limiter.on('depleted', () => {
  fetchGithubRemaining()
  .then((response) => {
    const timeToWait = response.rateLimitReset - Date.now(); // would be 2 minutes in that example
    const timeout = setTimeout(() => {
      limiter.updateSettings({ reservoir: response.maxValue });
    }, timeToWait);
    timeout.unref();
  })
  .catch((err) => {
    // handle error here
  })
});

unref() tells Node to not wait for this specific timeout to complete before letting the program exit gracefully.

from bottleneck.

pvdlg avatar pvdlg commented on May 17, 2024

I didn't know about unref() :)

It doesn't seems to be the right solution though. The event handler doesn't handle promises so it doesn't wait for asynchronous calls to finish. And it seems that when the reservoir is empty Bottleneck doesn't wait for it to be refilled either.

const Bottleneck = require('bottleneck');

const reservoir = 3;
const calls = 6;

const now = Date.now();
const limiter = new Bottleneck({reservoir});

const throttled = limiter.wrap(i => {
  const secDiff = ((Date.now() - now) / 1000).toFixed();
  return Promise.resolve(`${i}: ${secDiff}s`);
});

limiter.on('depleted', () => {
  const timeout = setTimeout(() => {
    limiter.updateSettings({reservoir: 10});
  }, 3000);
  timeout.unref();
});

const promises = [];
for (let i = 1; i <= calls; i++) {
  promises.push(throttled(i).then(console.log));
}

Promise.all(promises).then(() => {
  console.log(`${calls} calls executed`);
});

The code above create a limiter with a reservoir of 3. When the reservoir is depleted it sets a timeout to refill it after 3 seconds as suggested in your comment. Then we call a function 6 times.
The expected behavior is to call the function 3 times, wait 3 seconds, call it 3 times again and exit.

If you run the code above the function will be called only 3 times.
If you remove timeout.unref(); the function will be called 6 times but only because Node wait for the timeout to expire before exiting.

In order to work as expected I would suggest to:

  • Allow the depleted event handler to be an async (or Promise returning function) and have Bottleneck to wait for it to resolve
  • Trigger the depleted event in a more optimistic way: only when a call is made and can't be processed due to the reservoir being depleted

An other solution is to have the call to the function returned by limiter.wrap to not resolve if the reservoir is depleted and wait for it to be refilled. But that would lead to less flexibility for users as they would be forced to implement the depleted event handler otherwise Bottleneck would wait forever for the reservoir to refill.

With the first solution, users can choose:

  • Implement the depleted event handler and return a Promise => Bottleneck will wait for this Promise to resolve before processing other calls
  • Don't implement the depleted event handler => Bottleneck will work as of now and silently ignore the calls made after the reservoir was depleted. Arguably it's not the appropriate behavior and the function returned by limiter.wrap should return a rejected Promise in such case, but that's probably another discussion.

from bottleneck.

SGrondin avatar SGrondin commented on May 17, 2024

I'm working on a solution to this, I'll provide you a branch to try before I release anything.

from bottleneck.

pvdlg avatar pvdlg commented on May 17, 2024

Awesome. Thanks!

from bottleneck.

SGrondin avatar SGrondin commented on May 17, 2024

Please try this experimental branch:

npm install --save git://github.com/SGrondin/bottleneck.git#depleted-reservoir-events

Changes

  1. Added a queued-reservoir-depleted event. It is triggered when a request gets queued up and the reservoir value happens to be 0.
  2. Returning a promise from any event handler causes Bottleneck to wait on it. If that promise fails, the error triggers an error event.

Here's how I used those 2 changes to get the correct behavior. Feel free to play with the reservoir and calls constants to validate that it works as you expect.

const Bottleneck = require('bottleneck');

const reservoir = 3;
const calls = 6;

const now = Date.now();
const limiter = new Bottleneck({reservoir});
const updateSettingsLimiter = new Bottleneck({ maxConcurrent: 1, highWater: 0 });

const wait = (ms) => new Promise((resolve, reject) => setTimeout(resolve, ms));

const prefixTime = (str) => {
  const secDiff = ((Date.now() - now) / 1000).toFixed();
  return `${secDiff}s ${str}`;
}

const throttled = limiter.wrap(i => {
  return Promise.resolve(prefixTime(i));
});

const errorHandler = (err) => {
  console.log('ERROR EVENT')
  console.log(err.message)
  console.log(err.stack)
};

limiter.on('error', errorHandler);
updateSettingsLimiter.on('error', errorHandler);

limiter.on('queued-reservoir-depleted', () => {
  const queue = limiter.queued();
  console.log(prefixTime(`queued-reservoir-depleted triggered! Queue length: ${queue}`));
  if (queue > 0) {
    return updateSettingsLimiter.schedule(() => {
      return wait(3000)
      .then(() => {
        console.log(prefixTime('Updating reservoir!'));
        return limiter.updateSettings({reservoir: 10});
      });
    });
  }
});

const promises = [];
for (let i = 1; i <= calls; i++) {
  promises.push(throttled(i).then(console.log));
}

Promise.all(promises).then(() => {
  console.log(`${calls} calls executed`);
});

It outputs:

0s 1
0s 2
0s queued-reservoir-depleted triggered! Queue length: 0
0s 3
0s queued-reservoir-depleted triggered! Queue length: 1
0s queued-reservoir-depleted triggered! Queue length: 2
3s Updating reservoir!
3s 4
3s 5
3s 6
6 calls executed

I made a second limiter to prevent making another call to Github while one is already in process.

from bottleneck.

pvdlg avatar pvdlg commented on May 17, 2024

Thanks for the quick answer!
I didn't tested all scenario yet but I found an issue. If you set calls to 9 and refill the reservoir with 3 (return limiter.updateSettings({reservoir: 3});) then it should call the function 3 times, then wait 3s then call it 3 times, then wait 3s then calls it 3 times, then exit.

I obtain:

0s 1
0s 2
0s queued-reservoir-depleted triggered! Queue length: 0
0s 3
0s queued-reservoir-depleted triggered! Queue length: 1
0s queued-reservoir-depleted triggered! Queue length: 2
0s queued-reservoir-depleted triggered! Queue length: 3
0s queued-reservoir-depleted triggered! Queue length: 4
0s queued-reservoir-depleted triggered! Queue length: 5
3s Updating reservoir!
3s 4
3s 5
3s 6

The problem seems to be that Bottleneck continues to fire queued-reservoir-depleted events for each call made after the first queued-reservoir-depleted event, so if multiple refill are necessary to go through the list calls, the following refills never happens.
Maybe a solution would be to stop processing the queue until the reservoir is refilled? But I don't know the incidence of that on the Bottleneck module overall.

It seems also counter intuitive to have the calls done after the reservoir is depleted to be silently dropped (in the example mentioned in this comment I don't see any errors being triggered for the calls 7, 8 and 9 that were never executed).

from bottleneck.

SGrondin avatar SGrondin commented on May 17, 2024

I've pushed a commit to the branch. It removes the queued-reservoir-depleted and instead passes an empty boolean to the depleted event handler. This empty boolean is simply the same condition that triggers the empty event (nothing in the queue and/or in the process of being queued).

const Bottleneck = require('bottleneck');

const reservoir = 3;
const calls = 9;

const now = Date.now();
const limiter = new Bottleneck({reservoir});

const wait = (ms) => new Promise((resolve, reject) => setTimeout(resolve, ms));

const prefixTime = (str) => {
  const secDiff = ((Date.now() - now) / 1000).toFixed();
  return `${secDiff}s ${str}`;
}

const throttled = (i) => {
  return Promise.resolve(prefixTime(i));
};

limiter.on('error', (err) => {
  console.log('ERROR EVENT')
  console.log(err.message)
  console.log(err.stack)
});

limiter.on('depleted', (empty) => {
  const queue = limiter.queued();
  console.log(prefixTime(`depleted triggered! Queue length: ${queue}. Empty: ${empty}`));
  if (!empty) {
    return wait(3000)
    .then(() => {
      console.log(prefixTime('Updating reservoir!'));
      return limiter.updateSettings({reservoir: 3});
    });
  }
});

const promises = [];
for (let i = 1; i <= calls; i++) {
  promises.push(limiter.schedule({id: i}, (j) => throttled(j), i).then(console.log));
}

Promise.all(promises)
.then(() => {
  console.log(`${calls} calls executed`);
});

Please let me know if the above meets your needs.

It seems also counter intuitive to have the calls done after the reservoir is depleted to be silently dropped (in the example mentioned in this comment I don't see any errors being triggered for the calls 7, 8 and 9 that were never executed).

I'm investigating this.

from bottleneck.

SGrondin avatar SGrondin commented on May 17, 2024

It seems also counter intuitive to have the calls done after the reservoir is depleted to be silently dropped (in the example mentioned in this comment I don't see any errors being triggered for the calls 7, 8 and 9 that were never executed).

I remember now. It's because Bottleneck has no way of knowing whether those jobs will ever get a chance to run, so it's the user's responsibility to have a timer/event/promise/etc that will call updateSettings. It's that timer/event/promise/etc that prevents the Node process from exiting. I think this is a relatively elegant solution, but I'm open to improvements.

This is only for scenarios where reservoir is in use, obviously queueing lots of jobs without using reservoir will not allow the Node process to exit until they're all completed, since Bottleneck is waiting on running jobs to complete in order to schedule the next ones. Again, in the case of reservoir === 0 Bottleneck isn't waiting on anything anymore (which allows Node to exit if the user doesn't have a running task of their own, such as one to call updateSettings/incrementReservoir, for example). Calling updateSettings/incrementReservoir triggers Bottleneck's scheduler, which resumes starting jobs and waiting for their completion to schedule more.

from bottleneck.

SGrondin avatar SGrondin commented on May 17, 2024

I've released version 2.2.0, it contains major upgrades and also passes the empty argument to the depleted event. Please reopen this issue if the problem persists. Thank you for your help in debugging and your input, I'm pleased with the result.

from bottleneck.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.