Coder Social home page Coder Social logo

Comments (4)

mesozoic avatar mesozoic commented on September 26, 2024

It does! In #272 (part of the 2.0 release) we introduced a default retry strategy for 429 errors (QPS limits exceeded) that will retry up to five times with exponential backoff. This somewhat mirrors the behavior of the official Airtable JS library, though we'll give up after ~3 seconds (and it's not clear to me whether their implementation will ever stop retrying).

Are you running into any issues with the default retry strategy?

from pyairtable.

simsong avatar simsong commented on September 26, 2024

Thank you for the response! I have reviewed the pyairtable code and reviewed Airtable's statement on rate limiting.

Airtable states: "Airtable enforces a rate limit of 5 requests per second to ensure optimal user performance across all pricing tiers. If you exceed this rate, you will receive a 429 status code and must wait 30 seconds before subsequent requests will succeed."

This may not be what Airtable is actually doing, but what it says is that if you issue 6 requests in a second, you need to wait 30 seconds before the next request is successful.

The request method in the Api class implements fallback, but it does not appear to implement throughput limiting:

def request(
self,
method: str,
url: str,
fallback: Optional[Tuple[str, str]] = None,
options: Optional[Dict[str, Any]] = None,
params: Optional[Dict[str, Any]] = None,
json: Optional[Dict[str, Any]] = None,
) -> Any:
"""
Makes a request to the Airtable API, optionally converting a GET to a POST
if the URL exceeds the API's maximum URL length.
See https://support.airtable.com/docs/enforcement-of-url-length-limit-for-web-api-requests
Args:
method: HTTP method to use.
url: The URL we're attempting to call.
fallback: The method and URL to use if we have to convert a GET to a POST.
options: Airtable-specific query params to use while fetching records.
See :ref:`Parameters` for valid options.
params: Additional query params to append to the URL as-is.
json: The JSON payload for a POST/PUT/PATCH/DELETE request.
"""
# Convert Airtable-specific options to query params, but give priority to query params
# that are explicitly passed via `params=`. This is to preserve backwards-compatibility for
# any library users who might be calling `self._request` directly.
request_params = {
**options_to_params(options or {}),
**(params or {}),
}
# Build a requests.PreparedRequest so we can examine how long the URL is.
prepared = self.session.prepare_request(
requests.Request(
method,
url=url,
params=request_params,
json=json,
)
)
# If our URL is too long, move *most* (not all) query params into a POST body.
if (
fallback
and method.upper() == "GET"
and len(str(prepared.url)) >= self.MAX_URL_LENGTH
):
json, spare_params = options_to_json_and_params(options or {})
return self.request(
method=fallback[0],
url=fallback[1],
params={**spare_params, **(params or {})},
json=json,
)
response = self.session.send(prepared, timeout=self.timeout)
return self._process_response(response)

This means that if 6 requests are issued in a second, it will start to retry with progressively longer timeouts until 30 seconds have elapsed, at which point it will continue.

I suggest that it would be more in line with what Airtable is specifying if this class implemented a singleton for LastAPIRequestTime which tracks globally when the last API request was sent to a given Base (which seems to be the unit of throttling). Because Airtable allows 5 request per second, the Api class should enforce a wait of at least 0.2 clock seconds between requests. This should be configurable, because Airtable also states elsewhere that upsert calls have more stringent rate limiting.

It may be that Airtable is not throwing clients into a penalty box for 30 seconds when the rate limit is exceeded, but this is what the documentation clearly states that it is doing.

What is your experience?

from pyairtable.

mesozoic avatar mesozoic commented on September 26, 2024

time.sleep is both problematic and insufficient. It can cause unnecessary slowdowns in applications that rely on asynchronous event loops or which perform their own time-intensive processing in between requests, and it also doesn't account for multiple processes (perhaps on different servers) accessing the same base at the same time. The QPS limit is per base, not per connection or API key, so there's really no way for the client library to predict whether a particular request will get a 429.

My experience has been that the current retry strategy can gracefully handle Airtable returning a 429 status code under load. We tested this using the code you can see in the linked PR, and found that (1) it was more reliable than the prior implementation, and (2) it led to an overall reduction in runtime.

I don't believe Airtable enforces the 30 second delay they refer to in their documentation. As far as I can tell, Airtable's official JavaScript client library imposes no such cooldown period when it encounters 429s.

I'm going to close this issue because I don't think there's a reproducible problem here. If you're encountering a real world problem with the current retry strategy, please file a bug with sample code that reproduces it, and we'll look more closely.

from pyairtable.

simsong avatar simsong commented on September 26, 2024

Thank you for the discussion. I think that it would be useful to add a summary of the above to the comments in the code; if you wish, I can write up a PR that does this. When I evaluated pyairtable for a project recently, I specifically rejected it from consideration because it did not implement throttling in line with the Airtable documentation. That's what led me to post the issue.

from pyairtable.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.