Does pyairtable have provisions for slowing the rate

It does! In <a class="issue-link js-issue-link" data-error-text="Failed to load title"

Rate limit API requests without forcing retries about pyairtable HOT 4 CLOSED

simsong commented on September 26, 2024

Rate limit API requests without forcing retries

from pyairtable.

Comments (4)

mesozoic commented on September 26, 2024

It does! In #272 (part of the 2.0 release) we introduced a default retry strategy for 429 errors (QPS limits exceeded) that will retry up to five times with exponential backoff. This somewhat mirrors the behavior of the official Airtable JS library, though we'll give up after ~3 seconds (and it's not clear to me whether their implementation will ever stop retrying).

Are you running into any issues with the default retry strategy?

from pyairtable.

simsong commented on September 26, 2024

Thank you for the response! I have reviewed the pyairtable code and reviewed Airtable's statement on rate limiting.

Airtable states: "Airtable enforces a rate limit of 5 requests per second to ensure optimal user performance across all pricing tiers. If you exceed this rate, you will receive a 429 status code and must wait 30 seconds before subsequent requests will succeed."

This may not be what Airtable is actually doing, but what it says is that if you issue 6 requests in a second, you need to wait 30 seconds before the next request is successful.

The request method in the Api class implements fallback, but it does not appear to implement throughput limiting:

pyairtable/pyairtable/api/api.py

Lines 109 to 166 in 308c269

    
               def request( 
        
                   self, 
        
                   method: str, 
        
                   url: str, 
        
                   fallback: Optional[Tuple[str, str]] = None, 
        
                   options: Optional[Dict[str, Any]] = None, 
        
                   params: Optional[Dict[str, Any]] = None, 
        
                   json: Optional[Dict[str, Any]] = None, 
        
               ) -> Any: 
        
                   """ 
        
                   Makes a request to the Airtable API, optionally converting a GET to a POST 
        
                   if the URL exceeds the API's maximum URL length. 
        
                   See https://support.airtable.com/docs/enforcement-of-url-length-limit-for-web-api-requests 
        
                   Args: 
        
                       method: HTTP method to use. 
        
                       url: The URL we're attempting to call. 
        
                       fallback: The method and URL to use if we have to convert a GET to a POST. 
        
                       options: Airtable-specific query params to use while fetching records. 
        
                           See :ref:`Parameters` for valid options. 
        
                       params: Additional query params to append to the URL as-is. 
        
                       json: The JSON payload for a POST/PUT/PATCH/DELETE request. 
        
                   """ 
        
                   # Convert Airtable-specific options to query params, but give priority to query params 
        
                   # that are explicitly passed via `params=`. This is to preserve backwards-compatibility for 
        
                   # any library users who might be calling `self._request` directly. 
        
                   request_params = { 
        
                       **options_to_params(options or {}), 
        
                       **(params or {}), 
        
                   } 
        
                   # Build a requests.PreparedRequest so we can examine how long the URL is. 
        
                   prepared = self.session.prepare_request( 
        
                       requests.Request( 
        
                           method, 
        
                           url=url, 
        
                           params=request_params, 
        
                           json=json, 
        
                       ) 
        
                   ) 
        
                   # If our URL is too long, move *most* (not all) query params into a POST body. 
        
                   if ( 
        
                       fallback 
        
                       and method.upper() == "GET" 
        
                       and len(str(prepared.url)) >= self.MAX_URL_LENGTH 
        
                   ): 
        
                       json, spare_params = options_to_json_and_params(options or {}) 
        
                       return self.request( 
        
                           method=fallback[0], 
        
                           url=fallback[1], 
        
                           params={**spare_params, **(params or {})}, 
        
                           json=json, 
        
                       ) 
        
                   response = self.session.send(prepared, timeout=self.timeout) 
        
                   return self._process_response(response)

This means that if 6 requests are issued in a second, it will start to retry with progressively longer timeouts until 30 seconds have elapsed, at which point it will continue.

I suggest that it would be more in line with what Airtable is specifying if this class implemented a singleton for LastAPIRequestTime which tracks globally when the last API request was sent to a given Base (which seems to be the unit of throttling). Because Airtable allows 5 request per second, the Api class should enforce a wait of at least 0.2 clock seconds between requests. This should be configurable, because Airtable also states elsewhere that upsert calls have more stringent rate limiting.

It may be that Airtable is not throwing clients into a penalty box for 30 seconds when the rate limit is exceeded, but this is what the documentation clearly states that it is doing.

What is your experience?

from pyairtable.

mesozoic commented on September 26, 2024

time.sleep is both problematic and insufficient. It can cause unnecessary slowdowns in applications that rely on asynchronous event loops or which perform their own time-intensive processing in between requests, and it also doesn't account for multiple processes (perhaps on different servers) accessing the same base at the same time. The QPS limit is per base, not per connection or API key, so there's really no way for the client library to predict whether a particular request will get a 429.

My experience has been that the current retry strategy can gracefully handle Airtable returning a 429 status code under load. We tested this using the code you can see in the linked PR, and found that (1) it was more reliable than the prior implementation, and (2) it led to an overall reduction in runtime.

I don't believe Airtable enforces the 30 second delay they refer to in their documentation. As far as I can tell, Airtable's official JavaScript client library imposes no such cooldown period when it encounters 429s.

I'm going to close this issue because I don't think there's a reproducible problem here. If you're encountering a real world problem with the current retry strategy, please file a bug with sample code that reproduces it, and we'll look more closely.

from pyairtable.

simsong commented on September 26, 2024

Thank you for the discussion. I think that it would be useful to add a summary of the above to the comments in the code; if you wish, I can write up a PR that does this. When I evaluated pyairtable for a project recently, I specifically rejected it from consideration because it did not implement throttling in line with the Airtable documentation. That's what led me to post the issue.

from pyairtable.

Rate limit API requests without forcing retries about pyairtable HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	def request(
	self,
	method: str,
	url: str,
	fallback: Optional[Tuple[str, str]] = None,
	options: Optional[Dict[str, Any]] = None,
	params: Optional[Dict[str, Any]] = None,
	json: Optional[Dict[str, Any]] = None,
	) -> Any:
	"""
	Makes a request to the Airtable API, optionally converting a GET to a POST
	if the URL exceeds the API's maximum URL length.

	See https://support.airtable.com/docs/enforcement-of-url-length-limit-for-web-api-requests

	Args:
	method: HTTP method to use.
	url: The URL we're attempting to call.
	fallback: The method and URL to use if we have to convert a GET to a POST.
	options: Airtable-specific query params to use while fetching records.
	See :ref:`Parameters` for valid options.
	params: Additional query params to append to the URL as-is.
	json: The JSON payload for a POST/PUT/PATCH/DELETE request.
	"""
	# Convert Airtable-specific options to query params, but give priority to query params
	# that are explicitly passed via `params=`. This is to preserve backwards-compatibility for
	# any library users who might be calling `self._request` directly.
	request_params = {
	**options_to_params(options or {}),
	**(params or {}),
	}

	# Build a requests.PreparedRequest so we can examine how long the URL is.
	prepared = self.session.prepare_request(
	requests.Request(
	method,
	url=url,
	params=request_params,
	json=json,
	)
	)

	# If our URL is too long, move most (not all) query params into a POST body.
	if (
	fallback
	and method.upper() == "GET"
	and len(str(prepared.url)) >= self.MAX_URL_LENGTH
	):
	json, spare_params = options_to_json_and_params(options or {})
	return self.request(
	method=fallback[0],
	url=fallback[1],
	params={spare_params, (params or {})},
	json=json,
	)

	response = self.session.send(prepared, timeout=self.timeout)
	return self._process_response(response)