Hi, I am using requests-futures to send a lot of requests to a serve

How to catch timeout exception about requests-futures HOT 3 CLOSED

ross commented on August 19, 2024

How to catch timeout exception

from requests-futures.

Comments (3)

ross commented on August 19, 2024

because I request images and I don't want to use too much memory.

holding on to the requests won't really change the amount of memory being used, the response data will still be downloaded and loaded in to memory in the worker threads. the easiest thing to do would be to hold on to the futures and check on the status there. i'm not really clear on the use-case of requesting a lot of images, but not caring about the results would be, but http HEAD requests might be a good idea here if it works.

from requests-futures.

Fman77 commented on August 19, 2024

Unfortunately I cannot use HEAD requests as this is not implemented on the server I'm requesting.
Anyway as you suggested I tried to save all futures in a list and then after my for loop which submits all the requests, going through this list and perform a .result() on each future. I have this error : requests.exceptions.SSLError: [Errno 24] Too many open files, OR requests.exceptions.ConnectionError: ('Connection aborted.', ResponseNotReady('Request-sent',)). Do you have any idea where it can come from ?
By the way I cannot check response if I have a timeout, because if the server timeouts I won't have any response, I have to put a try/except block with TimeoutError exception

To sum up my code I have something like that :

URLS=[list of URLS]
session = FuturesSession(max_workers=200)

for url in URLS
futures.append(session.get(url))

for future in futures:
try:
future.result(timeout=10)
except OSError.TimeoutError:
print("Request timed out")

What is strange is that it works with only 100 requests, but when I do more requests I have the Too many open files error. Do I have to close or clean something ?

from requests-futures.

ross commented on August 19, 2024

requests.exceptions.SSLError: [Errno 24] Too many open files

my guess would be that requests is dumping the image data to tmp files on disk rather than holding it all in memory. you can probably increase the ulimit for the user running the script to a point to prevent it from hitting that limit. the response not ready stuff may be similar/related.

By the way I cannot check response if I have a timeout, because if the server timeouts I won't have any response, I have to put a try/except block with TimeoutError exception

if the server times out there isn't a response to check, the request just failed and didn't get one in the allowed time.

What is strange is that it works with only 100 requests, but when I do more requests I have the Too many open files error. Do I have to close or clean something ?

beyond raising the ulimit to allow more open files you might try getting rid of the responses once you're done with them rather than keeping them around in the futures array. something like

while futures:
    future = futures.pop(0)  # pop() would be fine if you don't care about order
    # check the responses the same as you otherwise would

you have a pretty odd use-case here, requesting a large number of images, but not caring about the resulting data. you're also sending off "all" of the requests before checking on any of the responses which would result in buffering everything in to the program at once.

you aren't really in the designed uses for requests-futures and could probably get a lot further by doing the multi-threading yourself:

#!/usr/bin/env python

from Queue import Queue
from requests import Session
from threading import Thread
from time import sleep
import logging

# using logger so that prints aren't interleaved from multiple threads
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger()


class Worker(Thread):

  def __init__(self, queue):
    super(Worker, self).__init__()
    self.queue = queue
    self.session = Session()
    self.start()

  def run(self):
    while True:
      url = queue.get()
      # run until we get a None url, our signal to stop
      if url is None:
        break
      logger.debug(url)
      resp = self.session.get(url)
      # do whatever checking you want to do here


num_workers = 20
num_urls = 30

queue = Queue()
logger.info("creating workers")
workers = [Worker(queue) for i in xrange(num_workers)]

logger.info("enqueing urls")
for i in xrange(num_urls):
  queue.put('http://www.foo.bar/{}'.format(i))

logger.info("waiting for queue to empty")
while not queue.empty():
  sleep(1)

logger.info("signaling workers to stop")
# enqueue enough None jobs to stop all the workers
for worker in workers:
  queue.put(None)

logger.info("waiting on workers to finish")
for worker in workers:
  # wait for it to stop
  worker.join()

from requests-futures.

How to catch timeout exception about requests-futures HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent