Coder Social home page Coder Social logo

Comments (2)

bdarnell avatar bdarnell commented on July 18, 2024

This is unexpected, but I think I have an idea what's going on here. I have a version of this test case in https://replit.com/@bdarnell/tornado-gh-3299#main.py in which I have removed multiprocessing, requests, and flask, so Tornado is the only library involved. I see that with a Tornado RequestHandler things work as expected, but with a WSGI app that sleeps we see the behavior in which every request finishes at the same time.

I think this is because the WSGIContainer requires at least two trips through the ThreadPoolExecutor: once for executing the body of the function, and once for each iteration through the returned iterable:

tornado/tornado/wsgi.py

Lines 156 to 174 in ea0b320

app_response = await loop.run_in_executor(
self.executor,
self.wsgi_application,
self.environ(request),
start_response,
)
try:
app_response_iter = iter(app_response)
def next_chunk() -> Optional[bytes]:
try:
return next(app_response_iter)
except StopIteration:
# StopIteration is special and is not allowed to pass through
# coroutines normally.
return None
while True:
chunk = await loop.run_in_executor(self.executor, next_chunk)

This is because we can't know whether the WSGI app is doing everything up front and producing the response all at once, or if it's trying to do some sort of pseudo-async iterable and doing real work after the return. As a result, we get all 50 incoming requests queuing up in the thread pool to execute their first sleep, and then none of them can execute their second trip through the thread pool until after all of the first calls are finished.

This is something of an edge case, since it only occurs if you overload a thread pool. And even then, it replaces one kind of poor performance with another (instead of every request being processed after 7 seconds, you'd see a wave of requests being processed at 1s, a second wave at 2s, etc until some requests are taking 7s or more).

I'm not sure if there's a clean way to distinguish wsgi responses that do real work in a pseudo-async way (and therefore need the use of the thread pool) vs those that do not (and can be iterated on the main thread). We could certainly special-case ordinary list objects, but I'm not sure if that works for flask. We could perhaps have two thread pools for the different phases of the request, but sizing them gets tricky (this is where apple's libdispatch would be useful). Or I suppose we could staple the first iteration of the response to the pre-response function call to cover the common cases.

This issue is specific to WSGI so I'm re-titling it accordingly. Note that I would strongly encourage you to either use Tornado's native interfaces (tornado.web.RequestHandler, etc) on Tornado, or to use Flask or other WSGI frameworks on a WSGI-first server like uwsgi or gunicorn. Using flask via WSGIContainer on Tornado is not a great solution. Prior to Tornado 6.2 it was a really poor solution. Now it's better, but it's still not as good as servers that were built for WSGI from the ground up, and I do not intend to ever turn Tornado into a world-class WSGI server.

from tornado.

zweifeng1995 avatar zweifeng1995 commented on July 18, 2024

This is unexpected, but I think I have an idea what's going on here. I have a version of this test case in https://replit.com/@bdarnell/tornado-gh-3299#main.py in which I have removed multiprocessing, requests, and flask, so Tornado is the only library involved. I see that with a Tornado RequestHandler things work as expected, but with a WSGI app that sleeps we see the behavior in which every request finishes at the same time.

I think this is because the WSGIContainer requires at least two trips through the ThreadPoolExecutor: once for executing the body of the function, and once for each iteration through the returned iterable:

tornado/tornado/wsgi.py

Lines 156 to 174 in ea0b320

app_response = await loop.run_in_executor(
self.executor,
self.wsgi_application,
self.environ(request),
start_response,
)
try:
app_response_iter = iter(app_response)
def next_chunk() -> Optional[bytes]:
try:
return next(app_response_iter)
except StopIteration:
# StopIteration is special and is not allowed to pass through
# coroutines normally.
return None
while True:
chunk = await loop.run_in_executor(self.executor, next_chunk)

This is because we can't know whether the WSGI app is doing everything up front and producing the response all at once, or if it's trying to do some sort of pseudo-async iterable and doing real work after the return. As a result, we get all 50 incoming requests queuing up in the thread pool to execute their first sleep, and then none of them can execute their second trip through the thread pool until after all of the first calls are finished.

This is something of an edge case, since it only occurs if you overload a thread pool. And even then, it replaces one kind of poor performance with another (instead of every request being processed after 7 seconds, you'd see a wave of requests being processed at 1s, a second wave at 2s, etc until some requests are taking 7s or more).

I'm not sure if there's a clean way to distinguish wsgi responses that do real work in a pseudo-async way (and therefore need the use of the thread pool) vs those that do not (and can be iterated on the main thread). We could certainly special-case ordinary list objects, but I'm not sure if that works for flask. We could perhaps have two thread pools for the different phases of the request, but sizing them gets tricky (this is where apple's libdispatch would be useful). Or I suppose we could staple the first iteration of the response to the pre-response function call to cover the common cases.

This issue is specific to WSGI so I'm re-titling it accordingly. Note that I would strongly encourage you to either use Tornado's native interfaces (tornado.web.RequestHandler, etc) on Tornado, or to use Flask or other WSGI frameworks on a WSGI-first server like uwsgi or gunicorn. Using flask via WSGIContainer on Tornado is not a great solution. Prior to Tornado 6.2 it was a really poor solution. Now it's better, but it's still not as good as servers that were built for WSGI from the ground up, and I do not intend to ever turn Tornado into a world-class WSGI server.

Okay, thank you very much for your patience. I'll try to modify my code and retest it.

from tornado.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.