Comments (7)
Can you say more about how exactly this happens? It's true that we don't strip the value when parsing content-length, but it's supposed to already be stripped in the last line of HTTPHeaders.parse_line
.
The \r\n
is not supposed to make it to parse_line
; those characters are handled in parse()
. I don't see an issue when Content-Length is the last header: we have a test for this case at
Line 188 in a48d634
I do see a couple of potential issues in edge cases, though.
Content-Length: 42\r\n \r\n
(with a space between the CRLF pairs) will add a space to the value"42 "
Content-Length:\r\n 42\r\n
(with the whole value in a continuation line) adds a leading space," 42"
Both of these cases are errors now although they were accepted prior to bf90f3a. I think they're both technically legal although I'd have to go back to the RFCs to be sure.
from tornado.
We had some code that was manually proxying headers from an upstream request to a response that was pushing all of the lines passed to a AsyncHTTPClient.fetch
header_callback
to parse_line
that triggered this.
from tornado.
I just tested sending a request with a Content-Length
of 0
, and it worked totally fine. Can you enter an example of a request that causes the problem?
from tornado.
The Content-Length needs to be the last header which then gets interpreted as a multi-line continuation and then adds a space itself, as stated in the first message.
from tornado.
Got it; now I can reproduce the bug. Agreed that this is a problem.
Also, it turns out that gunicorn and fasthttp also have this exact same bug.
from tornado.
Got it; now I can reproduce the bug. Agreed that this is a problem.
I'm still not clear on what exactly the problem is. Is there an issue with HTTPHeaders.parse()
or only with parse_line()
? Internally, Tornado only uses parse_line()
inside parse()
and in curl_httpclient
's header callback.
I see that there's a design mismatch in the interfaces of header_callback and parse_line: the former gives you the newlines, while parse_line expects them to be removed (this isn't formally specified but it's implied by the doctest). So you can't actually pass the values from header_callback directly to parse_line, even though this is superficially a reasonable thing to do.
There's also a couple of weird edge cases I noted at the bottom of #3321 (comment)
Does that cover everything or am I missing something?
Solutions to the design mismatch include:
- Working as intended, just needs better docs
- Deprecate
header_callback
inAsyncHTTPClient.fetch
and replace it with a separate callback that gives you a pre-parsed HTTPHeaders object. We need a callback that gives you headers before the first streaming chunk, but doing it with raw header lines just pushes unnecessary work into the application. - Make
parse_line
able to handle newlines. This almost works (by accident) because simple headers get stripped, but continuation lines can cause extraneous whitespace.
from tornado.
Aha, now I see the problem. Single-line headers have leading and trailing whitespace stripped, while continuation lines make it possible to construct a header with trailing whitespace, potentially confusing users of that header. RFC 9110 is clear that trailing whitespace should be stripped from header values. I'm going to:
- Make continuation lines containing only whitespace an error. The
parse_line
interface doesn't let us handle this properly (we must preserve internal space but strip trailing space, and we can't tell in the line-by-line interface whether we're looking at a middle line or the last one of a header) - Handle newlines in
parse_line
, specifically so that lines containing only newlines are no-ops. This fixes the way that the last header gets a trailing space if you useparse_line
directly instead ofparse
- Emit a deprecation warning on continuation lines. There should be no reason to support this feature any more and we should get rid of it in the future.
from tornado.
Related Issues (20)
- Can I write a single decorator combining @tornado.gen.coroutine and @run_on_executor. HOT 2
- Possible leak when exception is raised in inner coroutine HOT 1
- iostream: SSL logging is too noisy
- Tornado was blocked for more than 1 second in HOT 1
- static_url with a prefix does not work in a Template tag HOT 2
- how to handle the http stream data without using callback but based on coroutine? HOT 1
- Test fail with openssl 3.2 HOT 1
- tornado.websocket.WebSocketClosedError HOT 1
- Send 204 as HTTPError
- Tornado Websocket write message performance degradation when binary is False HOT 2
- `Subprocess.wait_for_exit` never resolves if process terminated before it is called HOT 3
- Websocket Client Handshake URI issue HOT 1
- Questions about Copilot + Open Source Software Hierarchy
- tornado.httputil.HTTPServerRequest and tornado.web.RequestHandler HOT 1
- Please update to 3.11 HOT 1
- TypeError: WebSocketHandler.init() missing 2 required positional arguments: 'application' and 'request' HOT 1
- Under pytest 8.2.0, 'AsyncHTTPTestCase' has no attribute 'runTest' HOT 1
- 6.4.1: pytest fails in multiple nits with `AttributeError: 'TestIOStreamMixin' object has no attribute 'io_loop'` error HOT 2
- Vulnerability: GHSA-753j-mpmx-qq6g HOT 1
- v6.1,CPU increased to 100% when the client closes the connection HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tornado.