Coder Social home page Coder Social logo

Comments (11)

yohanboniface avatar yohanboniface commented on August 18, 2024

@youknowone do you have a test case at hand to reproduce the issue?

from httptools.

youknowone avatar youknowone commented on August 18, 2024

A request:

b'''GET /ping/ HTTP/1.1\r\nHost: github.com\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\nAccept-Encoding: gzip, deflate\r\nAccept-Language: ko-KR,ko;q=0.8,en-US;q=0.6,en;q=0.4\r\n\r\n'''

I called feed_data each byte by byte and the headers are broken.

Related issue in sanic which uses httptools: sanic-org/sanic#755

They are implmenting their own header fragment buffer.

from httptools.

1st1 avatar 1st1 commented on August 18, 2024

I'll take a look as soon as I finish working on the next uvloop release.

from httptools.

1st1 avatar 1st1 commented on August 18, 2024

@yohanboniface feel free to work on this if you have time

from httptools.

yohanboniface avatar yohanboniface commented on August 18, 2024

@youknowone to make sure I understand the issue: it arises when a request is chunked in a middle of a header field?
So for example, to continue on your data, we'd have this as first chunk:

GET /ping/ HTTP/1.1\r\nHost: github.com\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nUpgrade-Insecure-Requests: 1\r\nTransfer-Encoding: chunked\r\nUser-Agent: Mozilla/5.0

And this as second chunk:

(Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\nAccept-Encoding: gzip, deflate\r\nAccept-Language: ko-KR,ko;q=0.8,en-US;q=0.6,en;q=0.4\r\n\r\n

And then you'd have an incomplete value for User-Agent and an invalid header name with the rest of the value?

Is that correct or am I missing something?

edit bah, no, indeed a chunked request is only about the body.
So it's not about the request itself being split in the middle of a header field but that the code implementing httptools chunking it manually before calling feed_data?

from httptools.

youknowone avatar youknowone commented on August 18, 2024

@yohanboniface Yes, your description is correct. User-Agent will be Mozzle/5.0 in that case.

As I know, the chunked body is a spec about logical chunk, not about TCP packet fragment.

A question here: does httptools expect to feed the whole HTTP body (at least "a chunk") at same time? Then it can be a user fault - but still weird.

Because httptools is the parser, I think basically the users can't determine which part of http request is going to httptools or not. For the point of view of user, "end of chunk" of http body and any fragmented packet in http header is not recognizable before putting it into the parser.
I think httptools is the correct place to merge the fragmented tcp packets to avoid double-parsing http request both in httptools and the users.

from httptools.

yohanboniface avatar yohanboniface commented on August 18, 2024

@youknowone made a quick unittest to reproduce what I've understood of the issue, but… it passes ;)

See #26

Can you please check the unittest and tell me what I'm missing to properly reproduce the issue? thanks :)

from httptools.

youknowone avatar youknowone commented on August 18, 2024

Thanks, your test is really helpful.
It seems I need to look into both httptools and sanic.
Give me some time. I am new to both part.

from httptools.

youknowone avatar youknowone commented on August 18, 2024

I changed your test a little and it now starts to be broken: #27

from httptools.

youknowone avatar youknowone commented on August 18, 2024

I also added a patch to #27. Thanks @yohanboniface, I would never looked into it without your test.

from httptools.

yohanboniface avatar yohanboniface commented on August 18, 2024

Cool!
I'll let @1st1 do the final review :)

from httptools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.