Coder Social home page Coder Social logo

Comments (29)

tanner0101 avatar tanner0101 commented on May 31, 2024 6

@normanmaurer pre-NIO, Vapor had implemented lazy headers and that gave us a huge performance gain. Definitely the biggest gain over any previous optimizations we had done (besides non-blocking :P).

Some notes about how we did it:

  • HTTPHeaders struct has a COW HTTPHeadersStorage class.
  • HTTPHeadersStorage is backed by an UnsafeMutablePointer<UInt8>.
  • Reading headers means scanning over the bytes and extracting the desired headers.
  • Mutating headers means first doing COW check (and possibly copying the buffer) then editing the bytes in memory, re-allocating the buffer if needed.
  • The important piece was that the HTTPHeadersStorage buffer was always valid HTTP/1 headers. This allowed us to simply write the header storage buffer directly to the output stream when serializing HTTP messages.
  • We were also able to optimize some common base-cases, like:
Content-Length: 0

These base-case HTTPHeaders structs were statically available and, if never mutated (which is common for simple plaintext benchmarks), the performance was great.

link: HTTPHeaderStorage.swift

I was hoping at some point (before NIO happened) to further expand this HTTPHeaders concept to the entire HTTP message header (what NIO calls "head part"). I think everything there can benefit from being parsed lazily (besides perhaps the version, but still no harm done if it's parsed lazily). For example, Vapor has the concept of an "anything" router segment that matches all URIs. This, combined with lazy URI parsing in an HTTP request, could mean the URI never needs to get copied.

For reference, here is a rough idea of the performance I was seeing:

  • vapor/engine: ~130kRPS (highly optimized HTTP echo)
  • go/fast-http: ~120kRPS (also just HTTP echo)
  • vapor/vapor (pre-nio): ~105kRPS (vapor app returning "hello, world")
  • go/gin: ~90kRPS (gin app returning "hello, world")
  • vapor/vapor (post-nio): ~80kRPS (vapor app returning "hello, world")

from swift-nio.

jckarter avatar jckarter commented on May 31, 2024 3

Cross-module inlining is planned to become publicly available using the @inlinable attribute from https://forums.swift.org/t/se-0193-cross-module-inlining-and-specialization/7310 once that's finalized; the design is unlikely to diverge substantially from the existing @_inlineable (note the second e) internal attribute that's already present in 4.x except in spelling. Longer term, the build system ought to support cross-module optimization as a matter of course.

from swift-nio.

helje5 avatar helje5 commented on May 31, 2024 2

@fadi-botros It would be very helpful if you would provide a lot more info on this, as such this is just a pretty useless claim? You did JSON encoding/decoding? How? Using Codable? Maybe even Foundation.JSONDecoder? You'll get very disappointing performance with those for sure. What Swift version did you use? etc.

Wrt the Swift String performance, I think the claim is, that a Swift 4.1 UTF-8 String will be pretty much the same as the buffer. There is probably no need for a new class for cases which actually warrant a String (e.g. IMO this isn't the case in the HTTP stack, also checkout PR #355 on this).

I did a pretty detailed performance analysis for Redi/S, and NIO performance is already (surprisingly) close to the C implementation. You can find it over here: RedisServer/Performance.

from swift-nio.

weissi avatar weissi commented on May 31, 2024 2

the suggested items have been addressed and everything changes anyway with Swift 5

from swift-nio.

tanner0101 avatar tanner0101 commented on May 31, 2024 1

@twof 6.3MB at boot, 7.1MB during peak load.

@normanmaurer here is a base Vapor 3 (release candidate) project with the ping route.

nio-http.zip

Use this to run it:

swift run -c release Run --env prod

Beware that (at time of writing this) Vapor hasn't had the official 3.0.0 tag placed yet and thus hasn't undergone any in-depth performance or memory analysis. There are likely to be a few performance issues and possibly some memory leaks. (If you find any, please let me know haha). But it should hopefully help you nonetheless. You can check back on https://github.com/vapor/vapor/releases/tag/3.0.0 for the official release.

Also note that DateMiddleware is disabled by default. You can uncomment in configure.swift to enable it.

from swift-nio.

helje5 avatar helje5 commented on May 31, 2024 1

Upfront: I actually don't think this belongs here, maybe there should be a NIO mailing list (personally I wouldn't take part in a forum, but maybe that too). This is more like a general NIO programming question than an actual HTTP performance issue(, I think). E.g. even basic stuff like that NIO doesn't use GCD is missing here ...

Also, may using low-level threading primitives may be better than GCD ??

First of all: NIO does not use GCD. In a way it uses a simplified version of that, which has advantages.

What threading primitives work best for you, depends on the use case a lot. E.g. for Redi/S a plain Readers/Writer lock turned out to be best. But then the wrapped operations operate just in memory and are very fast in the first place.

Summary: it depends a lot.

"Almost" nothing to do w/ NIO. It has a thread safe setup, which is a little more expensive than what for example Node does, but if you avoid TS promises and such, it doesn't show up in my tests as an issue.

@helje5 Says in his performance analysis to his tool Redi/S, that using one thread was faster than using GCD or EventLoop (EventLoop code is also built on GCD I guess).

So you didn't even read what I wrote. :-> I actually found that for Redi/S the sweet spot was two threads.
But yes, Node.js style separate processes w/o locking overhead may actually perform better than a threaded server (w/ the necessary in process locking). But all that again really depends on what you are actually doing (how much shared state etc).

@twof For the code: Here is it:
https://gist.github.com/fadi-botros/b8276bb5e2b83a63045877aeb7415f34

This has quite a few basic NIO programming errors (flush, flush, close?, why is it an ChannelOutboundHandler?), it uses Codable, worse JSONEncoder, a DateFormatter, etc. I'm not surprised this is slow.
But none of that has to do w/ NIO.

That you come forward and claim that we need a:

WHOLE NEW string class

seems a little premature and not backed by actual analysis ;-)

from swift-nio.

normanmaurer avatar normanmaurer commented on May 31, 2024

@Lukasa @weissi @tanner0101 ping...

from swift-nio.

normanmaurer avatar normanmaurer commented on May 31, 2024

As suggested by @Lukasa we may be able to just "lazy parse and create Strings" and just make the HttpHeaders directly operate on ByteBuffer. This way we would only need to create Strings if the user ask for a specific header etc.

from swift-nio.

helje5 avatar helje5 commented on May 31, 2024

https://lists.swift.org/pipermail/swift-server-dev/Week-of-Mon-20171211/000735.html

(As mentioned, I would prefer a lower level HTTPMessage object, which directly peeks into and reuses the receive buffer. But again, all this has been discussed quite actively, and people insisted that String is what we want to have here, and that it’ll be fast enough - so I guess we just stick to that.)

Β―\_(ツ)_/Β―

Or in other words: I absolutely agree. We had this discussion w/ @weissi in the HTTP list and IIRR he said that Strings are cool ;-)

from swift-nio.

weissi avatar weissi commented on May 31, 2024

@helje5 Well, I still think the API should be Strings or at least some String-like type and not [UInt8], ByteBuffer or Unsafe[Raw]BufferPointer directly. So today I think we should make the String creation lazy and hope that that's good enough. Swift 5 (or maybe even 4.2) should improve the performance quite a bit because AFAIK it'll have a native UTF-8 backing and also the short string optimisation (Strings shorter than some number of characters can be stored inside of the pointer value, no allocations necessary).
If that turns out not to be fast enough still, we could always think about our own String-like type that is backed by a ByteBuffer without any further allocations. Possibly a future version of Swift would even allow the actual String with additional representations.

Now I have to admit that I would have expected Swift to deliver a String type that is backed by memory provided externally by this point in time. In other words I previously had hoped that just naively going with String will be good enough as I hoped that the language would develop its String capabilities pretty soon. Unfortunately, that hasn't happened to this day but there's still hope. So far I think leaving the API as is but making the String creation lazy should be good enough for now.

from swift-nio.

helje5 avatar helje5 commented on May 31, 2024

cross-module calls (no optimisations for this in swift yet :( )

@jckarter may provide input on this, but my feeling is that this won't actually change. Or in other words, it may make sense to have one big module (I made that mistake in Noze.io, and used a lot of tiny modules, which is pretty bad for perf, especially if generics are involved).

The other thing I noticed from a performance perspective, is that you create quite few helper objects and make a lot of indirections. To such a degree that I wonder whether this is going to hurt performance (in general, not just HTTP) just by malloc and function call (+ARC - a thread safe op!) overhead. No idea.

from swift-nio.

helje5 avatar helje5 commented on May 31, 2024

@helje5 Well, I still think the API should be Strings or at least some String-like type and not [UInt8], ByteBuffer or Unsafe[Raw]BufferPointer directly.

Yes, I never suggested that ;-)

There is a lot of optimisation potential in here. It may also make sense to provide static strings as the subscript input (sth like header[NIO.Headers.ContentLength]). And for some values (User-Agent etc) come to mind, it may make sense to maintain an intern table.

from swift-nio.

tanner0101 avatar tanner0101 commented on May 31, 2024

Well, I still think the API should be Strings or at least some String-like type

Totally agree.

So today I think we should make the String creation lazy and hope that that's good enough.

Just lazy String creation will be a huge win by itself. I think this makes sense as a first step before needing to delve into optimizing the actual String type.

It may also make sense to provide static strings as the subscript input

An overload for HTTPHeaders.subscript(name:) that accepts StaticString would be great.


Ideally, parsing an HTTP/1 message head (startline and headers), would require zero copies. After the OS reads the data into a ByteBuffer, that buffer would then be used to produce all needed data lazily. The HTTP scanner just needs to iterate over the bytes until it finds \r\n\r\n, then it's done--at this point it has been fully parsed into an HTTP{Request|Response}Head. These would all be computed properties that access the private COW storage.

The HTTP decoder (pipeline handler) will of course need to peek at some things like the Connection, Transfer-Encoding, or Content-Length headers and potentially the HTTP version to do its job. But those should be the only copies required as a part of the pipeline and they should happen only once required.

In terms of HTTP/2, I'm not too well versed with the spec yet, but as I understand it the same method should work just as well with HTTP/2 frames. Probably with even more room for optimization since we don't need to convert text-encoded integers to discover content length or search for crlfs.


That being said, the above change would require a lot of work. It would replace a lot of, if not everything, http-parser is providing. A simpler alternative (and perhaps a better starting place) would be to apply this lazy parsing to the HTTP headers struct only. The pointer info we get from the http-parser callbacks (on_header_field and on_header_value) can be stored to facilitate reads and writes on the underlying storage (see HTTPHeaderStorage.indexes property).

from swift-nio.

helje5 avatar helje5 commented on May 31, 2024

HTTP/2 does something similar to what I suggested above:

it may make sense to maintain an intern table

The disadvantage of preserving the HTTP header as-is, is that the header block can actually become sizeable (1K or more) and may require multiple reallocs. Another way to approach this are as mentioned string tables and maybe structured values.

The far majority of header names and values are very static. Only few headers actually change in real live (Date, Content-Length). Since you have to look at every byte anyways (to find CRLF etc), you can as well match headers during that process.

Eventually instead of storing one big blob, you could store them in a structured way, either as actual Strings (with CoW values) or as indices into a string table (the latter could save you a lot of ARC). So that

Content-Type: text/json
Content-Length: 0
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.2.149.27 Safari/525.13,

becomes:
(-1, 1)
(-2, 2)
(-3, 3)
aka ~6 bytes instead of 175.

This should play especially well with the "performance tests" ;-)

FWIW: This is also what Netty seems to be doing at least for common header names:

public final class HttpHeaderNames {
    public static final AsciiString ACCEPT = AsciiString.cached("accept");
}

But in the end I'm not the poor soul who has to implement that, so I guess I stop doing suggestions 😬

from swift-nio.

normanmaurer avatar normanmaurer commented on May 31, 2024

I will work on this next week.... Or @tanner0101 you want to try port your stuff from vapor ?

from swift-nio.

tanner0101 avatar tanner0101 commented on May 31, 2024

@normanmaurer honestly very tempting offer... but I don't have enough free time right now to do the job right. I am more than happy to help out with code review and testing though!

from swift-nio.

normanmaurer avatar normanmaurer commented on May 31, 2024

@tanner0101 ok will hopefully start on monday

from swift-nio.

helje5 avatar helje5 commented on May 31, 2024

JFYI: https://github.com/Geal/parser_benchmarks

Benchmarks of parsing HTTP requests with nom (a parser combinator):
β€’ The classical way: 355 Mb/s,
β€’ Optimized (with SIMD): 1930 Mb/s.

from swift-nio.

normanmaurer avatar normanmaurer commented on May 31, 2024

@tanner0101 can you share how you ran the benchmark ? Like what command you used on the client side and what response you expected etc

from swift-nio.

tanner0101 avatar tanner0101 commented on May 31, 2024

I always run this same command to keep things consistent:

wrk -t 4 -c 128 -d <usually between 1s - 60s> http://localhost:8080/ping

And it hits a route that returns:

HTTP/1.1 200 OK
Content-Length: 4
Date: ...
Content-Type: text/plain

pong

note: Date header is required to be valid HTTP which is annoying because it's quite slow to generate the date. Vapor's DateMiddleware optimizes this by caching the result and only recalculating when it changes.

In Vapor, that looks like:

router.get("ping") { req -> StaticString in
    return "pong" 
}

Since the "pong" string doesn't change between requests, we can use StaticString which gives a nice boost.

from swift-nio.

normanmaurer avatar normanmaurer commented on May 31, 2024

@tanner0101 thanks. Could you also provide me with an example that I can run using vapor-core only ?

from swift-nio.

twof avatar twof commented on May 31, 2024

@tanner0101 Do you have any stats on memory usage during those benchmarks?

from swift-nio.

fadi-botros avatar fadi-botros commented on May 31, 2024

Very good issue. Actually I tried to benchmark SwiftNIO against Vertx (tried to make a benchmark under https://www.techempower.com/benchmarks/), and the results were extremely disappointing to me.

I think we need a WHOLE NEW string class, also, JSON Encoding / Decoding, we need it in a high-performance way (I know SwiftNIO is low-level, it may not support JSON out-of-the-box, but I see making a lot of "performant converters" that doesn't use Swift String will give a lot of performance gain) for this framework (this converters may have separate module for each, so that one can only use what he needs).

Swift is very performant in a lot of things, but a big question mark on Swift String performance.

We need something like cstring, and a lot of operations on it, in HTTP, one nearly doesn't use UTF-8 not other encoding, it will be used in database and other things (out-of-scope of SwiftNIO), but HTTP itself doesn't need things out of cstrings. One should try making StringBuffer implementation that doesn't use Swift String, for example.

from swift-nio.

jdmcd avatar jdmcd commented on May 31, 2024

@fadi-botros Can you please share the results from the benchmarks you ran?

from swift-nio.

twof avatar twof commented on May 31, 2024

Can you also post the code for your tests? There are some very confusing results coming out of Vapor too. No idea if anyone's looked into those yet.

from swift-nio.

fadi-botros avatar fadi-botros commented on May 31, 2024

Also, may using low-level threading primitives may be better than GCD ??
@helje5 Says in his performance analysis to his tool Redi/S, that using one thread was faster than using GCD or EventLoop (EventLoop code is also built on GCD I guess).

@twof For the code: Here is it:
https://gist.github.com/fadi-botros/b8276bb5e2b83a63045877aeb7415f34

from swift-nio.

jdmcd avatar jdmcd commented on May 31, 2024

@fadi-botros Can you share the actual results of your benchmark? Including what you hardware and software you used to run it (wrk, etc). It's hard to contextualize your concerns without seeing the numbers and I'm curious to know what you used to perform the tests.

from swift-nio.

fadi-botros avatar fadi-botros commented on May 31, 2024

@mcdappdev This is the results
Sorry, it is as it is resultant from TechEmpower framework benchmark scripts.
Hardware: MacBook Pro 13'', Late 2011, with SSD hard disk 256 GB, 8 GB of RAM
Software: Linux (Ubuntu Trusty) VM, using Vagrant, (which is built on VirtualBox),
the host OS is macOS High Sierra.
The "framework" itself is https://github.com/TechEmpower/FrameworkBenchmarks

https://gist.github.com/fadi-botros/fc0aeb0bbd27a986a1f2a4c3056a5e3d

Also note, my try of "plaintext" test has problems and I don't know where

This is my code (anyway, the link was also in a previous reply)
https://gist.github.com/fadi-botros/b8276bb5e2b83a63045877aeb7415f34

from swift-nio.

fadi-botros avatar fadi-botros commented on May 31, 2024

@helje5 Sorry but could you suggest improvements of my code ?
I tried removing date formatter and using C time library { time, gmtime }

let weekdays = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
let months = ["January", "February", "March", "April", "May", "June", "July", "August", "September",
              "October", "November", "December"]
let comma = ", "
let gmt = " GMT"
let colon = ":"
let space = " "

...

var t: time_t = 0
time(&t)
let a = gmtime(&t).pointee
let weekday = weekdays[Int(a.tm_wday)]
let month = months[Int(a.tm_mon)]

I even tried to convert them to utf8CString then appending them, it was a pain, and the performance is worst

Also changed the writeAndFlush, writeAndFlush, close
to write, write, writeAndFlush, close

And also no significant performance improvements

It seems the only problem is with JSONEncoder, but also I think String class may need some C enhancements (not sure of that).

from swift-nio.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.