Comments (29)
@normanmaurer pre-NIO, Vapor had implemented lazy headers and that gave us a huge performance gain. Definitely the biggest gain over any previous optimizations we had done (besides non-blocking :P).
Some notes about how we did it:
HTTPHeaders
struct has a COWHTTPHeadersStorage
class.HTTPHeadersStorage
is backed by anUnsafeMutablePointer<UInt8>
.- Reading headers means scanning over the bytes and extracting the desired headers.
- Mutating headers means first doing COW check (and possibly copying the buffer) then editing the bytes in memory, re-allocating the buffer if needed.
- The important piece was that the
HTTPHeadersStorage
buffer was always valid HTTP/1 headers. This allowed us to simply write the header storage buffer directly to the output stream when serializing HTTP messages. - We were also able to optimize some common base-cases, like:
Content-Length: 0
These base-case HTTPHeaders
structs were statically available and, if never mutated (which is common for simple plaintext benchmarks), the performance was great.
link: HTTPHeaderStorage.swift
I was hoping at some point (before NIO happened) to further expand this HTTPHeaders
concept to the entire HTTP message header (what NIO calls "head part"). I think everything there can benefit from being parsed lazily (besides perhaps the version, but still no harm done if it's parsed lazily). For example, Vapor has the concept of an "anything" router segment that matches all URIs. This, combined with lazy URI parsing in an HTTP request, could mean the URI never needs to get copied.
For reference, here is a rough idea of the performance I was seeing:
- vapor/engine: ~130kRPS (highly optimized HTTP echo)
- go/fast-http: ~120kRPS (also just HTTP echo)
- vapor/vapor (pre-nio): ~105kRPS (vapor app returning "hello, world")
- go/gin: ~90kRPS (gin app returning "hello, world")
- vapor/vapor (post-nio): ~80kRPS (vapor app returning "hello, world")
from swift-nio.
Cross-module inlining is planned to become publicly available using the @inlinable
attribute from https://forums.swift.org/t/se-0193-cross-module-inlining-and-specialization/7310 once that's finalized; the design is unlikely to diverge substantially from the existing @_inlineable
(note the second e
) internal attribute that's already present in 4.x except in spelling. Longer term, the build system ought to support cross-module optimization as a matter of course.
from swift-nio.
@fadi-botros It would be very helpful if you would provide a lot more info on this, as such this is just a pretty useless claim? You did JSON encoding/decoding? How? Using Codable? Maybe even Foundation.JSONDecoder? You'll get very disappointing performance with those for sure. What Swift version did you use? etc.
Wrt the Swift String performance, I think the claim is, that a Swift 4.1 UTF-8 String will be pretty much the same as the buffer. There is probably no need for a new class for cases which actually warrant a String (e.g. IMO this isn't the case in the HTTP stack, also checkout PR #355 on this).
I did a pretty detailed performance analysis for Redi/S, and NIO performance is already (surprisingly) close to the C implementation. You can find it over here: RedisServer/Performance.
from swift-nio.
the suggested items have been addressed and everything changes anyway with Swift 5
from swift-nio.
@twof 6.3MB at boot, 7.1MB during peak load.
@normanmaurer here is a base Vapor 3 (release candidate) project with the ping
route.
Use this to run it:
swift run -c release Run --env prod
Beware that (at time of writing this) Vapor hasn't had the official 3.0.0 tag placed yet and thus hasn't undergone any in-depth performance or memory analysis. There are likely to be a few performance issues and possibly some memory leaks. (If you find any, please let me know haha). But it should hopefully help you nonetheless. You can check back on https://github.com/vapor/vapor/releases/tag/3.0.0 for the official release.
Also note that DateMiddleware
is disabled by default. You can uncomment in configure.swift
to enable it.
from swift-nio.
Upfront: I actually don't think this belongs here, maybe there should be a NIO mailing list (personally I wouldn't take part in a forum, but maybe that too). This is more like a general NIO programming question than an actual HTTP performance issue(, I think). E.g. even basic stuff like that NIO doesn't use GCD is missing here ...
Also, may using low-level threading primitives may be better than GCD ??
First of all: NIO does not use GCD. In a way it uses a simplified version of that, which has advantages.
What threading primitives work best for you, depends on the use case a lot. E.g. for Redi/S a plain Readers/Writer lock turned out to be best. But then the wrapped operations operate just in memory and are very fast in the first place.
Summary: it depends a lot.
"Almost" nothing to do w/ NIO. It has a thread safe setup, which is a little more expensive than what for example Node does, but if you avoid TS promises and such, it doesn't show up in my tests as an issue.
@helje5 Says in his performance analysis to his tool Redi/S, that using one thread was faster than using GCD or EventLoop (EventLoop code is also built on GCD I guess).
So you didn't even read what I wrote. :-> I actually found that for Redi/S the sweet spot was two threads.
But yes, Node.js style separate processes w/o locking overhead may actually perform better than a threaded server (w/ the necessary in process locking). But all that again really depends on what you are actually doing (how much shared state etc).
@twof For the code: Here is it:
https://gist.github.com/fadi-botros/b8276bb5e2b83a63045877aeb7415f34
This has quite a few basic NIO programming errors (flush, flush, close?, why is it an ChannelOutboundHandler?), it uses Codable, worse JSONEncoder, a DateFormatter, etc. I'm not surprised this is slow.
But none of that has to do w/ NIO.
That you come forward and claim that we need a:
WHOLE NEW string class
seems a little premature and not backed by actual analysis ;-)
from swift-nio.
@Lukasa @weissi @tanner0101 ping...
from swift-nio.
As suggested by @Lukasa we may be able to just "lazy parse and create Strings" and just make the HttpHeaders
directly operate on ByteBuffer
. This way we would only need to create Strings if the user ask for a specific header etc.
from swift-nio.
https://lists.swift.org/pipermail/swift-server-dev/Week-of-Mon-20171211/000735.html
(As mentioned, I would prefer a lower level
HTTPMessage
object, which directly peeks into and reuses the receive buffer. But again, all this has been discussed quite actively, and people insisted that String is what we want to have here, and that itβll be fast enough - so I guess we just stick to that.)
Β―\_(γ)_/Β―
Or in other words: I absolutely agree. We had this discussion w/ @weissi in the HTTP list and IIRR he said that Strings are cool ;-)
from swift-nio.
@helje5 Well, I still think the API should be String
s or at least some String
-like type and not [UInt8]
, ByteBuffer
or Unsafe[Raw]BufferPointer
directly. So today I think we should make the String
creation lazy and hope that that's good enough. Swift 5 (or maybe even 4.2) should improve the performance quite a bit because AFAIK it'll have a native UTF-8 backing and also the short string optimisation (String
s shorter than some number of characters can be stored inside of the pointer value, no allocations necessary).
If that turns out not to be fast enough still, we could always think about our own String
-like type that is backed by a ByteBuffer
without any further allocations. Possibly a future version of Swift would even allow the actual String
with additional representations.
Now I have to admit that I would have expected Swift to deliver a String
type that is backed by memory provided externally by this point in time. In other words I previously had hoped that just naively going with String
will be good enough as I hoped that the language would develop its String capabilities pretty soon. Unfortunately, that hasn't happened to this day but there's still hope. So far I think leaving the API as is but making the String
creation lazy should be good enough for now.
from swift-nio.
cross-module calls (no optimisations for this in swift yet :( )
@jckarter may provide input on this, but my feeling is that this won't actually change. Or in other words, it may make sense to have one big module (I made that mistake in Noze.io, and used a lot of tiny modules, which is pretty bad for perf, especially if generics are involved).
The other thing I noticed from a performance perspective, is that you create quite few helper objects and make a lot of indirections. To such a degree that I wonder whether this is going to hurt performance (in general, not just HTTP) just by malloc and function call (+ARC - a thread safe op!) overhead. No idea.
from swift-nio.
@helje5 Well, I still think the API should be Strings or at least some String-like type and not [UInt8], ByteBuffer or Unsafe[Raw]BufferPointer directly.
Yes, I never suggested that ;-)
There is a lot of optimisation potential in here. It may also make sense to provide static strings as the subscript input (sth like header[NIO.Headers.ContentLength]). And for some values (User-Agent etc) come to mind, it may make sense to maintain an intern table.
from swift-nio.
Well, I still think the API should be Strings or at least some String-like type
Totally agree.
So today I think we should make the String creation lazy and hope that that's good enough.
Just lazy String creation will be a huge win by itself. I think this makes sense as a first step before needing to delve into optimizing the actual String type.
It may also make sense to provide static strings as the subscript input
An overload for HTTPHeaders.subscript(name:)
that accepts StaticString
would be great.
Ideally, parsing an HTTP/1 message head (startline and headers), would require zero copies. After the OS reads the data into a ByteBuffer, that buffer would then be used to produce all needed data lazily. The HTTP scanner just needs to iterate over the bytes until it finds \r\n\r\n
, then it's done--at this point it has been fully parsed into an HTTP{Request|Response}Head. These would all be computed properties that access the private COW storage.
The HTTP decoder (pipeline handler) will of course need to peek at some things like the Connection
, Transfer-Encoding
, or Content-Length
headers and potentially the HTTP version to do its job. But those should be the only copies required as a part of the pipeline and they should happen only once required.
In terms of HTTP/2, I'm not too well versed with the spec yet, but as I understand it the same method should work just as well with HTTP/2 frames. Probably with even more room for optimization since we don't need to convert text-encoded integers to discover content length or search for crlf
s.
That being said, the above change would require a lot of work. It would replace a lot of, if not everything, http-parser
is providing. A simpler alternative (and perhaps a better starting place) would be to apply this lazy parsing to the HTTP headers struct only. The pointer info we get from the http-parser
callbacks (on_header_field
and on_header_value
) can be stored to facilitate reads and writes on the underlying storage (see HTTPHeaderStorage.indexes
property).
from swift-nio.
HTTP/2 does something similar to what I suggested above:
it may make sense to maintain an intern table
The disadvantage of preserving the HTTP header as-is, is that the header block can actually become sizeable (1K or more) and may require multiple reallocs. Another way to approach this are as mentioned string tables and maybe structured values.
The far majority of header names and values are very static. Only few headers actually change in real live (Date, Content-Length). Since you have to look at every byte anyways (to find CRLF etc), you can as well match headers during that process.
Eventually instead of storing one big blob, you could store them in a structured way, either as actual Strings (with CoW values) or as indices into a string table (the latter could save you a lot of ARC). So that
Content-Type: text/json
Content-Length: 0
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.2.149.27 Safari/525.13,
becomes:
(-1, 1)
(-2, 2)
(-3, 3)
aka ~6 bytes instead of 175.
This should play especially well with the "performance tests" ;-)
FWIW: This is also what Netty seems to be doing at least for common header names:
public final class HttpHeaderNames {
public static final AsciiString ACCEPT = AsciiString.cached("accept");
}
But in the end I'm not the poor soul who has to implement that, so I guess I stop doing suggestions π¬
from swift-nio.
I will work on this next week.... Or @tanner0101 you want to try port your stuff from vapor ?
from swift-nio.
@normanmaurer honestly very tempting offer... but I don't have enough free time right now to do the job right. I am more than happy to help out with code review and testing though!
from swift-nio.
@tanner0101 ok will hopefully start on monday
from swift-nio.
JFYI: https://github.com/Geal/parser_benchmarks
Benchmarks of parsing HTTP requests with nom (a parser combinator):
β’ The classical way: 355 Mb/s,
β’ Optimized (with SIMD): 1930 Mb/s.
from swift-nio.
@tanner0101 can you share how you ran the benchmark ? Like what command you used on the client side and what response you expected etc
from swift-nio.
I always run this same command to keep things consistent:
wrk -t 4 -c 128 -d <usually between 1s - 60s> http://localhost:8080/ping
And it hits a route that returns:
HTTP/1.1 200 OK
Content-Length: 4
Date: ...
Content-Type: text/plain
pong
note: Date header is required to be valid HTTP which is annoying because it's quite slow to generate the date. Vapor's DateMiddleware optimizes this by caching the result and only recalculating when it changes.
In Vapor, that looks like:
router.get("ping") { req -> StaticString in
return "pong"
}
Since the "pong"
string doesn't change between requests, we can use StaticString
which gives a nice boost.
from swift-nio.
@tanner0101 thanks. Could you also provide me with an example that I can run using vapor-core
only ?
from swift-nio.
@tanner0101 Do you have any stats on memory usage during those benchmarks?
from swift-nio.
Very good issue. Actually I tried to benchmark SwiftNIO against Vertx (tried to make a benchmark under https://www.techempower.com/benchmarks/), and the results were extremely disappointing to me.
I think we need a WHOLE NEW string class, also, JSON Encoding / Decoding, we need it in a high-performance way (I know SwiftNIO is low-level, it may not support JSON out-of-the-box, but I see making a lot of "performant converters" that doesn't use Swift String will give a lot of performance gain) for this framework (this converters may have separate module for each, so that one can only use what he needs).
Swift is very performant in a lot of things, but a big question mark on Swift String performance.
We need something like cstring, and a lot of operations on it, in HTTP, one nearly doesn't use UTF-8 not other encoding, it will be used in database and other things (out-of-scope of SwiftNIO), but HTTP itself doesn't need things out of cstrings. One should try making StringBuffer implementation that doesn't use Swift String, for example.
from swift-nio.
@fadi-botros Can you please share the results from the benchmarks you ran?
from swift-nio.
Can you also post the code for your tests? There are some very confusing results coming out of Vapor too. No idea if anyone's looked into those yet.
from swift-nio.
Also, may using low-level threading primitives may be better than GCD ??
@helje5 Says in his performance analysis to his tool Redi/S, that using one thread was faster than using GCD or EventLoop (EventLoop code is also built on GCD I guess).
@twof For the code: Here is it:
https://gist.github.com/fadi-botros/b8276bb5e2b83a63045877aeb7415f34
from swift-nio.
@fadi-botros Can you share the actual results of your benchmark? Including what you hardware and software you used to run it (wrk, etc). It's hard to contextualize your concerns without seeing the numbers and I'm curious to know what you used to perform the tests.
from swift-nio.
@mcdappdev This is the results
Sorry, it is as it is resultant from TechEmpower framework benchmark scripts.
Hardware: MacBook Pro 13'', Late 2011, with SSD hard disk 256 GB, 8 GB of RAM
Software: Linux (Ubuntu Trusty) VM, using Vagrant, (which is built on VirtualBox),
the host OS is macOS High Sierra.
The "framework" itself is https://github.com/TechEmpower/FrameworkBenchmarks
https://gist.github.com/fadi-botros/fc0aeb0bbd27a986a1f2a4c3056a5e3d
Also note, my try of "plaintext" test has problems and I don't know where
This is my code (anyway, the link was also in a previous reply)
https://gist.github.com/fadi-botros/b8276bb5e2b83a63045877aeb7415f34
from swift-nio.
@helje5 Sorry but could you suggest improvements of my code ?
I tried removing date formatter and using C time library { time, gmtime }
let weekdays = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]
let months = ["January", "February", "March", "April", "May", "June", "July", "August", "September",
"October", "November", "December"]
let comma = ", "
let gmt = " GMT"
let colon = ":"
let space = " "
...
var t: time_t = 0
time(&t)
let a = gmtime(&t).pointee
let weekday = weekdays[Int(a.tm_wday)]
let month = months[Int(a.tm_mon)]
I even tried to convert them to utf8CString then appending them, it was a pain, and the performance is worst
Also changed the writeAndFlush
, writeAndFlush
, close
to write
, write
, writeAndFlush
, close
And also no significant performance improvements
It seems the only problem is with JSONEncoder, but also I think String class may need some C enhancements (not sure of that).
from swift-nio.
Related Issues (20)
- Failed WebSocket upgrade does not consume initial request `.end` part HOT 2
- 11-5421`\_rafe transport chasis #β’β^|-{cricket bat/Β°87\.,.,./
- Carton_\mahim cargo service~)`ββ/[(βiβ)>,_/β)β²[]
- visionOS Support HOT 6
- `ClientBootstrap.connect` that returns a NIOAsyncChannel hits precondition if connect fails HOT 1
- ByteToMessageProcessor: re-entrancy bugs
- FBβ,-(CI HOT 1
- β HOT 1
- Proposal: Contributing HTTP Response Status Code Mappings HOT 2
- Performance tests broken (always paste a result from 3rd Jan, actual run fails) HOT 7
- TCPThroughputBenchmark has a crashing race
- NIOFileSystem uses yet another thread pool instead of NIOThreadPool HOT 3
- NIOFileSystem's BufferedWriter wants Sequence<UInt8>s and doesn't accept ByteBuffers
- NIOFileSystem's BufferedWriter produces empty files when they shouldn't be empty HOT 1
- NIOFileSystem: Should offer withTemporaryDirectory
- NIOFileSystem: BufferedReader.read(while:) doesn't say if it stopped reading because the condition was met or because or EOF HOT 1
- NIOFileSystem: BufferedReader can only return `[UInt8]` and not `ByteBuffer` HOT 2
- `NIOFileSystem` module needs to be `_NIOFileSystem`
- Occasional program crashes on ubuntu20.04 HOT 2
- When my app is submitted to the App Store, an error occurs The naming convention for 'NlOBase64 'is incorrect HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from swift-nio.