Comments (26)
@RossTate thank you, I agree that simplicity is better than a small performance benefit for now. But some people here have concerns that performance benefits might actually be significant. Because at the end of the day we are trying to compete with JS as a compilation target, and performance is the biggest competitive advantage we can have. We need more measurements :)
Also, this probably deserves a separate discussion, but separate compilation is a very hard issue to solve:
- "Symbolic" struct field linking close to source language -- the thing we discussed the most -- easy to solve.
- Compiler optimizations -- for many languages separate compilation relies on JIT that can optimize across compilation units. Currently Wasm engines expect preoptimized code. Adding a bunch of good speculative optimization to all engines to be competitive with JS or JVM is a very hard task.
- Code size -- separate compilation has more unused stuff, that can be eliminated by DCE. Even in JS where 1. and 2. are solved issues, gold standard is to still use some bundler like webpack. I don't see how we can solve this issue with Wasm.
from function-references.
Could that sentinel object be in its own OS page, and that page protected in the same way that linear memory is?
I considered that idea before saying "I don't see that happening" :-)
With actual 64-bit pointers it might be feasible. The main concern there is that we wouldn't want the thing to require special-casing all over the place (GC marking phase, write barrier, ...), so it must "look like a regular heap object" including a shape descriptor and a proper page header and whatnot.
With pointer compression, however, I'm pretty sure it's just outright impossible.
Also, V8 implementation aside, IIUC memory protection tricks can only handle .set/.get
, not ref.cast
(which according to previous discussions is going to be a very frequent operation).
Is it because of interop with nulls from other subsystem, like JS?
"It's the null
we already have" is certainly one reason; but due to the considerations above I'm also pretty sure we won't introduce a special "Wasm null
" with a different implementation. (We may or may not introduce a special "Wasm null
", but it would be implemented the same way as the existing JS null
.)
Generating local as_non_null cast after first access can make things worse for non-optimizing wasm engine with signal handler trick, but it might be better for current V8 in some cases.
Agreed, our current prototype would achieve slightly better performance in some cases if you did that, but I would generally recommend that we first build things (in both compilers and engines) as simple as we can, then profile real-world use cases, and then decide where we need more complexity in order to improve things. Chances are that the situations where an X-to-Wasm compiler can easily emit such local optimizations are the same cases where a Wasm engine can fairly easily make the inference; so if/when we decide that this is worth improving, we should coordinate on the "how".
from function-references.
Two benefits I can see.
- There are some engines where an implicit nullcheck (e.g. in a
struct.get
) is not free, and costs a branch. Knowing the input is non-null means it can omit the branch. - For those same instructions that have an implicit nullcheck that traps, knowing the input is non-null means that the instruction cannot trap. If we allow catching traps in the future, this means fewer control-flow edges that a compiler has to worry about, generally resulting in better code.
from function-references.
About 1, which engines do you mean? It seems like every time you create a struct, you would zero-initialize the i32
zero fields just like we initialize locals and memory - and so then you could also initialize a nullable function reference to a (hidden, internal) "null function" that starts a trap. Then whenever you have a reference, nullable or not, you know you can always call it. Does the idea not work, or not work in some engines?
2 is an interesting point I hadn't considered! Yes, if we allow catching traps in the future it seems like it could help.
from function-references.
- For GC struct references, engines can protect a few memory pages after nullref address, and avoid generating null-check branches for memory accesses with reasonably small offsets. Should work for function references too.
from function-references.
For GC struct references, engines can protect a few memory pages after nullref address, and avoid generating null-check branches for memory accesses with reasonably small offsets. Should work for function references too.
This usually only happens for a per-page allocation during memory.grow
and when using mmap
under the hood. Also it's cannot be done on 32-bit platforms. I don't think every nullref
address will alloc by mmap and wrap by guarded pages. It would be a terrible waste of resources.
from function-references.
@skuzmich Thanks, yes, I think you're right. So like with linear memory, null checks can be "free" with the signal handler trick, and non-nullable references can't speed up GC references either.
@MaxGraey You'd only need to guard the region around 0 once, and then it helps you with every single check later.
(You're right it can't be done on 32-bit platforms, and places where the signal handler trick can't be used for other reasons - but in wasm we've already agreed basically that the signal handler trick is something we can depend on in most important cases, which is why wasm didn't add a different form of sandboxing like masking.)
from function-references.
Non-nullable types could be more efficient for ref.cast from GC proposal, and similar instructions that access struct RTT/header without trapping on null reference.
In addition to the point 2. Even without being able to catch traps in Wasm, Wasm state can be observed after trap in a hosts like JS. Thus, unless they inver nullability, Wasm compilers wound not be able to reorder nullable struct.get
s without having to speculate and deoptimize. This can prevent various reordering optimizations, like loop-invarinat code motion.
from function-references.
I was wrong about ref.cast above. Adding trapping version of instructions would solve the problem.
from function-references.
in wasm we've already agreed basically that the signal handler trick is something we can depend on in most important cases, which is why wasm didn't add a different form of sandboxing like masking.
I'm not sure I subscribe to that. Even with bounds checking you cannot rely on signal handling everywhere. 32 bit platforms are still real, so are techniques like pointer compression, or execution environments with no access to signals.
The knowledge that something isn't null also allows certain optimisations, e.g., in instructions following an explicit null check it can elide various cases where null is treated specially but without trapping, or when inlining a function performing an explicit null-check into a call site where the parameter is known to have non-null type. In simple cases, the engine could use flow analysis to derive similar information, but that does not work across function boundaries, except when inlining.
from function-references.
Even with bounds checking you cannot rely on signal handling everywhere. 32 bit platforms are still real, so are techniques like pointer compression, or execution environments with no access to signals.
I agree to all those. I think you might be responding to a point I didn't make, though, so apologies if I wasn't clear enough.
My point is that we knew 32-bit platforms etc. exist, and we considered adding a form of linear memory sandboxing like masking that does work on them and that is cheaper than explicit bounds checks. But we decided that the signal handler trick was possible in enough places, and the overhead of explicit bounds checks small enough, that it wasn't worth it. It seems like the same thinking could apply to GC, unless we expect 32-bit platforms to be more important there, or the overhead larger, for some reason?
The knowledge that something isn't null also allows certain optimisations
I agree with this too. It's related to @skuzmich 's point that
unless they inver nullability, Wasm compilers wound not be able to reorder nullable struct.gets
I think this is definitely worth measuring to see how important it is. As before, something similar happens with linear memory checks, where potentially trapping things like memory accesses and integer divides etc. cannot be reordered. I'm not aware of data showing that that is a significant issue, but it may be, and it may be different in GC.
Note btw that the binaryen optimizer has an ignoreImplicitTraps
flag which does let it reorder memory operations and integer divides and other things that might trap. We don't have data showing that is useful for speed, but it can help code size in some cases by a small amount. I expect we will experiment with expanding that flag to GC somehow.
from function-references.
To add another dimension of concern, I'm worried about how much non-null reference types will be generated. Many languages do not reason about nullness. Many languages that do still use null for other reasons. One simple example is using null
for nullary constructors of algebraic data types (e.g. nil
or None
). But a more widespread example is initialization. For example, Kotlin is a mostly null-safe language but punts on nullness in a few spots. One in particular is initialization. When designing the language, we discussed ways to ensure null safety even during initialization and we found that it would be extremely cumbersome in order to support common patterns (particularly due to encapsulation and separate compilation) and with little performance benefit. So Kotlin's non-null types can actually be null during initialization. People trying to generate WebAssembly will likely have similar problems. Without advanced features like typestate, the common pattern of allocating a blank instance and then letting each function/module initialize the appropriate component won't type-check. This is even problematic for initializing runtime data structures. For example, a v-table is often initialized by passing the v-table to each function/module defining relevant superclasses, starting with the base class and moving down the hierarchy. Type-checking initialization such that the burden can be split across functions/modules is not easy.
from function-references.
@kripken we have more freedom optimizing non-nullable struct.get
compared to knowing that load
s in the program never trap, assuming that is what ignoreImplicitTraps
assumes. For example, we can unconditionally execute conditional struct.get
.
from function-references.
V8 has no plans to use signal handler tricks for null checks on any platform. (Reason: our null
reference is a sentinel object somewhere on the heap, not a zero-valued pointer like C++'s nullptr
, and I don't see that changing.) So any non-null-permitting operation (struct.get
, ref.cast
, ...) on a possibly-null value will emit an explicit null check (comparison + conditional branch, depending on circumstances possibly also materialization of the null
sentinel constant to compare with).
I haven't measured what the performance impact of these checks is.
To me, the interesting (and hard-to-answer) question is how much guaranteed non-null-ness optimizing compilers will be able to infer by themselves (it's easy to come up with scenarios where it's trivial, as well as others where an engine can't possibly prove it, and in between there's the possible-but-difficult area), and how long it will take for engines to actually implement such optimizations.
It is certainly much easier when the engine only has to look at the static type of the value in order to decide whether a null check is needed (which is what V8 currently does).
from function-references.
our null reference is a sentinel object somewhere on the heap, not a zero-valued pointer like C++'s nullptr
Could that sentinel object be in its own OS page, and that page protected in the same way that linear memory is?
from function-references.
@RossTate I wouldn't worry about Kotlin. I think most of your concerns regarding initialization nullability is based on JVM implementation, which implies separate compilation and a bytecode format without sound type nullability. Many language features (like when-expressions, primary contractor properties, etc.) encourage immediate variable initialization: ~95% of Kotlin/JS standard library local variables are immediately initialized. Since separate compilation is not very viable in Wasm for many reasons, we can generate non-nullable Wasm types for most immediately-initialized non-nullable fields in a single whole-program compilation. Even if I'm wrong and we wouldn't be able to generate a lot of non-nullable types on average, I would argue, it is still important to give programmers a way to micro-optimize their perf-critical code by choosing the "efficient" initialization scheme. I will share concrete numbers once non-nullable type generation is done.
V8 has no plans to use signal handler tricks for null checks on any platform. (Reason: our null reference is a sentinel object somewhere on the heap, not a zero-valued pointer like C++'s nullptr, and I don't see that changing.)
Is it because of interop with nulls from other subsystem, like JS?
To me, the interesting (and hard-to-answer) question is how much guaranteed non-null-ness optimizing compilers will be able to infer by themselves
A good solution would probably look like an inter-procedural analysis (and maybe a higher-level language type information), which would be more viable in AOT optimizers rather than Wasm JITs.
There are some optimizations, that are not representable in Wasm types, like multiple accessing the same object, where one access dominate others. Generating local as_non_null cast after first access can make things worse for non-optimizing wasm engine with signal handler trick, but it might be better for current V8 in some cases.
from function-references.
Thanks for the useful information, @jakobkummerow! It's very interesting because it permits a bunch of flexibility. In particular, you can get from and set to fields (up to some offset) of a null reference, so long as you check that the reference is null before you take any observable actions based on what you got. Plus branch hinting/prediction is pretty effective. I imagine y'all have thoroughly explored the tradeoffs for JS, and given that JS is not a null-safe language or particularly easy to reason about, I would expect that it would have been very well positioned to benefit from trapping versus testing. So maybe that's a sign that testing is in fact not particularly expensive.
@skuzmich To clarify my perspective, I think there is potential value for non-null references in WebAssembly and that we should leave the door open for such an extension, but at the moment we have bigger concerns to address. For example, I am very concerned about your statement that WebAssembly is not currently viable for separate compilation. It will be easier to make the GC MVP support separate compilation if the MVP can—for now—assume all value types are defaultable. Non-null reference types are currently the only value type in any proposal that would break that simplifying assumption, and their advantages don't seem to merit the complexity—for now. So, for me, this issue is a matter of prioritization.
from function-references.
I think it is very likely that other Wasm engines will implement null checks with trap handling because they are so common. Wizard isn't at the stage yet where that optimization matters (not JIT yet), but it already does effectively do this, using an implementation-language null
to represent a Wasm null
, though that is packed into an ADT for the interpreter.
from function-references.
The main concern there is that we wouldn't want the thing to require special-casing all over the place (GC marking phase, write barrier, ...), so it must "look like a regular heap object" including a shape descriptor and a proper page header and whatnot.
@jakobkummerow
I'm not familiar with the implementation, but would it be possible to allocate "null" object at the end of the page and protect the next page, so that it would have a proper accessible header, but user-level field offsets would be protected, enabling trap trick?
from function-references.
Thanks for all the comments so far, everyone!
Overall it seems like there may be a performance benefit here. We will just need to measure it to be sure. And it doesn't sound like there are any non-performance benefits.
If we don't get clear numbers on this being a useful speedup, then as @RossTate said, it may make sense to defer non-nullability for later (as long as we don't do anything to prevent it from happening later). I think that may have value since non-nullability does add some complexity here (which surprised me), in particular the new let
is a new structural construct with some new features. But we can see what the performance numbers are first.
from function-references.
Hi all,
To put some concrete numbers on the potential performance benefits of non-nullable types, I ran some experiments on our V8 prototype. Our prototype already optimizes away null checks if the static type of the object is known to be non-nullable.
I created a loop which repeatedly applies an operation requiring a null check on a local value. I measured the running time of the loop when the value was nullable vs. non-nullable. To isolate the effect of the operation from the loop jump and counter increment, I also ran an empty loop (which just increments a counter).
Note that, since the operation was always run on the same non-null object, the branch prediction should have hit every time. The results might be different otherwise.
Unfortunately, the benchmark is quite flaky, but these are the trends which I observed:
For ref_test
and ref_cast
the normalized speedup (subtracting the running time of the empty loop) was about 13-19%.
For struct_get
and array_len
it was about 33%.
For struct_set
about 50%.
For array_get
(with a constant index) about 67%.
Of course, speedups were lower when the total time of the loop used instead, and ranged between 12-50%.
These numbers are of course specific to the situation of a tight loop with a specific payload, and are not likely to be measured in an actual program. However, I think they do show that non-nullability is an effective feature.
from function-references.
This is not to detract from the discussions here and about let
, but I am still concerned for the languages that were designed with the premise that null
accesses would fault rather than have to be branched on, so I wanted to prompt some more discussion on that issue.
@skuzmich made a clever suggestion in #40 (comment). I reread the blog on pointer compression, and I think it should be compatible with that. I suspect the GC already sets up some space for permanent objects created during initialization, so my guess is that you could implement this by putting the null
value at the very end of that space and have a protected page immediately after it (or otherwise at the end of "old" object space). @jakobkummerow, what are your thoughts?
from function-references.
It might be possible to create such a null
, yes. I'm still somewhat skeptical about it:
- we couldn't use this for the JavaScript
null
(at least not without significant reengineering effort), so we'd need a separate Wasmnull
, with according translation at the boundary - a "trappable null" would avoid explicit null-checks for
(struct|array).(get|set)
, but wouldn't help forref.is_null
orbr_on_null
orref.as_non_null
(in the latter case, because we'd have to be accurate about trapping right there at that instruction, so we couldn't simply skip/defer the check).
Also, in general, I wouldn't want the specification to rely on engines implementing one particular optimization technique. It's good to know that certain patterns can be optimized through advanced hackery, but it's also good to assume/allow implementation flexibility: what's viable in one engine today might not be viable in another engine, or in a future with different constraints and/or fundamental design choices, or on other platforms (e.g. 32-bit).
from function-references.
Sounds good! I think that the current spec, because it allows struct.get
and such on null
references, does not force any one implementation strategy. And I think that generators of languages with frequent null references would have no need to change how they generate wasm code depending on whether they're targeting a faulting or branching implementation of null
checks (provided streaming compilers for branching engines do basic null-pointer reasoning so that successive field accesses do not incur repeated checks, which I hear is the case). And having non-nullable types for languages with infrequent null references means they are not unnecessarily forced to rely on having a faulting implementation.
So, altogether, my sense is that the spec should not be changed assuming a faulting implementation of null
, but that at the same time it would be good to explore such an implementation because languages like Java seem to stand to benefit from it regardless of the spec.
from function-references.
The idea is fine (and not new I think; ISTR Lisp implementations have used this for nil in the past) but it is not performance-neutral. Accesses will be faster, but tests will be slower. As the bit pattern for the null value is now something specific rather than all-zeroes, the null value has to be loaded (from Tls, say) for anything that compares with null. Frequently the Tls pointer is in a register, so this is often just a single load, and it can be commoned if the value is used frequently, but then it occupies a register. In contrast, compare-with-zero-and-branch is fast and compact on most systems. Additionally, as Jakob says, the JS null will be a different value; this means significant complexity at the JS/wasm boundary. This worries me more, actually.
(Finally, the wasm null must be the same everywhere which means managing a shared resource somehow, but this seems like small potatoes.)
from function-references.
Consensus is to include non-null types, so closing this.
from function-references.
Related Issues (20)
- Issues with the JavaScript spec tests HOT 4
- `Illegal macro parameter reference` in the spec HOT 1
- Binary encoding out of sync with GC proposal
- Merge in latest spec repo HOT 2
- Function reference from JavaScript functions? HOT 4
- Validation rule for call_ref with polymorphic stack HOT 5
- Fix table-elem desugaring
- `func.bind` and iso-recursion HOT 2
- Validating local.get in nested blocks with unreachable HOT 5
- br_on_non_null.wast needs unreachable test HOT 1
- slowness of test/core/skip-stack-guard-page.wast HOT 3
- Drop null casts? HOT 4
- Tweak binary format for table declarations HOT 2
- JS-API is missing value conversions HOT 3
- Extend `TableType` to support non-nullable references in JS-API HOT 1
- Progress toward phase 4 HOT 2
- Malformed br_on_null and br_on_non_null in tests HOT 1
- Default/init values in JS-API for globals and table. HOT 3
- Clarification request: initializers for tables with non-nullable type and initial_size == 0 HOT 3
- Should non-nullable locals be initialized by unreachable code? HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from function-references.