So far, Wasm has been consciously designed such that the type of every instruction is

Just to quickly <a href="https://github.com/WebAssembly/gc/pull/241#issuecomment-91698

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Also, <a class="issue-link js-issue-link" data-error-text="Failed to load

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

(credit to <a class="user-mention notranslate" data-hovercard-type="user" data-hoverca

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Type annotations on new instructions,about webassembly/function-references

Comments (99)

askeksa-google commented on May 31, 2024 3

Just to quickly recap, the actual numbers for barista3 are 5.8% uncompressed and 6.4% compressed (zopfli).

from function-references.

kripken commented on May 31, 2024 3

@eqrion

I would just like to better understand why wasm-GC binary size is 2x JS binary size.

Things are still evolving so any specific measurement may end up changing. But I think these are some of the main factors, which include things you mentioned:

Inlining. While engines can inline at runtime, I don't think we can assume they'll do as well as JS (JS is higher level so there is more to work with). Also runtime inlining takes time to warm up and may differ between VMs. So we inline more than JS toolchains do, and this helps benchmarks.
Library support. Atm we're shipping Java library code for strings, for example - the Strings proposal should remove some of that, but not all. JS has a great standard library that helps lots of things. Sometimes we can call out to JS (e.g. to do Math.log, rather than bundle a log impl) but that has runtime costs.
Repeating code patterns like "if null, throw an exception". In JS the code can rely on the VM, but in wasm we'd get a nonrecoverable trap - but J2Wasm needs the ability to catch some of these. Lots of other patterns exist, like x.push in JS is just a few bytes but in wasm it's going to need to handle reallocating the backing array etc.
Lots of small stuff like a.x in minified JS is a read from a field in 3 bytes (in wasm that takes more with or without annotations), JS not having type annotations in general (regardless of the current topic, the annotations we definitely will keep are a significant source of entropy), JS having "overloading" ("+" works on numbers or strings), etc. etc.

Those are sorted in my current best guess at relative importance from the most to the least, but again, that's imprecise and may change.

I'm not sure about the data/elem sections. J2Wasm doesn't emit them, but it does emit equivalent stuff in arrays and structs. It's hard to measure that since it's interleaved in the code.

@rossberg

Native code also tends to be much larger than JS source code, probably by more than 2x.

That reminds me, I actually measured JS compared to native code over a decade ago 😄 yes, as you said, JS is surprisingly compact - and it's only gotten smaller since then!

Yes, this is somewhat expected. But the bottom line though is that this matters. If wasm is (say) 2x larger than JS then that will prevent some people from being able to use it. The smaller it is, the more people can use it, and the better it will work for those that do. I wouldn't go as far as to say that "if we make wasm 5% smaller it will be 5% more impactful", but there is a positive connection there.

from function-references.

jakobkummerow commented on May 31, 2024 2

I can offer two data points:

Type annotations also benefit baseline compilers. Otherwise baseline compilers have to track full types (rather than machine representations) to model the stack effects of instructions like call.ref or knowing the register class of the result of struct.get.

While that sounds good in theory, V8 does not care about this property, for two reasons:

We wouldn't want the code duplication of having one decoder implementation that tracks types (which is required for validation anyway), and another (for compiling) that doesn't. So we use a single (templatized) implementation; as a consequence there's practically no cost to having the compilers track static types of value stack entries.
In order to apply simple ("peephole", non-flow-dependent) optimizations, we need to track types on the stack. For example, for a sequence (struct.new ...) (struct.set ...) (or, now that we have 1a-NNLs, for a sequence (local.get $nnl) (struct.get ...)), in order to skip the null-check in struct.set, we need to track the type of the value stack entry. There's no benefit to tracking nullability-but-not-type of the value.

I would not find it surprising if many/most other production-quality engines found themselves in the same situation.

In the CalcEngine module that's produced by the current versions of J2Wasm and Binaryen, we have:

instruction	count	average bytes	fraction of code section
struct.get	138138	4.25	6.1%
array.get	22708	3.02	0.7%
struct.set	8844	4.55	0.4%
struct.get_s	1714	4.68	0.1%
array.set	987	3.32	0.0%
array.get_u	677	3.00	0.0%
array.get_s	75	4.00	0.0%
struct.get_u	30	5.00	0.0%

The total size of the Code section is 9.7 MB, so saving one byte each (which is a lower bound for dropping type immediates) for each of these instructions would save 173 KB or 1.8%. These are uncompressed numbers, and I don't have an easy way to measure the effect on compressed modules; lacking a concrete measurement I would guess that the relative savings are similar (which is what we saw for previous questions of "should we drop some bytes here or there"). Note that there is some fairly obvious opportunity for saving some code size in that module, so the relative impact of these type annotations will get bigger after we've picked other low-hanging fruit; also if we had to add immediates to call_ref (and maybe others?) that would further increase the delta.

Same table for Dart's "barista3" benchmark:

instruction	count	average bytes	fraction of code section
struct.get	23957	4.35	14.1%
struct.set	11040	4.63	6.9%
array.get	222	3.00	0.1%
array.set	189	3.06	0.1%
array.get_u	140	3.03	0.1%
array.get_s	6	3.00	0.0%

Saving one byte for each of these would add up to 35 KB out of 746 KB or 4.7%.

(To clarify: the "average bytes" are for the entire instruction, so e.g. 4.25 means that struct.get, on average, needs two bytes 0xfb 0x03 for the instruction itself and another 2.25 bytes for type immediate and field index immediate.)

from function-references.

eqrion commented on May 31, 2024 2

I would not find it surprising if many/most other production-quality engines found themselves in the same situation.

It's already the case that different production engines have made different tradeoffs. For example, the SpiderMonkey baseline compiler has its own dedicated decoding logic and its abstract stack does not model full types (instead, kinds, constants, register classes, and spill status).

I don't have an opinion on the larger issue here, but just wanted to clarify this point.

Our baseline compiler performs function validation at the same time as code generation. Function validation requires the value stack, so we always have the full typing of the stack available. For code generation, we keep a parallel stack that models kinds/constants/register classes/spill status as you say.

If we were to (for some reason) pre-validate functions before compilation, then theoretically we could try to eliminate the full type tracking. I could see a reason to do this for an interpreter, but I'm unsure the benefit for a compiler. Even if you compile functions lazily, my understanding is that it's legal to defer validation of a function until the point the function it is called for the first time.

from function-references.

jakobkummerow commented on May 31, 2024 2

I've hacked up a standalone module processor that lets me drop and restore unnecessary bytes at will :)

So, here are actually-measured numbers for the Sheets module:

module size	uncompressed	gzip	brotli
current	10632532	1995346	1284238
without immediates	10417566	1950247	1250531
saved (absolute)	214966	45099	33707
saved (relative)	2.0%	2.3%	2.6%

As mentioned before, there's some unrelated low-hanging fruit in that module for saving binary size (making use of the stringref proposal, ordering globals by reference frequency, adopting NNLs, dropping array.len immediates), so I expect the relative impact of these annotations to grow.

The nice thing about an experimental tool is that I can run more experiments. E.g. this is what happens when I introduce one-byte shorthands for local.get 0, local.get 1, local.get 2, local.get 3, i32.const 0, i32.const 1, struct.get, struct.set, array.get (saving one byte each), with or without dropping type immediates:

module size	uncompressed	gzip	brotli
shorthands, dropping immediates	9722691	1923694	1246768
saved (absolute)	909841	71652	37470
saved (relative)	8.6%	3.6%	2.9%
shorthands, keeping immediates	9937707	1966772	1281289
saved (absolute)	694825	28574	2949
saved (relative)	6.5%	1.4%	0.2%

Apparently compressors (brotli in particular) are indeed pretty good at compressing repeated 2-byte sequences, whereas from the high impact of dropping type immediates, we can conclude that they have lots of entropy, and there's no easy alternative way to get comparable compressed savings.

And this is why it matters:

module size	uncompressed	gzip	brotli
JavaScript version	4689150	1294864	906739
JS smaller than current Wasm	55.9%	35.1%	29.4%

We don't want deployments on constrained connections or devices to prefer compiling stuff to JavaScript because it's smaller, do we?

@titzer: To clarify, all new*/test/cast instructions need their immediates, because they otherwise can't possibly know what they're supposed to do. So dropping immediates is only an option for the eight {struct,array}.{get*,set} instructions.

from function-references.

eqrion commented on May 31, 2024 2

Also, #69 already shows that even your base assumption is false. You may or may not consider that case a real problem, but fact is that short of annotations, we need to extend the typing rules for all relevant instructions and introduce special cases, one way or another.

I don't think there is disagreement on whether #69 is a real problem. I think the disagreement is on whether it motivates F-R/GC to change or whether dead code should be changed.

Dead code is vanishingly rare. It effectively only exists in spec tests and fuzz tests. I would advocate for changing dead code whenever we run into an issue caused by a well-motivated instruction/type.

The problem is that issues typically result from an interaction between multiple features. And the place that then needs an annotation may be a pre-existing instruction that we cannot change anymore. Again, see select. There, we got away with a hack to keep the old version meaningful, but it may not always be that easy, depending on the details of the subtype relation and the offending properties.

I'm interested in more details here, as I would have thought how we handled select could be a guide here.

Specifically, I would view the no type-annotation struct.get/set/etc instructions as convenient short-hands to reduce binary size for a very common instruction (something there is precedence for). But the short-hand may have cases it cannot handle, which would necessitate adding a larger instruction that would need to be emitted for that case (something there is precedence for with select). I don't think this particular combination of precedents have been used before, but it seems valid to me.

from function-references.

rossberg commented on May 31, 2024 1

@tlively:

I really don't see how we could end up in a place where we do not know enough about the type on top of the stack to be able to validate a field access. If we didn't, then it seems we wouldn't have enough information to validate local.set, a function call, or any other consuming instruction, either.

It's quite a bit more subtle than that. There are other sorts of problem that can arise, e.g., that the input type does not provide enough information to compute a (principal) output type, when there is a dependency. For example, we may have to synthesise that type, which may involve synthesising or guessing a type definition, which doesn't fit Wasm's setup. Or we may have to compute lubs/glbs (see select).

Also, #69 already shows that even your base assumption is false. You may or may not consider that case a real problem, but fact is that short of annotations, we need to extend the typing rules for all relevant instructions and introduce special cases, one way or another.

And beyond just dead code, this example demonstrates a more general problem: subtyping does not necessarily tell you enough about the super type required in a rule to check all the rule's side conditions accurately (or produce an accurate output type). Depending on what forms of subtyping we may want to add in the future, other variants of this problem could crop up.

If a future type system extension needs more annotations, it will be on instructions that push annotated values onto the stack, not on instructions that pop them off.

The problem is that issues typically result from an interaction between multiple features. And the place that then needs an annotation may be a pre-existing instruction that we cannot change anymore. Again, see select. There, we got away with a hack to keep the old version meaningful, but it may not always be that easy, depending on the details of the subtype relation and the offending properties.

I know this all is vague, but type systems are very subtle and fragile, much more so than many folks perhaps realise. It's a safer bet to keep a "buffer of robustness" where we can. That is a long-term concern that, to me, vastly outweighs minor size wins.

from function-references.

tlively commented on May 31, 2024 1

(credit to @jakobkummerow, not me, for the experiments)

I would personally change my mind about this if we could demonstrate a toy type system extension that purposefully breaks annotation-free struct.get without requiring us to overhaul unrelated parts of the existing system. If we can find such an extension on purpose, then I would believe that we might encounter a similar one accidentally in the course of normal development. On the other hand, if we can't even find such an extension on purpose, then I don't think it's worth trying to plan for.

from function-references.

RossTate commented on May 31, 2024 1

@tlively Found it: https://github.com/WebAssembly/meetings/blob/main/gc/2020/GC-12-01.md

from function-references.

titzer commented on May 31, 2024 1

I wrote before:

Suppose we didn't know that we could save 2-5% module size without dropping annotations. Would MVP with annotations be unshippable due to prohibitive space overhead?

I ask this unironically, and not as a gotcha. Is MVP with annotations shippable, if we accept 5% space overhead?

I pointed out in the meeting that omitting type annotations is effectively using the type reconstruction of validation as a compression scheme. If we were to lean into that, we could imagine a Layer 1 compression scheme specifically designed to leverage type reconstruction, and we could do even more, because we could compress local declarations by omitting their type declarations for ones that are definitely initialized. I believe those optimizations are not available to any general purpose compression scheme out there, since they are tied to Wasm semantics.

from function-references.

rossberg commented on May 31, 2024 1

@tlively, to decide whether something is the only possible type you generally need an algorithm that can actually compute and enumerate all possible types, and then check that it produces a unique one. If you don't have the principal types property (which is the scenario we're considering), then such an algorithm is likely exponential. There may be simpler conditions to decide this in specific scenarios, but not in general.

Re substitutability: that is the core property you want from subtyping. The type system would make no sense without it.

from function-references.

conrad-watt commented on May 31, 2024 1

I think we can sidestep the specifics of the graph colouring construction in that presentation by observing that we can't define a convincing linear scan validation algorithm for struct access instructions consuming an opaque type described only by multiple struct-shaped supertype bounds, if said struct access instructions don't have input type annotations. IIUC that's the "essence" that's important here.

from function-references.

RossTate commented on May 31, 2024 1

My understanding is that $V would not be allowable as the annotation for struct.get; the annotation has to be specifically a struct type. That would mean each of the initial struct.get instructions would have to be annotated with the specific struct supertype of $V that makes the program type-check, so each such annotation would in turn specify the color of the respective vertex that demonstrates the graph is 3-colorable.

from function-references.

eqrion commented on May 31, 2024 1

Thanks Ross for the answer, that helps a lot.

Subsumption is a property of a fixed instruction: if a given fixed instruction type-checks with certain inputs, that same instruction must necessarily type-checks with more precise inputs. A type annotation is part of the instruction, so changing the type annotation results in a different instruction; as such, subsumption says nothing about the relationship between these instructions.

So if the abstract syntax of struct.get included a type annotation, but we introduced a fallible type inference algorithm (similar to how @titzer framed the issue above) to generate that type annotation from a binary format that doesn't include the annotation, that would not strictly violate subsumption? (in the world where multiple upper-bounded type imports is a thing). I'm just trying to understand the options here for the future, a fallible type inference algorithm could probably be introduced at the time it would solve an issue.

Personally, I would agree that it's unclear whether we'll have multiple upper-bounded type imports ever and if so, what restrictions we could add on them to accommodate design decisions we make now.

I also go back and forth on whether a 6% binary size reduction is acceptable for an MVP. If current Wasm-GC ports are ~2x the size of equivalent JS, 5% would need to be part of a larger story of getting to parity. It would help to have that larger discussion to be able to prioritize this 6%.

from function-references.

kripken commented on May 31, 2024 1

About the importance of binary size, I look at it like this: if we could easily reduce the size of all images on the Web by 5% then we'd want to do that. Likewise for video, JS, etc. The scale of the Web is so big that even a 5% win on one format matters a lot! Maybe one user can't easily notice it by themselves, but plenty of data shows that even small changes to download size have real-world impacts on average user experience. And 5% less total bandwidth across the entire Web is a significant amount of work and electricity over billions of downloads.

Because of that there are various projects working to shrink images, video, JS etc., both on the spec side and tools side. As wasm rises in popularity its size will matter more and get closer to those. In fact we already do a lot of work for code size on the tools level, in Binaryen and elsewhere - that has been motivated by existing important real-world use cases that care about binary size today.

So even 5% is worth thinking about. Yes, there may be other ways to shrink size (L1 compression), and maybe we'll get to them, but that's uncertain. And, yes, maybe we are 2x larger than JS now, and maybe 5% won't fix that, but it's still 5% better. Whenever we have a good practical way to reduce size that's worth considering. Of course, we need to consider the other side of the tradeoff as well.

from function-references.

RossTate commented on May 31, 2024

More examples of instructions that could or could not have redundant type annotations:

input type annotation of block
input type annotation of if
output type annotation of loop
input type annotation of try
input and output type annotations of let

There will likely me many more such instructions in future proposals. For example, input and output type annotations for many of the instructions for stack primitives are redundant.

from function-references.

rossberg commented on May 31, 2024

@RossTate, yeah, but as mentioned elsewhere, block typing has sailed long ago, and changing it now would be rather tough. In any case, it's outside the scope of this proposal.

from function-references.

RossTate commented on May 31, 2024

That discussion indicated that the rationale behind the decision was that eliminating type annotations would complicate the pseudo-code interpreter by requiring end to check if the construct had an associated out-annotation and, if so, to restrict the outgoing types to that annotation and check them for compatibility.

That decision was also made at a time where only one block-like structure did not need an out-annotation, making the complexity seem unnecessary. But let will be another example where such an out-annotation is unnecessary, and there will likely be more examples down the line. let also introduces new complexities to end, such as cleaning up the local-variable stack, meaning that end will become more complicated anyways. So the landscape is changing in a way that suggests the decision should be reevaluated, and this proposal is the first sign of that change.

from function-references.

tebbi commented on May 31, 2024

These type annotations increase binary size and mean more work for validation, but I don't see any clear benefits. In a fast single-pass compiler, you probably want to do validation together with code generation anyway, meaning the information in these immediates is available anyway. For a slower, optimizing compiler used for tier-up, this information could be cached or should be cheap enough to re-compute. For an interpreter working directly on the wire-bytes, it could in principle be helpful to have this type information right there, at least for some operations. But the Wasm binary format is already hard to interpret without additional sidetable information, and what you really want in an interpreter is often platform-specific, like real offsets in the case of Wasm GC object access. So this information needs to be put in a sidetable or engine-specific internal bytecode anyway. So I really can't imagine any situation where these immediates would be useful enough to pay for the increased size in the transport format, especially since this is a price paid by all users independently of the respective engine even profiting from it.

from function-references.

taralx commented on May 31, 2024

For those like me who are curious about the history, this pull request appears to be where the blocks got types. It seems like the primary concern at the time was knowing how many stack items are retained by a branch ("arity").

from function-references.

RossTate commented on May 31, 2024

Thanks, @taralx! Historical context is always super useful, and expect everyone involved in that conversation appreciates being saved from repeating the conversation 😄 Looking through it, it seems the option I am suggesting was never discussed: every block-like construct (i.e. has an end) specifies an arity to know how many values to leave on the stack (I suspect 0 is extremely common), and if the block-like construct provides a branch target (e.g. block, if, and loop) then specify a "label" type for the target (e.g. [i32, i32]). Now, given what already exists, it might just be easier to say that block-like constructs providing a branch target use block types, and those not providing a branch target (like let) use an arity. But we should try to reduce how much we rely on the compression granted by using type indices—as WebAssembly's type system grows there will be fewer and fewer identical block types within a module.

from function-references.

rossberg commented on May 31, 2024

To clarify, let is consistent with other block constructs and binds a label -- as mentioned elsewhere, treating all block constructs uniformly as branch targets was the group's decision long ago.

That's implicit in the typing rule as stated in the proposal, though it may not be super-obvious. I added an explicit note (and fixed a bug on the way :) ).

from function-references.

RossTate commented on May 31, 2024

Yes, I understand let currently binds a label solely due to a decision made long ago when all block constructs naturally bound a label. Above I gave reasons to revisit that decision. Could you link to the relevant discussion behind that decision so that we can see if/how its rationale carries over to let and other non-control-flow constructs (in addition to the one linked above, which I've already tried to incorporate into the discussion here)?

from function-references.

rossberg commented on May 31, 2024

Unfortunately, there are no notes of that, as far as I'm aware; it was early times.

FWIW, I argued against that choice at the time, and explicitly raised the question whether this should then apply to all block constructs that we may add in the future. The resolution was yes, because otherwise the primary motivation (simplicity of decoders and validators, avoiding a separate stack for tracking labels) would have been moot.

That doesn't mean that we can't revisit, but none of the technical arguments to do so are specific to let. So such a discussion would probably deserve a more general topic and any resolution should consistently apply to all relevant block instructions. I would hence prefer to separate this particular discussion from the current proposal. Focus helps progress.

from function-references.

jakobkummerow commented on May 31, 2024

+1 to not requiring redundant type immediates for new instructions, in the interest of saving module size. Obviously, instructions that create new values must know a type for them; whereas instructions that consume stack values don't need to encode the expected type of these values.

From an engine implementation point of view, redundant type immediates only make us do more work at validation time, because equivalence of actual stack value type and expected type (per immediate) must be checked. Having the immediates provides no (performance or simplicity or safety) benefit. (Direct wire-byte interpreters might be an exception to this "no benefits" statement, but (1) as @tebbi explains above even that is unclear, and (2) it's also unclear whether that's the implementation strategy that Wasm should optimize for -- wouldn't most users of most engines expect to get the speed that comes from compilation to machine code?)

From a language point of view, I can see that such additional "consistency checks" might be helpful for a human-written language, but they seem significantly less useful for a compilation target.

(For background, I'm mostly looking at this from the GC proposal's perspective, but it's a design decision that should ideally be consistent. For example, struct.get could skip its type parameter and only have a field index, the semantics being "retrieve the n-th field from whatever struct is currently on top of the value stack; fail to validate if that struct doesn't have enough fields". In this proposal, ref.as_non_null could easily be polymorphic and simply return a non-null reference for whatever nullable reference type it found on the stack.)

from function-references.

lars-t-hansen commented on May 31, 2024

One argument in favor of out annotations on block-like structures is that in a situation where the block leaves multiple values on the stack on the way out, a streaming compiler can pre-allocate space for those values to go into (they won't all fit in registers). As it is, our baseline compiler does not make use of that information and instead ends up emitting a fair amount of data shuffling code in some cases. I probably believe that it would be possible to get rid of that shuffling code if we can know something about how much space will be required along the exit edge from the block.

I don't know that this is a particularly strong argument for us, because it's a baseline compiler - some inefficiencies are to be expected - and because I've not done the detailed analysis, nor am I likely to. I guess there could also be streaming compilers that are not quite so simplistic and would benefit more from information up-front.

from function-references.

RossTate commented on May 31, 2024

Hmm, interesting idea. But I suspect you'll have to have a bunch of shuffling code with that anyways. That is, even if you preallocate the space, you don't know which instruction outputs end up going into which spots until you reach the end of the block, so either you'll have to add shuffling code or you'll have to backpatch the code, which is probably what you would have done anyways 😕

But now that I understand where @rossberg meant to go with this post (sorry for the confusion, though the opening arguments do apply just as easily to block-like instructions), and that the pull requests for the relevant changes are in progress, it's probably best to let him close the issue when those are done and I'll open up another one to discuss the topic of annotations for blocks elsewhere.

from function-references.

binji commented on May 31, 2024

it's probably best to let him close the issue when those are done and I'll open up another one to discuss the topic of annotations for blocks elsewhere.

Sgtm, I'd suggest the design repo, since this could be a separate proposal.

from function-references.

RossTate commented on May 31, 2024

Thanks for the suggestion! I was wondering that exact question 😄

from function-references.

RossTate commented on May 31, 2024

Done: WebAssembly/design#1352

from function-references.

titzer commented on May 31, 2024

I'll repeat here (with some delay) my argument for keeping at least some type annotations that "seem" redundant:

Just off the top of my head for call_ref, an interpreter could benefit from a function type annotation to know how many arguments to pop, without having to inspect the value and get to the function's (dynamic) type. In general, the reliance on types only known from the operand stack can manifest in the interpreter as instead requiring dynamic type information from a value operand. (select was different because the interpreter can cheat with a universal value representation and doesn't need to inspect the true/false values.) So I would use this as a guide going forward: don't unnecessarily introduce a dependency on having dynamic type information attached to values when run on an interpreter.

Basically, we should not trade static annotations for dynamic metadata and checks thereof. Even if a system doesn't spend any significant execution time in its interpreter tier, it has to pay the space costs of metadata forever.

from function-references.

rossberg commented on May 31, 2024

We still need to resolve this. Please let's focus on new instructions consuming reference operands, of the sort occurring in this and the GC proposal, e.g. call_ref, array.get, etc. Currently, some of these annotate the reference type, some don't. I think there is agreement that we should be consistent, but less so in which way.

AFAICS, the main arguments in this discussion so far have been:

Against annotations:

smaller code size
lower validation cost

For annotations:

less work for interpreters
lower risk of future issues with the type system

For previous discussion, see also WebAssembly/gc#241.

from function-references.

titzer commented on May 31, 2024

Type annotations also benefit baseline compilers. Otherwise baseline compilers have to track full types (rather than machine representations) to model the stack effects of instructions like call.ref or knowing the register class of the result of struct.get.

from function-references.

titzer commented on May 31, 2024

I would not find it surprising if many/most other production-quality engines found themselves in the same situation.

It's already the case that different production engines have made different tradeoffs. For example, the SpiderMonkey baseline compiler has its own dedicated decoding logic and its abstract stack does not model full types (instead, kinds, constants, register classes, and spill status). That pays off: SpiderMonkey baseline compiler generates code 4x faster than Liftoff.

While current production engine tiering configurations inform designs, we shouldn't allow them to dictate tradeoffs made in other, or future designs. This is what motivated my comments above. An entirely reasonable 3-tier design (interpreter, baseline compiler, optimizing compiler), or a design where validation is separate from baseline compilation for another reason, would be penalized by forcing them to model types.

There are at least two other instances that I can think of where modeling types (i.e. reconstructing the validation algorithm) is a detriment: the dynamic control stack in wamr-classic, and lazily generating stack maps for GC. In general, though we are focused on execution tiers here, any consumer that wants to model the value stack, even just for its height, would benefit from the property that I outlined: the immediates are enough to correctly model the stack height/kind at every instruction.

from function-references.

manoskouk commented on May 31, 2024

It seems that both V8 and SpiderMonkey always reconstruct types along with code generation. This is a strong argument to remove type annotations and save ~5% on binary size. I think forcing interpreters to maintain some information that they generate during verification anyway is a small price to pay for this benefit. Keep in mind that we are at a 2x binary-size deficit to JS, so any improvements are welcome.

from function-references.

titzer commented on May 31, 2024

I'd like to see updated numbers on the binary format overhead since we've moved away from RTTs. In particular, I think casts will still have an immediate, but binaries should be smaller overall without additional RTT instructions.

from function-references.

eqrion commented on May 31, 2024

One question I have for if we don't have a type annotation for some new instructions, such as call_ref, is how they are expected to validate in dead code (i.e. with stack polymorphism)? Let me know if this was answered somewhere else.

(func
  unreachable
  call_ref
  (; do we have to determine set of possible typed function references that we can materialize as the callee ref? ;)
)

Is there an implicit requirement for going down this road that we adopt relaxed dead code validation?

from function-references.

askeksa-google commented on May 31, 2024

I'd like to see updated numbers on the binary format overhead since we've moved away from RTTs. In particular, I think casts will still have an immediate, but binaries should be smaller overall without additional RTT instructions.

Here are some numbers for the barista3 benchmark for the current version of dart2wasm (nominal types, non-nullable locals), unoptimized. "Current" is the currently specified encoding, without type index on array.len (since we are removing that in any case). "Without" is futhermore without type index on struct.get*, struct.set, array.get* and array.set.

	Module, uncompressed	Module, zopfli	Code section, uncompressed	Code section, zopfli
Current	1694335	521514	1030246	373537
Without	1635418	495679	971329	348648
Saving	3.5%	5.0%	5.7%	6.7%

Going in the other direction (adding type index to call_ref) is a somewhat bigger undertaking, as the dart2wasm code generator currently doesn't supply this type. But I can look into it if there's interest.

from function-references.

eqrion commented on May 31, 2024

One question I have for if we don't have a type annotation for some new instructions, such as call_ref, is how they are expected to validate in dead code (i.e. with stack polymorphism)? Let me know if this was answered somewhere else.
(func
  unreachable
  call_ref
  (; do we have to determine set of possible typed function references that we can materialize as the callee ref? ;)
)
Is there an implicit requirement for going down this road that we adopt relaxed dead code validation?

Paging in the dead code conversation again, I remembered that the bottom type should resolve this particular situation. ref.as_non_null is a bit trickier as it refines a nullable reference type a non-nullable reference type. I do not know if a bottom heap type is expected to be added to resolve this situation as well.

from function-references.

jakobkummerow commented on May 31, 2024

@eqrion : call_ref currently doesn't have any type annotation, and in unreachable code it just trivially validates. It doesn't modify the stack; since the stack is polymorphic anyway, other subsequent instructions will accept it just as happily.
ref.as_non_null is equally easy: when it encounters a polymorphic/unreachable stack, it leaves it as is.
We implement that using an internal-only "bottom" type; it doesn't need to be spec'ed because it's never visible, just an internal sentinel.

from function-references.

titzer commented on May 31, 2024

Just to be clear, this is what I am proposing:

array.new,array.get*, array.set: immediate is heaptype (decltype) index
array.len: no immediate
struct.new: immediate is heaptype (decltype) index
struct.get, struct.set: immediate is module-wide field index [1]
casts/tests: to be resolved per #274

[1] I believe having a module-wide index space for struct fields is both specific enough to achieve the property I identified (immediates are enough to know machine kinds and generate code) and also forward-compatible with importing field accessors. I think imported field accessors, though not necessarily a solution to high-level language interop on their own, are a necessary building block for many kinds of linking.

from function-references.

askeksa-google commented on May 31, 2024

A module-wide field index would be a nightmare for on-the-fly encoding producers like dart2wasm.

In terms of compressed size, it's even bigger than having both type index and field index.

from function-references.

eqrion commented on May 31, 2024

@eqrion : call_ref currently doesn't have any type annotation, and in unreachable code it just trivially validates. It doesn't modify the stack; since the stack is polymorphic anyway, other subsequent instructions will accept it just as happily. ref.as_non_null is equally easy: when it encounters a polymorphic/unreachable stack, it leaves it as is. We implement that using an internal-only "bottom" type; it doesn't need to be spec'ed because it's never visible, just an internal sentinel.

I have a follow-up question to this, and filed #69 to not distract this issue.

from function-references.

tlively commented on May 31, 2024

I added this issue to the agenda for the subgroup meeting tomorrow in case we want to talk about it in real time.

from function-references.

titzer commented on May 31, 2024

@askeksa-google I don't quite understand why a module-wide field index is bigger or why it is difficult for an on-the-fly producer. Could you explain a little?

from function-references.

rossberg commented on May 31, 2024

@tliverly, FWIW, I probably can't make it tomorrow.

@eqrion:

ref.as_non_null is a bit trickier as it refines a nullable reference type a non-nullable reference type. I do not know if a bottom heap type is expected to be added to resolve this situation as well.

Yes, a (spec-internal) bottom heap type is definitely required, see WebAssembly/gc#311 (edit: and also the updated algorithm section for this proposal). Before, None was nicely doing that job, but with the 3-pronged type hierarchy, we'll need an extra bottom type below.

@askeksa-google:

A module-wide field index would be a nightmare for on-the-fly encoding producers like dart2wasm.

I have to agree with that, I'd highly prefer not to make such a change. It would make the type section rather non-compositional, as it makes reordering/removing/inserting/deduping type definitions and thus many program transformations much more difficult. Not to mention the specification having to jump through extra hoops to resolve such indices.

I am pretty sure there are multiple ways in which we can later solve the problem of separate compilation up to field offsets without relying on such a scheme for known ones.

from function-references.

titzer commented on May 31, 2024

@rossberg I don't see how a module-local field ordering is any different (or worse) than the module-local ordering we have for types. Pretty much all the issues you mention apply in exactly the same way to types in a recursion group, as they are numbered with module-local indexes and not group-local indexes.

from function-references.

rossberg commented on May 31, 2024

@titzer, to give an example, currently, you can choose or change the definition of a single type without affecting uses of any other type. With your proposed change, it would affect all uses of all types in the program.

I believe that would e.g. affect my compilers, because there are types whose index I "allocate" during codegen without knowing their definition yet (but already need to form refs to). Under your proposal, I would likely need to introduce an extra pass to resolve and patch up all uses of field indices (of completely unrelated types) in the program at the end of codegen. Sure, that's doable, but rather pointless complexity.

from function-references.

askeksa-google commented on May 31, 2024

@askeksa-google I don't quite understand why a module-wide field index is bigger or why it is difficult for an on-the-fly producer. Could you explain a little?

The increase in compressed size has to do with the kinds of patterns the compressor can recognize. The field index is often very small, which is a pattern in itself. And the type index strongly correlates with the preceding instructions, since for the same type, the value will often come from similar sources. The module-wide field index, on the other hand, will only be recognized by the compressor when the combination of type and field is the same, which happens much more rarely.

As for the module-wide field index being problematic for producers: It would mean that every time a type is added, or a field is added to a type, the index mapping changes, and all already encoded code has to be changed. Throw in a desire to reorder types (which is readily provided by the iso-recursive type system) and it quickly becomes a mess. Alternatively, the internal representation could pretend that the type and field index are separate, and then the combined index could be computed at the very end. This requires adding another layer of abstraction on top of the instruction representation which may otherwise not be needed.

from function-references.

RossTate commented on May 31, 2024

Wait wait wait. Two posts above seemed to indicate that wasm GC is double the size (uncompressed) and yet also takes twice as long to finish executing as JS. Did I read that right? If so, what's going on? Is it stuff like too much type annotations, or is there something else going on?

from function-references.

jakobkummerow commented on May 31, 2024

WasmGC is more than twice as big as JS, yes. I didn't say anything about execution speed.

from function-references.

RossTate commented on May 31, 2024

Thanks for confirming I read that correctly. As for speed, I got that from #27 (comment) by @manoskouk.

from function-references.

manoskouk commented on May 31, 2024

Sorry, if it wasn't clear I meant 2x deficit in binary size.

from function-references.

RossTate commented on May 31, 2024

Ah, thanks! Sorry for misunderstanding.

from function-references.

jakobkummerow commented on May 31, 2024

I've found some time to implement decoding of immediate-less struct/array get/set instructions and measure performance.
I've used the Sheets module mentioned before.

For validation alone, one machine showed a 0.9% improvement (91.6 -> 90.8 ms), another machine only 0.5% (67.0 -> 66.7 ms).
For combined validation and baseline compilation, any difference is lost in the noise (so likely <0.1%, hard to be sure).

This isn't overly surprising, because both versions still have to perform a bunch of checks at validation, e.g. for struct.get:

is the type on the stack unreachable?
does the type of the stack value have an index?
does the module have a struct type with that index?
does that struct type have a field with that index?
is that field's type something other than i8/i16?

The version with annotation additionally has to check:

does the module also have a struct type with the annotation's index?
is the type on the stack a subtype of the type of the annotation?

These two additional checks for less than 5% of all instructions in the module just don't move the needle all that much (though the relative impact will grow as other opportunities for optimizations are addressed; what we are seeing right now is a lower bound).

The biggest benefit of dropping these annotations is that doing so saves module size. Module size certainly isn't our primary concern, but it's not entirely free either, so we shouldn't waste it. I remain unconvinced that the vague references to undefined hypothetical future features that have been thrown around are sufficiently concrete to justify spending this module size budget now.

from function-references.

tlively commented on May 31, 2024

I remain unconvinced that the vague references to undefined hypothetical future features that have been thrown around are sufficiently concrete to justify spending this module size budget now.

In particular, I really don't see how we could end up in a place where we do not know enough about the type on top of the stack to be able to validate a field access. If we didn't, then it seems we wouldn't have enough information to validate local.set, a function call, or any other consuming instruction, either. If a future type system extension needs more annotations, it will be on instructions that push annotated values onto the stack, not on instructions that pop them off.

from function-references.

titzer commented on May 31, 2024

Thanks for those experiments, Thomas.

I remain unconvinced that the vague references to undefined hypothetical future features that have been thrown around are sufficiently concrete to justify spending this module size budget now.

I see this differently. Suppose we didn't know that we could save 2-5% module size without dropping annotations. Would MVP with annotations be unshippable due to prohibitive space overhead?

from function-references.

RossTate commented on May 31, 2024

A year or two ago I presented a proof of how various extensions in discussion would make type-checking NP-hard should annotations on struct.get/set be removed (in a way that the select fix cannot address). I did this to motivate a nominal-based architecture that would avoid said problems. However, the group voted against that architecture and, as such, should accept the consequences of more annotation burden if it wants to leave the door open for various post-MVP plans.

from function-references.

tlively commented on May 31, 2024

@RossTate can you share a link to those presentations?

from function-references.

RossTate commented on May 31, 2024

I should also note at the same time that those post-mvp plans seem extremely hypothetical at this point. For example, the papers they are based on required exceptions to be explicit in function types. The issues that arose and were not addressed during the discussion of non-nullable types are a tiny fraction of the issues that will need to be addressed to make those post-MVP plans come to fruition. That, of course, is my personal assessment, but it felt wrong to mention one side of the argument without mentioning the other.

from function-references.

tlively commented on May 31, 2024

Thanks, @RossTate! That's the exact kind of example I was hoping for.

(The TL;DR of the presentation is that if you allow multiple supertypes, you can embed the graph coloring problem into the problem of trying to figure out how to type struct.set. The types are not immediately obvious without annotations because the multiple supertypes mean there is no single principal type to choose.)

I believe @rossberg has mentioned wanting to support multiple supertypes in the future, so that example seems like it might even come up as a non-toy consideration eventually.

Specifically, I would view the no type-annotation struct.get/set/etc instructions as convenient short-hands to reduce binary size for a very common instruction (something there is precedence for). But the short-hand may have cases it cannot handle, which would necessitate adding a larger instruction that would need to be emitted for that case (something there is precedence for with select). I don't think this particular combination of precedents have been used before, but it seems valid to me.

I agree with @eqrion that having non-annotated instructions and adding annotated version when we need them in the future would be a good option. For the spec, the non-annotated versions would have side conditions saying they only validate when there is a principal type. On the tools side, the fixup necessary for this would be extremely simple relative to other fixups we're implementing.

from function-references.

RossTate commented on May 31, 2024

That solution breaks substitutability/subsumption.

from function-references.

rossberg commented on May 31, 2024

Personally, I'm not sure to what generality we would want to support multiple supertypes, but e.g. @titzer pointed out the need for them. Nevertheless, we probably want to remain careful there.

I agree with @eqrion that having non-annotated instructions and adding annotated version when we need them in the future would be a good option. For the spec, the non-annotated versions would have side conditions saying they only validate when there is a principal type.

I don't think we can assume that would work easily. It may break relevant properties, and it's not necessarily obvious how to express simple and cheap conditions for restricting these cases. For example, there is no simple formulation of the condition "is principal".

I think we're better safe than sorry.

from function-references.

tlively commented on May 31, 2024

That solution breaks substitutability/subsumption.

@RossTate, can you explain how?

I don't think we can assume that would work easily. It may break relevant properties, and it's not necessarily obvious how to express simple and cheap conditions for restricting these cases.

@rossberg, can you be more concrete? It seems like it should be simple to have a rule that says "if there are multiple options for what type to choose, require an annotation."

from function-references.

rossberg commented on May 31, 2024

@tlively, it may not be easy to decide whether there are multiple options – that may well involve an expensive algorithm in itself, or worse.

from function-references.

tlively commented on May 31, 2024

@rossberg, can you provide an example of a type system extension for which that would be the case?

from function-references.

RossTate commented on May 31, 2024

Substitutability/subsumption, by definition, says that if an instruction type-checks with a certain (list of) input type, then it also type-checks with all subtypes of that input type. So if it type-checks with an input type that is a struct with some field, then it must also type-check with a type variable whose upper bound is that struct type—even if that type variable also has another upper bound that is a struct with a differently typed field.

from function-references.

tlively commented on May 31, 2024

Thanks, @RossTate. I wonder how important this case of substitutability/subsumption is if you view dropping the annotation as an optimization or form of compression, as @titzer described.

from function-references.

RossTate commented on May 31, 2024

My proof basically implies that "decompression" would be NP-hard.

from function-references.

titzer commented on May 31, 2024

As for multiple supertypes, the type import extension I explored with Jawa used them to model Java interfaces. Java interface types are extremely common in APIs. Without multiple supertypes, you have to resort to using a top type and dynamic checks.

from function-references.

tlively commented on May 31, 2024

Thanks, @rossberg, that makes sense. I was going to write a comment about how simple the rule would be in the case of multiple supertypes, but in the process I realized that it actually wouldn't be simple. You can't just look at the direct supertypes of the type on the stack, you have to search up the supertype chain to find branches, then compare the branches to see if they're incompatible, for some definition of incompatible I haven't worked out yet. Anyway, I'm satisfied that this could get tricky.

from function-references.

RossTate commented on May 31, 2024

Personally, I agree with @jakobkummerow. There are concrete costs to these annotations, and previously the group has discounted the concerns of hypothetical extensions in such situations (and these extensions seem particularly hypothetical).

from function-references.

conrad-watt commented on May 31, 2024

Are we open to the possibility of a future version of the type imports proposal with multiple declared supertype bounds? That seems to be the most immediately-likely way that this problem "becomes real". I appreciate that we've not really thought about that proposal for a while.

(there would possibly be work-arounds if we started with non-indiced instructions and wanted such a feature, but they'd be somewhat whack-a-mole-like in comparison to having indices from the start)

from function-references.

tlively commented on May 31, 2024

I would love to avoid having multiple supertypes in general as long as possible because Binaryen depends so heavily on being able to calculate LUBs, but I would at least spend time discussing and evaluating that extension if it were motivated by an important real-world use case.

Out of curiosity, are there any other type system extensions that folks have top-of-mind that would have similar issues with respect to type annotations?

from function-references.

RossTate commented on May 31, 2024

The ones I can think of would require exceptions to be explicit in function types in order to be used reliably for their respective applications. The related discussion on non-nullable types made that prospect seem pretty dismal, which makes the prospect of those hypothetical features seem dismal as well.

from function-references.

RossTate commented on May 31, 2024

Oh, and as a technical note, the issues being discussed here are orthogonal to LUBs.

from function-references.

tlively commented on May 31, 2024

I was talking to @kripken about the graph coloring example offline, and we realized that we don't know how adding annotations would fix it. It seems like the only annotations you could actually add to that program would be for type $V, but that doesn't give you any new information.

@RossTate, are we missing something here, or does the type system in that example have NP-hardness even in the presence of annotations on struct.get and struct.set?

from function-references.

tlively commented on May 31, 2024

Aha, thanks, @RossTate! Before understanding that, it had seemed that the example had no bearing on the present discussion since it would be equally broken with or without annotations.

from function-references.

eqrion commented on May 31, 2024

My understanding is that $V would not be allowable as the annotation for struct.get; the annotation has to be specifically a struct type. That would mean each of the initial struct.get instructions would have to be annotated with the specific struct supertype of $V that makes the program type-check, so each such annotation would in turn specify the color of the respective vertex that demonstrates the graph is 3-colorable.

@RossTate I may be missing something, but doesn't this break the definition of subsumption given above as well?

Substitutability/subsumption, by definition, says that if an instruction type-checks with a certain (list of) input type, then it also type-checks with all subtypes of that input type. So if it type-checks with an input type that is a struct with some field, then it must also type-check with a type variable whose upper bound is that struct type—even if that type variable also has another upper bound that is a struct with a differently typed field.

struct.get supertype-of-$V would validate, but struct.get $V would not in the above rules for annotations.

The side-condition 'type annotation must be a concrete struct type' seems identical in practice to 'input type must be a concrete struct type'?

from function-references.

RossTate commented on May 31, 2024

Good questions. Before getting technical, let me demystify and motivate subsumption a bit for everyone.

We like to be able to inline functions. Consider a subtyping A <: B and the obvious function $upcast_A_to_B: [A] -> [B] that simply returns its input, i.e. is the identify function except for a loss in type precision. Given what this function does, you'd expect to be able to inline a call $upcast_A_to_B (local.get $a) to just local.get $a. But, if your type system does not satisfy subsumption, then that increase in type precision could cause a subsequent instruction to fail. So, if you want inlining, you need subsumption.

Now for the technical answers.

My understanding is that $V would not be allowable as the annotation for struct.get; the annotation has to be specifically a struct type. That would mean each of the initial struct.get instructions would have to be annotated with the specific struct supertype of $V that makes the program type-check, so each such annotation would in turn specify the color of the respective vertex that demonstrates the graph is 3-colorable.

@RossTate I may be missing something, but doesn't this break the definition of subsumption given above as well?

Subsumption is a property of a fixed instruction: if a given fixed instruction type-checks with certain inputs, that same instruction must necessarily type-checks with more precise inputs. A type annotation is part of the instruction, so changing the type annotation results in a different instruction; as such, subsumption says nothing about the relationship between these instructions.

The side-condition 'type annotation must be a concrete struct type' seems identical in practice to 'input type must be a concrete struct type'?

That impression is probably because at present "practice" for WebAssembly is not including programs using type variables. If wasm ever gets to generics (which, admittedly, I'm pretty skeptical of at this point) or to powerful forms of separate compilation like Jawa (which I'm also pretty skeptical of at this point), then you will often have programs where you know a value belongs to some abstract type which you know is at least a struct with some fields, at which point requiring an annotation to be a struct type will be very different from requiring an input to belong to a struct type and not a type variable.

If it helps to clarify things, there's another way the complexity issue can be rectified using annotations. At present, the annotation for local.get essentially specifies what to upcast the input to, and the restriction that the annotation be a struct type guarantees that the retrieved value has a principle type to use. Instead, the annotation for local.get could be just the (upcast) type of the field at hand, and validation would just check that the input has a field of (at least) that type.

As a reminder, I am giving these technical answers not to force a particular decision. Rather, their implication is just to make the group understand that you must make a decision: remove type annotations xor maintain forward compatibility with some features like multiple upper bounds on imported types. My recommendation would be to do the former, but it's your decision.

from function-references.

tlively commented on May 31, 2024

A third option would be to compromise slightly on substitutability under the assumption that any tool that would benefit from substitutability would also be able to add the type annotation as necessary (under any type system that can otherwise work without annotations).

from function-references.

RossTate commented on May 31, 2024

You can't really compromise slightly on a global principle; it holds or it doesn't hold. Rather, what you're proposing is that the extensions in discussion would not be subtypes but would rather have erasable coercions. For example, a type import would not be a subtype of its additional upper bounds; rather, there would be an upcast_to_bound <n> : [(ref $V)] -> [(ref $bound_n_plus_1)] instruction, which is what call $upcast_V_to_BoundNPlus1 would inline to. This has a number of downsides but, given that I find it unlikely wasm will get to this point, that's fine with me.

from function-references.

askeksa-google commented on May 31, 2024

I did some more measurements to estimate the additional overhead of adding a type index to call_ref. Here's the table from earlier with an extra "Full" row that includes these (edit: brotli columns added):

	Module, uncompressed	Module, zopfli	Module, brotli	Code section, uncompressed	Code section, zopfli	Code section, brotli
Full	1695174	521675	399958	1031085	373826	281649
Current	1694335	521514	399999	1030246	373537	281625
Without	1635418	495679	380343	971329	348648	261510
Saving vs Current	3.48%	4.95%	4.91%	5.72%	6.66%	7.14%
Saving vs Full	3.53%	4.98%	4.90%	5.80%	6.74%	7.15%

The additional overhead looks tiny here, but I think these numbers underestimate the typical overhead of adding a type index to call_ref, for two reasons:

In dart2wasm, call_ref is only used for closure calls, so there are very few call_ref in the program (less than 0.1% of instructions).
The function types for closure calls use only one type per arity, so the entropy of the call_ref type index is small.

It would be very interesting to see similar numbers from a compiler that uses call_ref more extensively and variedly, such as J2Wasm.

from function-references.

kripken commented on May 31, 2024

As a quick estimate for call_ref specifically in J2Wasm, there are 9 K call_refs in a 9 MB binary, and the number of types is around 18 K so type LEBs are probably 2-3 bytes, so the code size increase would be 0.2%-0.3%. There are over an order of magnitude more struct.gets, so those matter a lot more than call_ref for size.

Overall a size improvement of ~4% seems significant here and I think we should remove the annotations. My mind could be changed by a plausible example that could not be fixed by annotations later, but so far everything mentioned seems like it could work (e.g. with type imports, we could say that in the coloring example above $V - a type import - would not be allowable as an input to struct.get, which is basically parallel to what @RossTate said we need to do anyhow; and adding annotations later could be done in a single new instruction as @manoskouk mentioned in the meeting, which is not burdensome).

from function-references.

RossTate commented on May 31, 2024

@eqrion I'm not entirely I understand the option you're asking me feedback on, so here's my best guess. Using a fallible inference algorithm (in the future) to recover missing annotations has the same issues as using an incomplete validation algorithm.

That said, I'm not sure we need more options to consider. It's been about a week since someone posed the question of whether there were concrete features people wanted to maintain forwards compatibility with. Since then, while I've articulated why you can't always simply add annotations later on to fix an incompatibility, my own answer to the question was that the features I could think of already have compatibility issues with other features of WebAssembly, and I don't believe anyone else has provided any specific features to consider. So maybe y'all are at a point where you can decide to go with removing annotations, which seemed to be the prevailing sentiment of comments in the past week?

from function-references.

titzer commented on May 31, 2024

@eqrion I also feel like a 5% space savings doesn't really solve the issue if 2x is what is needed. Also, that 5% space is compressed size. My comment in the meeting a couple weeks back about including a general purpose compression algorithm in our calculations wasn't fully articulated. The compression schemes we have right now are 5% smaller without annotations, but if a different compression algorithm is invented that does better, our cost model is outdated. That's why I think it's odd that we include GP compression schemes in a criteria we are trying to optimizing for.

To wit, a layer 1 compression algorithm that is aware of Wasm's type system can and should completely outclass GP compression. In particular, I suspect that the type annotations necessary for locals are also a big chunk of binary size. Layer 1 compression could probably eliminate those for definitely-initialized locals.

which seemed to be the prevailing sentiment of comments in the past week?

There is still disagreement about this. I don't think the discussion has turned up anything that has fundamentally changed the calculus, or any opinions, yet.

from function-references.

rossberg commented on May 31, 2024

I don't think the discussion has turned up anything that has fundamentally changed the calculus, or any opinions, yet.

Agreed. If anything, it turned up how slippery the slope would be. ;)

from function-references.

jakobkummerow commented on May 31, 2024

@titzer:

a layer 1 compression algorithm that is aware of Wasm's type system can and should completely outclass GP compression.

I encourage you to develop and/or present such an algorithm. On its own, "we shouldn't do anything now because I'm sure someone else will invent a rounder wheel in the future, which will solve all our problems" is not a convincing argument.

I suspect that the type annotations necessary for locals are also a big chunk of binary size

In the Sheets module measured above, 200 KB are used for defining 61K locals across 28K functions, accounting for 2.1% of the uncompressed code section size.

from function-references.

RossTate commented on May 31, 2024

I don't think the discussion has turned up anything that has fundamentally changed the calculus, or any opinions, yet.

I had thought that the notable lack of motivating examples benefiting from annotations might have been a fundamental change to the decision calculus. If it would help move the discussion forward, I can explain why the one example that has been mentioned—Jawa—does not actually need multiple upper bounds on type imports, leaving us with no motivating examples. (The short of it is that, due to the complications caused by multiple inheritance, there's no significant run-time benefit to knowing an object implements an interface, so there's no need to encode subtyping between interfaces when compiling to wasm.)

If anything, it turned up how slippery the slope would be. ;)

Can you explain what you mean by this?

On the larger issue of binary size, my understanding is that j2wasm (and maybe dart2wasm) is using aggressive closed-world AOT optimizations to make performance competitive with j2cl. Is that understanding correct? If it is, such performance optimizations can substantially increase code size, so what happens to binary size (with and without annotations) and performance if instead one optimizes for size rather than performance?

from function-references.

askeksa-google commented on May 31, 2024

Indeed dart2wasm performs closed-world optimizations, but most of these are ones that reduce code size, e.g. tree shaking, parameter elimination, constant propagation, devirtualization.

from function-references.

RossTate commented on May 31, 2024

Ah, I would have thought it did inlining as well largely due to its symbiotic role with devirtualization (devirtualization provides an inlining opportunity, which moves virtual calls to a more specialized context, which enables more devirtualization).

from function-references.

eqrion commented on May 31, 2024

@kripken Thanks for that, I think that's well said.

So even 5% is worth thinking about. Yes, there may be other ways to shrink size (L1 compression), and maybe we'll get to them, but that's uncertain. And, yes, maybe we are 2x larger than JS now, and maybe 5% won't fix that, but it's still 5% better. Whenever we have a good practical way to reduce size that's worth considering. Of course, we need to consider the other side of the tradeoff as well.

I just want to clarify my comments regarding the '5% vs 2x' point. I definitely agree that improvements are worth considering even if they don't solve the larger issue completely.

I would just like to better understand why wasm-GC binary size is 2x JS binary size. This would help me to evaluate the relative importance of a 5% improvement vs any spec concerns. Are the wasm-GC toolchains doing more inlining? Are call sequences or prologues larger? Is there a runtime library required with wasm that JS doesn't need? Is the data/elem section significant?

For example, if there's low hanging fruit elsewhere that doesn't have spec concerns then that seems like the first step before dropping annotations. If the other required fixes are likely to be difficult too, then it may make sense to start with dropping annotations.

But without more context on the '2x' number, it's hard for me to weigh the improvement against theoretical spec concerns.

from function-references.

rossberg commented on May 31, 2024

@eqrion:

I would just like to better understand why wasm-GC binary size is 2x JS binary size.

Because JS is a high-level language and Wasm isn't. Respective size comparisons are mostly apples to oranges. Native code also tends to be much larger than JS source code, probably by more than 2x.

from function-references.

eqrion commented on May 31, 2024

@eqrion:

I would just like to better understand why wasm-GC binary size is 2x JS binary size.

Because JS is a high-level language and Wasm isn't. Respective size comparisons are mostly apples to oranges. Native code also tends to be much larger than JS source code, probably by more than 2x.

I'm aware of the difference between JS and Wasm. All of the items I listed as possible reasons for a size difference would be due to Wasm being a lower-level language. What I'm asking for is quantification to help understand what, if anything, can be done about the difference.

from function-references.

tlively commented on May 31, 2024

One thing that might help move the conversation forward is if we could get an example of a type system extension for which the existence of unannotated struct.get and struct.set (even in the presence of additional annotated versions) would be a problem. In other words, it would be nice to have an example where the decision procedure for determining whether an annotation is required is greater than constant time (amortized).

Lacking such an example, it has been hard to convince ourselves that the risk of encountering such a situation is anything but vanishingly small and that the expected benefit of keeping annotations could outweigh the benefit of reducing code size.

Note that for the example with multiple supertypes on type imports, I believe an efficient and sufficient decision procedure for requiring annotations would be to require them when the type of the stack is a subtype of a type import, which would be amortized constant time after preprocessing the types.

from function-references.

dtig commented on May 31, 2024

Adding a note here that this was discussed in the GC subgroup meeting earlier this week (PR for notes here). The high order bit is that the poll of the room showed that the consensus is to retain type annotations on new instructions. Continuing the discussion for next steps here, some questions:

Is the consensus in the meeting sufficient to move forward?
If so, what should the next steps be for the OP of this issue w.r.t. call_ref, func.bind?
Other thoughts from people who didn't attend the meeting?

from function-references.

tlively commented on May 31, 2024

Let's move forward with keeping the existing annotations and adding annotations to call_ref and friends as well. I don't see a need to hold a further vote. @rossberg, can you prepare a spec PR for that? We'll work on updating V8 and Binaryen, too.

from function-references.

rossberg commented on May 31, 2024

@tlively, done: #76.

from function-references.

rossberg commented on May 31, 2024

Closing via #76.

from function-references.

Type annotations on new instructions about function-references HOT 99 CLOSED

Comments (99)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent