webassembly / function-references Goto Github PK

Proposal for Typed Function References

Home Page: https://webassembly.github.io/function-references/

License: Other

Makefile 0.20% Python 4.99% Batchfile 0.07% CSS 0.01% Perl 0.13% Shell 0.07% HTML 0.03% OCaml 3.61% Standard ML 0.01% WebAssembly 87.96% JavaScript 2.24% TeX 0.01% Bikeshed 0.69%

proposal

function-references's Introduction

Function Reference Types Proposal for WebAssembly

This repository is a clone of github.com/WebAssembly/spec/. It is meant for discussion, prototype specification and implementation of a proposal to add support for basic function reference types to WebAssembly.

See the overview for a summary of the proposal.
See the modified spec for details.

The repository is now based on the bulk operations proposal and includes all respective changes.

Original README from upstream repository follows...

spec

This repository holds a prototypical reference implementation for WebAssembly, which is currently serving as the official specification. Eventually, we expect to produce a specification either written in human-readable prose or in a formal specification language.

It also holds the WebAssembly testsuite, which tests numerous aspects of conformance to the spec.

View the work-in-progress spec at webassembly.github.io/spec.

At this time, the contents of this repository are under development and known to be "incomplet and inkorrect".

Participation is welcome. Discussions about new features, significant semantic changes, or any specification change likely to generate substantial discussion should take place in the WebAssembly design repository first, so that this spec repository can remain focused. And please follow the guidelines for contributing.

citing

For citing WebAssembly in LaTeX, use this bibtex file.

function-references's People

Contributors

Stargazers

Watchers

Forkers

dalavancloud chicoxyzzy littledan heartyharts dbezhetskov ms2ger diversable takikawa dhil jamesernator osa1 sno2 zapashcanon q82419 seanpm2001

function-references's Issues

Use cases for typed function references without GC

The presentation mostly focused on aspects of this proposal besides typed function references. I've been wondering what use cases y'all had in mind for typed function references ahead of garbage collection?

Should we merge this proposal back into GC?

With recent momentum on the GC proposal, it doesn't look like this proposal is likely to be shipped or used before the rest of GC. Since work implementing and using this proposal has so far been tied to broader work on GC, I think it would make sense to merge typed function references back into GC. Doing so would consolidate discussion in a single place and would save us the overhead of defining typed function references in one proposal and depending on them from another proposal.

Remove `let`

In #44, we discussed many alternatives to let. From the temperature of the discussion, it seems like an unstated consensus that we should remove let as it is.

I propose we do go ahead and remove let for now, in the spirit that we should not have incomplete features that we do not intend to ship laying around. We therefore better capture the subset that we do intend to ship.

That leaves us without a solution for non-nullable locals for now. I'm OK with that; it will expose pain points and increase the priority of finding a solution.

Time for a rendered spec :)

Hey it would be swell if we got the HTML version rendered and posted :) Would be happy to help out if needed!

Benefits of non-nullable references?

I understand the benefit of non-nullable references in a source language, but I was curious what we expect the benefits to be in wasm? Is it for speed, or correctness, or something else?

The one minor benefit I can think of is that a wasm=>wasm optimizer could optimize out some ref.is_null checks based on it.

The overview mentions the overall benefits of the proposal, but none of those seem to apply to non-nullable references. I'm probably missing something obvious, though.

Are recursive types allowed?

The overview mentions that subtyping is co-inductive, but it does not explicitly state that type section entries can reference later type section entries. If a reader were to extrapolate from Wasm's existing declarations-before-uses structure, they might arrive at the conclusion that type section entries cannot reference later type section entries and therefore recursive types are not allowed. Can we clarify this one way or another in the explainer?

Binary encoding of 'let' (discussion topic)

One thing that I'd like to be sure of is that a streaming compiler can generate code for allocating the locals area in the stack frame at once when it first encounters a function and not have to scan the function code to find out how many locals it might need. So ideally the information about let-bound locals would be present in the function header. For example, the function could in principle have a fixed set of locals; some locals can then become activated and deactivated as control enter various regions of the code.

JS-API is missing value conversions

This is highly related to: WebAssembly/gc#279, but as this is a separate proposal I think it needs an issue here too.

What is the behavior of ToWebAssemblyValue for (ref null? T) when T = func-type. The explainer says for 'value conversions' that a type equality check is performed. The JS-API is missing this case. To be consistent with the current direction of WebAssembly/gc#279, it seems like we should disallow this case and throw an exception.
What is the behavior of ToWebAssemblyValue for non-nullable references when value = JS null? We currently throw a TypeError in SpiderMonkey, and this is similar to how V128 is handled.
What is the behavior of ToWebAssemblyValue for (ref extern) when value = JS null. The same as above? I call this one out just to clarify my mental model that heap-type extern can hold every JS value except for JS null, which becomes a null reference value in WebAssembly and therefore requires the reference to be nullable.

https://webassembly.github.io/function-references/js-api/index.html#towebassemblyvalue

folded syntax of let

I have been using let fairly extensively recently (at least hand-writing programs that use it).

Question: what is the appropriate folded version of a let instruction?
If the order of were reversed, then it would better match the structure of a function (IMO).

func.bind has a dynamic number of stack arguments?

The way that the func.bind instruction is specified, the literal type operand specifies the result type of the bind:

func.bind $t' : [t1^n (ref $t)] -> [(ref $t')]

That seems to suggest that the implementation has to look at the actual function, decide how many extra arguments to bind by diffing the actual function type (which is on the argument stack) and the target function type.

This could be resolved by either limiting bind to one free variable or by adding an additional operand.

Or am I reading this incorrectly?

can we pull out the let instruction?

The let instruction is independently useful as a better way of naming stack entries.

The interface types proposal will very likely depend on this instruction being available.

Propose that we spin off a separate proposal for the let instruction.

ref.is_null and heaptype vs. reftype

The reference types proposal introduces ref.null and ref.is_null and defines them both to take a "reftype" immediate, which as of that proposal means either extern or func.

This proposal introduces a distinction between "reftype" and "heaptype", with the relation reftype = (ref null? <heaptype>). The newly added reference-related instructions take "heaptype" immediates; ref.null and ref.is_null are not updated. This looks like an oversight to me; I think they should both be refined to take a "heaptype" immediate.

As far as the binary encoding is concerned, -0x10 encodes either func (heaptype) or funcref (reftype) depending on context; -0x11 does the same for "extern(ref)". So this spec change would not create backwards compatibility issues for binary modules.

Beyond consistency, the motivation is that now that the set of "reftypes" includes nullable and non-nullable types, it's a bit silly to use this definition for null-related instructions. For instance, while ref.null <ref null $ht> (with $ht being a heaptype) makes sense, ref.null <ref $ht> would be pointless (and would either have to fail validation, or trap, or return a value of type (ref null $ht), contradicting its immediate).

Similarly, ref.is_null <ref $ht> : [(ref $ht) -> (i32.const 0)] would be legal but useless: why would anyone want to do a null-check on a value that's statically known to be non-nullable?

For non-shorthands, a "reftype" is one byte bigger than a "heaptype", so given that the extra byte provides zero value, I suggest to switch ref.null and ref.is_null to "heaptypes" to save binary module size.

Interaction between 'let' and name section

Are there any thoughts about how 'let' should interact with the name section? I see two issues:

Since local indexes shift inside the scope of 'let', a consumer of the name section (e.g. debugger) has to track those scopes (or at least the number of bound locals).
There is no way to assign names to let-bound variables in the name section.

On a more general note, since 'let' seems to be creating some complications, maybe it should be removed and replaced with the rule that non-defaultable locals must be (trivially) provably initialized before use.

Build depends on old OCaml version

While the mainline spec repo has moved onto OCaml version 4.10, this repo still depends on an earlier version. This makes it hard to work on multiple proposals at once.

Can we rebase this repo on the current spec repo?

Interpreter issues with let

The let instruction is difficult to implement in an interpreter without significant overhead. In particular, a let instruction introduces a new block scope and binds new local variable indices that are accessible (and writeable) in that scope. This requires the use of a second stack pointer if using the common implementation technique where local variables are preallocated before the operand stack on a single array that increases with the call stack. Secondly, leaving the scope of the let block must unbind or pop these variable indices. As the scope of a let is terminated with a normal end, there is no way for an interpreter to know where the scope ends unless by dynamically tracking control scopes. This is not the case with any of the existing control constructs in Wasm, and is specific to let. Dynamically tracking control scopes is not necessary for any reason currently, and would penalize all functions, even ones without let.

A simple solution to this problem is to require that functions pre-declare all of the space that they will need for let bindings in their bodies. This allows an interpreter to preallocate space for let bindings, doesn't require a second stack pointer, and also benefits JITs, since they must also use a dynamic data structure (e.g. to track SSA values) that would have to grow and shrink with lets. Since the preamble of a function is just a series of local variable declarations with value types, we will probably need to reserve a value from the binary encoding of the value type space to indicate a pre-declaration of let indexing space.

I think the other alternatives for avoiding this problem (e.g. not having let, but requiring definite initialization of locals with value types that have no default value such as non-null references) are worse.

Should non-nullable locals be initialized by unreachable code?

Binaryen optimization passes that can introduce new invalid uses of non-nullable locals opt themselves into getting an automatic fixup pass applied afterward. To keep binaryen efficient, we try to have as few optimization passes as possible opt into this fixup. However, we found a problem with Binaryen's treatment of non-nullable locals that might require many more passes to opt into the fixup. I'd like to avoid that by making a fix in the spec instead.

The problem is that Binaryen performs a domination analysis that finds initializing sets that make subsequent gets valid, but that domination analysis did not consider unreachable code specially, so sets in unreachable code were considered to make subsequent gets valid. However, Binaryen does not emit unreachable code, so those sets are not actually emitted into the binary. The engine then sees a get without an initializing set and fails validation. You can find an example of this problem at WebAssembly/binaryen#5599.

A potential fix is given in WebAssembly/binaryen#5665, which makes the domination analysis ignore unreachable sets that won't appear in the binary, but this fix would require many more passes to have to opt into the local fixup mechanism and would considerably slow down Binaryen's compilation speed.

The problem would also be fixed if we changed the spec to say that instructions that leave the stack in a polymorphic state (i.e. introduce unreachability) also mark all locals as initialized. This would be sound because subsequent gets that are made valid by this change are never executed. This would let Binaryen avoid making any changes at all by making the current bug a feature instead.

What do folks think about this proposed relaxation of the validation rules?

Type annotations on new instructions

So far, Wasm has been consciously designed such that the type of every instruction is self-contained. That is, every operand's type is either fully known (possibly with the help of a type immediate) or fully polymorphic.

With instructions like call_ref or func.bind this design choice starts to become a nuisance, and will be even more so under future extensions like those of the GC proposal. Redundant type annotations can also increase the cost of validation, since more types needs to be compared against each other (although it's not clear whether this extra cost is relevant, given the relative infrequency of most affected instructions).

Currently, the proposal omits the type on call_ref and the source type on func.bind. It requires constype immediates on null-related instructions, though, to be consistent with the instructions in the reference types proposal. Should more annotations be added for consistency and ease of type checking, or more removed for convenience/compactness?

Alternatives to `let`?

We have heard from multiple directions (interpreters, toolchains, debuggers) that let is (surprisingly) hard to implement/use. As I am looking into implementing let in V8's non-optimizing compiler, add my name to that list. It's certainly all doable, but it's quite involved. And it would be sad if engine/interpreter implementors had to spend particular implementation effort (as well as runtime CPU+memory cost) on a feature that toolchains then barely use.

So I was wondering whether the let instruction as currently proposed really is the best solution we can collectively come up with.

IIUC, its primary purpose is to provide a way to have non-defaultable locals (in particular: non-nullable reference type locals) in a function. Please do point out if it has additional important use cases.

In the spirit of brainstorming, here are some alternatives that would solve that problem too:

(1) A "locals_initialized_barrier" instruction with semantics: before the barrier, locals may be uninitialized/null, and reading from them incurs the cost of a check; after the barrier, such checks are dropped as all locals are guaranteed to be initialized. Execution of the barrier checks that all non-defaultable locals have been initialized.

(2) A scope that indicates "within this scope, the given locals are non-null". Entering the scope performs null checks on the specified locals. Accessing these locals within the scope needs no null checks.

(3) Introduce "local initializers" modeled after the existing global initializers (which are solving the same problem). We'd need to figure out how exactly to encode these in the text and binary formats. Execution of a function would begin with evaluating these local initializers; afterwards all locals are guaranteed to be initialized. Similar to globals, the rules would be something like: only constant instructions are allowed as local initializers; they can read other locals but only those with lower indices.

(4) Require all locals to be pre-declared (at least by count, maybe also by type). Their initialization then still happens with let as currently proposed. That would prevent the size of the locals list/array/stack from changing dynamically, and would also keep each local's index constant throughout the entire function.

(5) Drop let entirely, at least for now. We can always add it later if we have enough evidence of a concrete need for it. In the meantime, a workaround is to factor out the body of what would have been a let-block as a function, and call that. A JIT might still decide to inline that function. (This would limit Wasm module's ability to fine-tune for maximum performance; but based on binaryen's feedback it's unclear to what extent they'd do that anyway. This is not my preferred solution, just mentioning it here for completeness.)

(I realize that there is a lot of conceptual overlap between these ideas -- which is unsurprising given that they're all solving the same problem, just with slightly different approaches.)

I'm sure there are other possibilities, please suggest them!
If the ideas above have critical flaws making them unsuitable, please point them out! And if you have a favorite among them, please say so as well.
As I wrote above, this issue is meant to be a collective brainstorming (and maybe eventually decision-making) effort.
We don't have to change anything; I just wanted to make sure that this design has received sufficient contemplation before we set it in stone. Especially in light of the feedback we've been getting so far.

slowness of test/core/skip-stack-guard-page.wast

This spec test has gotten a lot slower with the function references and gc proposals, taking over 30 seconds on my machine. The rest of the tests are in the millisecond range. I looked into why this is and if there was something that could be done to make the test faster. E.g. I suspected I could nerf this test to run in less time, but alas, it was not so easy because it tries to test different nesting depths and ultimate needs a lot of (live) locals to function. I concluded that it was a change in the way locals are implemented (I suspect due to let?) in the reference interpreter.

I was wondering if there's something I hadn't thought of to make this test faster, or if there is an inherent interpreter speed limit here.

Thoughts?

Function bind

The notation for explaining the function.bind instruction is confusing to me.
Is it possible to have some form of explanation?

Also, what is the lifetime of the resulting closure: stack allocated or can its lifetime escape the enclosing function.

Tweak binary format for table declarations

The new binary format for table types allows 0x40 0x00 as a prefix to a non-nullable reference type to indicate a table that has an explicit element initializer. I think we should tweak this a little.

Avoid 0x40, since that is used for an empty block type already[1]. E.g. use 0x41, which doesn't collide with anything.
For table types, put the constant initializer before the limits, so that a non-nullable reference type with an explicit initializer can be parsed as a single unit. The constexpr could either come before the table type (i.e. right after the sentinel), or be inside the table type.

[1] I was initially confused by the empty block sentinel started showing up here, since Wizard tries to share some type decoding routines for simplicity.

As a bonus, by doing (2) above, we could in theory allow the explicit initializer for local types or block types, or other places where types occur.

Validation rule for call_ref with polymorphic stack

Should the following module validate?

(module
  (type (func (result f32 f32)))
  (func (result f32 f32)
    unreachable
    call_ref
    f64.add

    (; to make func result correct ;)
    f32.const 0
    f32.const 0
  )
)

Here's the rule for call_ref:

call_ref calls a function through a reference

    call_ref : [t1* (ref null $t)] -> [t2*]
        iff $t = [t1*] -> [t2*]
    traps on null

The polymorphic stack after unreachable can materialize an unknown type for the callee funcref, but what would the $t of the typing rule be then? And therefore what gets pushed after call_ref?

In the above example, the only legal $t is (type (func (result f32 f32)), and therefore call_ref would push two f32's, which would cause the module to fail validation. This assumes that a legal substitution in that rule must be a declared type in the module the call_ref is in.

If the $t can be synthesized to be any theoretical function type, then the results of a call_ref with an unknown input yields essentially a new polymorphic stack.

Knowing the maximum depth of locals

Since the number of locals can now change at runtime, it might be valuable for low latency consumers (e.g. baseline compilers) to know the total number of stack slots they need to reserve for locals. Otherwise, for at least JSC, we will need to do a two phase generation because we don't know the final stack layout until we are done parsing. For what it's worth, I haven't thought about this too hard though, so it could be there's actually a simple solution on the consumer side, which removes the need to know everything up front.

It was also brought up that some consumers may want to know the types of each slot. It's not clear to me how that would work though, since the indices could change type per block. It would be interesting to know if some consumer cares about this, so I thought I would bring it up here too.

br_on_null vs br_on_non_null?

When starting to work on this in binaryen, there was some confusion. I had some questions on the design that I don't see discussed, so opening this issue.

The spec says:

br_on_null $l : [t* (ref null ht)] -> [t* (ref ht)]
  iff $l : [t*]
branches to $l on null, otherwise returns operand as non-null

That is, if the input is null we branch, sending values to the target as necessary. If it is not null we return it as not null, and leave the other values on the stack.

I can see how this makes sense, but I wonder if we've considered the opposite, br_on_non_null? That could be

br_on_non_null $l : [(ref null ht)] -> [(ref null ht)]
  iff $l : [(ref ht)]
branches to $l on non-null, otherwise returns operand

Note how this would match br_on_func, br_on_data, br_on_i31 in the GC proposal: they all get an input, check if it can be converted to something more specific, and if so send it on the branch, and otherwise leave it on the stack.

I think matching the GC proposal makes sense to do for consistency and uniformity - making all br_on_* behave as similarly as possible.

Alternatively, we can achieve consistency by flipping the GC proposal ones, so we'd have br_on_non_func etc. That also makes sense in a way, as it would be like ref.as_func: return the refined value. So ref.as_* traps if it can't be refined, and br_on_non_* would branch in that case. (Though personally I think the smaller change, in the previous paragraph, is nicer.)

Are there code size or other concerns here that motivated the current design?

Progress toward phase 4

I'd like to start a conversation around the state of this proposal and what steps are missing to bring it to Phase 4.

This is inspired by WebAssembly/tail-call#15, which (seems to me) helped to focus the work of multiple actors towards making progress.

Default/init values in JS-API for globals and table.

Currently, there is a test in SM codebase:

    let g = new WebAssembly.Global({value: "externref"});
    assertEq(g.value, null);

With changes in how the init parameter works in the objects constructors (and grow method, and maybe set one), we are treating externref and funcref differently. The DefaultValues for these two are different. If I read the JS-API spec correctly, the externref default value is undefined, and the funcref is null.

What is the right value for g.value with the current spec? Will the function-references proposal change the behavior?

Extend `TableType` to support non-nullable references in JS-API

Is JS API is missing the extension for ElemType at Type Representation of the Overview? The js-types defines:

type ElemType = "funcref"
type TableType = {limits: Limits, element: ElemType}

It looks like the ElemType will be simply = "funcref" | {ref: ConsType, nullable: bool}. Is this correct?

Runtime checks of call_ref

In the overview it says:

efficient indirect function calls without runtime checks

What does that mean in reality? What occur if count and types of values on the stack does not match with the calling function?

Are inline heap types allowed in the text format?

Is this valid?

(module
  (func $foo (param (ref null (func (param i32)))))
)

Or would it have to be written like this:

(module
  (type $takes_i32 (func (param i32)))
  (func $foo (param (ref null $takes_i32)))
)

The abstract syntax given in the overview defines heaptype as heaptype ::= <typeidx> | func | extern, which would imply that the first option is invalid, but I'm not sure if that's meant to apply to the concrete text format as well.

cc @kripken

Clarification request: initializers for tables with non-nullable type and initial_size == 0

A question that came up is whether initializers are required for tables that have a non-nullable type (which, generally, causes them to require an initializer) but initial_size zero (which makes any initializer useless).

My reading of the current text of https://github.com/WebAssembly/function-references/blob/main/proposals/function-references/Overview.md#tables is that initializers are required regardless of initial_size when the type is non-nullable. Is that interpretation correct? If yes, do we want to keep this rule, or allow the special case of skipping the initializer when the initial_size is 0?

(Personally, I don't feel strongly about this either way; just asking to clarify.)

Malformed br_on_null and br_on_non_null in tests

function-references/test/core/br_on_non_null.wast

Line 59 in 159e41b

(return (br_on_non_null $l (local.get $n) (local.get $r)))
function-references/test/core/br_on_null.wast

Line 53 in 159e41b

(return (call_ref $t (br_on_null $l (local.get $n) (local.get $r))))

These uses of br_on_null and br_on_non_null take 3 arguments, instead of 2, and the reference arguments (2nd arguments) are i32s.

`func.bind` and iso-recursion

I am having trouble wrapping my head around the intended type rules for func.bind. Currently, func.bind includes a type annotation (call this $ft) that should reference a signature type. The result of the instruction should be ref $ft. The top of the stack should contain a function reference (let's suppose its type is ref null $et). The difference in parameter arity between $et and $ft is the number of arguments that will be bound by this instruction (call this diff).

I am trying to understand how to relate the "residual" type ($et - diff) to $ft.

Suppose diff != 0

Issues come to mind:

the residual type is not part of a recursion group, so I am not sure how to canonicalize it
it doesn't have an explicit supertype chain--should it have a supertype chain of similar residual types?

Confusion on non-nullable function references

Regular function references cannot be null. Yet, all value types have a default value, and for reference types this value is null. This means that if a regular function reference type is null, it is an error. As such, the default value of a regular function reference can be a function which causes an unconditional trap (similarly to unreachable), in the vein of the never type ! in Rust, since the regular function reference should never be the default value (null). This avoids the need for the let instruction while (I think) preserving semantics.

Am I correct in thinking this? Sorry if this has already been discussed or if I have a misunderstanding.

Fix table-elem desugaring

It is not inserting the element type into the elem definition.

Also, 1.0 elem syntax with tableuse but no type has fallen between the cracks.

@CosineP, @dhil, FYI.

Drop null casts?

With the latest generalisations to casts in the GC proposal, the instructions ref.as_non_null, br_on_null, and br_on_non_null from this proposal are actually (more or less) subsumed:

(ref.as_non_null) == (ref.cast <ht>)
(br_on_non_null $l) == (br_on_cast_fail $l null bot<ht>) (drop)
(br_on_null $l) ~~ (br_on_cast $l' null bot<ht>)

where <ht> is the heap type of the respective operand, and bot<ht> the bottom type of the respective hierarchy. In the case of br_on_null, however, the label $l needs to have a different type ([(ref null bot<ht>)] instead of []).

Given this, we could remove these instructions without much loss.

Tables with non-null element type

Currently, table types require a defaultable (nullable) element type, in order to be able initialise a table.

We should probably lift that restriction and extend table definitions with an explicit initialisation value in form of a constant expression, which is optional but required for non-defaultable element types.

`Illegal macro parameter reference` in the spec

Hi @rossberg,

Here the generation of the spec seems to be broken.

Screenshot:

Interpreter issues with func.bind

Related to #11 and #27.

Because there is no type annotation for type of the incoming function argument to func.bind, the runtime representation of a function must include the signature of the function in order for the interpreter to compute how many stack arguments are bound. It must also annotate the new closure with the signature for future func.bind instructions.

It gets hairier with subtyping. Here, the interpreter must compute the new signature from the signature of the incoming function, since it may differ from the signature specified in the immediate (i.e. be a subtype), and then store that signature on the newly bound closure, since that amounts to RTT that might be used when storing the function into a table.

Should include return call

should also have return_call_ref instruction

Issues with the JavaScript spec tests

I've run the current version of the JavaScript tests in this repo in V8, and encountered a couple of issues.

(1) The (generated?) function assert_return() has a bug in the "ref.null" case, where it treats actual as a scalar value, but it's actually an array (see other cases in the same switch for comparison). In other words, in the snippet:

case "ref.null":
        if (actual !== null) {
          throw new Error("Wasm null return value expected, got " + actual);
        };

both occurrences of actual should be actual[i].
This makes the following tests fail:

ref_null
select
table

(2) There are a few tests that require global.get $non-imported-constant-global to be a "constant expression" that can be used to initialize other globals, but that's a change we made in the GC proposal, not here. (It's also not a big deal for any implementations that are planning to ship function-references and GC at the same time.) This affects the following tests:

data
elem
global

(3) The test type-equivalence asserts that (type $t (func (result (ref $t)))) is an invalid type definition, i.e. a type cannot refer to itself. However, if I'm not mistaken, we have decided that standalone type definitions are interpreted as being in an implicit single-entry recgroup, which means that they can refer to themselves.

Subtyping and typed function references

What is the plan for subtyping and typed function references?

Should ref.as_non_null validate when taking an argument of type (ref null)?

Since (ref null) <: (ref null $t), it is legal for ref.as_non_null to be applied on an argument of type (ref null). However, this would later create an error during validation since, due to the typing rule of ref.as_non_null, the stack would need to contain a value of type (ref $t) for some unknown $t.
I understand that this is a degenerate case, but what is its semantics?

[docs] Typo in Overview.md ($x is referred to as $i).

In Overview.md (Typed Function References for WebAssembly), the first code example has a typo in the $inc function. It takes a (single) param, named $x, but uses local.get on a param named $i.

(type $i32-i32 (func (param i32) (result i32)))

(func $hof (param $f (ref $i32-i32)) (result i32)
  (i32.add (i32.const 10) (call_ref (i32.const 42) (local.get $f)))
)

(func $inc (param $x i32) (result i32)      ;; here it's $x
  (i32.add (local.get $i) (i32.const 1))    ;; here it's $i
)

(func $caller (result i32)
  (call $hof (ref.func $inc))
)

Heap type encoding

Could the heap type encoding be changed to s32? It would make implementations slightly simpler, and it seems unlikely that a module would contain more than 2^31 type definitions.

Down casts from `funcref` to `ref t`

I don't see downcasts in this proposal, but they are listed as possible extensions here: https://github.com/WebAssembly/reference-types/blob/master/proposals/reference-types/Overview.md#down-casts

Without these, I suppose the way you can call a funcref is by putting it in a table and using call_indirect? Seems like it would be nice to avoid this. Is it worth adding to this proposal?

Storing function references

What are the options for actually storing a function reference in memory?
Local variables, heap memory etc.

func.bind and call_ref on generic funcref values

(This is vaguely related to #11 and also came up whilst chatting with Luke about funcref and func.bind.)

It's possible to have safe func.bind and call_ref without typed functions, needing only funcref. The mechanism would be the same as for call_indirect: there's a runtime signature check. That's overhead, but we optimize it pretty well for call_indirect.

func.bind_checked $t $u would check that the funcptr argument (on the stack) has the type $t; $u would be the desired type of the bound function, as for the proposed func.bind.

call_ref_checked $t would similarly check that the funcptr argument (on the stack) has the type $t, in the manner of call_indirect.

I'm curious about whether this is worth pursuing as an MVP of the MVP of typed functions, or simply because it is useful functionality when compiling some source languages - after all there is no proposed downcast from funcref to a typed function, yet sometimes a funcref slot may be the best way to store something. I'm not sure what the answer is.

Heap types as table types?

Before this proposal, you can specify a table as having externref or funcref. In the summary of this proposal, it includes the text:

Add a new form of typed reference type ref $t and a nullable variant ref (null $t), where $t is a type index; can be used as both a value type or an element type for tables

However the spec does not appear to be updated to include this extension. If the summary is still accurate, then this change is a TODO :)

Validating local.get in nested blocks with unreachable

Should the following module validate?

(module
  (func
    (local (ref 0))
    unreachable
    block
      local.get 0
    end
  )
)

The declarative rule for unreachable indicates that it is valid for any instruction type, and therefore can set any local in the context, which would allow this module to validate.

The algorithm in the appendix however only checks the top control item for unreachable (ctrls[0].unreachable), which would cause this module to not validate, as there are no other local.set instructions in the module.

Put another way, is 'local.set polymorphism' inherited to nested blocks?

br_on_non_null.wast needs unreachable test

It looks like a copy'n'paste typo. The file is missing unreachable test for br_on_non_null, but has one for the br_on_null.

function-references/test/core/br_on_non_null.wast

Line 30 in 959999d

(return (call_ref $t (br_on_null $l (unreachable))))