webassembly / reference-types Goto Github PK
View Code? Open in Web Editor NEWProposal for adding basic reference types (anyref)
Home Page: https://webassembly.github.io/reference-types/
License: Other
Proposal for adding basic reference types (anyref)
Home Page: https://webassembly.github.io/reference-types/
License: Other
This proposal introduces a typed select instruction, which takes reference types and also a return type. The return type is written as a vector of valtype
. Currently only the vector of length 1 is accepted in the v8 validation. I guess the reason it is a vector is for the multivalue proposal, right?
Is there a reason this has a different representation from that of block
, loop
, and if
type when there are multiple types? blocktype
for the multivalue proposal says it is extended to have a type index in the presence of multi values. This will be applied to other instructions such as loop
s and if
s. Wouldn't it be better for select
to be consistent?
cc @tlively
As mentioned at the in-person meeting, this makes me uncomfortable.
I think we can get better functionality and similar performance without requiring embedders to use any specific implementation strategy.
Rather than mandating reference equality, the core spec could say ref.eq
calls an embedding specific equality function. The JS/web spec would define that function to be strict equals. Implementations of course would then be free to inline the strict equals function in the same way they do for JS.
This has concrete advantages above the current spec: it allows for strings (and all other JS types/DOM objects) to be compared. It also removes the need for eqref
altogether, which require extra type checks/marshalling on the JS/wasm boundary.
The latest spec tests seem to produce invalid JS code, e.g. in select.wast
reference-types/test/core/select.wast
Line 285 in 2719ec3
reference-types/test/core/select.wast
Line 288 in 2719ec3
ref.func
is unknown by JS (ReferenceError: ref is not defined
) and therefore produces an error. Instead, the result should be compared to the $dummy
function reference which needs to be made accessible.
Similarly in br_table.wast
:
reference-types/test/core/br_table.wast
Lines 1499 to 1510 in 2719ec3
Also, the join-funcref
test should maybe be using anyref
as type instead of funcref
:
reference-types/test/core/select.wast
Lines 45 to 51 in 2719ec3
This makes writing transformation passes in toolchain somewhat tricky. I'm not necessarily arguing for allowing it at this point, but I don't precisely remember what the reason for disallowing that was.
Blocked on bulk ops proposal to update table.
Since tables and globals (as far as I can tell) are not shared in the current threading proposal, are reference-types (and eventual gc-types) intended to always stay in a single wasm instance?
test/core/ref_func.wast
seems to think the table comes first, before the sig. e.g:
reference-types/test/core/ref_func.wast
Line 33 in 1e05d9b
(func (export "call-g") (param $x i32) (result i32)
(table.set $t (i32.const 0) (ref.func $g))
(call_indirect $t (param i32) (result i32) (local.get $x) (i32.const 0))
)
Whereas the spec test says it comes after:
call_indirect (type $t) $x : [t1* i32] -> [t2*]
iff $t = [t1*] -> [t2*]
and $x : table t'
and t' <: funcref
wabt also seems to have implemented the later: https://github.com/WebAssembly/wabt/blob/a147d92575d386ef45c75a3e492c8ca4d33a3bbc/test/parse/expr/reference-types-call-indirect.txt#L13
I hope the tests, rather than the spec, can be updated since add an extra optional parameter seems to be much simpler to implement in the parser.
Before we removed anyref, the ref.is_null
instruction had a canonical type:
ref.is_null : [anyref] -> [i32]
One piece of the fallout from removing anyref was that this no longer worked. In order to avoid a dependency on the outcome of the wider discussion opened in WebAssembly/function-references#27, I added a type annotation on the instruction, so that it became
ref.is_null <reftype> : [<reftype>] -> [i32]
(with the understanding that the <reftype>
would later be refined to a <heaptype>
as per the typed (function) references proposal).
However, given that the discussion on WebAssembly/function-references#27 seems to show a common sentiment to avoid redundant type annotations -- especially considering the many more affected instructions added in something like the GC proposal -- it would be unfortunate if ref.is_null
became an outlier. And having adapted all the tests, I can say that it is quite annoying in practice, too (ref.null is tedious enough already).
So I propose removing the annotation and changing the instruction to
ref.is_null : [<reftype>] -> [i32]
such that the a linear validator simply has to check that there is some<reftype>
on the stack.
Thoughts?
Since table.grow
will need to take a default argument to use for initialization, table.grow(0)
is not a plausible mechanism for obtaining a table's length. So we should include a mechanism for that. Since it's memory.size
(nee current_memory
), let it be table.size
.
Table stuff in the JS API needs to be reworded to account for general references.
Also, table constructor and grow method need optional init argument.
(This idea came up after yesterday's discussion about the GC extension. I have tried to describe it here in a self-contained matter, but let me know if there are any terms I forgot to define or motivations I forgot to provide.)
Having funcref
be a subtype of anyref
forces the two to have the same register-level representation. Yet there are good reasons why an engine might want to represent a function reference differently than an arbitrary reference. For example, function references might always be an assembly-code pointer paired with a module-instance pointer, effectively representing the assembly code compiled from a wasm module closed over the global state of the specific instance the function reference was created from. If so, it might make sense for an engine to use a fat pointer for a function reference. But if funcref
is a subtype of anyref
, and if it overall makes sense for arbitrary references to be implemented with normal-width pointers, then that forces function references to be implemented with normal-width pointers as well, causing an otherwise-avoidable additional memory-indirection in every indirect function call.
Regardless of the reason, by making funcref
not a subtype of anyref
, we give engines the flexibility to represent these two types differently (including the option to represent them the same). Instead of subtyping, we could have a convert
instruction that could take a function reference and convert it into an anyref
representation, or more generally could convert between "convertible" types. The only main benefit of subtyping over conversion in a low-level type system is its behavior with respect to variance, such as co/contravariance of function types, but I see no such application for funcref
and anyref
. And in the worst case, we could always making funcref
a subtype of anyref
later if such a compelling need arises.
?
Intuitively I understood anyref as a new value type. Therefore I expected anyref to be allowed in signatures, locals, and globals. The current proposal does not mention globals though. I wonder if we should support anyref globals, just for symmetry to the other value types.
I am missing a design rational for this proposal in the same vein as the design rational documents for the Wasm specification that can be found here: https://webassembly.org/docs/rationale/
One of the questions I was having is why this documents proposes to add the questionable null-reference and all the infrastructure (is_null
) around it. To me coming from a C++ background it is quite confusing because semantically references should be assumed to always point to something, aka they are non-nullable pointers - same concept in Rust. Pointers in these languages on the other hand are nullable. However, there is also this interesting quote from the original designer of the null-pointer where he refers to it as a 1-billion dollar mistake: https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare/
Which make people again wonder why we would want to repeat this in Wasm.
Note that I am not completely against adding nullable pointers to the Wasm spec I just really am in need of a rational of the why we cannot or do not want go without them.
There will be many more of those questions for upcoming Wasm proposals. Maybe we should enforce a dedicated rationals doc or section for all of them to quickly explain certain designs.
Since this change to bulk memory bounds checks are now performed before fill operations:
WebAssembly/bulk-memory-operations#123
However table_fill.wast still includes tests that depend on partial execution of the fill.
As part of the effort to drive both this proposal and the bulk memory proposal toward shipping status, let's nail down the opcode encodings. (The bulk memory proposal depends on what we choose for ref.null
and ref.func
, since those are used to express passive element segments.)
The spec interpreter in this repo has some TODO comments around the opcode encodings, and some opcodes are missing from the interpreter at present, and all proposed opcodes are precious single-byte ones.
From the interpreter in this repo we have:
0x25 == table.get
0x26 == table.set
0xd0 == ref.null
0xd1 == ref.is_null
0xd2 == ref.func
From the bulk memory proposal we have these proposed codes:
0xfc 0x08 == memory.init
0xfc 0x09 == data.drop
0xfc 0x0a == memory.copy
0xfc 0x0b == memory.fill
0xfc 0x0c == table.init
0xfc 0x0d == elem.drop
0xfc 0x0e == table.copy
In addition we need opcodes in this proposal for table.grow
, table.size
, and (possibly) table.fill
.
We are in somewhat short supply of single-byte opcodes so I propose that we (a) change the encoding of ref.func
since is not likely to be a very common opcode, and (b) allocate prefixed opcodes also for the three table operations mentioned above, yielding the following table for the present proposal:
0x25 == table.get
0x26 == table.set
0xd0 == ref.null
0xd1 == ref.is_null
0xfc 0x0f == table.grow
0xfc 0x10 == table.size
0xfc 0x11 == table.fill
0xfc 0x20 == ref.func
with the idea that 0xfc 0x20
can be the start of the group for multi-byte gc/reftypes operations, and 0xd0
remains the start of the group for single-byte gc operations.
@rossberg @binji @lukewagner @titzer, opinions?
Nothing in here is meant to change the behavior of the proposal. It is just a suggestion on how to formalize the existing behavior.
The extern value cache doesn't seem to serve a purpose.
One purpose I can imagine is as a formalization device: it associates a natural number that represents a JS object. But that purpose seems addressible by letting externaddr
in the core spec be an abstract set of values specified by the embedder, in which case ref.extern
could take a JS object as its argument. (That's a matter of taste though, so I'm happy to leave it as is.)
But the extern value cache also seems to be ensuring that every JS object has at most one natural number associated with it as well as determinizing what that number is. Since this number has no way of leaking into either wasm or JS (or at least I would think it shouldn't), this seems unnecessary.
Surprise, using a sentinel value (-1) for memory.grow
is coming back to bite us. For table.grow
should preferably behave analogously. and the proposal currently says that table.grow
also returns -1 (i.e., 2^32-1) in case of error. However, that does not really make sense, because we allow table sizes to span the entire u32 value range.
I can see these possible options:
Leave as is. Pro: symmetry with memory.grow
; cons: quite a wart and sets a bad precedent.
Keep -1 but disallow 2^32-1 as a table size. Pro: symmetry with memory.grow; cons: weird discontinuity, arguably backwards as a "fix", technically a breaking change
Use zero as a sentinel value. Pro: less arbitrary, in practice you never want to grow by 0; cons: still a bit hacky.
Always return previous size. Pro: no sentinel; cons: checking for error (via comparing with table.size
afterwards) requires more work and is potentially racy in a future with shared tables.
Return success status instead of previous size. Pro: simple and simplest to use; cons: getting old size would require separate call to table.size
, which again is potentially racy
Use multiple results, success status + old size. Pro: serves all purposes; cons: more complex, creates dependency on multi-value proposal
Return an i64. Pro: solves problem, but only for the "wasm32" equivalent of tables; cons: does not avoid sentinel, coherence would suggest to make table size an i64 everywhere
@binji, @lukewagner, @lars-t-hansen, @titzer, I am probably leaning towards option 5 (Boolean result), but would like to hear your opinions. Is there a strong use case for getting the old size atomically? In general, I would be inclined to avoid sentinel values.
While the current proposal allows them to be null, wouldn't it be better to have a set of refs that can be null and a set of refs that cannot so that call_indrect on a func ref doesn't require a nullcheck, etc?
This proposal introduces multiple tables, a new thing. This means that some existing operations must be expanded (call_indirect needs to carry an optional table index, although it has space for this) and that some proposed operations must take the multi-table case into account (table.copy, table.init). We should be sure to harmonize with the bulk memory proposal now.
The bulk table operations have a (misnamed) memory
varu32 operand that must currently be zero and can be repurposed as a flags field /or/ as a table index field, see here, depending on how we like it. For table.copy there can be two table indices, so a flags field is probably more or less inevitable for that instruction.
I propose that we uniformly use a varu32 flags field for all the table operations: table.get, table.set, table.grow, table.fill, table.size, table.copy, and table.init. (table.drop is arguably misnamed but that's a discussion for elsewhere) and that this field is either zero, indicating default operands (table zero for every instruction except table.copy; src=0 and dest=0 for table.copy), or a bit indicating that a table index follows after the flag word, or, for table.copy, that two indices follow, for dest and source.
Normally, having a flags field will add zero overhead when using table zero, and one byte of overhead when using other tables.
For prototype code I'm writing for Firefox I've gone with the flag value 0x04 to signify the presence of table indices, to fit in with the other flags that we currently use for memories and tables (0x00=Default, 0x01=HasMaximum, 0x02=IsShared).
@rossberg, opinions? We don't have to use a flags field everywhere but we probably have to use it for the bulk table operations, which strongly suggests using one for table.grow and table.fill and table.size; table.get and table.set and call_indirect might reasonably be considered a separate class of instructions. The cost of using a flags field there seems slight. We can choose not to do it, but then any future extensions that could have used a flag will require new instructions or out-of-band encodings in the table index.
The envisioned table.grow instruction takes a required fill value as a second argument. We should similarly change WebAssembly.Table.prototype.grow to accept a fill value, indeed for non-nullable element types we must have a fill value. Here are some notes about that.
Backward compatibility concerns when the table is table-of-anyfunc:
For table-of-anyref it is probably most correct if the default value for the optional second argument is null, not undefined, since null is a value that is in wasm, unlike undefined. (This matters only because undefined is representable as an anyref value and is thus a candidate at least in principle.) FWIW, the table.grow instruction cannot talk about undefined values without getting them from the host, but it can synthesize null. A JS caller to WebAssembly.Table.prototype.grow() can pass an undefined value explicitly as the second argument to initialize new slots with that value.
Long-term it's inevitable that the behavior of W.T.p.grow depends on the table type anyway; for non-nullable element types there can be no default, for example. We could choose now to require the second argument also for anyref tables, we would just have to leave anyfunc tables as a special case.
With subtyping, and lacking sufficient type annotations, some operations may require the validation algorithm to compute the least upper bound (lub) or greatest lower bond (glb) of two types to accurately infer an output or input type, respectively. In current Wasm, this comes up with two instructions:
select
: the result type is the lub of the two operand typesbr_table
: the operand type is the glb of the label types(Similar issues would come up with ops like dup
or pick
, which might also require a glb of the use edges when the input type is not known a priori, i.e., in unreachable code.)
While lub and glb are easy to compute with the tiny subtyping lattice introduced by this proposal itself, it will not stay that way with future reference extensions (e.g., typed functions or GC types), which will make it both more complex and more costly. In accordance with the design goals for Wasm validation, we should make sure to avoid the need for computing lubs/glbs altogether.
Possible Solution:
For select
, the only option seems to be introducing a new type-annotated version, and restricting the pre-existing unannotated version to numeric types for backwards compatibility (fortunately, only trivial subtyping is available on numeric types).
Without going into technical details, the glb cases (br_table
, dup
, pick
) can most easily be avoided by adding a bottom type to the type system (a least type in the subtype lattice). Effectively, this is already present in the MVP validation algorithm, to type unreachable stack slots; promoting it to a proper type itself is a natural generalisation in the presence of subtyping. (Note that this type doesn't need not to be expressible in programs, it's sufficient if it exists in the typing rules.)
Currently, local.tee
validation requires C.locals[x] = t
and br_if
validation requires C.labels[l] = [t?]
. This causes the following two examples to fail validation in the spec interpreter when intuitively they seem valid:
(module
(func (param $p funcref)
(local $x anyref)
(local $y funcref)
local.get $p
local.tee $x
local.set $y
)
)
(module
(func (param $p funcref)
(block $b (result anyref)
(block $c (result funcref)
local.get $p
(br_if $b (i32.const 1))
br $c
)
)
drop
)
)
If instead the =
was replaced with <:
in the abovementioned validation rules, then I think these examples would validate.
Trying to think if this actually matters, you could imagine that it would be useful to have the property that:
(local.set $local1 (local.get $x))
(local.set $local2 (local.get $x))
was always equivalent to:
(local.tee $local1 (local.get $x))
local.set $local2
so that a size-optimizer could simply recognize and replace this pattern.
With the recent change to introduce the bottom type, the validation rules for br_table
changed. However, br_if
did not get changed accordingly.
Consider the following two functions, the first function uses br_table
, with one additional block to simulate the fall-through of br_if
. The second function uses br_if
directly.
(func (result f32)
(block (result f32)
(block (result i32)
unreachable
br_table 0 1 1
)
f32.convert_i32_u
)
)
(func (result f32)
(block (result f32)
unreachable
br_if 0
f32.convert_i32_u
)
)
The function with br_table
validates, because the result type of the br_table
instruction is BOT
. The function with br_if
, however, does not validate. For br_if
the result type of the fall-through case is not considered when its result type is calculated.
Should we adjust the validation of br_if to match the validation of br_table?
There are merge conflicts between this repo and the spec master branch:
CONFLICT (content): Merge conflict in test/core/select.wast
CONFLICT (content): Merge conflict in test/core/linking.wast
CONFLICT (content): Merge conflict in test/core/imports.wast
CONFLICT (content): Merge conflict in test/core/globals.wast
CONFLICT (content): Merge conflict in test/core/exports.wast
CONFLICT (content): Merge conflict in test/core/br_table.wast
CONFLICT (content): Merge conflict in test/core/binary.wast
CONFLICT (content): Merge conflict in interpreter/valid/valid.ml
CONFLICT (content): Merge conflict in interpreter/text/parser.mly
CONFLICT (content): Merge conflict in interpreter/text/lexer.mll
CONFLICT (content): Merge conflict in interpreter/text/arrange.ml
CONFLICT (content): Merge conflict in interpreter/syntax/types.ml
CONFLICT (content): Merge conflict in interpreter/syntax/operators.ml
CONFLICT (content): Merge conflict in interpreter/syntax/ast.ml
CONFLICT (content): Merge conflict in interpreter/script/js.ml
CONFLICT (content): Merge conflict in interpreter/runtime/memory.mli
CONFLICT (content): Merge conflict in interpreter/runtime/memory.ml
CONFLICT (content): Merge conflict in interpreter/exec/eval_numeric.ml
CONFLICT (content): Merge conflict in interpreter/exec/eval.ml
CONFLICT (content): Merge conflict in interpreter/binary/encode.ml
CONFLICT (content): Merge conflict in interpreter/binary/decode.ml
CONFLICT (content): Merge conflict in interpreter/README.md
This prevents the testsuite mirror's update script from updating the reference-types tests.
As in any interlanguage mapping, no mapping is perfect but some are more useful than others.
NOTE: It may already be too late to make this change, even if we agree it would have been a good idea.
Wasm null is used for the uninitialized value on of values of nullable types. For nullable types, null will be the unique platform supported value to indicate that the expected value is absent. This corresponds most closely to the JS undefined
rather than the JS null
. For the ES specification itself, undefined
is consistently treated as the indicator of absence:
var x; // x === undefined
let y; // y === undefined
function foo(x) { console.log(x); }
foo(); // unbound parameters bound to undefined
foo(); // non-return returns undefined
({}).z; // absent properties read as undefined
function bar(x = 3) { return x; }
bar(); // 3. unbound parameters use default value
bar(undefined); // 3. parameters bound to undefined act as if unbound
bar(null); // null. null does not emulate unbound
(new Map()).get("x"); // undefined indicates absence
I understand this is very late in this proposal's lifespan, so likely nothing can be changed at this point, but I'm wondering about whether move semantics were ever considered for references.
For example, in a situation where you're opening and closing a reference to a file (like in this example in the type imports proposal), you are required to have a "zombie" state where the file is closed internally, but there may still be references to the File alive. If some sort of move semantics were allowed, where you had an owning ref, and a non-owning ref, this kind of thing could be avoided.
I haven't really thought through whether this is possible without a really strong type-system (and lifetimes or something similar), so really I'm just wondering if there is prior discussion about something along these lines.
As far as I can tell, there is no link to a formatted copy of the modified JS API. Would it be possible to add this to the README?
The anyref/anyfunc allows a function to exported from a WASM module dynamically - as part of the returned value of another exported function. Similarly for imports.
This effectively blows apart any discipline for what is exported from a WASM module. It also complicates implementation because potentially every WASM function may become accessible from JS.
The table.set operator in conjunction with indirect calling of functions from any table allows monkey-patching of functions.
With multi-table we need to worry about how many tables are many enough and how the implementations align.
There are merge conflicts between this repo and the spec master branch:
Auto-merging interpreter/text/parser.mly
CONFLICT (content): Merge conflict in interpreter/text/parser.mly
Auto-merging interpreter/text/lexer.mll
CONFLICT (content): Merge conflict in interpreter/text/lexer.mll
Auto-merging interpreter/text/arrange.ml
CONFLICT (content): Merge conflict in interpreter/text/arrange.ml
Auto-merging interpreter/script/script.ml
CONFLICT (content): Merge conflict in interpreter/script/script.ml
Auto-merging interpreter/script/run.ml
CONFLICT (content): Merge conflict in interpreter/script/run.ml
Auto-merging interpreter/script/js.ml
CONFLICT (content): Merge conflict in interpreter/script/js.ml
Auto-merging interpreter/README.md
CONFLICT (content): Merge conflict in interpreter/README.md
Auto-merging document/js-api/index.bs
CONFLICT (content): Merge conflict in document/js-api/index.bs
This prevents the testsuite mirror's update script from updating the reference-types tests.
The bulk proposal adds new table instructions, while this proposal adds multiple tables. The intersection of the two proposals has been implemented and spec'ced here after rebasing this on bulk, but we also need to extend the existing tests for those bulk instructions. In particular, test table.copy
between different tables is a somewhat interesting case.
The respective tests are generated by scripts. @lars-t-hansen, would you or somebody else familiar with those scripts be willing to extend them?
In #8, ToWebAssemblyValue gained an additional error argument; the current spec only passes it in a single call site. It seems like it had a caller that passed LinkError
at the time. Presumably all callers should pass a value here; otherwise the argument should be marked as optional.
Alternatively, the algorithm should always throw a TypeError
; that's what it already throws when ToInt32
and friends fail.
The currently stated type of the ref.func instruction is [] -> [funcref]. I assume that with the typed functions proposal, this return type will be refined to be the actual type of the function referenced in the immediate. Is it worth mentioning this in the overview? I can't see a case where this proposal isn't future-proof, but it might be good to state it here to double-check our understanding.
I've noticed a new select
variant when implementing the proposal as well as in Binary Encoding section of the modified spec, but it doesn't seem to be documented in the Overview, making it unclear if it's part of the proposal or comes from elsewhere.
(Now I know it's part of the proposal, but still worth clarifying IMO.)
Blocked on bulk ops proposal to finish spec text.
Rather than continue down the road of having tables (editorial comment: arguably tables are the ugliest part of wasm) and having multiple tables, together with table.set etc., a more scalable and powerful approach would be to use managed memory.
Specifically, the array type in the MM proposal looks a lot like tables; with the advantages of being part of a holistic approach to memory.
Since we do not have MM at the moment, I would suggest refactoring multiple table so that it would become part of the ultimate MM proposal. (May no be possible of course).
The proposal currently allows getting and setting elements from anyfunc tables, but provides no way of producing such values separately, which is a weird functionality gap.
We could already add the instruction ref.func $f
for creating a reference to a given function $f
, which is currently mentioned in the function reference extension only. One caveat is that in the baseline type system we cannot yet provide it with its ideal type, we can only support
ref.func $f : [] -> [anyfunc]
However, thanks to subtyping, it would be a backwards-compatible change to refine that typing rule to return the more specific type (ref <functype-of-$f>)
in later versions. So it seems save to add it now with weaker typing.
There are merge conflicts between this repo and the spec master branch:
CONFLICT (content): Merge conflict in document/core/appendix/properties.rst
This prevents the testsuite mirror's update script from updating the reference-types tests.
After going through the open issues, there appear to be two substantive core language issues (#55, #60), both of which can probably be postponed if we want to or if we can't find agreement quickly. There is one corner case (#40) that should probably be resolved in favor of current behavior, because sentiment in the discussion is tilting in that direction. The JS API also has a small number of mop-up issues that may or may not already be done (#9, #20, #22, #51); investigating. The remaining open issues are discussion items or feature requests that have been postponed.
The process document requires these for phase 4 (with my comments about status):
We may need to ship bulk memory at the same time, as the two proposals touch in various ways, but if anything that proposal is even closer to a shipping state.
The instruction seems quite similar to the i32.eqz instruction, so I think ref.eq_null would be more consistent than ref.is_null.
While working on the function references and GC proposals I noticed that some naming conventions become a bit arbitrary again. For example, for function references we currently have:
ref.func
func.bind
call_ref
I'd like to make at least the first two more consistent by renaming ref.func
to func.ref
. The idea being that all instructions producing/consuming a particular class of reference have its prefix.
That might also suggest renaming call_ref
to func.call
, though that may be confusing with plain calls. Not sure we can do much about calls, that ship has probably sailed. Maybe call
and call_indirect
should have been func.call
and table.call
, but then what about call_ref
?
Would it be possible to post an updated version of https://webassembly.github.io/reference-types/core? (It says it was last updated May 12, 2018.) Thanks!
I'm implementing a language runtime that mixes linear memory allocated code with anyref's that need to persist in a "heap", in the case of Wasm that means stored in a table (table 1 anyref)
. In case readers aren't familiar, linear memory cannot directly store anyrefs, but you can store it in a table at a particular index, then store that index in linear memory.
The tricky part is when you want to reuse old, now unused, indices so you're not growing the table size indefinitely. When that table index is "freed" by removing the anyref from that particular index, the anyref itself can potentially be GC'd (if nothing else has a reference to it) but you need some way of knowing that index can be reused now.
I couldn't find any discussions around this (apologies if I missed them!) so I'm curious what folks think the ideal approach is for handling this? The most obvious might be some sort of mark-compact like thing, but it's not clear how one would update the table indices stored in linear memory. Seems like you'd almost have to have another indirection; two linear memory dictionaries of [table_index]: pointer
and [pointer]: table_index
so you can change what table index your fake "pointer" points to.
Another solution might be keeping a linear memory vector of freed indices that have been reclaimed, using up those first before appending to the end of the table or resizing.
Rather than freeing developers from managing tables, this proposal forces them to do so. Furthermore, given that tables represent the boundary between JS and WASM (say) neither side will have a clear way of indicating which parts of the table are actually in use -- forcing some convention such as null for empty tables.
Imagine the scenario where an application in WASM needs to create an entity that must be referenced from the table dynamically (e.g., an object). From the WASM side, there will need to be a map of available slots in the table which can be used to store the reference. But the same table also 'belongs' to the host -- who also has to find a free location within the table for a value intended to be passed to the WASM module.
Examples include string values that are computed by a wasm module, objects that are created by a wasm module to be passed back in JS callbacks, ...
What ist the relation between the externref
and anyref
type. In the overview both are used. I have thought anyref
was renamed to externref
. But now both terms are in the overview. I am confused.
Following issue #31, the spec was recently changed to disallow function references in global definitions or function bodies that are not explicitly listed in the element section.
Because the element section occurs in the binary right before the code section and therefore after the all other header sections in Wasm modules, this means that the validation of function references needs to be deferred until the elements section is parsed.
Is this an intended complication or would it help allowing arbitrary function references in the header and extend the allowed references in the code section to any function references in any of the header sections?
At the moment, the JS API spec requires that if the value argument to the WebAssembly.Global constructor is undefined then the default value for the Global's type is stored. For a Global of type anyref this probably means that the value stored is null. But this is counter to the intent of anyref, which is that it can represent (through boxing) any host type faithfully.
This is not a compatibility issue since anyref is a new thing and the problem is not observable with older types, it's just a slight complication of the algorithm for Global's constructor. I'm not sure if this means the WebIDL spec for that constructor needs to change, ie, if the current behavior is a result of some WebIDL standard behavior. If it does need to change then compatibility may be slightly at risk, though no doubt a fix can be found.
The online specification indicates that the table segment is encoded first and the element segment is encoded second, but this doesn't agree with other implementations, for example:
memory.init
doesn't match the specification as well (?)Is this a bug in the specification? Or do engines need to update which one they're reading first?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.