Coder Social home page Coder Social logo

tc39 / proposal-source-phase-imports Goto Github PK

View Code? Open in Web Editor NEW
125.0 23.0 8.0 266 KB

Proposal to enable importing modules at the source phase

Home Page: https://tc39.es/proposal-source-phase-imports/

License: MIT License

HTML 100.00%
tc39 javascript proposal

proposal-source-phase-imports's Introduction

Source Phase Imports

Status

Champion(s): Luca Casonato, Guy Bedford

Author(s): Luca Casonato, Guy Bedford, Nicolo Ribaudo

Stage: 3

Stage 3 reviewers: Daniel Ehrenberg, Kris Kowal

Note: Latest spec text is now maintained in the ECMA-262 upstream PR.

Motivation

For both JavaScript and WebAssembly, there is a need to be able to more closely customize the loading, linking, and execution of modules beyond the standard host execution model.

For JavaScript, creating userland loaders would require a module source type in order to share the host parsing, execution, security, and caching semantics.

For WebAssembly, imports and exports for WebAssembly modules often require custom inspection and wrapping in order to be set up correctly, which typically requires manual fetch and instantiation work that is not provided for in the current host ESM integration proposal.

Supporting syntactical module source imports as a new import phase creates a primitive that can extend the static, security and tooling benefits of modules from the ESM integration to these dynamic instantiation use cases.

Proposal

This proposal allows ES modules to import a reified representation of the compiled source of a module when the host provides such a representation:

import source x from "<specifier>";

The source module source loading phase name is added to the beginning of the ImportStatement.

Only the above form is supported - named exports and unbound declarations are not supported.

Dynamic form

Just as with static and dynamic imports, there is a need for static and dynamic access to sources, to be able to support both those sources that are required to be instantiated from source text during initialization of an application, and those that are optionally or lazily created at runtime.

The dynamic form uses a import.<phase> import call:

const x = await import.source("<specifier>");

By making the phase part of the explicit syntax, it is possible to statically distinguish between a full dynamic import and one that is only for a source (where dependencies don't need to be processed).

Optional import attributes may still be specified with the second argument in a with key, just like for dynamic import, and without conflict due to the design of phased evaluation.

Loading Phase

Module source imports can be seen to be one type of evaluation phase.

If the asset references proposal advances in future this could be seen as another type of phase representing an earlier phase of the loading process.

import asset x from "<specifier>";
await import.asset("<specifier>");

Only the source import source phase is specified by this proposal.

Defining Module Source

The object provided by the module source phase must be an object with AbstractModuleSource.prototype in its prototype chain, defined by this specification to be a minimal shared base prototype for a compiled modular resource.

In addition it defines the @@toStringTag getter returning the constructor name string corresponding to the name of the specific module source subclass, with a strong internal slot check.

JS Module Source

For JavaScript modules, the module source phase is then specified to return a ModuleSource object, representing an ECMAScript Module Source, where ModuleSource.prototype.[[Proto]] is %AbstractModuleSource%.prototype.

Future proposals may then add support for bindings lookup methods, the [ModuleSource constructor][module soruce] and instantiation support.

New properties may be added to the base %AbstractModuleSource%.prototype, or shared with ECMAScript module sources via ModuleSource.prototype additions.

Wasm Module Source

For WebAssembly modules, the existing WebAssembly.Module.prototype object is to be updated to have a [[Proto]] of %AbstractModuleSource%.prototype in the WebAssembly JS integration API.

This allows workflows, as explained in the motivation, like the following:

import source FooModule from "./foo.wasm";
FooModule instanceof WebAssembly.Module; // true

// For example, to run a WASI execution with an API like Node.js WASI:
import { WASI } from 'wasi';
const wasi = new WASI({ args, env, preopens });

const fooInstance = await WebAssembly.instantiate(FooModule, {
  wasi_snapshot_preview1: wasi.wasiImport
});

wasi.start(fooInstance);

The static analysis benefits of not needing a custom fetch and WebAssembly.compileStreaming apply not only to code analysis and security but also for bundlers.

In turn this enables Wasm components to be able to import WebAssembly.Module objects themselves in future.

Other Module Types

Any other host-defined module types may define their own host module sources. If a given module does not define a source representation for it's source, importing it with a "source" phase target fails with a ReferenceError at link time.

Host-defined module sources must include %AbstractModuleSource%.prototype in their prototype chain and support the [[ModuleSourceRecord]] internal slot containing the @@toStringTag brand check and underlying source host data.

Security Benefits

The native ES module loader is able to implement security policies, including support for Content Security Policies in browsers. This property does not just impact platforms using CSP, but also other platforms with systems to restrict permissions, such as Deno. These policies are based on protecting which URLs are supported for the compilation and execution of scripts or modules.

Extending the static security benefits of the host module system to custom loaders is a security benefit of this proposal. For Wasm, it would enable source-specific CSP policies for dynamic Wasm instantiation.

Cache Key Semantics

Because [[ModuleSourceObject]] is keyed on the base module record, it will always be unique to the module being imported from.

Q&A

Q: How does this relate to import attributes?

A: Import attributes are properties of the module request, while source imports represent phases of that specific request / key in the module map, without affecting the idempotency of the module load. Both can be used together for a resource to indicate alternative phasing for the given module resource and attributes.

Q: How does this relate to module expressions and compartments?

A: The module object that is provided has been carefully specified here to be compatible with the linking model of module expressions and compartments.

Q: Why not just use const module = await WebAssembly.compileStreaming(fetch(new URL("./module.wasm", import.meta.url)));?

A: There are multiple benefits: firstly if the module is statically referenced in the module graph, it is easier to statically analyze (by bundlers for example). Secondly when using CSP, script-src: unsafe-eval would not be needed. See the security improvements section for more details.

proposal-source-phase-imports's People

Contributors

a-tarasyuk avatar guybedford avatar littledan avatar lucacasonato avatar nicolo-ribaudo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

proposal-source-phase-imports's Issues

Consider cost of `import source from` ambiguity

@waldemarhorwat pointed out and is concerned about the ambiguity of import source from ....

Here are all possible productions:

// regular import, default binding name is `source`
import source from "";
// source phase import, binding name is `from`
import source from from "";

And when taking into account import declarations with local imports:

// with import declarations, regular import, default binding name = source, import declaration = from
import source from from;

Regarding other ambiguity with import declarations local imports, phase imports, and ASI, also see #41.

Due to the disambiguation complexity (even though I think it is bounded to a three token lookahead), I suggest we ban import source from from in the context of source phase imports. This means that: import source from ... is unambiguously a regular import with a binding name of source. With import declarations this could be followed by either a identifier (for local imports), or a specifier string literal.

This reduces the parsing complexity back to a 1-token lookahead.

Thanks @littledan for the suggestion.

@waldemarhorwat, what do you think?

Missing sources should throw a *SyntaxError* when imported

Right now this code throws a ReferenceError, because there are no JS source objects:

import source s from "./mod.js";

I believe this code should throw a SyntaxError, even if might appear conter-intuitive. ReferenceError is only used for code that runs, and not before evaluation. As a very strong precedent ("very strong" because it's basically the same thing), import { x } from "./a" is a SyntaxError is ./a does not export x.

Older versions of the language used to have ReferenceErrors for some errors that happen before evaluation, but in ES2020 we updated all of them to be SyntaxErrors (tc39/ecma262#1527).

Use extensible object syntax?

Would you be open to using an extensible key/value format, to future-proof reflection in case it's needed for other use cases, even though none are implemented now?

import JoyStyle from "./joy.css" as { type: "css", media: "(width > 640px)" }

Parameterized Evaluators

Much like import assertions is a generic mechanism. I could imagine non-WASM evaluators and parameterization of them:

import './model.json' assert { 
  type: 'json'
} as {
  evaluator: 'json-schema', schema: 'https://schema.org/...'
};

Unlike assertions I do think having evaluatior (bikeshed) be a shorthand is fine since it defines what is being parameterized and I cannot think of a reason to drop it ever even if it is always implicitly defined.

Lazy versus eager reflection

One concern still remaining with the current refactoring model is the assumption that [[ModuleSourceObject]] is pre-populated by the module system onto the module record.

My concern here is that this object is a JS object not an internal specification object so would require the JS host to create both a JS object and internal host object representation for every single module loaded.

We could possibly:

  1. Explicitly note that [[ModuleSourceObject]] should not have any observable initialization since it is an ambient object, and therefore that internal optimizations could lazily create it
  2. Have [[ModuleSourceObject]] can be populated by a HostEnsureReflection hook or similar.

This also brings up some questions about realms as well - as a JS object, this object will be specific to the realm, but I suppose that is true of the abstract module record itself. I'm still not clear on how this fits into the cross-realm transfer story though.

Consider a method-like `import` metaproperty for source phase imports

I've seen arguments for why the syntax should be part of with {}, and also for why the syntax should be a modifier following import, but has there been any consideration for using a method-like import MetaProperty?

For example:

const asset = import.resolve("./path"); // instead of `import asset`
const source = await import.compile("./path"); // instead of `import source`
const instance = await import.instantiate("./path"); // instead of `import instance`

Such calls are just as statically analyzable as the top-level import syntax and don't require shoehorning the result of each operation into the various import binding syntaxes, as you do with the phase modifier syntax. It also avoids encoding what is essentially a method name into a one-off phase property of the import call, especially if import(path, { phase: "asset" }) might be non-promise returning.

Related proposal: deferred module evaluation

This proposal seems related to work I've been doing on deferred module evaluation: https://github.com/tc39/proposal-defer-import-eval

The flow is largely the same, either return an eagerly evaluated instance, or defer evaluation until it is needed. there are some discussions of "re-instantiated modules" as well. In your case, you want to load the module and instantiate it, or load the module and then use the module object as necessary, without linking or instantiating. One key difference is that you would have an unlinked and uninitialized but compiled webassembly module, whereas the defer import eval proposal does linking.

Perhaps this is something that should be worked on in a more general way? My thinking is that a complete solution would involve exposing the module loading interface and allow customization of the loader itself. For example, an incomplete sketch (don't think the class syntax will actually work, just illustrating)

// user defined loaders
class DeferLoader extends import.Loader {
 ...
}

class CommonJSLoader extends import.Loader {
 ...
}
//....
import "x" from "y.js" using DeferLoader;
import "commonjsModule" from "z.js" using CommonJSLoader;

import "z" from "<specifier>" using import.WasmModuleLoader;

etc. 

Just thinking what a generic solution might be.

Use special module specifier format

I might be missing the exact requirements and use cases, but from a quick glance at the readme I thought: Why not just use the fragment or query part the specifier URL?

import x from "<specifier>#<reflection-type>"

or

import x from "<specifier>?type=<reflection-type>"

instead of

import x from "<specifier>" as "<reflection-type>"

It won't need special syntax, and afaiu it would have the same cache key semantics as outlined in the readme.

Reflect with transitive dependencies

I have implemented a PoC in a Webpack plugin to support import(spec, { reflect: "module"}). I found the current reflection is too weak to support real-world use cases.

Use case

Import the untrusted npm module as a whole to provide virtualization.

Let's say we have an untrusted npm module mod like this:

// mod/index.js
import { assert } from './utils.js'
import { count } from './count.js'
import { readFile } from 'node:fs/promises'

In the current proposal, if we want to fully virtualize the mod module, we need to manually collect all the transitive dependencies of mod, for example: mod/utils.js and mod/count.js.

This is not practicable and with low performance (you need to reflect every module and manually link them together by the importHook).

Currently: the importHook will be called for ./utils.js, ./count.js, and node:fs/promises

Suggestion

The module reflection, by default, fetches and links all its transitive dependencies, except the module that is intrinsic (has no source and is implemented by the host, like node:fs).

Still the same module above, with this suggestion, it will only call importHook for all Node built-in modules. mod and it's all its transitive dependencies are evaluated in the given Evaluator.

Peer dependencies

If this mechanism does not support opt-out, it will be not usable. Some module does not allow multiple instances (like "react") otherwise it will not work correctly. We can provide an opt-out mechanism like this.

await import("react-window", { reflect: "module", reflectExclude: ["react"] })

In this way, this module reflection will not capture react and we need to manually provide it in the importHook.

Today's behavior

await import("mod", { reflect: "module", reflectExclude: "*" })

Use a "*" string instead of an array to get today's behavior. (not link anything)

Pros

  • Better ergonomics, easier to adopt
  • Performance gain (many modules can be linked by the engine. less user code, better performance)

Cons

  • We need to introduce the resolveHook back again in the compartment proposal.
  • Reflected modules have different identities if the exclude options are different.

Block evaluator attribute

This came up in #7 - would there be any benefit in defining an as "block" evaluator attribute in future?

For example, to load a source text or Wasm module as a module block to pass to another thread one could write:

import linkedWasmModule from './module.wasm' as 'block';
const worker = new Worker(new URL('./executor.js', import.meta.url));
worker.postMessage(linkedWasmModule);

the benefit of this being the ability to pass any linked module (not just blocks) across boundaries, and possibly for some preloading benefits.

In theory one can easily do the above without such an attribute via:

const linkedWasmModule = module { export * from './module.wasm' };
const worker = new Worker(new URL('./executor.js', import.meta.url));
worker.postMessage(linkedWasmModule);

Although the workflow friction in the above is hitting the problem that export * excludes the default export so one does need to reflect the default specifically or not depending on whether it is included, where the attribute would kind of pave the pattern more clearly.

Support both module reflection and module source reflection

I can imagine the use case for both use cases:

reflect module instance

import instance fs from 'fs'
new Module(someSource, {
    importHook(spec) {
        if (spec === 'fs') return fs
    }
})

In this example, I linked the native fs module to another module. This is not achievable if this proposal only supports reflecting the source because fs module does not have a source.

reflect module source

import source mod from 'a-module-that-cannot-be-resolved-under-the-current-host'
new Module(mod, { ... })

In this example, I need to reflect the source of a module whose dependencies cannot be resolved by the host. This is not achievable if this proposal only supports reflecting the module instance (and encourages us to use module.source to get the source) because the import will fail at the first time and there is no chance to get the source.

conclusion

considered the two use cases before, I believe we need to support both case of reflection.

Clearer analysis of cache key situation

This proposal will need to more clearly tackle the cache key questions head on.

Specifically some of the details around:

  • HTML already does double-keying for import assertions in that an assert and non assert for the same module can race I believe (cc @bmeck if I've got this wrong)
  • How this affects the idempotency requirement of host resolve imported module, and how it would be adjusted.
  • If the evaluator attribute would be permitted to alter resolution itself (eg changing the resolved URL).

Reflect string versus boolean

There has been some discussion about whether we should use a string reflection type in dynamic import or rather have a import reflection specific boolean.

Currently we support:

const mod = await import('./mod.js', { reflect: 'module' });

But the alternative would be support:

const mod = await import('./mod.js', { module: true })

and remove any reference to the reflect key or guidance for how other specifications should interact with it. I'm personally open to whatever works best here.

"Source" is a little confusing

It's only after reading the spec that I understood the value accessed when using the source keyword isn't the module's source file content, but rather the module object. Is the name source definitive?

WebAssembly compile-time imports

Hot off the presses, WebAssembly now has a proposal for compile-time imports: WebAssembly/design#1479.

This is a very early stage proposal on the WebAssembly side, but it'd be good for the champions to be aware of. Namely, should imports be passable both during compile time (i.e. the action that produces a WebAssembly.Module), how might this proposal be extended to support that? In other words, there should be an "out" here so if compile-time imports become a thing, that ideally does not preclude usage of this proposal.

At the same time, because the WebAssembly proposal is so early stage, we definitely should not design the JS proposal speculatively.

Thoughts?

Limit to representation of module?

It seems very dangerous to have a generic syntax that allows any engine or build tool to radically change the semantics of importing a specifier based on a separate piece of information. Given that the motivating use case, as i understand it, is for WASM to provide two representations of the same module, could we impose this as a restriction in the spec?

Consider wider evaluator attribute possibilities

Currently the proposal is entirely justified around the Web Assembly use case.

Typically the MIME type for fetch schemes, or the file extension for file schemes is considered the authoritative interpretation of the module format.

The justification for "as" here is entirely based around wanting to interpret the same MIME in multiple ways.

  • Is this something specific to Web Assembly, or does it extend to other formats?
  • Would patterns like as "typescript" want to be encouraged?
  • For those constructing custom host loaders, would custom formats be encouraged? Or treating "as" as an override of the MIME be encouraged?

Fleshing out some of these use cases and considering what sort of ecosystem this spec is guiding in these other types of module format directions seems an important aspect of the proposal.

Name bikeshedding

The current name of the proposal is not very descriptive of what it does: evaluator attributes do not change how a module is evaluated. It instead changes how a module is reflected (represented). Would reflector attributes be a better name? Or representation attributes?

Ref #6

Disallow newlines between phase and identifier

There is a cross cutting concern with module declarations:

module source {}
import source
x
from
"url"

Here it is ambiguous whether this is:

module source {}
import source;
x;
from;
"url";

or

import source x from "url";

We should disambiguate by banning newlines after <phase> in the static syntax.

Thanks @hax for reporting!

Is %AbstractModuleSource% synchronously reachable from script?

On first read it seemed like it wouldn’t be possible to synchronously obtain a reference to %AbstractModuleSource% during script evaluation. If so, it would be the first intrinsic object for which that’s true. This can create problems for virtualization strategies that depend on “first-run-code” status to capture intrinsics and prepare an environment for other code, e.g. “initialization” scripts passed to ShadowRealm.prototype.evaluate prior to running other scripts. Similarly it would seem to prevent “hardening” strategies that freeze intrinsics up-front.

...but maybe it is reachable? %AbstractModuleSource% is said to “[act] as the abstract superclass of the ModuleSource” — and while “ModuleSource” isn’t linked and doesn’t seem to be defined in the current spec text, the readme for this repo makes it sound like that’s supposed to be another new intrinsic. Is %ModuleSource% being specified in another related proposal or is it part of this one and just not in the spec yet? Will it be exposed by a regular global property?

Bikeshedding the `source` keyword

During the 97th TC39 meeting a discussion about bikeshedding the source keyword came up.

@syg has brought up that some folks were confused by the source keyword, expecting it to return unparsed, uncompiled source code.

This keyword represents the second phase of the five stage loading process:
image

It represents early exiting from the module loader right after the module has been fetched, parsed, (and compiled)

Some alternative keywords that have been suggested:

  • instantiable
  • handle
  • parsed
  • compiled

You can vote here: #54

Please keep discussion in this issue.

This issue will be collecting feedback on alternative names until 21 Jul 2023 at 00:00 UTC. If no consensus on an alternative name is reached by then, the proposal goes to Stage 3 with the name source.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.