Coder Social home page Coder Social logo

proposal-json-modules's Introduction

JSON modules

Champions: Sven Sauleau (@xtuc), Daniel Ehrenberg (@littledan), Myles Borins (@MylesBorins), and Dan Clark (@dandclark)

Status: Stage 3.

Please leave any feedback you have in the issues!

Synopsis

The JSON modules proposal builds on the import attributes proposal to add the ability to import a JSON module in a common way across JavaScript environments.

Developers will then be able to import a JSON module as follows:

import json from "./foo.json" with { type: "json" };
import("foo.json", { with: { type: "json" } });

Note: this proposal was originally part of the import attributes proposal, but it was resolved during the July 2020 meeting to split it out into a separate proposal.

Motivation

Standards-track JSON ES modules were proposed to allow JavaScript modules to easily import JSON data files, similarly to how they are supported in many nonstandard JavaScript module systems. This idea quickly got broad support from web developers and browsers, and was merged into HTML, with an implementation for V8/Chromium created by Microsoft.

However, in an issue, Ryosuke Niwa (Apple) and Anne van Kesteren (Mozilla) proposed that security would be improved if some syntactic marker were required when importing JSON modules and similar module types which cannot execute code, to prevent a scenario where the responding server unexpectedly provides a different MIME type, causing code to be unexpectedly executed. The solution was to somehow indicate that a module was JSON, or in general, not to be executed, somewhere in addition to the MIME type. Import attributes provide a mechanism for doing so, allowing us to reintroduce JSON modules.

Rather than granting hosts free reign to implement JSON modules independently, specifying them in TC39 guarantees that they will behave consistently across all ECMA262-compliant hosts.

Proposed semantics and interoperability

If a module import has an attribute with key type and value json, the host is required to either fail the import, or treat it as a JSON module. Specifically this means that the content of the module is parsed as JSON and the resulting JSON object is the default export of the module (which has no named exports).

Each JavaScript host is expected to provide a secondary way of checking whether a module is a JSON module. For example, on the Web, the MIME type would be checked to be a JSON MIME type. In "local" desktop/server/embedded environments, the file extension may be checked (possibly after symlinks are followed). The type: "json" is indicated at the import site, rather than only through that other mechanism in order to prevent the privilege escalation issue noted in the opening section.

All of the import statements in the module graph that address the same JSON module will evaluate to the same mutable object as discussed in #54.

Nevertheless, the interpretation of module loads with no attributes remains host/implementation-defined, so it is valid to implement JSON modules without requiring with { type: "json" }. It's just that with { type: "json" } must be supported everywhere. For example, it will be up to Node.js, not TC39, to decide whether import attributes are required or optional for JSON modules.

Further attributes and module types beyond json modules could be added in future TC39 proposals as well as by hosts. HTML and CSS modules are also under consideration, and these may use similar explicit type syntax when imported.

FAQ

How would this proposal work with caching?

The determination of whether the type attribute is part of the module cache key is left up to hosts (as it is for all import attributes).

Why don't JSON modules support named exports?

"Named exports" for each top-level property of the parsed JSON value were considered in earlier HTML efforts towards JSON modules. On one hand, named exports are implemented by certain JavaScript tooling, and some developers find them to be more ergonomic/friendly to certain forms of tree shaking. However, they are not selected in this proposal because:

  • They are not fully general: not all JSON documents are objects.
  • It makes sense to think of a JSON document as conceptually "a single thing" rather than several things that happen to be side-by-side in a file.

Specification

proposal-json-modules's People

Contributors

bfarias-godaddy avatar dandclark avatar jlhwung avatar littledan avatar ljharb avatar michaelficarra avatar mylesborins avatar nicolo-ribaudo avatar xtuc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

proposal-json-modules's Issues

Editorial notes

The spec text mostly looks good, but I have a couple comments:

  • Synthetic module records hold "an abstract operation" in their EvaluationSteps field. In practice it's not an actual AO, jsut a list of steps which takes an argument. There's now a type in the spec, Abstract Closure, which is intended to represent exactly that thing; it would be good to use it here.
  • The call to ParseJSONModule says "[...] must either invoke ParseJSONModule and return the resulting Module Record, or throw an exception". It is probably worth being more explicit that ParseJSONModule can throw (that is, return an abrupt completion), in which case the algorithm is required to throw.

Rationale for optional exception

The proposal includes the following text:

  • If assertions has an entry entry such that entry.[[Key]] is "type", let type be entry.[[Value]]. The following requirements apply:
    • If type is "json", then this algorithm must either invoke ParseJSONModule and return the resulting Completion Record, or throw an exception.

I don't see why the proposal allows implementers to optionally throw an exception. I've summarized my understanding below to help others find the hole in my reasoning (and potentially identify an opportunity to improve the proposal's documentation).

I'm aware of the security concerns that were initially raised about an early version of this proposal, but it's not clear if/how the optional exception is related. In that issue (and the subsequent conversations at TC39 plenary throughout 2020), it seemed sufficient to give authors a means to communicate the desire, "this resource should never be evaluated as JavaScript." That allows hosts to fail the import based on additional context (e.g. web browsers receiving HTTP responses with inappropriate Content-Type headers).

The part where I get lost is why the proposal text explicitly allows an exception. Even without import assertions, web browsers are already rejecting modules at their own discretion (e.g. due to Content-Security Policy violations). I haven't been able to find any justification for that kind of rejection in ECMA262--the closest I could find was the exception for Module Records that "[do] not exist or cannot be created." My ignorance doesn't prove anything, of course (as demonstrated by the existence of square dancing), but it makes me wonder whether it's absent because it's not necessary. If it is present, I'd be curious to learn why it doesn't apply to this case.

Even if an error is specifically necessary for modules imported using this new syntax, it seems as though the expected exception (and the security guarantee) would be naturally enforced by the semantics of the ParseJSONModule abstract operation. In order to present any danger in this context, the source text would necessarily fail to parse as JSON, resulting in a SyntaxError.

Thanks for the help!

Should JSON modules export Records/Tuples?

As currently proposed, JSON modules have a default export and no other exports, where the default export is the object resulting from parsing the module's content as JSON, like what you'd get from JSON.parse(). In the July TC39 meeting (notes here) @rricard and @erights expressed interest in exporting Records/Tuples instead of an object.

Pro: For folks in favor of #1 , the immutability is a positive. IMO exporting as Record/Tuple is a nicer way of achieving immutability than doing Object.freeze under the hood, although as I stated in the other Issue I'm not convinced that immutability is a requirement.
Con: People have long-standing historical expectations that JSON is an Object (I think @bmeck raised a point like this during the meeting). Any change where JSON modules produce something fundamentally different from JSON.parse() could be a source of confusion and even a drag on adoption. My expectation had been that developers would be able to straightforwardly swap out parts of their code that did thinks like fetch() content that gets fed to JSON.parse() and replace with a simpler import of a JSON module. The more that JSON.parse() differs from JSON modules, the harder this gets. Then again, I see that JSON.parseImmutable() is also proposed so perhaps one could argue that we're looking to expand the expectations about what JSON should be in JS and that long-term this isn't a concern.

Thoughts? I guess this whole issue is contingent on the discussion at #1 about whether JSON modules should be immutable in the first place, so maybe that needs to be decided first.

Disallowing unused values

As far as I know, the following construction, though syntactically valid, would never be used intentionally since ParseJSONModule has no side effects:

import 'data.json' assert { type: 'json' };

In that case, disallowing it could surface programming errors. I'm not sure if the restriction would justify the complexity of another early error, though. Is it better left to linters?

Should JSON modules be frozen?

During the July TC39 meeting, @FUDCo and @erights (and maybe others who I'm missing?) stated that the JSON object exported by a JSON module should be frozen (via Object.freeze) so that each importer has a guarantee that the JSON object they are importing has not been modified from another import site. Meeting notes here. The concern was that not doing so could lead to bugs from cross-module interference, and importers could never be sure that they were getting a JSON object that hadn't been modified.

I'm not fully convinced that this is needed. JS modules are mutable by default, and unless there's a clear history of bugs caused by this I'm not sure that we should change this behavior; it seems like a potential source of confusion if default mutability differs across module type.

If there are cases where it's necessary to freeze an import or guarantee that a given import site gets a fresh copy, that seems like a use case for an evaluator attribute that could be used to achieve this with JSON modules as well as other potential module types. This seems more consistent and flexible than choosing different per-module-type defaults.

I also found this point from @ljharb compelling:

JHD: JSON modules in node for CJS always had this property, bringing in a mutable object, just like the default for every module. I think the vast majority of people do not bother to freeze them, and they haven’t run into these issues. That’s just the way JS works, things are mutable by default...

If this has historically not been an issue with node/CJS then that suggests to me that it would not be a problem for ES JSON modules.

Thoughts?

As a (potentially irrelevant) historical note, when JSON modules were previously added to the HTML spec, they were not frozen.

Should JSON modules be frozen?

@Jack-Works recently and @robpalme previously suggested that JSON modules be frozen. We previously concluded that they would not be frozen.

However, in the context of JSON modules potentially standardized by TC39, maybe we could reconsider, and freeze JSON modules this time. I would be OK either way, personally.

The PR for JSON modules includes them as un-frozen. This wouldn't necessary prohibit a host/implementation from going around and freezing it (depending on the metaphysics of spec reading), but it'd be nice to really have a solid, cross-environment decision one way or another.

I want to argue that this is something that we can iterate on during Stage 2 and need to decide on before Stage 3.

Reviver function upon import?

Could it be possible to use a reviver function upon importing the JSON module?

Maybe another option, in line with the assertions proposal; ex:

const json = await import("./data.json", { assert: { type: "json" }, reviver() { ... } });

The author may want the reviver to be called before the object is placed into the import graph. So that all imports may now yield the revived object for efficiency and performance.

But at that point, it could very well be debated that the author ought to be using a normal ECMAScript/Wasm export:

export default JSON.parse(
    await fetch(
        "./data.json", {
            mode: "cors"
        }
    ), () => { ... }
);

But, paired with @devsnek's suggestion about using a reviver function in the record/tuple proposal, this could be the key to resolving all of the immutability/mutability debates (#1, #2, #3, etc), possibly pleasing the average developer's expectations, while still providing the extra (small) step to make sure that interested developers don't shoot themselves in the foot by using an immutable object instead.

But I do believe that not freezing them or making them immutable becomes a security vulnerability.

For example, a web page author has a large, server-generated JSON module that is needed only upon the user interacting with a specific part of the page.

JSON module contents:

[ "foo", "bar", "quz" ]

They don't want to import it unless it's necessary due to both, server-side and client-side performance and resource costs, so they choose to lazily import it using a dynamic import call, only once the user has interacted with the page.

Example ECMAScript:

async function onClick (clickEvent) {
    const serverData = await import( "./serverData.json" );

    const firstCharacter = serverData[0][0];
}

But, before it was ever imported by the author's code, another module had imported it and modified it:

import serverData from "./serverData.json";

serverData[0] = new Proxy(
    {}, {
        get: do_something_obscurely_malicous_with_attempted_key_access
    },
    ...
);

It could probably be exploited by malicious actors.

But, even if they had accessed it, parsing it into a Tuple first would've thrown a TypeError, as proxies cannot be inside of a tuple/record, safely preventing whatever the malicious actor planned on doing.

Stage 3 Review

This doesn't appear to have changed much since my last review so it is fairly simple:

https://github.com/tc39/proposal-json-modules/blob/master/spec.html#L81

if { type: 'json' } should be assert { type: 'json' }

per updated import assertions spec.

The above text implies that is recommended but not required that hosts do not use moduleRequest.[[Assertions]] as part of the module cache key. In either case, an exception thrown from an import with a given assertion list does not rule out success of another import with the same specifier but a different assertion list.

We probably don't want to imply a recommendation (since web and compatible envs itself won't do this exactly); the text above it isn't necessarily a recommendation but is an observation . Maybe a reframing:

The above text implies that is recommended but not required clarifies that hosts may or may not do not use moduleRequest.[[Assertions]] as part of the module cache key. In either case, an exception thrown from an import with a given assertion list does not rule out success of another import with the same specifier but a different assertion list.

Rest seems fine. 👍🏼

Should named JSON module import be a parsing stage error?

Currently Babel parser throws syntax error for the following cases:

import { foo } from "./foo" asserts { type: "json" };

which conforms to this test262 test.

But the spec only states that JSON module should provide only a single default export, so a SyntaxError should be thrown no later than the module resolution stage. However unlike random modules, a parser can recognize JSON module imports thanks to the import assertion syntax. So should a JavaScript parser throw such errors?

Either way an end user will not observe any differences, but it helps us define the scope.

Context: babel/babel#14816

Request for feedback on draft proposal

I've had a brainstorm for a TC39 proposal that partially builds off the work done on the JSON Modules specification, and I was hoping I could get some feedback from the JSON Modules champions on its viability, especially with regards to web security and browser adoption. I've written it up as a draft on the TC39 discourse:

https://es.discourse.group/t/proposal-parser-augmentation-mechanism/2008

It's a specification for an extensible syntax for defining, declaring, and requesting alternate parsing modes for any given module, much like the JSON Modules specification does. There are two crucial distinctions in behavior between the Parser Augmentation proposal and JSON Modules, however (aside from the obvious one that Parser Augmentation is extensible to other syntaxes besides JSON):

  1. Parser augmentation is based around the concept of input transformation, rather than synthetic modules. Rather than performing a JSON.parse() at import time and storing a closure that creates a synthetic export, an augmentation-based JSON Module would instead translate the source text into the AST equivalent of export default JSON.parse("<the full source text>"), in which all the tokens except for the source text string constant are marked as "synthetic".
  2. Parser augmentation uses ECMAScript to describe the transformations, rather than ecmaspeak abstract algorithms. However, the ECMAScript transformation descriptions are treated much the same way as abstract algorithms: they are not, under normal circumstances, observable by the runtime, and only the outputs matter. Hosts are required to make the algorithm description of any supported transformation type available by returning a string of ECMAScript code from the appropriate capability query, but they are explicitly not required to allow registration of new parse types; and, unless support for a given transformation is mandated by the spec (like the "json" transformation, if this proposal were based on parser augmentation) a host is allowed to refuse to execute a transformation based on any combination of referrer, URL, or import attributes.

Please note that this isn't a suggestion that the JSON Modules proposal be "rebased" against the currently-nonexistent Parser Augmentation module, given its current acceptance status. If anything, the JSON modules portion of the spec could be retconned in some hypothetical future without change of functionality; the equivalent "transformation description" might look something like the following:

async function json(parser) {
  parser.setTokenizerMode(Parser.TOKENIZER_STRING_CONSTANT);
  const stringConstantToken = await parser.parseInput();
  return Parser.syntheticAST`export default JSON.parse(${stringConstantToken})`;
}

This code would likely never get executed by any browser, just as the JSON.parse method will likely not actually get called in the current JSON Modules spec. However, it would provide one piece of functionality that the JSON Modules proposal currently does not: a canonical ECMAScript equivalent for any module loaded as a JSON module. Bundlers and transpilers could use this information to transform a JSON module or any other ECMAScript-equivalent module type defined in the future into a format understood by any host, even ones that don't support Parser Augmentation at all.

Implicite assert type

After reading the proposal, and as a web developer I was slightly uneasy with an aspect of the JSON module syntax:

import json from "./foo.json" assert { type: "json" };

Because of the .json extension it feels like having the assert is, in this specific case, somewhat redundant.

I get through the whole security thread about the necessity to explicitly state the nature of what is imported and fully agree on the risk and mitigation suggested there. That said, I tend to think that we could introduce some basic heuristic to ease developer work (and adoption).

I would tend to suggest that if the URL in the import ends up with a .json extension we should assume assert { type: "json" } because it is the most basic web developer intent. On the other hand if a developer decide to serve some non-json content with a .json extension, then we should just fail at parsing the content as JSON. But if the developer think that this is something legit then in that case, it would make sens to ask them to opt-in for interpretation with for example assert { type: "ecmascript" }. This is counter intuitive but could be necessary in some weird cases.

So to summaries, I'm suggesting something along that line:

// All the following import should be parsed a JSON and nothing else.
import json from "./foo" assert { type: "json" };
import json from "./foo.js" assert { type: "json" };
import json from "./foo.whatever" assert { type: "json" };
import json from "./foo.json" assert { type: "json" };
import json from "./foo.json";

// All the following import should be parsed and interpreted as ECMAScript
import ems from "./foo";
import ems from "./foo.js";
import ems from "./foo.whatever"; // where ".whatever" isn't ".json"
import ems from "./foo.json" assert { type: "ecmascript" }; // Active optin for what is supposed to be JSON

Regarding dynamic import I would not get into that same suggestion for a simple fact: because the first argument of import() can be hidden behind a variable name, the intent of the developer can't be as clear as they think and in that case requiring an explicit type parsing is safer.

Simplify the type definition to 'as type'

Don't know if this proposal is still active and current, but I'd like to suggest a simpler alternative. Instead of the proposed syntax:

import json   from "./foo.json" assert { type: "json" };

I prefer this:

import data   from './data.json' as json
import poem   from './poem.txt'  as json

In fact, this is my real preference, but that is not what this proposal is about:

import data   from ./data.json as json
import poem   from ./poem.txt  as json

The type definition doesn't have to look like an object or programming rule. Just fixed keywords of the import statement can indicate how the parser should interpret the import.

I added the import of poem.txt to clarify that a file extension should not influence the type assumed by the parser.

Future other types can get their own value, like commonjs for CommonJS modules or text for plain text.

The benefits of supporting JSON modules named exports would outweigh the downsides

Summary

Respectfully, I believe not supporting named exports with JSON modules was the wrong decision. Importing JSON modules with named exports is indeed more ergonomic and lends itself to tree shaking.

This was a useful feature in Webpack, but is in the process of being removed due to this proposal. See webpack/webpack#9246

Response to reasons given

From: https://github.com/tc39/proposal-json-modules#why-dont-json-modules-support-named-exports

They are not fully general: not all JSON documents are objects, and not all object property keys are JavaScript identifiers that can be bound as named imports.

Sure, but that doesn't mean it wouldn't be useful. I would prefer to have the feature available even if it has limitations.

It makes sense to think of a JSON document as conceptually whatwg/html#4315 (comment) rather than several things that happen to be side-by-side in a file.

I think it can also make sense to think of a JSON document in terms of different importable fields.

How would it work with a `<script>` HTML tag?

What would be the syntax of importing a *.json module in a browser environment using a <script> tag?

# project files
/project/
  assets/
    data.json
  index.html
// data.json
{
  "value1": 42,
  "value2": {
    "foo": "baz",
    "bar": "qux"
  }
}
<!-- index.html (excerpt) -->

<!-- What would be the attributes? -->
<script src="assets/data.json" type="module" asserttype="json"></script>
<script src="assets/data.json" type="module/json"></script>
<script src="assets/data.json" type="application/json"></script>

<!-- How would I access the values? -->
<script>
  console.log(document.imports["assets/data.json"].value2.foo); // logs "baz"
</script>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.