wintercg / fetch Goto Github PK
View Code? Open in Web Editor NEWThis project forked from whatwg/fetch
WinterCG Fetch Standard
Home Page: https://fetch.spec.wintercg.org/
License: Other
This project forked from whatwg/fetch
WinterCG Fetch Standard
Home Page: https://fetch.spec.wintercg.org/
License: Other
Use the compliance modes from #19 throughout the spec:
E.g. fetch('/foo')
, what should the whole URL should be?
As fetch()
is a fairly well-known API, and a bunch of folks are used to consulting the WHATWG fetch, would it make sense to have a section in the WinterCG spec that summarizes where the differences between these two specs are?
I think that would help a couple of decision makers feel more comfortable following the WinterCG spec where appropriate.
(This is probably generally true for WinterCG specs that are forks of existing web specs.)
The WHATWG Fetch spec says this regarding fetching of URLs using the file:
protocol:
For now, unfortunate as it is, file URLs are left as an exercise for the reader.
When in doubt, return a network error.
That'd be nice to have a standard way for handling those.
The fetch()
spec forbids to read and store cookies because of https://developer.mozilla.org/en-US/docs/Glossary/Forbidden_header_name.
How should handle cookies in server environments?
Most implementations have allow them in some way, however this changed between all the implementations:
I think it would be a good way to create a shared standard for this behavior.
We should discuss things like cors behavior - there are questions/suggestions about this in nodejs/undici#1315 (comment)
I think https://deno.land/manual/runtime/web_platform_apis#spec-deviations is a good baseline but I would request the following deviations for what we standardize:
Of course, this would need to be bike-shedded and written more formally. Please suggest any more deviations we'd want here.
Note this list omits the handling of file:
urls. Node.js does not wish to implement file url support at the moment because of security concerns. People (@mcollina for example) have raised good concerns it would be too easy to get a file
url from a user and pass that to fetch
. I think it's probably fine for servers/edge to deviate on this?
The fetch
API is becoming the standard HTTP client for server usage. When proxying back a fetched resource to the user-agent, here's what i'd intuitively do:
async function myEndpoint(request: Request): Promise<Response> {
return await fetch("https://www.gatsbyjs.com/Gatsby-Logo.svg");
}
The URL in this example responds with a gzip-compressed SVG. As specced in 16.1.1.1 of 15.6 HTTP - network fetch
, the body
stream contains an uncompressed version of the SVG, making compression transparent to the user. This works well for consuming the SVG.
However, the spec does not require headers to mirror this decompression. Although the Response
has uncompressed body, the Content-Length
headers shows it compressed length and the Content-Encoding
header pretends it's compressed. In nodejs/undici#2514 (the issue I originally opened), I've included a repro of what this means if the Response
is proxied to the user-agent:
body
ends up being longer than the headers described. This is not allowed in the HTTP spec, and leads clients like cURL to warn.The current workaround is to manually alter headers on responses from fetch
:
async function myEndpoint(request: Request): Promise<Response> {
let resp = await fetch("https://www.gatsbyjs.com/Gatsby-Logo.svg");
if (resp.headers.get("content-encoding") {
const headers = new Headers(resp.headers)
headers.delete("content-encoding")
headers.delete("content-length")
resp = new Response(body, { ...resp, headers })
}
return resp
}
This is cumbersome and should live in library code. It can't live in frameworks, because it's impossible to differentiate a response produced by fetch (content-encoding header, but uncompressed body) from a compressed Response created in user code (same content-encoding header, but compressed body).
Instead of this workaround, fetch
should require the content-length
and content-encoding
headers to be deleted when response body is decompressed.
Some implementors already do this, including Deno and workerd. Netlify might implement the same for Netlify Functions 2.0.
I've checked undici (Node.js), node-fetch, Chrome and Safari - all of them expose the behaviour explained above, where the headers don't match the body.
An alternative solution for this would be whatwg#1524 - that way, frameworks can use the decompressed body field to tell compressed responses from uncompressed ones.
Summary:
What's the problem? compressed responses having content-encoding
header, but decompressed body
leads to incoherent Response
What's the usecase? mostly reverse proxying
Is it relevant to upstream? Yes, potentially because of service workers.
What's my suggestion for a fix? Upon decompression, content-length
and content-encoding
headers should be deleted.
How do engines deal with this? Most have the bug, but Deno and workerd implemented the fix I propose.
The fetch spec includes APIs for interacting with form submissions. For example, there is the Request
and Response
constructors accepting URLSearchParams
and FormData
objects as the request/response body, which is generally useful and is expected to be part of the common minimum API.
However, the fetch spec also defines the formData()
method of the Body
interface mixin, which is included in Request
and Response
. This method parses the HTTP body as a form submission enctype (either application/x-www-form-urlencoded
or multipart/form-data
) and returns a FormData
object. Since form submission bodies only generally make sense as requests, and it's rarely useful to parse a request body from an HTTP client, it wouldn't make much sense to include this method as part of the common minimum API – but it is certainly useful for fetch-based HTTP server APIs, as Deno and CFW have.
For multipart/form-data
parsing, however, this method leaves things almost completely unspecified. While there is a formal definition of this format (in RFC7578, which relies on the multipart
definitions in RFC2046), it is in the form of an ABNF grammar rather than a parsing algorithms, and so different implementations differ in how they parse some input.
What's more, browsers have not always escaped field names and filenames in multipart/form-data
payloads in the same way. For example, until last year Firefox escaped double quotes by prepending a backslash, and newlines by turning them into spaces; while Chromium and Webkit used percent-encoding. And while this percent-encoding behavior was added to the HTML spec (whatwg/html#6282), and FIrefox's behavior fixed in turn, no implementation of the parsing that I'm aware of (including Chromium and Webkit!) decode the percent-encoding escapes:
const original = new FormData();
original.set('a"b', "");
original.set('c"d', new File([], 'e"f'));
log(original); // a"b c"d e"f
const parsed = await new Response(original).formData();
log(parsed); // a%22b c%22d e%22f
// (In CFW it's a%22b c%22d undefined, because it seems like files are not
// distinguished from non-file values when parsing.)
function log(formdata) {
// FormData is pair-iterable.
const entries = [...formdata];
const firstEntryName = entries[0][0];
const secondEntryName = entries[1][0];
const secondEntryFilename = entries[1][1].name;
console.log(firstEntryName, secondEntryName, secondEntryFilename);
}
For browsers, specifying multipart/form-data
parsing is not a big priority, since there are not many use cases for them, and the formData()
method has been broken for 8 years or so. But for WinterCG runtimes with a fetch-based HTTP server API, being able to parse form submissions with the existing fetch API is crucial, and being able to accurately parse the form submissions that all browser engines are currently submitting is a large part of that. So this seems like a very interesting issue to tackle as part of the WinterCG project.
This is sort of similar to #9 but I think it has some interesting semantics that might be worth looking into seperately.
The current spec is very clear that Response.redirect
MUST be resolved against the current url. This is rather unfortunate by itself since server runtimes don't really have a concept of "current url". Outside of that, redirects (especially when paired with status code 307
) are incredibly common for stuff like OAuth flows.
Currently, doing:
return Response.redirect("/home", 307);
just errors in Node and Deno (without the --location
flag). The correct way to do this currently is
return new Response(null, {
status: 307,
headers: {
"Location": "/home"
}
});
which I think is suboptimal. Perhaps this issue should be raised upstream, because this could actually work in a web-standard Response.redirect
. Ideally, Response.redirect
would just use a relative location for status codes that support it. I'm hoping to champion this change in WinterCG and eventually land it upstream later.
I would like to propose that the CG pursue standardization of Mutual TLS authentication in the fetch
API.
This is a feature that is not likely to be implemented by browser runtimes but is in my opinion missing in non-browser runtimes where fetch
is the only interoperable HTTP client.
The use-case I have in mind is implementation of OAuth mTLS Client Authentication and Client Certificate-Bound Access Tokens.
Node.js - https
module has the option to provide the cert
, key
, crl
, passphrase
, pfx
, and ca
options.
Deno - using deno --unstable
there's Deno.createHttpClient
, the result of which can be passed as a client
property to fetch
's init
argument. This method accepts certChain
, privateKey
, and caCerts
options.
Currently, both the Request
and Response
classes contain a bunch of properties that likely have no effect server-side.
request.destination
request.referrer
request.referrerPolicy
request.mode
*request.cache
"...indicating how the request will interact with the browser’s cache when fetching"request.integrity
"A cryptographic hash of the resource to be fetched by request."request.isReloadNavigation
request.isHistoryNavigation
(options that are passed to the Request
constructor)
RequestInit.referrer
: "A string whose value is a same-origin URL"RequestInit.referrerPolicy
: "A referrer policy to set request’s referrerPolicy."RequestInit.mode
: "A string to indicate whether the request will use CORS, or will be restricted to same-origin URLs. Sets request’s mode. If input is a string, it defaults to "cors"."RequestInit.credentials
: see request.credentials
*RequestInit.cache
: see request.cache
RequestInit.integrity
: see request.integrity
RequestInit.window
: "Can only be null. Used to disassociate request from any Window."* omit
and include
may be useful for developers, however same-origin
is not.
** cors
should be omitted from this type.
As you can see, each environment is different in supported properties which can cause cross-platform confusion. It also makes everything more confusing considering that these platforms typically leave in unsupported properties in their typings, but do not document which types are ignored (unless you look for it on google).
false
for request.isHistoryNavigation
). Default flags would also be needed for RequestInit
as the spec heavily defines fetch's behavior from certain flags being set.<-- Original fetch spec -->
# some title
1. If request's mode is "cors" then:
...
2. Perform scheme fetch.
3. If request's `referrer` is not this's current settings origin url then:
1. Abort this request.
2. Return a network error.
// and so on
<-- Server environment spec -->
# some title
1. Ignore step 1
3. Ignore step 3
// and so on
As promised a few calls ago, I have been working on drafting the initial specification for WinterCG Fetch. I've had many discussions with multiple folks and I have arrived at two options for us. I'd like us to decide on one of them as the organization structure for our specification. Once agreed; I will continue #11 and get our base line specification published.
The first option is to create a fork of whatwg/fetch
here in wintercg. We will utilize aspects of the Bikeshed language (which is what whatwg/fetch
is written in) to omit sections and include notes/extensions for aspects that we want to modify.
We will be responsible for rebasing our modifications every time Fetch lands a change to the specification. This could be partially automated where we create a bot that watches the whatwg/fetch
repo, and anytime new commit(s) are merged to main
, it would open a branch and attempts to do the necessary git operations. Of course, if there are merge conflicts they would need to be settled by a contributor here in WinterCG.
This will ensure our specification is always up to date with the latest whatwg version.
This option has a long-term maintenance cost where members of WinterCG would be responsible for managing the rebasing overtime. As stated, it could be automated, but it wouldn't be a perfect solution as whenever conflicts arise someone would have to spend time fixing them.
The second option is to start with essentially an empty specification that states something along the lines of: "Unless otherwise specified in this document, WinterCG Fetch is compatible with the latest edition of WHATWG Fetch specification". Then, overtime as we agree on modifications to whatwg/fetch
, we will create new sections within our document that states the necessary changes. For example, lets pretend we agree to get rid of the entire concept of "Forbidden Headers". Our specification may include a section such as:
Please note this is purely for demonstration purposes. The WinterCG has made no decisions regarding modifications to the Whatwg Fetch API and the content in the following example is purely hypothetical. Do not use this issue thread to discuss the nuance of the example.
### Headers
#### Modification of Forbidden Headers List
Section [2.2.2 Headers #forbidden-header-name](https://fetch.spec.whatwg.org/#forbidden-header-name) of the whatwg/fetch specification states a list of header names that are considered "forbidden". During runtime execution of the Fetch API, usage of a forbidden header results in an early return such as in the [Concept Headers append](https://fetch.spec.whatwg.org/#concept-headers-append) section.
WinterCG Fetch deviates from this section by stating that there are **no** forbidden headers. A WinterCG Fetch API will not return early if it encounters one of these headers.
This option has less maintenance burden as it could essentially stagnate while remaining "up to date". With the catch all statement stating that essentially WinterCG Fetch is WHATWG Fetch unless otherwise noted. The WHATWG Fetch could land changes and unless we need to deviate from those changes, we don't have to modify our specification.
Unfortunately, this also means that if we are not on top of changes to WHATWG Fetch, we could incorrectly be supporting something they add that we want to deviate from. Arguably, implementations don't generally move as quickly as standards. And so even if there is a bit of a lag between us coming to decision on a hypothetical change to WHATWG Fetch, many implementers would already be apart of the conversation and it wouldn't have much impact.
With these two options, please react to this post with which one you prefer more to give us a sense of what folks are preferring. We will also be discussing this at upcoming wintercg calls. When we come to a majority decision I will create the initial proposal draft. In the mean time, we can being making API decisions for WinterCG Fetch - capture the result in issues, and when we eventually get our proposal created, I can add those decisions to the initial draft. Also please feel free to use this issue to discuss details of either option too.
Thank you!
Option 1 - react with: 😄
Option 2 - react with: 🚀
The Concept of RRS maps directly to the MOJO::IPC dispatcher concept of ResourceIds that get resolved back to the real Resources
it is mainly a concept of exposing dynamic module loading to userland via filePicker or other methods.
Allow Nativ HTML Elements to interact with dynamic registered nativ components.
Creating a fullscreen snapshot if there would be a component called gpu::0 via directly memCopy of the GPU buffer.
<img src="gpu::0"></img>
The Compositor (Shell, Browser) is able to expose IPC Channels to the serviceWorker / nodejs / deno which is able to translate the gpu::0 string to a real resourceId inside the IPC System and this way get a Handle Back for it to access.
the serviceWorker solves also write locks and all that for multiple context instances.
as this allows to implement fetch as is with all its context restrictions as a none restricted fetch would be simply done by the net::https://domain.com/target.file call like it is already done internal on the C++ side.
browser implementations need to request permissions for dynamic module loads while host runtimes do not if they do not want to. it is a Runtime Shell depend.
think about it as Resolve Able Urls that do not get resolved by the Network Stack they are Internal ResolveableReferences defined as string Specifier.
this defines a Winterop component system using RRS as Specifiers Identifiers
Resolving a a module via reference inside v8
// result can be used after that via MessageChannels ports send to the context
// result can be used to supply fetch responses.
serviceWorker.load(fileHandle) // browser secure way without permissions as perimissions come from the fileHandle Request. Only localFiles
serviceWorker.load() // browser secure way without permissions as perimissions come from the fileHandle Request.
serviceWorker.load("net::https")
serviceWorker.load("fs::/home/path/shared.so")
Within Workers we have been having a discussion about how to communicate to users via Errors that the conditions leading to an error are temporary and that the user should retry their operation. The how and when to retry is not important here.
For example, a fetch()
promise can fail for many reasons. The network path could temporarily be down, the URL could be blocked, the header could be malformated, etc. We want to be able to clearly indicate that the user can/should retry their operation without requiring that the user resort to parsing the error message.
We have several possible paths forward, all of which have the same fundamental problem. We'd like to get consensus on which approach folks would find the most agreeable.
const err = new Error('an error occurred');
Object.defineProperty(err, 'name', { value: 'RetriableError' });
Error
const err = new Error('an error occurred');
err.retriable = true;
cause
const err = new Error('an error occured', { cause: { retriable: true } })
AggregateError
// The first object is always an error but the additional things communicate
// the additional structured information we want.
const err = new AggregateError([
new Error('an error occurred'),
{ retriable: true }
])
Other ideas?
My current thinking here is to prefer Option 3, using the cause
property.
Specifically, pulling out to a logical level: The purpose of the cause
is to communicate the reason for this error. That reason might be that another Error
was thrown, or it might be that some other condition occurred. For instance, the network was down, or there was an internal error, etc. So let's differentiate between Error
and Condition
.
If I have a transient condition and want to communicate that the user should retry their operation, then I could logically do something like:
cont condition = {
// The condition is temporary....
transient: true,
// The operation is retriable...
retriable: true,
};
const err = new Error('oops that failed', { cause: condition });
The challenge with this, of course, is interoperability. If workers chooses to use cause
in this way but other fetch()
implementations choose to use cause
in other ways then we can run into interop issues. To be clear, ALL of the options suffer from this exact problem.
The proposal I would like to make is to define a new ErrorCondition
interface specifically for use with cause
Essentially (treat this as a discussion example to express intent... the actual proposal can be refined):
dictionary ErrorConditionInit {
boolean transient = false;
boolean retriable = false;
DOMString name = "";
};
interface ErrorCondition {
constructor(optional DOMString message = "", optional ConditionInit init = {});
readonly attribute boolean transient;
readonly attribute boolean retriable;
readonly attribute DOMString name;
readonly attribute DOMString message;
}
Note that this interface intentionally mimics DOMException
with the inclusion of a name
and message
accessors.
Example use (assuming the proposal to add cause
to DOMException
goes through):
const err = new DOMException('The operation failed', {
name: 'NETWORK_ERR',
cause: new ErrorCondition('The network path is down', {
transient: true,
retriable: true,
})
});
console.log(err.cause.transient); // true
console.log(err.cause.retriable); // true
To be clear, I don't really have strong opinions on exactly how we solve this use case. My only requirement is that we have a mechanism for reliably communicating transient/retriable conditions that is interoperable across runtimes.
Platforms like Cloudflare Workers or Shopify Oxygen may need to add additional information to requests/responses. Cloudflare Workers has a proprietary .cf
property which includes fields like geolocation data etc. Shopify's Oxygen runtime has similar needs but uses custom HTTP headers to pass the data.
Since the standard doesn't offer any way to pass extra meta information along with Request
/Response
instances, it would be great to explore these and similar scenarios to see how the extra meta data may be added. There likely were no use cases for this in the browser world, but on the server side, the situation is different.
Using extra HTTP custom headers might be the most obvious approach, but it has a clear downside that headers have tight size limits, aren't very suitable for holding complex data, and can't contain anything that isn't directly serialisable into a string.
A custom property like .meta
or .metadata
(which should hold a JavaScript object, leaving its fields up to the vendor implementation) would be more flexible but needs to be standardised.
As outlined at WHATWG/fetch there is often the need in a server environment to access cookie headers.
For example editing Cookie headers:
const h = new Headers;
h.append("Set-Cookie", "a=1; Expires=Wed, 21 Oct 2015 07:28:00 GMT");
h.append("Set-Cookie", "b=1; Expires=Wed, 21 Oct 2015 07:28:00 GMT");
h.get("Set-Cookie")
// a=1; Expires=Wed, 21 Oct 2015 07:28:00 GMT, b=1; Expires=Wed, 21 Oct 2015 07:28:00 GMT
Headers.prototype.getAll
would solve this problem, but was removed from the browser specification. Should we standardize on adding it back? Or should we define a new method all together? Ideally we should align with node-fetch
and undici
as well.
Hello,
I would like to report a potential vulnerability in the fetch spec.
Is there a private email list or an internal issue tracker that I can submit
the details to?
Thanks in advance and happy new year!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.