Coder Social home page Coder Social logo

wicg / capability-delegation Goto Github PK

View Code? Open in Web Editor NEW
14.0 9.0 6.0 112 KB

An API to allow developers transfer the ability to use restricted APIs to any target window in the frame tree.

License: Other

HTML 85.33% Bikeshed 14.67%
delegation activation-delegation payment-request fullscreen

capability-delegation's Introduction

Capability Delegation

Transferring the ability to use restricted APIs to another window.

(Draft specification)

Author

Mustaq Ahmed ([email protected], github.com/mustaqahmed)

Participate

Introduction

"Capability delegation" means allowing a frame to relinquish its ability to call a restricted API and transfer the ability to another (sub)frame it trusts. The focus here is a dynamic delegation mechanism which exposes the capability to the target frame in a time-constrained manner (unlike <iframe allow=...> attribute which is not time-constrained).

The API proposed here is based on postMessage(), where the sender frame uses a new PostMessageOptions member to specify the capability it wants to delegate.

Motivating use-cases

Here are some practical scenarios that are enabled by the Capability Delegation API.

Secure PaymentRequest processing in a subframe

Many merchant websites perform payment processing through a Payment Service Provider (PSP) site (e.g. Stripe) to comply with security and regulatory complexities around card payments. When the end-user clicks on the "Pay" button on the merchant website, the merchant website sends a message to a cross-origin iframe from the PSP website to initiate payment processing, and then the iframe uses the Payment Request API to complete the task.

But sites are only allowed to call the Payment Request API after transient user activation (a recent click or other interaction) to prevent malicious attempts like unattended or repeated payment requests. Since the user probably clicked on the main site, and not the PSP iframe, this would prevent the PSP from using the Payment Request API at all. Browsers today support such payment processing by ignoring the user activation requirement altogether (see crbug.com/1114218)!

Capability Delegation API provides a way to support this use-case while letting the browser enforce the user activation requirement, as follows:

// Top-frame (merchant website) code
checkout_button.onclick = () => {
    targetWindow.postMessage("process_payment", {targetOrigin: "https://example.com",
                                                 delegate: "payment"
                                                });
};

// Sub-frame (PSP website) code
window.onmessage = () => {
    const payment_request = new PaymentRequest(...);
    const payment_response = await payment_request.show();
    ...
}

Allowing fullscreen from opener Window click

This is a work-in-progress in Chrome.

Consider a presentation/slide website where the main "control panel" window has spawned a few presentation windows, and the user wants to selectively make one presentation window fullscreen by clicking on the appropriate button on the main window (a feature request from a developer). Clicking on the "control panel" button does not make the user activation available to the presentation window, so this does not work today.

The Web does not support this use-case today but Capability Delegation API provides a solution:

// Main window ("control panel") code
let win1 = open("presentation1.html");
let win2 = open("presentation2.html");

button1.onclick = () => win1.postMessage("msg", {targetOrigin: "https://example.com",
                                                 delegate: "fullscreen"});
button2.onclick = () => win2.postMessage("msg", {targetOrigin: "https://example.com",
                                                 delegate: "fullscreen"});

// Sub-frame ("presentation window") code
window.onmessage = () => document.body.requestFullscreen();

Allowing display capture from cross-origin iframe click

Consider a web app in which you want to add video-conferencing capabilities. You turn to a third party solution that can be embedded in a cross-origin iframe. There's a lot of logic behind the scenes, but UX-wise, maybe you work out a scheme where it's mostly the video which is user-facing in the video-conferencing iframe, and the user-facing controls - mute, leave, share-screen - are all part of the web app, and receive its specific UX styling. When those buttons are pressed, some messages are exchanged between the web app and the embedded video-conferencing solution.

To let the third-party iframe to prompt the user to share a tab, a window, or a screen, the top frame would delegate the mediaDevices.getDisplayMedia() permission to the iframe as follows:

// In the top frame, user clicks the "Share My Screen" button.
button.onclick = () =>
  frames[0].postMessage("msg", { delegate: "display-capture" });
// In the cross-origin video-conferencing iframe, prompt the user
// to share a tab, a window, or a screen.
window.onmessage = () => navigator.mediaDevices.getDisplayMedia();

Other similar scenarios

  • A web service that does not care about user location except for a "branch locator" functionality provided by a third-party map-provider app can delegate its own location access capability to the map iframe in a temporary manner right after the "branch locator" button is clicked.

  • An authentication provider may wish to show a popup to complete the authentication flow before returning a token to the host site.

  • A website may want a third-party chat app in an iframe to be able to vibrate the phone on message receipt, even when the user is not active in the iframe.

Non-goals

  • This explainer is not about delegation of user activation (i.e., allowing the iframe to choose from all of the things the top frame could do after a user click or other interaction). See Considered Alternatives below for more details.

  • This explainer does not determine which APIs could possibly support capability delegation. If any API needs the support, the designers of the API would decide details of delegated behavior. The PaymentRequest API case presented here (in collaboration with the owners of that API) serves as a guide for similar changes in other API specifications.

Using capability delegation

Developers would use Capability Delegation by just initiating the delegation appropriately, as shown in the example code snippets above. In short, when a browsing context wants to delegate a capability to another browsing context, it sends a postMessage() to the second browsing context with an extra WindowPostMessageOptions member called delegate specifying the capability.

After a successful delegation, the "user API" (the restricted API being delegated) just works when called at the right moment. The general idea is calling the restricted API in a MessageEvent handler or soon afterwards. In the examples above, the restricted APIs are payment_request.show(), element.requestFullscreen(), and mediaDevices.getDisplayMedia() respectively.

Demo

  • Payment Request API: To see how this API works with Payment Request, run Chrome with the command-line flag: --enable-blink-features=PaymentRequestRequiresUserActivation, then open this demo.

  • Fullscreen API: Work in progress.

  • Screen Capture API: Work in progress.

Related links

Considered alternatives

Delegating user activation instead of a specific capability

It may appear that we can delegate user activation to solve the same use-cases and thus avoid specifying a feature in the postMessage() call. We attempted this direction in the past from a few different perspectives, and decided not to pursue this. In particular, user activation controls many Web APIs, so delegating user activation for any of the mentioned use-cases is impossible without causing problems with unrelated APIs. See the TAG discussion with one past attempt.

Using a delegation-specific method instead of postMessage()

Instead of piggy-backing the delegation request as a PostMessageOptions entry, we considered adding a new delegation-specific interface on the Window object. While the latter may look cleaner from a developer’s perspective, to support cross-origin communication this solution would require adding the new method on the WindowProxy wrapper, which HTML's editor strongly disliked.

Stakeholder feedback/opposition

We will track the overall status through this Chrome Status entry.

Acknowledgements

Many thanks for valuable feedback and advice from:

capability-delegation's People

Contributors

beaufortfrancois avatar mikewest avatar mustaqahmed avatar stephenmcgruer avatar yoavweiss avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

capability-delegation's Issues

Requiring the postMessage origin not to be a wildcard?

If website don't use COOP:same-origin, they could, without knowing, have an opener/openee relationship with a malicious window. This relation can remain open a long time, despite several navigations in both windows.

With postMessage, we recommend developers not to use targetOrigin="*".
With capability delegation, it might be wise to require it instead of recommending it.

Why it is safe to take timestamp when delegation has been received

https://wicg.github.io/capability-delegation/spec.html#monkey-patch-to-html-tracking-delegation

"If delegate is not null, AND the user agent supports delegating delegate, then set DELEGATED_CAPABILITY_TIMESTAMPS[delegate] to current high resolution time."

But if the main thread of the target has been busy, that current time might be way in the future comparing to when the message was sent. Why is that ok - or am I missing something?

How many layers of delegation

It's somewhat common for sites to have multiple layers of nested documents. How would that work here? In the current setup it seems the top-layer would have to be aware of each of them so it can message directly to the innermost document that might be responsible for fullscreen or some such, but is that ideal? (It doesn't seem ideal.)

Clarifying the algorithm for feature detection

Ideally, the spec should provide an example (and clarify the window post message algorithm's monkey-patch to eludicate how a site can detect this feature.

Ideally, the site could call postMessage on its own window with a capability to delegate, without user activation, and check for a NotAllowedError (or similar) to detect the user agent's ability to delegate that capability.

The algorithm may also want to consider clarifying the behavior when the destination doesn't have a supporting feature policy.

Is "token" the best term to use here?

We got an early feedback that the proposed term token is confusing...it seems to suggest a "transferable object". In our case the "token" would be "non-transferable by design". Not sure what could be a better alternative here. Any suggestions welcome.

Why not use sandbox?

Sorry if this is a silly question, but I'm wondering... why not just use something like sandbox=? Say:

<iframe sandbox="allow-transient-activation" allow="payments">

Then regular transient activation expiry time still applies to the remote origin and no need need to do any capability delegation (it's handled by allow= permissions policy).

Open question: do cross-origin iframes have their own transient-activation timer or is global cross-process? (I think I know the answer... but).

How does this work with permission-gated capabilities and permission prompts?

What happens if the frame delegating the capability does not have the necessary permission to actually use it? In the case of subframe the usage seems close enough to permission/feature delegation but when it comes to popup windows this delegation seems confusing.

Would it be reasonable to enforce that the top-level frame can only delegate capabilities that it already has the permission to make use of?

Architecture thoughts

Looking at https://wicg.github.io/capability-delegation/spec.html#monkey-patch-to-payment-req it seems that the current setup is quite involved for participating specifications.

It seems to me the contract could be simpler. Whereby a participating specification provides an identifier and a global and a shared algorithm then returns whether it can proceed (previously known as "has transient activation").

(Perhaps a bit more is needed to address the variety of use cases, I haven't looked at this in detail, but in general we should strive for making adoption easy and put the bulk of the logic in the base specification.)

When does a delegated Payment Request capability expire?

The proposed monkey-patch to Payment Request spec needs to clarify when a delegated payment request capability expires.

The current text seems to suggest the same expiry time as transient user activation through the link from "expired" to HTML spec, but technically it doesn't really work because we are talking about a different timestamp field here.

@stephenmcgruer Before I change it, does "the same expiry time as transient user activation" make sense to Payment Request team? Or you want a different expiry?

Clarifying the behavior for consuming the user activation and delegated capability

For the APIs that would consume the user activation and delegated capability, Fullscreen and Payment, they have different behavior. If the global has both valid transient activation and delegated capability, the Payment API only consumes the transient activation, which means the Payment API is allowed to be called again because of the delegated capability is still valid; however Fullscreen API seems to consume both. Is this intentional?

Require a non-* targetOrigin

This is to prevent giving access to a navigated child frame. You should know the origin of a trusted partner.

Consider extending MessageEvent

It may be useful to pass additional information to the receiver via the MessageEvent, so a developer can know that delegation failed, or was denied or something similar.

window.addEventListener("message", e => {
  if (e.delegate == false) {
    // do something useful, rather than hope the API i would have called has a rejection handler (or w/e)
 }
});

Serialize a string?

Let delegate be the serialization of options["delegate"].

There's nothing to serialize here, right?

Delegating a capability not acquired yet.

By providing the capability delegation as a string, developers may delegate features they don't have access to yet.

For instance. What would happen if:

postMessage("msg", {delegate: "geolocalisation"});

is sent before asking the users to allow geolocalisation on the document?

One alternative would have been to pass some "Capability/Token" object that can be constructed after getting access to a feature. This way, you could only delegate capabilities you already have access to.

If strings are used this bring interesting questions:

  1. Do delegations applies retroactively to user's permission prompt?
  2. If the permission prompt happens after the delegation, where do you show permission prompt? On both windows?
  3. Do you have race conditions? How do the specification deals with it?

+CC @mikewest

Demo postMessage options cleanup

The demo currently has both delegate: "paymentrequest" and createToken: "paymentrequest" to make it testable with older Chrome. We need to remove the latter one when appropirate.

Also, the capability should be mentioned as "payment" as per our proposed spec draft.

Examples lack targetOrigin

It's not clear to me how the examples work at the moment. In the same-origin case this isn't needed as the relevant windows would already have transient activation. In the cross-origin case you need to supply targetOrigin.

Relation to permissions policy

The specification lists Permissions Policy as a normative dependency, but never references it. At a minimum, that's an editorial problem.

More substantively, I don't understand how the relationship between the two is supposed to be conceptualized. Is it only limited to those cases where there are transient activations involved? What if the top-level site delegates a capability that is not otherwise transient; does it become transient as a consequent of the delegation?

Interaction with Permissions / Feature Policy

How does this interact with Permissions Policy. e.g. in the example and demo shown:

checkout_button.onclick = () => {
    targetWindow.postMessage("process_payment", {delegate: "payment"});
};

The top-level frame knows the origin of targetWindow so it seems reasonable that it might have set a Permissions Policy to enable or restrict the feature for that origin. As in, it feels like a site might want to prevent the ability for a malicious script to insert an entry like this to delegate a capability.

Likewise, though this might just be my own confusion / wish-list, it would be nice if the capabilities were consistent in naming with Permissions Policy too… but I don't know if that's a valid thing to want.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.