Coder Social home page Coder Social logo

Comments (4)

youennf avatar youennf commented on August 26, 2024

This will impose a support burden on applications, which could need to maintain maintain a camera blacklist.

My understanding is that UAs are responsible for ensuring the results are good enough.
This seems no different than other existing APIs such as echo cancellation or HW encoders.

Such a list would be difficult to develop without the ability to identify the camera hardware, which in turn could be considered a fingerprinting risk.

This is gated by camera permission.
Video frames are already exposed to the web application.
I am unclear which additional fingerprinting information this API would actually provide.

These issues do not arise for applications utilizing an existing face detection model written for an ML platform, since those models will yield the same results, albeit with better or worse performance depending on the (GPU) hardware.

This is true of many existing features, I do not see what is special here.
Echo cancellation for instance can be done by the OS, the UA or the web application.
All 3 options are yielding different results and different performances, which gives a healthy playground for developers.

from mediacapture-extensions.

aboba avatar aboba commented on August 26, 2024

The issue is that ML models are under a lot of scrutiny so that an API that will yield different results depending on the hardware is a problem. Today models require a lot of validation before they can be deployed; they are increasingly being treated like drugs, with a multi-stage process involving review panels and large-scale testing. All existing W3C APIs for ML model acceleration enable the same model to be deployed everywhere, albeit running slower or faster depending on the hardware. This API proposal does not provide the level of uniformity of existing approaches, nor does it compare favorably on coverage.

from mediacapture-extensions.

ttoivone avatar ttoivone commented on August 26, 2024

If web developers need to have exactly uniform results everywhere, then they are free to deploy a ML model of their choice using Web ML APIs. However, we believe that there are numerous applications where identical results are not required as long as they have reasonable quality (and user agents can filter out implementations with unreasonably low quality).

It also works in the opposite way: when system face detection models improve, existing applications will get the benefit if using the proposed API unlike if they would deploy their own model.

As youennf mentioned, there are many existing W3C APIs where the quality varies depending on platform yet they have been found important enough to be supported: for example the Shape Detection API and the WebCodecs VideoEncoder and AudioEncoder interfaces. Here's a table showing how widely the quality of H.264 video encoders can vary yet any could be used for implementing the WebCodecs VideoEncoder API.

Also important aspect is the performance. In many cases the FD using the proposed API is free or near-free in computation (camera algorithms often run internally FD whether user wants the results or not) and many users might opt to using the API even if the results might vary to some degree.

And last, even if an app would decide to deploy its own ML model, it could still make use of the metadata definitions from this proposal.

from mediacapture-extensions.

dontcallmedom-bot avatar dontcallmedom-bot commented on August 26, 2024

This issue was mentioned in WEBRTCWG-2023-02-21 (Page 45)

from mediacapture-extensions.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.