Since the supported Face Detection models may vary by camera, the API proposed in PR 8

This issue was mentioned in <a href="https://lists.w3.org/Archives/Public/www-archive

Face Detection: Variance of Results about mediacapture-extensions HOT 4 OPEN

w3c commented on August 26, 2024

Face Detection: Variance of Results

from mediacapture-extensions.

Comments (4)

youennf commented on August 26, 2024

This will impose a support burden on applications, which could need to maintain maintain a camera blacklist.

My understanding is that UAs are responsible for ensuring the results are good enough.
This seems no different than other existing APIs such as echo cancellation or HW encoders.

Such a list would be difficult to develop without the ability to identify the camera hardware, which in turn could be considered a fingerprinting risk.

This is gated by camera permission.
Video frames are already exposed to the web application.
I am unclear which additional fingerprinting information this API would actually provide.

These issues do not arise for applications utilizing an existing face detection model written for an ML platform, since those models will yield the same results, albeit with better or worse performance depending on the (GPU) hardware.

This is true of many existing features, I do not see what is special here.
Echo cancellation for instance can be done by the OS, the UA or the web application.
All 3 options are yielding different results and different performances, which gives a healthy playground for developers.

from mediacapture-extensions.

aboba commented on August 26, 2024

The issue is that ML models are under a lot of scrutiny so that an API that will yield different results depending on the hardware is a problem. Today models require a lot of validation before they can be deployed; they are increasingly being treated like drugs, with a multi-stage process involving review panels and large-scale testing. All existing W3C APIs for ML model acceleration enable the same model to be deployed everywhere, albeit running slower or faster depending on the hardware. This API proposal does not provide the level of uniformity of existing approaches, nor does it compare favorably on coverage.

from mediacapture-extensions.

ttoivone commented on August 26, 2024

If web developers need to have exactly uniform results everywhere, then they are free to deploy a ML model of their choice using Web ML APIs. However, we believe that there are numerous applications where identical results are not required as long as they have reasonable quality (and user agents can filter out implementations with unreasonably low quality).

It also works in the opposite way: when system face detection models improve, existing applications will get the benefit if using the proposed API unlike if they would deploy their own model.

As youennf mentioned, there are many existing W3C APIs where the quality varies depending on platform yet they have been found important enough to be supported: for example the Shape Detection API and the WebCodecs VideoEncoder and AudioEncoder interfaces. Here's a table showing how widely the quality of H.264 video encoders can vary yet any could be used for implementing the WebCodecs VideoEncoder API.

Also important aspect is the performance. In many cases the FD using the proposed API is free or near-free in computation (camera algorithms often run internally FD whether user wants the results or not) and many users might opt to using the API even if the results might vary to some degree.

And last, even if an app would decide to deploy its own ML model, it could still make use of the metadata definitions from this proposal.

from mediacapture-extensions.

dontcallmedom-bot commented on August 26, 2024

This issue was mentioned in WEBRTCWG-2023-02-21 (Page 45)

from mediacapture-extensions.

Face Detection: Variance of Results about mediacapture-extensions HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent