Comments (4)
This will impose a support burden on applications, which could need to maintain maintain a camera blacklist.
My understanding is that UAs are responsible for ensuring the results are good enough.
This seems no different than other existing APIs such as echo cancellation or HW encoders.
Such a list would be difficult to develop without the ability to identify the camera hardware, which in turn could be considered a fingerprinting risk.
This is gated by camera permission.
Video frames are already exposed to the web application.
I am unclear which additional fingerprinting information this API would actually provide.
These issues do not arise for applications utilizing an existing face detection model written for an ML platform, since those models will yield the same results, albeit with better or worse performance depending on the (GPU) hardware.
This is true of many existing features, I do not see what is special here.
Echo cancellation for instance can be done by the OS, the UA or the web application.
All 3 options are yielding different results and different performances, which gives a healthy playground for developers.
from mediacapture-extensions.
The issue is that ML models are under a lot of scrutiny so that an API that will yield different results depending on the hardware is a problem. Today models require a lot of validation before they can be deployed; they are increasingly being treated like drugs, with a multi-stage process involving review panels and large-scale testing. All existing W3C APIs for ML model acceleration enable the same model to be deployed everywhere, albeit running slower or faster depending on the hardware. This API proposal does not provide the level of uniformity of existing approaches, nor does it compare favorably on coverage.
from mediacapture-extensions.
If web developers need to have exactly uniform results everywhere, then they are free to deploy a ML model of their choice using Web ML APIs. However, we believe that there are numerous applications where identical results are not required as long as they have reasonable quality (and user agents can filter out implementations with unreasonably low quality).
It also works in the opposite way: when system face detection models improve, existing applications will get the benefit if using the proposed API unlike if they would deploy their own model.
As youennf mentioned, there are many existing W3C APIs where the quality varies depending on platform yet they have been found important enough to be supported: for example the Shape Detection API and the WebCodecs VideoEncoder and AudioEncoder interfaces. Here's a table showing how widely the quality of H.264 video encoders can vary yet any could be used for implementing the WebCodecs VideoEncoder API.
Also important aspect is the performance. In many cases the FD using the proposed API is free or near-free in computation (camera algorithms often run internally FD whether user wants the results or not) and many users might opt to using the API even if the results might vary to some degree.
And last, even if an app would decide to deploy its own ML model, it could still make use of the metadata definitions from this proposal.
from mediacapture-extensions.
This issue was mentioned in WEBRTCWG-2023-02-21 (Page 45)
from mediacapture-extensions.
Related Issues (20)
- Should media delivery during transfer be specified? HOT 1
- MediaStreamTrack transfer requires secure context? HOT 1
- additional interfaces need `Exposed=(Window,Worker)` to support transfer to worker context
- Design pattern for constraints with system-level UI?
- MediaStreamTrack: actual frame rates captured HOT 5
- Face Detection: Segmentation metadata HOT 5
- Alternative approach to configurationchange
- Filtering for relevant configurationchange events HOT 14
- Face Detection: Scope of Applicability HOT 6
- Clarify which event loop task will be used for track in configuration change . HOT 2
- Limit when configurationchange fires to useful cases. HOT 3
- polling of getSettings() on a muted track reveals OS blur changes, correlating user across origins
- Consider need for automatic single-shot face framing
- MediaStreamTrack audio delay/glitch capture stats HOT 8
- Face framing: Discuss privacy implications HOT 2
- Converge face detection metadata between shape detection API and VideoFrame
- Migrate capture metrics from RTCAudioSourceStats to MediaStreamTrack method HOT 5
- [Track Stats API] track.getFrameStats() allocates memory, adding to the GC pile HOT 62
- mediastream cannot handle webcam data stream in yuv pixel format correctly
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mediacapture-extensions.