Comments (6)
That seems to me to be a slightly defeatist attitude - i.e. unless all devices can do this well we shouldn't allow the ones that can to expose the hardware capability.
We've already seen instances of background blur being done on a server because the GPU isn't always capable of the required webML in realtime.
I suspect that users who find that common tensor flow models don't reliably detect their faces would welcome the ability to buy a camera that does - I can imagine communities adopting certain webcams precisely because they work well for (say) grey-beards.
from mediacapture-extensions.
it is likely to only be supported on new camera models
AFAIK, iOS devices have cameras supporting this for quite some time. Coverage on this particular platform should be pretty good today.
Those applications that include their own face detection models will also probably choose to forgo the proposed APIs, choosing instead to leverage GPU-based acceleration approaches supported by Web ML platforms such as Tensorflow.js.
I do not think one approach excludes the other.
I could for instance see camera driver face detection input be refined by WebML models as a perf optimization.
In general, I would tend to think that if native applications have a use for some native APIs, web applications will probably have a use case for similar web APIs. This principle seems to apply well here.
from mediacapture-extensions.
Face detection API has been available for Android phone vendors since API level 14 (Android 4.0, 2011) and in 2015, 54% or 1.3 billion devices shipped were based on Android (Wikipedia). I don't have hard facts how large percentage of these devices actually implement the face detection API, but I would assume currently virtually all as it is usually implemented as part of the camera control algorithms (auto exposure, focus, white balance) to improve image quality. For example, my Motorola Droid 4 from 2012 and Huawei Honor 7 from 2015 both support it. If someone here knows an Android phone which does not support it, please let us know.
On ChromeOS, the Android Camera2 API, which supports face detection, has been available on all Pixelbooks from Google (from 2017 onwards). We actually originally implemented the Chromium face detection support on Pixelbook Go (shipped 2019).
On Windows 10 and above, clients with driver support have supported a face detection API. And at least in latest Windows versions, if the driver doesn't provide face detection, Windows uses Windows.Media.FaceAnalysis to implement the face detection, so missing driver/camera support shouldn't be a showstopper. We agree that the percentage of Windows clients on the market who can take advantage of this right now is low, but soon enough people should have updated Windows to a more recent version with face detection support.
As youennf mentioned, iOS devices have also supported this for some time, although we don't have first-hand experience of that platform.
from mediacapture-extensions.
@steely-glint The goal of existing W3C ML APIs is to allow the same model to run on any browser or hardware, albeit slower if acceleration is not available. In order to ensure the widest range of applicability, existing API proposals rely on GPU acceleration, which is also what ML frameworks use for acceleration and is the approach identified in the WebRTC-NV Use Cases document. Depending on camera hardware will limit applicability compared with the GPU acceleration approach.
One way to address the applicability problem would be to find a way to support failover. Other APIs such as Media Capabiltiies make it possible for applications to understand performance characteristics under various conditions.
from mediacapture-extensions.
Depending on camera hardware will limit applicability compared with the GPU acceleration approach.
We do not depend on camera hardware on the proposal. In fact, no MIPI-based camera has face detection built-in. On systems with a MIPI camera, the image is processed by a chip typically on the SoC. For instance, Intel has had its Image Processing Unit (IPU) with face detection part of selected SoCs since around 2012. Basically all mobile phones have a MIPI camera and thus FD on SoC. Several newer laptops also have a MIPI camera, although this is much rarer, but as mentioned, even with USB camera without FD support, recent Windows versions have a failover support.
One way to address the applicability problem would be to find a way to support failover. Other APIs such as Media Capabiltiies make it possible for applications to understand performance characteristics under various conditions.
It is true that our API proposal does not give access to face detection performance characteristics. That is something which might make sense to add. However, I suspect that platform APIs themselves provide little information on performance. One simple way would be to not expose the FD API in cases where the platform implementation is known to be relatively slow (compared to eg. W3C ML APIs), or of low quality.
from mediacapture-extensions.
This issue was mentioned in WEBRTCWG-2023-02-21 (Page 43)
from mediacapture-extensions.
Related Issues (20)
- [Track Stats API] Rename "videoStats" to "stats" HOT 1
- [Track Stats API] Rephrase sentence on when to update internal slots HOT 14
- [Track Stats API] Rename deliveredFrames to deliverableFrames HOT 2
- [Track Stats API] SameObject is a confusing API shape HOT 23
- [Track Stats API] When to initialize frame counting HOT 3
- Should web applications be aware of reaction effects added by OS to camera feeds? HOT 17
- [Audio Stats] Add current latency HOT 2
- Background Blur: Unprocessed video should be mandatory to support HOT 8
- [Track Stats API] Make stats attribute nullable instead of throwing when unsupported HOT 2
- [Stats] Example uses ratio, not percentage
- [Audio Stats] Add average, min and max latency HOT 5
- [Audio Stats] Disagreement about audio dropped counters HOT 19
- volume is not working HOT 1
- Move MediaStreamTrack stats in its own spec? HOT 10
- Clarify how `track.stats.resetLatency()` relates to run-to-completion semantics HOT 1
- https://w3c.github.io/mediacapture-extensions/#transferable-mediastreamtrack should talk about the MediaStreamTrack's application-set content hint
- Moving the source of a track HOT 1
- Should we add reasons to MediaStreamTrack.onended HOT 4
- Add a blackFrames counter to MediaStreamTrackVideoStats HOT 7
- Consider adding onVoiceActivity event on MediaStreamTrack for audio HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mediacapture-extensions.