Coder Social home page Coder Social logo

Comments (6)

steely-glint avatar steely-glint commented on August 26, 2024

That seems to me to be a slightly defeatist attitude - i.e. unless all devices can do this well we shouldn't allow the ones that can to expose the hardware capability.

We've already seen instances of background blur being done on a server because the GPU isn't always capable of the required webML in realtime.

I suspect that users who find that common tensor flow models don't reliably detect their faces would welcome the ability to buy a camera that does - I can imagine communities adopting certain webcams precisely because they work well for (say) grey-beards.

from mediacapture-extensions.

youennf avatar youennf commented on August 26, 2024

it is likely to only be supported on new camera models

AFAIK, iOS devices have cameras supporting this for quite some time. Coverage on this particular platform should be pretty good today.

Those applications that include their own face detection models will also probably choose to forgo the proposed APIs, choosing instead to leverage GPU-based acceleration approaches supported by Web ML platforms such as Tensorflow.js.

I do not think one approach excludes the other.
I could for instance see camera driver face detection input be refined by WebML models as a perf optimization.

In general, I would tend to think that if native applications have a use for some native APIs, web applications will probably have a use case for similar web APIs. This principle seems to apply well here.

from mediacapture-extensions.

ttoivone avatar ttoivone commented on August 26, 2024

Face detection API has been available for Android phone vendors since API level 14 (Android 4.0, 2011) and in 2015, 54% or 1.3 billion devices shipped were based on Android (Wikipedia). I don't have hard facts how large percentage of these devices actually implement the face detection API, but I would assume currently virtually all as it is usually implemented as part of the camera control algorithms (auto exposure, focus, white balance) to improve image quality. For example, my Motorola Droid 4 from 2012 and Huawei Honor 7 from 2015 both support it. If someone here knows an Android phone which does not support it, please let us know.

On ChromeOS, the Android Camera2 API, which supports face detection, has been available on all Pixelbooks from Google (from 2017 onwards). We actually originally implemented the Chromium face detection support on Pixelbook Go (shipped 2019).

On Windows 10 and above, clients with driver support have supported a face detection API. And at least in latest Windows versions, if the driver doesn't provide face detection, Windows uses Windows.Media.FaceAnalysis to implement the face detection, so missing driver/camera support shouldn't be a showstopper. We agree that the percentage of Windows clients on the market who can take advantage of this right now is low, but soon enough people should have updated Windows to a more recent version with face detection support.

As youennf mentioned, iOS devices have also supported this for some time, although we don't have first-hand experience of that platform.

from mediacapture-extensions.

aboba avatar aboba commented on August 26, 2024

@steely-glint The goal of existing W3C ML APIs is to allow the same model to run on any browser or hardware, albeit slower if acceleration is not available. In order to ensure the widest range of applicability, existing API proposals rely on GPU acceleration, which is also what ML frameworks use for acceleration and is the approach identified in the WebRTC-NV Use Cases document. Depending on camera hardware will limit applicability compared with the GPU acceleration approach.

One way to address the applicability problem would be to find a way to support failover. Other APIs such as Media Capabiltiies make it possible for applications to understand performance characteristics under various conditions.

from mediacapture-extensions.

ttoivone avatar ttoivone commented on August 26, 2024

@aboba

Depending on camera hardware will limit applicability compared with the GPU acceleration approach.

We do not depend on camera hardware on the proposal. In fact, no MIPI-based camera has face detection built-in. On systems with a MIPI camera, the image is processed by a chip typically on the SoC. For instance, Intel has had its Image Processing Unit (IPU) with face detection part of selected SoCs since around 2012. Basically all mobile phones have a MIPI camera and thus FD on SoC. Several newer laptops also have a MIPI camera, although this is much rarer, but as mentioned, even with USB camera without FD support, recent Windows versions have a failover support.

One way to address the applicability problem would be to find a way to support failover. Other APIs such as Media Capabiltiies make it possible for applications to understand performance characteristics under various conditions.

It is true that our API proposal does not give access to face detection performance characteristics. That is something which might make sense to add. However, I suspect that platform APIs themselves provide little information on performance. One simple way would be to not expose the FD API in cases where the platform implementation is known to be relatively slow (compared to eg. W3C ML APIs), or of low quality.

from mediacapture-extensions.

dontcallmedom-bot avatar dontcallmedom-bot commented on August 26, 2024

This issue was mentioned in WEBRTCWG-2023-02-21 (Page 43)

from mediacapture-extensions.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.