Coder Social home page Coder Social logo

Comments (8)

mtschirs avatar mtschirs commented on August 16, 2024

Hi boynet,

the correct approach to smile detection is:

  1. Detect the face
  2. Define a region of interest (ROI) within the face where you expect the mouth to be
  3. Detect the smile within this region of interest (ROI)

This approach leads to very good runtime and detection performance.

Example (didn't test the code):

// Define face detector:
var width = ~~(60 * video.videoWidth / video.videoHeight);
var faceDetector = new objectdetect.detector(width, 60, 1.1, objectdetect.frontalface);

// Define smile detector:
var smileDetector = new objectdetect.detector(36 * 1.5, 24 * 1.5, 1.05, objectdetect.smile),

// Detect face:
var faceRects = faceDetector.detect(video, 1)
if (faceRects[0]) {
  var faceRect = faceRects[0];

  // Rescale face coordinates from detector to video coordinate space:
  faceRect[0] *= video.videoWidth / faceDetector.canvas.width;
  faceRect[1] *= video.videoHeight / faceDetector.canvas.height;
  faceRect[2] *= video.videoWidth / faceDetector.canvas.width;
  faceRect[3] *= video.videoHeight / faceDetector.canvas.height;

  // Detect smile:
  var smileROI = [~~(faceRect[0] + faceRect[2] * 0.15), ~~(faceRect[1] + faceRect[3] * 0.55), ~~(faceRect[2] * 0.7), ~~(faceRect[2] * 0.7 * smileDetector.canvas.height / smileDetector.canvas.width)];
  var smileRects = smileDetector.detect(video, 1, 1, smileROI);
  ...
}

from js-objectdetect.

boynet avatar boynet commented on August 16, 2024

thanks for give me something to start with.. I got it to work with your code.. but with the opencv smile detector.. the detector you have give me a lot of false positive.. where can I find some info to start understand what happening? I look at the code but I cant understand the logic of what to use with detector width and height? like why in the smile detector you use different values from face detector? base on what? thanks

from js-objectdetect.

mtschirs avatar mtschirs commented on August 16, 2024

The smile detector included in the js-objectdetect repository is the same as the OpenCV-smile detector, just converted into the js-objectdetect format. Therefore I don't understand how you can have different results for both...

Each detector has a user defined width and height. If you detect objects on an image or video using this detector, the image or video is first resized to the detector's width and height. You have make sure the aspect ratio of the detector and the source image or video is identical. The coordinates of detected objects are relative to the detector's width and height. A region of interest however is always given in coordinates relative to the source image or video's width and height. Therefore a conversion is necessary.

I understand that this is a bit confusing and I might simplify this in a future release of js-objectdetect.

Concerning the different values for face and smile detector: The face detector has a size of 60x60 pixel and scale-factor of 1.1. These values have been chosen to give good results with good runtime performance. If you increase the size of the face detector, you will be able to detect smaller faces in the source image or video, however the number of false positives will increase and the runtime performance will decrease.
Also, if you decrease the scale factor, detection performance will improve at the cost of a reduced runtime performance.

The aspect ratio of the smile detector (54x36) is identical to that of the region of interest for smile detection.

from js-objectdetect.

boynet avatar boynet commented on August 16, 2024

thanks starting to see some good results..

  1. what is the advantage of decrease/increase scale factor vs width/height ?
    as you said for capturing smaller things I need to increase the dimensions and to get more accurate ressults I need to decrease the scale factor? I saw that 1 scale factor isn't working so its need to be >1.

  2. about detect stepSize its the same effect as the scale factor? increase for better perfomance and decrease for better results?

  3. I saw the canny parameter I guess it shouldent be use? I tried it and saw no difference

from js-objectdetect.

mtschirs avatar mtschirs commented on August 16, 2024

Each classifier has been trained on and detects images of certain width and height. The face classifier for example was trained on 24x24 pixel face images. However, in real life, you want to detect faces of all kinds of sizes and not only 24x24 pixels. Therefore, the detector applies the classifier to resized versions of the camera image. It is explained in more detail here: http://www.pyimagesearch.com/2015/03/23/sliding-windows-for-object-detection-with-python-and-opencv/

Example: If your camera gives you images with 600x600 pixel, then you would run the face classifier on all possible positions in the 600x600 pixel image to detect the very small (far away) 24x24 pixel faces. Then, you resize the camera image to lets say 500x500 pixel and run the classifier again on the resized image. This time, the classifier still detects only 24x24 pixel faces, however these faces correspond to bigger (closer) faces in the original camera image. These resizing and classifying steps are repeated until all "scales" or sizes from 24x24 pixel to the original camera image dimensions have been covered.

  1. Width and height of the detector determine the the dimensions of the smallest detectable object. E.g. if you set width and height to 24x24 and use the (24x24) face classifier to detect faces, it would only detect a face if it fills out the whole 24x24 image.
    The scale factor determines how the camera image will be resized in the resize step. E.g. if you set the scale factor to 2, detection will be performed on 24x24, 48x48, 96x96, ... resized versions of the original camera image.

  2. The step size determines how often the classifier is applied to one image or scale, e.g. the number of horizontal and vertical pixels between each application. Better always set it to 1.

  3. Canny pruning can be used to exclude parts of the camera image with less structure (edges) from the computationally costly detection process. However, it is tricky to use, leads to only minor performance improvements while at the same time reducing accuracy, so better don't use it.

from js-objectdetect.

boynet avatar boynet commented on August 16, 2024

thanks a lot clearer right now :) I can improve the demos a little bit you accept a pr?

I think there is little change that can make huge benefit is that detector.detect will return results ordered by confidence for example the example_gesture_input If I order the coords by confidence and add threshold of confidence>2 I get almost none false positive.
if not its detect my neck as a fist with little confidence between 1-3 where a real fist always get 2-100.

  1. its a lot more complicated but again make it a lot more accurate is to develop some kind dumb guessing.
    in some frame my neck get a bigger confidene than my fist and its jumping between my fist and my neck where I can make something like: if my last fist was at points 10,10 => 10,11 =>10,12 and in the next step it detect 2 fist one at 10,13 and the other at 150,150 its should prefer the closest to the latest detection as there is a little change that my fist move that far. so the move will be more smoother without jumping

from js-objectdetect.

mtschirs avatar mtschirs commented on August 16, 2024

If you want to improve the demos, go ahead. However, I would like to keep the demos as simple as possible, as a kind of first 'practical example' for beginners who look at the js-objectdetect library. Perhaps I should better give them the name 'examples' instead of 'demos', so we can have another folder with more complex and interesting 'real' demonstrations, too...

  1. detector.detect already returns the results ordered by confidence (detector.detect returns an array of [x, y, width, height, confidence]).

  2. You are absolutely right! This is called "object tracking" and a separate process from "object detection" (where the tracking algorithm decides when and how often to call the object detection function). For simple applications such as in the demos, tracking of a single object (e.g. fist) can be done by running the object detection every frame. However, there are several better ways to track objects, among them the so-called "Kalman filter" or, especially when it comes to face tracking the "Camshift" algorithm. Your suggestion of tracking spatially close objects is probably sufficient for a lot of applications.

I didn't implement a tracking algorithm in js-objectdetect since, as the name says, this library focuses on performant object detection. However, since there is also a smoother included, I don't see why it couldn't be added as a supplementary javascript unit in the future.

from js-objectdetect.

boynet avatar boynet commented on August 16, 2024

about confidence ordering in the "example_gesture_scroll" exmaple you order them in line 316 maybe its there from some old code..

found some implementations of Camshift and Kalman filter https://github.com/auduno/headtrackr/blob/master/src/camshift.js https://github.com/itamarwe/kalman I will check them out if i am gonna need more perfomance
just made some mouth+face detection sample on pc I got 50-60fps where on mobile its somewhere around 20-30fps which is great.
thank you very much for this libary

from js-objectdetect.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.