vladmandic / human Goto Github PK

Human: AI-powered 3D Face Detection & Rotation Tracking, Face Description & Recognition, Body Pose Tracking, 3D Hand & Finger Tracking, Iris Analysis, Age & Gender & Emotion Prediction, Gaze Tracking, Gesture Recognition

Home Page: https://vladmandic.github.io/human/demo/index.html

License: MIT License

JavaScript 2.65% TypeScript 27.46% HTML 68.35% Shell 0.02% CSS 1.52%

face-detection gender-prediction iris-tracking body-tracking hand-tracking tensorflowjs age-estimation face-position tfjs emotion-detection

human's Introduction

Human Project

SD.Next Project

Other Projects

Development Tools

Minor Projects & Tests

TensorFlow Port & Tools

ML Model Ports

human's People

Contributors

Stargazers

Watchers

Forkers

trendingtechnology ishan-marikar frank890417 delebash sukeningchuang gokulsg nemophi pizzashift ajaysinghpathania mrhegemon antonlinderer rena-ganba sandhyacs chenggong0602 sadam1195 samuelail kaushiksathvara swipswaps warhammer0 simon-lanf gpsbird butzyung niceban devmohrezakazem lghasemzadeh zhangjiekui gharaibeh jolt2017 jxncyym peterzhousz oristar2018 ucas-iigroup zekanysz dannisliang mkzirncz1 baronrustamov dwtcourses ghostnumber7 mrg7 creator4ever darkknight2223 sevencong avestaharsh technologyarts egimbernat xinsuinizhuan atomeyang jimnys-8 markrichers haohoangtran maxplne classicvalues starkindustrynco madewithstone pratitichanda reaper622 lifescopelabs 01coding wokawoka gyusang francescocerio aristotle-li ulasarican sammcgrail madhu009 samsonmaxwell gatodillo shailesh0209 jjeantw tens0rflowjs gotomypc zahiruldu jamsoyscottowen yyb930803 doytsujin hirajanwin proteanblank fingerx igneous8k s-jp rana-salma leoinsinging cfinfosolutions aiaurora geeky-motors irfanrobot perilloart3159 litong-2017 crescent12345 anandarauf machinevisionbeans nineswiss gsiogkas pantez publicviewtoken yahyaghani dzyninteams xzmagic ravi-ct runningv

human's Issues

createImageBitmap, bitmap.close() safari can't use.

Issue Description

createImageBitmap, bitmap.close() safari can't use.

human/src/human.js

Lines 463 to 485 in 77408fc

    
             async warmup(userConfig) { 
        
               const b64toBlob = (base64, type = 'application/octet-stream') => fetch(`data:${type};base64,${base64}`).then((res) => res.blob()); 
        
               if (userConfig) this.config = mergeDeep(this.config, userConfig); 
        
               const video = this.config.videoOptimized; 
        
               this.config.videoOptimized = false; 
        
               let blob; 
        
               switch (this.config.warmup) { 
        
                 case 'face': blob = await b64toBlob(sample.face); break; 
        
                 case 'full': blob = await b64toBlob(sample.body); break; 
        
                 default: blob = null; 
        
               } 
        
               if (!blob) return null; 
        
               const bitmap = await createImageBitmap(blob); 
        
               const t0 = now(); 
        
               const warmup = await this.detect(bitmap, config); 
        
               const t1 = now(); 
        
               bitmap.close(); 
        
               log('Warmup', this.config.warmup, (t1 - t0), warmup); 
        
               this.config.videoOptimized = video; 
        
               return warmup; 
        
             } 
        
           }

createImageBitmap
ImageBitmap

Please back to

human/src/human.js

Lines 425 to 465 in 977f92f

    
             async warmup(userConfig) { 
        
               if (userConfig) this.config = mergeDeep(this.config, userConfig); 
        
               return new Promise((resolve) => { 
        
                 const video = this.config.videoOptimized; 
        
                 this.config.videoOptimized = false; 
        
                 let src; 
        
                 let size; 
        
                 switch (this.config.warmup) { 
        
                   case 'face': 
        
                     size = 256; 
        
                     src = sample.face; 
        
                     break; 
        
                   case 'full': 
        
                     size = 1200; 
        
                     src = sample.body; 
        
                     break; 
        
                   default: 
        
                     size = 0; 
        
                     src = null; 
        
                 } 
        
                 const img = new Image(size, size); 
        
                 img.onload = () => { 
        
                   const canvas = (typeof OffscreenCanvas !== 'undefined') ? new OffscreenCanvas(size, size) : document.createElement('canvas'); 
        
                   canvas.width = size; 
        
                   canvas.height = size; 
        
                   const ctx = canvas.getContext('2d'); 
        
                   ctx.drawImage(img, 0, 0); 
        
                   const data = ctx.getImageData(0, 0, size, size); 
        
                   const t0 = now(); 
        
                   this.detect(data, config).then((warmup) => { 
        
                     const t1 = now(); 
        
                     log('Warmup', this.config.warmup, (t1 - t0), warmup); 
        
                     this.config.videoOptimized = video; 
        
                     resolve(warmup); 
        
                   }); 
        
                 }; 
        
                 if (src) img.src = src; 
        
                 else resolve(null); 
        
               }); 
        
             } 
        
           }

implement gesture recognition

usage in nodejs limited to some models

Currently only body pose detection works while face and hand models are not supported.
See tfjs-node issue tensorflow/tfjs#4066 for details.

Better Optimization for the No-landmarks option

I was reading up the code, and saw your implementation of the hand detection, with and without landmarks, and thought that you could optimize this using only the getBoundingBoxes function instead of the estimateHandBounds and estimateHands..
Thoughts?

ReferenceError: require is not defined

Run your online demo https://vladmandic.github.io/human/demo/index.html. I am not familiar with esbuild so I am not sure what is wrong. I started making this type of app myself but you have all of what I wanted to implement already done so why reinvent the wheel 👍 Awesome!!

WASM Not Initialized in human

Issue Description
Due to WASM not initialized and directly used, possible problem in the library.
WASM was not initialized, run tf.ready() or tf.setBackend("wasm") before doing this
Steps to Reproduce
Use the demo
Expected Behavior
No errors
Environment

Browser or NodeJS and version (e.g. NodeJS 14.15 or Chrome 86) Chrome
OS and Hardware platform (e.g. Windows 10, Ubuntu Linux on x64, Android 10) win10-x64
Packager (if any) (e.g, webpack, rollup, parcel, esbuild, etc.) -

Additional

For installation or startup issues include your package.json
For usage issues, it is recommended to post your code as gist
N/A

Hand Pose very slow

Issue Description
handpose one hand performance very slow

Steps to Reproduce
Open live demo
Enable hand pose only
Set max objects to 1
Expected Behavior
Not slow or choppy
**Environment
Chrome (87)

See attached video for demonstration
human-handpose.zip

hand detection works only with primary hand

currently HandPose model only runs for a hand with highest overall detection score instead of detecting multiple hands.

enable support for nodejs v12

i'm not using any node v14 specific features, it should work fine with node v12, i just skipped it as i didn't see the need.
most likely it will work, just need to update the project and run tests. i'll update here when done.

Originally posted by @vladmandic in #52 (comment)

model tuning and evaluation

Hi,
I was wondering if there is any reason not to support "tf.loadLayersModel" which will open the possibility of using more pretrained models that are available.

Would it makes sense to add a condition and a definition of Layers vs GraphModel to the config file?

modularize loading of iris and mesh models

Issue Description
iris.bin is still loaded when iris is set to false.
Steps to Reproduce

set config

 iris:{
    enabled:false,
...}

Expected Behavior
iris.bin file not loaded

**Environment
Chrome 86, node 14, Mac OS 10
dynamic create script element to load human.

document.createElement('script')

Can I know if a person is looking at the camera ?

I know that right now I can know if the user's eyes are facing the camera by knowing they are both at the same distance from the camera. The problem is that if someone is in the far left for example, this person would have to turn it's head in an angle from the camera's point of view and now this person is not facing the camera.

I wonder if there is a way to detect these cases. We could triangulate where the person is looking from the 3d coordinates, or try to guess if the person is looking by adding parameters X and Z in the equation, for example if you are in the far left, you should be facing slightly the right.

Let me know if you have any idea.

edit: I'd like to add that I'm not looking for exact gaze or where the iris points. just if the head if turned to the camera.

implement face feature vector extractor

human.detect(Videoelement. config)returns the result.box that is too large on the phone.

Hi,
I found a problem when i use it on my phone.The result.box is too large which is hard to match my specific validation boxes.
I hope you can solve this problem. Thank you~

'ls' is not recognized as an internal or external command,

Issue Description
'ls' is not recognized as an internal or external command,

Steps to Reproduce
npm run build
Expected Behavior
ls should work. I don't think this is a valid command in windows
**Environment
Windows 10
Node 15.0.1
npm 7.0.3

Additional
"build": "rimraf dist/* && npm run build-iife && npm run build-esm-bundle && npm run build-esm-nobundle && npm run build-node && npm run build-node-nobundle && npm run build-demo &&ls-l dist/",

browser demo is not compatible with mobile browsers

mobile browsers do not support <script type="module">, so demo does not load correctly.

gestures should support multiple persons

When multiple people's face gesture. The information can't be matched to real one.

wasm performance issues

Issue Description
i am getting very low fps for WASM in my hand-tracking, so just one model that too without landmarks. This would give me 3FPS with WASM SIMD Threaded, which is very very less
Steps to Reproduce
Simply check my demo
Expected Behavior
10FPS, at least?

The source code is here

When both the mesh and iris model are enabled, the page will stall for 4-5 seconds.

Issue Description
When I open the mesh and iris models on my laptop(i5,2.3GHz), the first time I open the web page, there will be 4-5 seconds of lag. During this time, animation and other things cannot be rendered.
ps: After the execution of human.detect(videoElement,config), the web page will freeze, and the video will not freeze if it is only opened.

Steps to Reproduce
1.Enable the mesh and iris models in file of config.
2.Preload the JSON and bin files.
3.Open a new page and call the camera to enable face detection
Expected Behavior

**Environment

NodeJS 14.15 and Chrome 86
Windows 10

Additional

"@vladmandic/human": "^0.9.12",

 import { IHumanOptions } from './face-camera.interface';

export const getConfig = (resourcesUrl: string = '', live = true, debug = false): IHumanOptions => {
  return {
    backend: 'webgl', // select tfjs backend to use
    wasmPath: `${resourcesUrl}human/assets/`, // path for wasm binaries
    // only used for backend: wasm
    console: debug, // enable debugging output to console
    async: true, // execute enabled models in parallel
    // this disables per-model performance data but
    // slightly increases performance
    // cannot be used if profiling is enabled
    profile: false, // enable tfjs profiling
    // this has significant performance impact
    // only enable for debugging purposes
    // currently only implemented for age,gender,emotion models
    deallocate: false, // aggresively deallocate gpu memory after each usage
    // only valid for webgl backend and only during first call
    // cannot be changed unless library is reloaded
    // this has significant performance impact
    // only enable on low-memory devices
    scoped: false, // enable scoped runs
    // some models *may* have memory leaks,
    // this wrapps everything in a local scope at a cost of performance
    // typically not needed
    videoOptimized: true, // perform additional optimizations when input is video,
    // must be disabled for images
    // basically this skips object box boundary detection for every n frames
    // while maintaining in-box detection since objects cannot move that fast

    filter: {
      enabled: false, // enable image pre-processing filters
      width: 0, // resize input width
      height: 0, // resize input height
      // if both width and height are set to 0, there is no resizing
      // if just one is set, second one is scaled automatically
      // if both are set, values are used as-is
      return: true, // return processed canvas imagedata in result
      brightness: 0, // range: -1 (darken) to 1 (lighten)
      contrast: 0, // range: -1 (reduce contrast) to 1 (increase contrast)
      sharpness: 0, // range: 0 (no sharpening) to 1 (maximum sharpening)
      blur: 0, // range: 0 (no blur) to N (blur radius in pixels)
      saturation: 0, // range: -1 (reduce saturation) to 1 (increase saturation)
      hue: 0, // range: 0 (no change) to 360 (hue rotation in degrees)
      negative: false, // image negative
      sepia: false, // image sepia colors
      vintage: false, // image vintage colors
      kodachrome: false, // image kodachrome colors
      technicolor: false, // image technicolor colors
      polaroid: false, // image polaroid camera effect
      pixelate: 0 // range: 0 (no pixelate) to N (number of pixels to pixelate)
    },

    gesture: {
      enabled: true // enable simple gesture recognition
    },

    face: {
      enabled: true, // controls if specified modul is enabled
      // face.enabled is required for all face models:
      // detector, mesh, iris, age, gender, emotion
      // (note: module is not loaded until it is required)
      detector: {
        modelPath: `${resourcesUrl}human/models/blazeface-back.json`, // can be 'front' or 'back'.
        // 'front' is optimized for large faces
        // such as front-facing camera and
        // 'back' is optimized for distanct faces.
        inputSize: 256, // fixed value: 128 for front and 256 for 'back'
        rotation: true, // use best-guess rotated face image or just box with rotation as-is
        maxFaces: 5, // maximum number of faces detected in the input
        // should be set to the minimum number for performance
        skipFrames: 15, // how many frames to go without re-running the face bounding box detector
        // only used for video inputs
        // e.g., if model is running st 25 FPS, we can re-use existing bounding
        // box for updated face analysis as the head probably hasn't moved much
        // in short time (10 * 1/25 = 0.25 sec)
        minConfidence: 0.5, // threshold for discarding a prediction
        iouThreshold: 0.2, // threshold for deciding whether boxes overlap too much in
        // non-maximum suppression (0.1 means drop if overlap 10%)
        scoreThreshold: 0.5 // threshold for deciding when to remove boxes based on score
        // in non-maximum suppression,
        // this is applied on detection objects only and before minConfidence
      },

      mesh: {
        enabled: live,
        modelPath: `${resourcesUrl}human/models/facemesh.json`,
        inputSize: 192 // fixed value
      },

      iris: {
        enabled: live,
        modelPath: `${resourcesUrl}human/models/iris.json`,
        inputSize: 64 // fixed value
      },

      age: {
        enabled: false,
        modelPath: `${resourcesUrl}human/models/age-ssrnet-imdb.json`, // can be 'age-ssrnet-imdb' or 'age-ssrnet-wik'
        // which determines training set for model
        inputSize: 64, // fixed value
        skipFrames: 15 // how many frames to go without re-running the detector
        // only used for video inputs
      },

      gender: {
        enabled: false,
        minConfidence: 0.1, // threshold for discarding a prediction
        modelPath: `${resourcesUrl}human/models/gender-ssrnet-imdb.json`, // can be 'gender', 'gender-ssrnet-imdb' or 'gender-ssrnet-wik'
        inputSize: 64, // fixed value
        skipFrames: 15 // how many frames to go without re-running the detector
        // only used for video inputs
      },

      emotion: {
        enabled: false,
        inputSize: 64, // fixed value
        minConfidence: 0.2, // threshold for discarding a prediction
        skipFrames: 15, // how many frames to go without re-running the detector
        modelPath: `${resourcesUrl}human/models/emotion-large.json` // can be 'mini', 'larg'
      },

      embedding: {
        enabled: false,
        inputSize: 112, // fixed value
        modelPath: `${resourcesUrl}human/models/mobilefacenet.json`
      }
    },

    body: {
      enabled: false,
      modelPath: `${resourcesUrl}human/models/posenet.json`,
      inputSize: 257, // fixed value
      maxDetections: 10, // maximum number of people detected in the input
      // should be set to the minimum number for performance
      scoreThreshold: 0.8, // threshold for deciding when to remove boxes based on score
      // in non-maximum suppression
      nmsRadius: 20 // radius for deciding points are too close in non-maximum suppression
    },

    hand: {
      enabled: false,
      inputSize: 256, // fixed value
      skipFrames: 15, // how many frames to go without re-running the hand bounding box detector
      // only used for video inputs
      // e.g., if model is running st 25 FPS, we can re-use existing bounding
      // box for updated hand skeleton analysis as the hand probably
      // hasn't moved much in short time (10 * 1/25 = 0.25 sec)
      minConfidence: 0.5, // threshold for discarding a prediction
      iouThreshold: 0.1, // threshold for deciding whether boxes overlap too much
      // in non-maximum suppression
      scoreThreshold: 0.8, // threshold for deciding when to remove boxes based on
      // score in non-maximum suppression
      maxHands: 1, // maximum number of hands detected in the input
      // should be set to the minimum number for performance
      landmarks: true, // detect hand landmarks or just hand boundary box
      detector: {
        modelPath: `${resourcesUrl}human/models/handdetect.json`
      },
      skeleton: {
        modelPath: `${resourcesUrl}human/models/handskeleton.json`
      }
    }
  };
};

thanks a lot.

Would tensorflow/tfjs-node-gpu be something that would add value to this project?

TensorFlow js node_gpu seems like it could further improve performance. Just thought you might be interested see https://www.tensorflow.org/js/guide/nodejs

implement nsfw detection & classification

Documentation errors

https://github.com/vladmandic/human/wiki/Outputs says emotions are in the result root but it is actually inside the face object.

Also I am not sure human.defaults work, at least sometimes it returned undefined

improve detection of small faces

currently included BlazeFace model uses low resolution (128px) and has issues detecting small faces in a larger scene.

add input convolution filters for image enhancement

Accuracy difference in tfjs hand pose official demo and this app

Windows 10 Professional
Intel i9 9900k
Graphics Intel UHD 630

It seems that accuracy is not as good and also there is a performance lag between the official hand pose demo and this application. This is just my observation from testing the demo and and your human app. The distance from the camera remained constant between the two applications. Am I doing something wrong?

Video demonstration of difference.

hand-pose.zip

Thanks.

Cannot use mesh model without enabling iris model

I get an exception when I call warmup("full"). It works if I enabled the iris model.

Unhandled Rejection (TypeError): Cannot read property '0' of undefined
Human.detectFace
src/human.js:310
307 |
308 | // calculate iris distance
309 | // iris: array[ center, left, top, right, bottom]

310 | const irisSize = (face.annotations.leftEyeIris && face.annotations.rightEyeIris)
| ^ 311 | /* average human iris size is 11.7mm */
312 | ? 11.7 * Math.max(Math.abs(face.annotations.leftEyeIris[3][0] - face.annotations.leftEyeIris[1][0]), Math.abs(face.annotations.rightEyeIris[4][1] - face.annotations.rightEyeIris[2][1]))
313 | : 0;
View compiled
async
src/human.js:392
389 | resolve({ error: 'could not convert input to tensor' });
390 | return;
391 | }
392 | this.perf.image = Math.trunc(now() - timeStamp);
| ^ 393 | this.analyze('Get Image:');
394 |
395 | // run face detection followed by all models that rely on face bounding box: face mesh, age, gender, emotion

Camera access not supported

Issue Description
Camera access not supported in the browser, tried in Chrome and Edge.

Steps to Reproduce
Open demo
Expected Behavior
Works as required
Environment

Browser or NodeJS and version (e.g. NodeJS 14.15 or Chrome 86) Chrome 86 and Edge 87
OS and Hardware platform (e.g. Windows 10, Ubuntu Linux on x64, Android 10) win10-x64
Packager (if any) (e.g, webpack, rollup, parcel, esbuild, etc.)

Additional

For installation or startup issues include your package.json
For usage issues, it is recommended to post your code as gist

Add live demo site

I have added a live demo site at https://human-human.netlify.app
I have also added npm scripts to publish to https://www.netlify.com/ under my account. If you want to do add this then you should create an account under you as maintainer. It is free hosting and has integration with github. If you are interested I can submit a pull request with the changes. I tried hosting the live demo site on github pages but github pages has a problem with relative links when trying to load the .json files. I could not find a fix for this so I choose netlify.com to host the demo.

web worker performance issue

Human is compatible with new web workers, but...

web workers are finicky:

cannot pass HTMLImage or HTMLVideo to web worker, so need to pass canvas instead
canvases can execute transferControlToOffscreen() and then become offscreenCanvas
which can be passed to worker, but...
cannot transfer canvas that has a rendering context (basically, first time getContext() is executed on it)

which means that if we pass main canvas that will be used to render results on,
then all operations on it must be within webworker and we cannot touch it in the main thread at all.
doable, but...how to paint a video frame on it before we pass it?

so we create new offscreenCanvas that we drew video frame on and pass it's imageData
and return results from worker, but then there is an overhead of creating it and passing large messages between main thread and worker - it ends up being slower than executing in the main thread.

Human already executes everything in async/await manner and avoids synchronous operations as much as possible so it doesn't block the main thread, so not sure what is the benefit of web workers (unless main thread is generally a very busy one)?

Error loading WASM backend

Hello, let's continue here.

I have this error message when loading the demo on localhost.

The only thing I did was clone the repo, update demo/index.html to uncomment the 3 lines for wasm/webgpu and then npm run dev. Then I select WASM and then I start the camera if breaks hard.

Have a nice day.

multiple enhancements to body model

I tried to use the MediaPipe API in my project, but unfortunately it doesn't seem to support web worker (a must in my case, since there are some intensive 3D animations, and there is little room in the mani UI thread for other CPU-intensive task). So at the end I tried your @vladmandic human library instead, but I encountered some issues.

I loaded human.js inside worker (via importScripts), but I have to load the human object as new Human.default() instead of new Human() (I can import human.esm.js as module, but I want to avoid that in web worker).
.warmup() doesn't seem to work as worker doesn't have Image.
The accuracy of your PoseNet model is low, obviously lower than the default one from TFJS. Is it possible to change some parameters so that it is on par with the TFJS one, or even have the option to load ResNet instead of Mobilenet?

Originally posted by @ButzYung in #47 (comment)

Demo example is lagging and hand detection for single hand is not working well

Issue Description
Demo example is lagging and hand detection for single hand is not working well
Steps to Reproduce
Run Live Demo https://vladmandic.github.io/human/demo/index.html
Turn off all modules except HAND POSE
Change MAX OBJECTS to 1
Expected Behavior
Hand detector detects hand intermittently at a distance but should not have any problem.
**Environment
Windows 10
Microsoft Edge 86.0.622.63

Additional
May be related to issue #16

Module filters cannot be used when using web workers

see phoboslab/WebGLImageFilter#27 for details

autodetect if skip frames is supported for input type

config.js file changes are not reflected in the gui

Tested demo folder index.html using only hand pose enabled

Open config.js
Change hand>maxHands: to 1
Launch index.html and the gui still has Max Objects set to 10
I also tested another variable by setting backend to wasm but the change was also not updated in the gui

Regarding the coordinates returned by iris detection

Iris detection retuns the coordinates of the eye and iris, as well as the area around, including eyebrow. However, the coordinates of those areas outside the eye are often inaccurate, resulting in distorted geometry around the eyes under some situations. The coordinates of those areas can be safely discarded and just use the original ones provided by Facemesh without affecting the functioning of the iris detection. This can be easily done by commenting out/removing some lines of MESH_TO_IRIS_INDICES_MAP under the file coords.js. Personally I would say at least those coordinates regarding the eyebrow should be removed (since the iris version overwrites the coordinates from Facemesh and distort the detection of eyebrow).

const MESH_TO_IRIS_INDICES_MAP = [ // A mapping from facemesh model keypoints to iris model keypoints.
  { key: 'EyeUpper0', indices: [9, 10, 11, 12, 13, 14, 15] },
  { key: 'EyeUpper1', indices: [25, 26, 27, 28, 29, 30, 31] },
  { key: 'EyeUpper2', indices: [41, 42, 43, 44, 45, 46, 47] },
  { key: 'EyeLower0', indices: [0, 1, 2, 3, 4, 5, 6, 7, 8] },
  { key: 'EyeLower1', indices: [16, 17, 18, 19, 20, 21, 22, 23, 24] },
  { key: 'EyeLower2', indices: [32, 33, 34, 35, 36, 37, 38, 39, 40] },
  { key: 'EyeLower3', indices: [54, 55, 56, 57, 58, 59, 60, 61, 62] },
//  { key: 'EyebrowUpper', indices: [63, 64, 65, 66, 67, 68, 69, 70] },
//  { key: 'EyebrowLower', indices: [48, 49, 50, 51, 52, 53] },
];

WebWorker not working

Issue Description
Worker not working, showing creating worker thread and then not progressing
Steps to Reproduce
Just set useWorker to true
Expected Behavior
Works as expected
Environment

Browser or NodeJS and version (e.g. NodeJS 14.15 or Chrome 86) Chrome
OS and Hardware platform (e.g. Windows 10, Ubuntu Linux on x64, Android 10) Win7-x64
Packager (if any) (e.g, webpack, rollup, parcel, esbuild, etc.)

Additional

For installation or startup issues include your package.json
For usage issues, it is recommended to post your code as gist

detect facial expressions

as title says...

Increase FPS by decreasing Render Resolution

How do I reduce the render resolution and resize the canvas into a smaller one (300 x 150)?

Integrate BlazePalm into human

Issue Description

This is what Im getting
Steps to Reproduce
Switch on Hand and keep only the detection (not landmark), see console
Expected Behavior
Works as expected
Environment

Browser or NodeJS and version (e.g. NodeJS 14.15 or Chrome 86) Chrome
OS and Hardware platform (e.g. Windows 10, Ubuntu Linux on x64, Android 10) Win10
Packager (if any) (e.g, webpack, rollup, parcel, esbuild, etc.) N/A

Additional

For installation or startup issues include your package.json
For usage issues, it is recommended to post your code as gist

Editing the demo for just a single model

I've attempted doing this, for the handdetection, but with no success...
My browser.js code is below ~

import Human from '../dist/human.esm.js';
import draw from './draw.js';
import Menu from './menu.js';

const human = new Human();

// ui options
const ui = {
  baseColor: 'rgba(173, 216, 230, 0.3)', // 'lightblue' with light alpha channel
  baseBackground: 'rgba(50, 50, 50, 1)', // 'grey'
  baseLabel: 'rgba(173, 216, 230, 0.9)', // 'lightblue' with dark alpha channel
  baseFontProto: 'small-caps {size} "Segoe UI"',
  baseLineWidth: 12,
  baseLineHeightProto: 2,
  columns: 2,
  busy: false,
  facing: true,
  useWorker: false,
  worker: 'demo/worker.js',	
  samples: ['../assets/sample6.jpg', '../assets/sample1.jpg', '../assets/sample4.jpg', '../assets/sample5.jpg', '../assets/sample3.jpg', '../assets/sample2.jpg'],
  drawBoxes: true,
  drawPoints: false,
  drawPolygons: true,
  fillPolygons: true,
  useDepth: true,
  console: true,
  maxFrames: 10,
  modelsPreload: true,
  modelsWarmup: true,
  menuWidth: 0,
  menuHeight: 0,
  camera: {},
  fps: [],
};

// global variables
let menu;
let menuFX;
let worker;
let timeStamp;

// helper function: translates json to human readable string
function str(...msg) {
  if (!Array.isArray(msg)) return msg;
  let line = '';
  for (const entry of msg) {
    if (typeof entry === 'object') line += JSON.stringify(entry).replace(/{|}|"|\[|\]/g, '').replace(/,/g, ', ');
    else line += entry;
  }
  return line;
}

// helper function: wrapper around console output
const log = (...msg) => {
  // eslint-disable-next-line no-console
  if (ui.console) console.log(...msg);
};

const status = (msg) => {
  // eslint-disable-next-line no-console
  document.getElementById('status').innerText = msg;
};

// draws processed results and starts processing of a next frame
function drawResults(input, result, canvas) {
  // update fps data
  const elapsed = performance.now() - timeStamp;
  ui.fps.push(1000 / elapsed);
  if (ui.fps.length > ui.maxFrames) ui.fps.shift();

  // enable for continous performance monitoring
  // console.log(result.performance);

  // immediate loop before we even draw results, but limit frame rate to 30
  if (input.srcObject) {
    // eslint-disable-next-line no-use-before-define
    if (elapsed > 33) requestAnimationFrame(() => runHumanDetect(input, canvas));
    // eslint-disable-next-line no-use-before-define
    else setTimeout(() => runHumanDetect(input, canvas), 33 - elapsed);
  }
  // draw fps chart
  menu.updateChart('FPS', ui.fps);
  // draw image from video
  const ctx = canvas.getContext('2d');
  ctx.fillStyle = ui.baseBackground;
  ctx.fillRect(0, 0, canvas.width, canvas.height);
  if (result.canvas) {
    if (result.canvas.width !== canvas.width) canvas.width = result.canvas.width;
    if (result.canvas.height !== canvas.height) canvas.height = result.canvas.height;
    ctx.drawImage(result.canvas, 0, 0, result.canvas.width, result.canvas.height, 0, 0, result.canvas.width, result.canvas.height);
  } else {
    ctx.drawImage(input, 0, 0, input.width, input.height, 0, 0, canvas.width, canvas.height);
  }
  // draw all results
  draw.face(result.face, canvas, ui, human.facemesh.triangulation);
  draw.body(result.body, canvas, ui);
  draw.hand(result.hand, canvas, ui);
  draw.gesture(result.gesture, canvas, ui);
  // update log
  const engine = human.tf.engine();
  const gpu = engine.backendInstance ? `gpu: ${(engine.backendInstance.numBytesInGPU ? engine.backendInstance.numBytesInGPU : 0).toLocaleString()} bytes` : '';
  const memory = `system: ${engine.state.numBytes.toLocaleString()} bytes ${gpu} | tensors: ${engine.state.numTensors.toLocaleString()}`;
  const processing = result.canvas ? `processing: ${result.canvas.width} x ${result.canvas.height}` : '';
  const avg = Math.trunc(10 * ui.fps.reduce((a, b) => a + b) / ui.fps.length) / 10;
  document.getElementById('log').innerText = `
    video: ${ui.camera.name} | facing: ${ui.camera.facing} | resolution: ${ui.camera.width} x ${ui.camera.height} ${processing}
    backend: ${human.tf.getBackend()} | ${memory}
    performance: ${str(result.performance)} FPS:${avg}
  `;
}

// setup webcam
async function setupCamera() {
  if (ui.busy) return null;
  ui.busy = true;
  const video = document.getElementById('video');
  const canvas = document.getElementById('canvas');
  const output = document.getElementById('log');
  const live = video.srcObject ? ((video.srcObject.getVideoTracks()[0].readyState === 'live') && (video.readyState > 2) && (!video.paused)) : false;
  let msg = '';
  status('setting up camera');
  // setup webcam. note that navigator.mediaDevices requires that page is accessed via https
  if (!navigator.mediaDevices) {
    msg = 'camera access not supported';
    output.innerText += `\n${msg}`;
    log(msg);
    status(msg);
    return null;
  }
  let stream;
  const constraints = {
    audio: false,
    video: {
      facingMode: (ui.facing ? 'user' : 'environment'),
      resizeMode: 'none',
      width: { ideal: window.innerWidth },
      height: { ideal: window.innerHeight },
    },
  };
  try {
    // if (window.innerWidth > window.innerHeight) constraints.video.width = { ideal: window.innerWidth };
    // else constraints.video.height = { ideal: window.innerHeight };
    stream = await navigator.mediaDevices.getUserMedia(constraints);
  } catch (err) {
    if (err.name === 'PermissionDeniedError') msg = 'camera permission denied';
    else if (err.name === 'SourceUnavailableError') msg = 'camera not available';
    else msg = 'camera error';
    output.innerText += `\n${msg}`;
    status(msg);
    log(err);
  }
  if (stream) video.srcObject = stream;
  else return null;
  const track = stream.getVideoTracks()[0];
  const settings = track.getSettings();
  log('camera constraints:', constraints, 'window:', { width: window.innerWidth, height: window.innerHeight }, 'settings:', settings, 'track:', track);
  ui.camera = { name: track.label, width: settings.width, height: settings.height, facing: settings.facingMode === 'user' ? 'front' : 'back' };
  return new Promise((resolve) => {
    video.onloadeddata = async () => {
      video.width = video.videoWidth;
      video.height = video.videoHeight;
      canvas.width = video.width;
      canvas.height = video.height;
      canvas.style.width = canvas.width > canvas.height ? '100vw' : '';
      canvas.style.height = canvas.width > canvas.height ? '' : '100vh';
      //ui.menuWidth.input.setAttribute('value', video.width);
      //ui.menuHeight.input.setAttribute('value', video.height);
      // silly font resizing for paint-on-canvas since viewport can be zoomed
      const size = 14 + (6 * canvas.width / window.innerWidth);
      //ui.baseFont = ui.baseFontProto.replace(/{size}/, `${size}px`);
      if (live) video.play();
      ui.busy = false;
      // do once more because onresize events can be delayed or skipped
      // if (video.width > window.innerWidth) await setupCamera();
      status('');
      resolve(video);
    };
  });
}

// wrapper for worker.postmessage that creates worker if one does not exist
function webWorker(input, image, canvas) {
  if (!worker) {
    // create new webworker and add event handler only once
    log('creating worker thread');
    worker = new Worker(ui.worker, { type: 'module' });
    worker.warned = false;
    // after receiving message from webworker, parse&draw results and send new frame for processing
    worker.addEventListener('message', (msg) => {
      if (!worker.warned) {
        log('warning: cannot transfer canvas from worked thread');
        log('warning: image will not show filter effects');
        worker.warned = true;
      }
      drawResults(input, msg.data.result, canvas);
    });
  }
  // pass image data as arraybuffer to worker by reference to avoid copy
  worker.postMessage({ image: image.data.buffer, width: canvas.width, height: canvas.height }, [image.data.buffer]);
}

// main processing function when input is webcam, can use direct invocation or web worker
function runHumanDetect(input, canvas) {
  timeStamp = performance.now();
  // if live video
  const live = input.srcObject && (input.srcObject.getVideoTracks()[0].readyState === 'live') && (input.readyState > 2) && (!input.paused);
  if (!live && input.srcObject) {
    // if we want to continue and camera not ready, retry in 0.5sec, else just give up
    if ((input.srcObject.getVideoTracks()[0].readyState === 'live') && (input.readyState <= 2)) setTimeout(() => runHumanDetect(input, canvas), 500);
    else log(`camera not ready: track state: ${input.srcObject?.getVideoTracks()[0].readyState} stream state: ${input.readyState}`);
    return;
  }
  status('');
  if (false) {
    // get image data from video as we cannot send html objects to webworker
    const offscreen = new OffscreenCanvas(canvas.width, canvas.height);
    const ctx = offscreen.getContext('2d');
    ctx.drawImage(input, 0, 0, input.width, input.height, 0, 0, canvas.width, canvas.height);
    const data = ctx.getImageData(0, 0, canvas.width, canvas.height);
    // perform detection in worker
    webWorker(input, data, canvas);
  } else {
	 console.log("Running Detect");
    human.detect(input).then((result) => {
      if (result.error) log(result.error);
      else drawResults(input, result, canvas);
      if (human.config.profile) log('profile data:', human.profile());
    });
  }
}

// main processing function when input is image, can use direct invocation or web worker
async function processImage(input) {
  timeStamp = performance.now();
  return new Promise((resolve) => {
    const image = new Image();
    image.onload = async () => {
      log('Processing image:', image.src);
      const canvas = document.getElementById('canvas');
      image.width = image.naturalWidth;
      image.height = image.naturalHeight;
      canvas.width = human.config.filter.width && human.config.filter.width > 0 ? human.config.filter.width : image.naturalWidth;
      canvas.height = human.config.filter.height && human.config.filter.height > 0 ? human.config.filter.height : image.naturalHeight;
      const result = await human.detect(image);
      drawResults(image, result, canvas);
      const thumb = document.createElement('canvas');
      thumb.className = 'thumbnail';
      thumb.width = window.innerWidth / (ui.columns + 0.1);
      thumb.height = canvas.height / (window.innerWidth / thumb.width);
      const ctx = thumb.getContext('2d');
      ctx.drawImage(canvas, 0, 0, canvas.width, canvas.height, 0, 0, thumb.width, thumb.height);
      document.getElementById('samples-container').appendChild(thumb);
      image.src = '';
      resolve(true);
    };
    image.src = input;
  });
}

// just initialize everything and call main function
async function detectVideo() {
  human.config.videoOptimized = true;
  document.getElementById('samples-container').style.display = 'none';
  document.getElementById('canvas').style.display = 'block';
  const video = document.getElementById('video');
  const canvas = document.getElementById('canvas');
  ui.baseLineHeight = ui.baseLineHeightProto;
  if ((video.srcObject !== null) && !video.paused) {
    document.getElementById('play').style.display = 'block';
    status('paused');
    video.pause();
  } else {
    await setupCamera();
    document.getElementById('play').style.display = 'none';
    status('');
    video.play();
  }
  runHumanDetect(video, canvas);
}

// just initialize everything and call main function
async function detectSampleImages() {
  document.getElementById('play').style.display = 'none';
  human.config.videoOptimized = false;
  const size = 12 + Math.trunc(12 * ui.columns * window.innerWidth / document.body.clientWidth);
  ui.baseFont = ui.baseFontProto.replace(/{size}/, `${size}px`);
  ui.baseLineHeight = ui.baseLineHeightProto * ui.columns;
  document.getElementById('canvas').style.display = 'none';
  document.getElementById('samples-container').style.display = 'block';
  log('Running detection of sample images');
  status('processing images');
  document.getElementById('samples-container').innerHTML = '';
  for (const sample of ui.samples) await processImage(sample);
  status('');
}

function setupMenu() {
  human.config.hand.enabled = true;
  human.config.hand.landmarks = false;
  human.config.hand.maxHands = 1;
  human.config.minimumConfidence = 0.9;
  human.config.hand.detector.modelPath = "https://tfhub.dev/mediapipe/tfjs-model/handdetector/1/default/1";
  human.config.face.enabled = false;
  human.config.body.enabled = false;
  human.config.gesture.enabled = false;
  console.log("SETUP SUCCESSFUL YAY");
  console.log(human);
}

async function main() {
  log('Human: demo starting ...');
  setupMenu();
  document.getElementById('log').innerText = `Human: version ${human.version} TensorFlow/JS: version ${human.tf.version_core}`;
  // this is not required, just pre-warms the library
  await human.load();
  console.log("human.load");
  if (ui.modelsWarmup) {
    status('initializing');
    const warmup = new ImageData(50, 50);
    await human.detect(warmup);
  }
  status('human: ready');
  document.getElementById('loader').style.display = 'none';
  document.getElementById('play').style.display = 'block';
}

window.onload = main;
window.onresize = setupCamera;

Firefox webgl and performance

I am getting this error on firefox with my project:

Error when getting WebGL context: Error: Cannot create a canvas in this context 2 bundle.worker.js:40131:13
Initialization of backend webgl failed bundle.worker.js:53628:15
MathBackendWebGL@http://localhost:3000/static/js/bundle.worker.js:15334:13
../audio-ml-plugin/node_modules/@tensorflow/tfjs-backend-webgl/dist/base.js/<@http://localhost:3000/static/js/bundle.worker.js:16354:96

In the human demo Firefox is half the FPS of Chrome.

Any ideas?

Thanks

Suggestion: Addition of Quantized models

Models in the human repository are pretty nice, no doubt... However, could you, @vladmandic , add quantization to the models??

detection on android is much slower than ios

Hello, the hand detection speed is very slow, the FPS of running Huawei phones is very low, and the tracking accuracy is also very poor.

include support for wasm backend

add video pan & zoom controls

tf.nonMaxSuppression() in webgl locks the UI thread. Call tf.nonMaxSuppressionAsync() instead

Browser Microsoft Edge v86.0.622.63 x64

Steps to reproduce

Open this repo's live demo link https://vladmandic.github.io/human/demo/index.html
Open dev tools
Click Start Video
Error will display in console. Disable hand pose using slider and error will stop, enable hand pose and it will start.

Facemesh, Handpose, and normalized mesh

Issue Description

The Human version of Facemesh returns only the normalized mesh data, unlike the TFJS version which returns both .scaledMesh (normalized, equvialent to the Human version of mesh) and .mesh, the raw 3D coordinates which is what I need for 3D calculations. Please consider returning both version of mesh in future version of Human.

A similar issue occurs for Handpose. Both Human and TFJS version of Handpose return only the normalized mesh data. In the case of TFJS, I use a trick and modify Handpose by returning a normalized z value as well (i.e. the 3D aspect ratio is more accurate when all xyz are normalized instead of just xy), something like the following (not sure if this is the right way to get a normalized z though).

https://github.com/vladmandic/human/blob/main/src/hand/handpipeline.js

    return coordsRotated.map((coord) => [
      coord[0] + originalBoxCenter[0],
      coord[1] + originalBoxCenter[1],
      coord[2] * scaleFactor[0], // normalized it by multiplying with the same x scaleFactor
    ]);

I wonder if something can be done for the Human version of Handpose.

How to prevent loading of models at startup

I have edited the demo, and commented the below lines ~

/*if (ui.modelsPreload) {
    status("loading");
    await human.load(human.config);
  }*/
  if (ui.modelsWarmup) {
    status("initializing");
    const warmup = new ImageData(50, 50);
    await human.detect(warmup,human.config);
  }
  status("human: ready");
  document.getElementById("loader").style.display = "none";
  document.getElementById("play").style.display = "block"

It still shows the below screenshot~

WASM not working

Issue Description
WASM doesn't work in the demo
Steps to Reproduce
Simply go to the demo site and set backend to wasm

Expected Behavior
Sets properly
**Environment

Browser or NodeJS and version (e.g. NodeJS 14.15 or Chrome 86) Chrome, Edge
OS and Hardware platform (e.g. Windows 10, Ubuntu Linux on x64, Android 10) Win10
Packager (if any) (e.g, webpack, rollup, parcel, esbuild, etc.)

Additional

For installation or startup issues include your package.json
For usage issues, it is recommended to post your code as gist

	async warmup(userConfig) {
	const b64toBlob = (base64, type = 'application/octet-stream') => fetch(`data:${type};base64,${base64}`).then((res) => res.blob());

	if (userConfig) this.config = mergeDeep(this.config, userConfig);
	const video = this.config.videoOptimized;
	this.config.videoOptimized = false;
	let blob;
	switch (this.config.warmup) {
	case 'face': blob = await b64toBlob(sample.face); break;
	case 'full': blob = await b64toBlob(sample.body); break;
	default: blob = null;
	}
	if (!blob) return null;
	const bitmap = await createImageBitmap(blob);
	const t0 = now();
	const warmup = await this.detect(bitmap, config);
	const t1 = now();
	bitmap.close();
	log('Warmup', this.config.warmup, (t1 - t0), warmup);
	this.config.videoOptimized = video;
	return warmup;
	}
	}

	async warmup(userConfig) {
	if (userConfig) this.config = mergeDeep(this.config, userConfig);
	return new Promise((resolve) => {
	const video = this.config.videoOptimized;
	this.config.videoOptimized = false;
	let src;
	let size;
	switch (this.config.warmup) {
	case 'face':
	size = 256;
	src = sample.face;
	break;
	case 'full':
	size = 1200;
	src = sample.body;
	break;
	default:
	size = 0;
	src = null;
	}
	const img = new Image(size, size);
	img.onload = () => {
	const canvas = (typeof OffscreenCanvas !== 'undefined') ? new OffscreenCanvas(size, size) : document.createElement('canvas');
	canvas.width = size;
	canvas.height = size;
	const ctx = canvas.getContext('2d');
	ctx.drawImage(img, 0, 0);
	const data = ctx.getImageData(0, 0, size, size);
	const t0 = now();
	this.detect(data, config).then((warmup) => {
	const t1 = now();
	log('Warmup', this.config.warmup, (t1 - t0), warmup);
	this.config.videoOptimized = video;
	resolve(warmup);
	});
	};
	if (src) img.src = src;
	else resolve(null);
	});
	}
	}

vladmandic / human Goto Github PK

human's Introduction

Human Project

SD.Next Project

Other Projects

Development Tools

Minor Projects & Tests

TensorFlow Port & Tools

ML Model Ports

human's People

Contributors

Stargazers

Watchers

Forkers

human's Issues

Recommend Projects

Recommend Topics

Recommend Org