Coder Social home page Coder Social logo

what should I set ldim to? about pico HOT 4 OPEN

nenadmarkus avatar nenadmarkus commented on July 19, 2024 1
what should I set ldim to?

from pico.

Comments (4)

andykais avatar andykais commented on July 19, 2024

full code for reference:

import pico from './pico.js'
import { decode } from 'https://deno.land/x/[email protected]/mod.ts'

const facefinder_bytes = await Deno.readFile('./examples/facefinder')
const facefinder_classify_region = pico.unpack_cascade(facefinder_bytes)

/**
 * a function to transform an RGBA image to grayscale
 */
function rgba_to_grayscale(rgba, nrows, ncols) {
  var gray = new Uint8Array(nrows * ncols)
  for (var r = 0; r < nrows; ++r)
    for (var c = 0; c < ncols; ++c)
      // gray = 0.2*red + 0.7*green + 0.1*blue
      gray[r * ncols + c] =
        (2 * rgba[r * 4 * ncols + 4 * c + 0] +
          7 * rgba[r * 4 * ncols + 4 * c + 1] +
          1 * rgba[r * 4 * ncols + 4 * c + 2]) /
        10
  return gray
}

async function find_face(image_filepath: string, stride?: number) {
  const raw = await Deno.readFile(image_filepath)
  const image_data = decode(raw)
  // const image_data = image_data_flat.reduce((acc, ))
  // console.log(image_data)
  const grey_image_data = rgba_to_grayscale(image_data, image_data.height, image_data.width)
  // console.log(image_data.height, image_data.width)
  const image = {
    pixels: rgba_to_grayscale(image_data, image_data.height, image_data.width),
    nrows: image_data.height,
    ncols: image_data.width,
    // ldim: image_data.width // ? TODO
    ldim: stride ?? image_data.width
  }
  const params = {
    shiftfactor: 0.1, // move the detection window by 10% of its size
    minsize: 20, // minimum size of a face (not suitable for real-time detection, set it to 100 in that case)
    maxsize: 1000, // maximum size of a face
    scalefactor: 1.1 // for multiscale processing: resize the detection window by 10% when moving to the higher scale
  }

  // run the cascade over the image
  // detections is an array that contains (r, c, s, q) quadruplets
  // (representing row, column, scale and detection score)
  let detections = pico.run_cascade(image, facefinder_classify_region, params)
  // cluster the obtained detections
  detections = pico.cluster_detections(detections, 0.2) // set IoU threshold to 0.2
  // draw results
  const qthresh = 5.0 // this constant is empirical: other cascades might require a different one

  // this just draws the rectangles to help visualize. It is not relevant to the face tracking
  const rectangles = detections.map(([x, y, w, h]) => `rectangle ${x},${y} ${w},${h}`).join(' ')
  const proc = Deno.run({
    cmd: [
      'convert',
      image_filepath,
      '-fill',
      'none',
      '-stroke',
      'red',
      '-draw',
      rectangles,
      'preview.jpg'
    ]
  })
  const result = await proc.status()
  if (!result.success) Deno.exit(1)
  console.log(image_filepath, `(${image_data.width}x${image_data.height})`, 'found', detections.length, 'faces with ldim', image.ldim)
  return detections
}

// 419 feels extremely arbitrary
await find_face('./samples/6627147.jpeg', 400)
await find_face('./samples/6627147.jpeg', 419)
await find_face('./samples/6627147.jpeg', 420)
await find_face('./samples/6627147.jpeg', 480)

this code is written for deno, the imported pico object is the same one provided by https://github.com/nenadmarkus/picojs

from pico.

nenadmarkus avatar nenadmarkus commented on July 19, 2024

The pixel intensity values of the image are accessed as pixels[r*ldim + c] and in your case ldim should be set to image_data.width.

In some cases, pico just fails to detect the face. This is the case you have.

Understand that pico should mainly be used with a video stream where detections get "averaged" over multiple frames and this leads to significantly better detection capabilities.

from pico.

andykais avatar andykais commented on July 19, 2024

ah. Thanks for the explanation. Is there a chance fiddling with any of the other defaults would help?

In the final product I will in fact be using this with frames extracted from a video to zoom-pan to the face so hopefully this will be less of an issue

from pico.

nenadmarkus avatar nenadmarkus commented on July 19, 2024

You can try reducing shiftfactor and/or scalefactor. This should improve the detection rate at the cost of speed, i.e., there's a trade-off.

from pico.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.