Coder Social home page Coder Social logo

Comments (7)

exitudio avatar exitudio commented on September 4, 2024

I assume that you are trying to compare with the appearance base methods, right? The model-based (skeleton-based) methods are a lot smaller models because they use skeleton data instead of raw images. However, the heavy part is pose estimation.

  • HRNet w32 is 41.230M parameters (from here)
  • GaitMixer has 166,304 parameters. We already print the parameters here.
    For FLOPs, we didn't measure it.

from gaitmixer.

puyiwen avatar puyiwen commented on September 4, 2024

我假设您正在尝试与外观基本方法进行比较,对吧?基于模型(基于骨架)的方法是小得多的模型,因为它们使用骨架数据而不是原始图像。但是,沉重的部分是姿势估计。

  • HRNet w32 是 41.230M 参数(从这里开始)
  • GaitMixer 有 166,304 个参数。我们已经在这里打印了参数。对于 FLOP,我们没有对其进行测量。

Thank you for your answer, I'm comparing appearance-based model or skeleton-based model which smaller. And I find a smaller HRNet for pose estimation, called Lite-HRNet. Why are you use HRNet w32 instead of Lite-HRNet, because Lite-HRNet pose estimation is not as good as HRNet w32?

from gaitmixer.

exitudio avatar exitudio commented on September 4, 2024

We didn't explore much about the pose estimator. We experimented with SimCC. But it's not a clear improvement so we just follow GaitGraph which uses HRNet.

from gaitmixer.

puyiwen avatar puyiwen commented on September 4, 2024

We didn't explore much about the pose estimator. We experimented with SimCC. But it's not a clear improvement so we just follow GaitGraph which uses HRNet.

Thank you for your quick reply! I'm so sorry to bother you again, I want to calculate the GFLOPs of the model first, so I want to know what is the input dimension of the model?

from gaitmixer.

exitudio avatar exitudio commented on September 4, 2024

the input has 4 dimensions [batch size, # of frames, # of joints, dim]

batch size = 64
 # of frames = 60
 # of joints = 17
dim = 2 (x,y)

In GaitGraph, dim=3, they add confidence from the pose estimator. But we experiment with that, there is no improvement so we ignore it.

from gaitmixer.

puyiwen avatar puyiwen commented on September 4, 2024

the input has 4 dimensions [batch size, # of frames, # of joints, dim]

batch size = 64
 # of frames = 60
 # of joints = 17
dim = 2 (x,y)

In GaitGraph, dim=3, they add confidence from the pose estimator. But we experiment with that, there is no improvement so we ignore it.

Thank you very much!! I calculate the GFLOPs and Params of the model and I find GaitMixer is very small. Are these skeleton-based methods small as GaitMixer? And I've always been confused as to why skeleton-based methods perform worse than appearance-base methods. It stands to reason that skeleton-based methods noise is less noisy than appearance-base methods. Can you tell me? Thank you very much again.

from gaitmixer.

exitudio avatar exitudio commented on September 4, 2024

Yeah, the other skeleton-based methods should be similar. GaitGraph has even less computation because it's using graph convolution in the spatial dimension while GaitMixer is using self-attention. But to be fair, we should include pose estimation to calculate end-to-end detection.

appearance-based vs skeleton-based
Theoretically, appearance-based method is not pure gait recognition because it sees the other visual cues such as hair, face, and clothes. As you can see the accuracies for appearance-based method drop a lot in different cloth condition (CL). I think skeleton-based methods are more practical and robust.

from gaitmixer.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.