Coder Social home page Coder Social logo

cure-lab / smoothnet Goto Github PK

View Code? Open in Web Editor NEW
334.0 9.0 37.0 3.28 MB

[ECCV 2022] Official implementation of the paper "SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos"

License: Apache License 2.0

Python 99.86% Shell 0.14%
2d-pose-estimation 3d-pose-estimation body-reconstruction human-pose-estimation motion-estimation pose-detection pose-estimation smoothing temporal-models eccv

smoothnet's Introduction

SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos (ECCV 2022)

This repo is the official implementation of "SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos". [Paper] [Project]

Update

  • Support SmoothNet in MMPose Release v0.25.0 and MMHuman3D as a smoothing strategy!

  • Clean version is released!

  • To further improve SmoothNet as a near online smoothing strategy, we reduce the original window size 64 to 32 frames by default!

  • We also provide the pretrained models with the window size 8, 16, 32 and 64 frames here.

It currently includes code, data, log and models for the following tasks:

  • 2D human pose estimation
  • 3D human pose estimation
  • Body recovery via a SMPL model

Major Features

Description

When analyzing human motion videos, the output jitters from existing pose estimators are highly-unbalanced with varied estimation errors across frames. Most frames in a video are relatively easy to estimate and only suffer from slight jitters. In contrast, for rarely seen or occluded actions, the estimated positions of multiple joints largely deviate from the ground truth values for a consecutive sequence of frames, rendering significant jitters on them.

To tackle this problem, we propose to attach a dedicated temporal-only refinement network to existing pose estimators for jitter mitigation, named SmoothNet. Unlike existing learning-based solutions that employ spatio-temporal models to co-optimize per-frame precision and temporal smoothness at all the joints, SmoothNet models the natural smoothness characteristics in body movements by learning the long-range temporal relations of every joint without considering the noisy correlations among joints. With a simple yet effective motion-aware fully-connected network, SmoothNet improves the temporal smoothness of existing pose estimators significantly and enhances the estimation accuracy of those challenging frames as a side-effect. Moreover, as a temporal-only model, a unique advantage of SmoothNet is its strong transferability across various types of estimators and datasets. Comprehensive experiments on five datasets with eleven popular backbone networks across 2D and 3D pose estimation and body recovery tasks demonstrate the efficacy of the proposed solution. Our code and datasets are provided in the supplementary materials.

Results

SmoothNet is a plug-and-play post-processing network to smooth any outputs of existing pose estimators. To fit well across datasets, backbones, and modalities with lower MPJPE and PA-MPJPE, we provide THREE pre-trained models (Train on AIST-VIBE-3D, 3DPW-SPIN-3D, and H36M-FCN-3D) to handle all existing issues.

Please refer to our supplementary materials to check the cross-model validation in detail. Noted that all models can obtain lower and similar Accels than the compared backbone estimators. The differences are in MPJPEs and PA-MPJPEs.

Due to the temporal-only network without spatial modelings, SmoothNet is trained on 3D position representations only, and can be tested on 2D, 3D, and 6D representations, respectively.

3D Keypoint Results

Dataset Estimator MPJPE (Input/Output):arrow_down: Accel (Input/Output):arrow_down: Pretrain model
AIST++ SPIN 107.17/95.21 33.19/4.17 checkpoint / config
AIST++ TCMR* 106.72/105.51 6.4/4.24 checkpoint / config
AIST++ VIBE* 106.90/97.47 31.64/4.15 checkpoint / config
Human3.6M FCN 54.55/52.72 19.17/1.03 checkpoint / config
Human3.6M RLE 48.87/48.27 7.75/0.90 checkpoint / config
Human3.6M TCMR* 73.57/73.89 3.77/2.79 checkpoint / config
Human3.6M VIBE* 78.10/77.23 15.81/2.86 checkpoint / config
Human3.6M Videopose(T=27)* 50.13/50.04 3.53/0.88 checkpoint / config
Human3.6M Videopose(T=81)* 48.97/48.89 3.06/0.87 checkpoint / config
Human3.6M Videopose(T=243)* 48.11/48.05 2.82/0.87 checkpoint / config
MPI-INF-3DHP SPIN 100.74/92.89 28.54/6.54 checkpoint / config
MPI-INF-3DHP TCMR* 92.83/88.93 7.92/6.49 checkpoint / config
MPI-INF-3DHP VIBE* 92.39/87.57 22.37/6.5 checkpoint / config
MuPoTS TposeNet* 103.33/100.78 12.7/7.23 checkpoint / config
MuPoTS TposeNet+RefineNet* 93.97/91.78 9.53/7.21 checkpoint / config
3DPW EFT 90.32/88.40 32.71/6.07 checkpoint / config
3DPW EFT 90.32/86.39 32.71/6.30 checkpoint / config(additional training)
3DPW PARE 78.91/78.11 25.64/5.91 checkpoint / config
3DPW SPIN 96.85/95.84 34.55/6.17 checkpoint / config
3DPW TCMR* 86.46/86.48 6.76/5.95 checkpoint / config
3DPW VIBE* 82.97/81.49 23.16/5.98 checkpoint / config

2D Keypoint Results

Dataset Estimator MPJPE (Input/Output):arrow_down: Accel (Input/Output):arrow_down: Pretrain model
Human3.6M CPN 6.67/6.45 2.91/0.14 checkpoint / config
Human3.6M Hourglass 9.42/9.25 1.54/0.15 checkpoint / config
Human3.6M HRNet 4.59/4.54 1.01/0.13 checkpoint / config
Human3.6M RLE 5.14/5.11 0.9/0.13 checkpoint / config

SMPL Results

Dataset Estimator MPJPE (Input/Output):arrow_down: Accel (Input/Output):arrow_down: Pretrain model
AIST++ SPIN 107.72/103.00 33.21/5.72 checkpoint / config
AIST++ TCMR* 106.95/106.39 6.47/4.68 checkpoint / config
AIST++ VIBE* 107.41/102.06 31.65/5.95 checkpoint / config
3DPW EFT 91.60/89.57 33.38/7.89 checkpoint / config
3DPW PARE 79.93/78.68 26.45/6.31 checkpoint / config
3DPW SPIN 99.28/97.81 34.95/7.40 checkpoint / config
3DPW TCMR* 88.46/88.37 7.12/6.52 checkpoint / config
3DPW VIBE* 84.27/83.14 23.59/7.24 checkpoint / config
  • * means the used pose estimators are using temporal information.
  • The usage of SmoothNet for better performance: a SOTA single-frame estimator (e.g., PARE) + SmoothNet
  • Since TCMR uses a sliding window method to smooth the poses, which causes over-smoothness issue, SmoothNet will be hard to further decrease the MPJPE, PA-MPJPE.

Getting Started

Environment Requirement

SmoothNet has been implemented and tested on Pytorch 1.10.1 with python >= 3.6. It supports both GPU and CPU inference.

Clone the repo:

git clone https://github.com/cure-lab/SmoothNet.git

We recommend you prepare the environment using conda:

# conda
source scripts/install_conda.sh

Prepare Data

All the data used in our experiment can be downloaded here.

Google Drive

Baidu Netdisk

The sructure of the repository should look like this:

|-- configs
    |-- aist_vibe_3D.yaml
    |-- ...
|-- data
    |-- checkpoints         # pretrained checkpoints
    |-- poses               # cleaned detected poses and groundtruth poses
    |-- smpl                # SMPL parameters
|-- lib
    |-- core
        |-- ...
    |-- dataset
        |-- ...
    |-- models
        |-- ...
    |-- utils
        |-- ...
|-- results                 # folders including log files, checkpoints, running config and tensorboard logs
|-- scripts
    |-- install_conda.sh
|-- eval_smoothnet.py       # SmoothNet evaluation
|-- train_smoothnet.py      # SmoothNet training
|-- README.md
|-- LICENSE
|-- requirements.txt

If you want to add your own dataset, please follow these steps (noted that this is also how the provided data is organized):

  1. Organize your data into corresponding type according to the body representation. The file structure is shown as follows:

    |-- [your dataset]\_[estimator]\_[2D/3D/smpl]
        |-- detected
            |-- [your dataset]\_[estimator]\_[2D/3D/smpl]_test.npz
            |-- [your dataset]\_[estimator]\_[2D/3D/smpl]_train.npz
        |-- groundtruth
            |-- [your dataset]\_gt\_[2D/3D/smpl]_test.npz
            |-- [your dataset]\_gt\_[2D/3D/smpl]_train.npz
    

    It is fine if you only have training or testing data. The content in each .npz file is consist of "imgname" and "human poses", which is related to the body representation you use.

    • 3D keypoints:

      • imgname: Strings containing the image and sequence name with format [sequence_name]/[image_name](string "" if the sequence_name and image_name not available).
      • keypoints_3d: 3D joint position. The shape of each sequence is corresponding_sequence_length*(keypoints_number*3). The order of it is the same with imgname
    • 2D keypoints

      • imgname: Strings containing the image and sequence name with format [sequence_name]/[image_name](string "" if the sequence_name and image_name not available).
      • keypoints_2d: 2D joint position. The shape of each sequence is corresponding_sequence_length*(keypoints_number*2). The order of it is the same with imgname
    • SMPL

      • imgname: Strings containing the image and sequence name with format [sequence_name]/[image_name](string "" if the sequence_name and image_name not available).
      • pose: pose parameters. The shape of each sequence is corresponding_sequence_length*72. The order of it is the same with imgname
      • shape: shape parameters. The shape of each sequence is corresponding_sequence_length*10. The order of it is the same with imgname
  2. If you use 3D keypoints as the body representation, add the root of all keypoints cfg.DATASET.ROOT_[your dataset]_[estimator]_3D in evaluate_config.py, train_config.py or visualize_config.py according to your purpose(test, train or visualize).

  3. Construct your own dataset following the existing dataset files. You might need to modify the detailed implementation depending on your data characteristics.

Training

Run the commands below to start training:

python train_smoothnet.py --cfg [config file] --dataset_name [dataset name] --estimator [backbone estimator you use] --body_representation [smpl/3D/2D] --slide_window_size [slide window size]

For example, you can train on 3D representation of Human3.6M using backbone estimator FCN with silde window size 8 by:

python train_smoothnet.py --cfg configs/h36m_fcn_3D.yaml --dataset_name h36m --estimator fcn --body_representation 3D --slide_window_size 8

You can easily train on multiple datasets using "," to split multiple datasets / estimator / body representation. For example, you can train on AIST++ - VIBE - 3D and 3DPW - SPIN - 3D with silde window size 8 by:

python train_smoothnet.py --cfg configs/h36m_fcn_3D.yaml --dataset_name aist,pw3d --estimator vibe,spin --body_representation 3D,3D  --slide_window_size 8

Note that the training and testing datasets should be downloaded and prepared before training.

Evaluation

Run the commands below to start evaluation:

python eval_smoothnet.py --cfg [config file] --checkpoint [pretrained checkpoint] --dataset_name [dataset name] --estimator [backbone estimator you use] --body_representation [smpl/3D/2D] --slide_window_size [slide window size] --tradition [savgol/oneeuro/gaus1d]

For example, you can evaluate MPI-INF-3DHP - TCMR - 3D and MPI-INF-3DHP - VIBE - 3D using SmoothNet trained on 3DPW - SPIN - 3D with silde window size 8, and compare the results with traditional filters oneeuro by:

python eval_smoothnet.py --cfg configs/pw3d_spin_3D.yaml --checkpoint data/checkpoints/pw3d_spin_3D/checkpoint_8.pth.tar --dataset_name mpiinf3dhp,mpiinf3dhp --estimator tcmr,vibe --body_representation 3D,3D --slide_window_size 8 --tradition oneeuro

Note that the pretrained checkpoints and testing datasets should be downloaded and prepared before evaluation.

The data and checkpoints used in our experiment can be downloaded here.

Google Drive

Baidu Netdisk

Visualization

Here, we only provide demo visualization based on offline processed detected poses of specific datasets(e.g. AIST++, Human3.6M, and 3DPW). To visualize on arbitrary given video, please refer to the inference/demo of MMHuman3D.

un the commands below to start evaluation:

python visualize_smoothnet.py --cfg [config file] --checkpoint [pretrained checkpoint] --dataset_name [dataset name] --estimator [backbone estimator you use] --body_representation [smpl/3D/2D] --slide_window_size [slide window size] --visualize_video_id [visualize sequence id] --output_video_path [visualization output video path]

For example, you can visualize the second sequence of 3DPW - SPIN - 3D using SmoothNet trained on 3DPW - SPIN - 3D with silde window size 32, and output the video to ./visualize by:

python visualize_smoothnet.py --cfg configs/pw3d_spin_3D.yaml --checkpoint data/checkpoints/pw3d_spin_3D/checkpoint_8.pth.tar --dataset_name pw3d --estimator spin --body_representation 3D --slide_window_size 32 --visualize_video_id 2 --output_video_path ./visualize

Citing SmoothNet

If you find this repository useful for your work, please consider citing it as follows:

@inproceedings{zeng2022smoothnet,
      title={SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos},
      author={Zeng, Ailing and Yang, Lei and Ju, Xuan and Li, Jiefeng and Wang, Jianyi and Xu, Qiang},
      booktitle={European Conference on Computer Vision},
      year={2022},
      organization={Springer}
}

Please remember to cite all the datasets and backbone estimators if you use them in your experiments.

License

This code is available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using this code you agree to the terms in the LICENSE. Third-party datasets and software are subject to their respective licenses.

smoothnet's People

Contributors

ailingzengzzz avatar juxuan27 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

smoothnet's Issues

About the training process

Loading dataset (0)......
#############################################################
You are loading the [training set] of dataset [h36m]
You are using pose esimator [fcn]
The type of the data is [3D]
The frame number is [1559752]
The sequence number is [600]
#############################################################
#############################################################
You are loading the [testing set] of dataset [h36m]
You are using pose esimator [fcn]
The type of the data is [3D]
The frame number is [543344]
The sequence number is [236]
#############################################################

The training process has been stuck in the data loading interface, would like to ask what is probably the problem

function of slide_window_to_sequence ???

I wonder what's the functionality of slide_window_to_sequence(), it seems to average values single axis of the whole slide window. Why not use the single frame rather than the averaged frames for prediction?

How to train SMPL-based model?

There are some problems in trying to reproduce the pw3d-spin-smpl experiment.

  • First, the key name in the ./data/poses/pw3d_spin_smpl/groundtruth is misspelled, and the “pose” is written as “psoe”.
  • In addition, I tried to run the ./train_smoothnet.py with default setting from pw3d_ spin_ 3d.yaml to train smpl-based model using oneeuro filter. It is found that although the train loss and acceleration errors have significantly decreased during the training stage, other metrics have become larger like MPJPE, MPVPE and PAMPJPE. I wonder if there is an error in my hyper-parameter setting.
  • Besides, does this repository consider adding the slerp filter function mentioned in the paper?

restore

How to restore the estimation of the smoothnet with the same format of detected, namely, restore the imgname and keypoints while not the visual videos.

filter_pose ?

when i run it with smpl,
image

cmd: python eval_smoothnet.py --cfg configs/h36m_fcn_3D.yaml --checkpoint data/checkpoints/h36m_fcn_3D/checkpoint_32.pth.tar --dataset_name aist --estimator tcmr --body_representation smpl --slide_window_size 32

Q:[WARN] Cannot find rule for <class 'lib.models.smoothnet.SmoothNet'>. Treat it as zero Macs and zero Params.
it is no effect for smoothing?
what happend ?

滤波后在原图可视化骨骼图 贴合效果不理想?

曾爱玲 大佬您好,针对2D pose滤波,根据图像大小归一化图像序列对应的pose到【-1,1】,smoonthNet滤波后反归一化,整体的棍图看着是平滑没那么抖动了,但是在原视频中可视化,发现贴合到人的骨骼效果并不理想,可以简单分析下可能的原因吗?感谢

Motion-aware smoothnet

Hi,

I'm trying to implement Motion-aware SmoothNet. And it's a little bit confusing about the $T^e$ . Should they output the same same number of $T^e$ or just like the number of input $T, T-1, T-2$?

Besides, for inference pipeline, which frames should be updated in the window? For example, in a 8 frames sequneces, should all frames be updated or just the middle frames?

Thanks.

Replicate the Motion-aware SmoothNet

Hello, I am trying to replocate the Motion-aware Smoothnet, but I meet a question about how to define angular velocity in axis-angle representation?

how to train custom yoga pose model?

1)i have no idea how should i genrate data, i saw in redme ,you mentioned two things one is file & second ground truth , also file format is .npz , How should I generate this? what steps should I need to follow?
2) using this repo can I able to do custom yoga pose detection?

Why SMPL rotation is 3 not 3*3?

HI., I saw there has an option, smpl 6d, why it only have 243 and 246

rotation matrix why not be 24 * 3 *3? what's this 6d stands for?

3D pose estimator results are different from paper

I tested human3.6M dataset on FCN estimator by using your testing guideline.

python train_smoothnet.py --cfg configs/h36m_fcn_3D.yaml --dataset_name h36m --estimator fcn --body_representation 3D --slide_window_size 8

Predicted estimator results are the same as paper, including Accel, MPJPE, and PA-MPJPE. However SmoothNet results are different.

In the paper, Accel, MPJPE, and PA-MPJPE are 1.03, 52.72, and 40.92, respectively. In contrast, your checkpoint results are 1.94, 53.29, and 41.38.

Is there any setting that I need to change? Please suggest to me.

run with different keypoints number

I had train a customize pose detector for 12 joints.
Is smoothnet works without retrain the weight or just to input the joints with coordinates like 0,0,0?

Motion-aware SmoothNet

Hi! Nice work, I read the lib/models/smoothnet.py, but I didn't find the code of Motion-aware SmoothNet. So have you already released that code?

checkpoints is not a pretrained model!!!!

Hello, I downloaded your pretrained model. When I tested and trained, there was a problem. data/checkpoints/pw3d_spin_3D/checkpoints_8.pth.tar is not a pretrained model!!!! Can you tell me how to solve it?

Wrong keypoint position

I'm sorry to disturb you again. I tried Yolo_ Pose predicted the key points and sent the predicted key points to smoothnet. I used the 2D key point model based on hrnet. In my opinion, the jitter problem has been reduced, but a large position offset problem has been found.
as follow:
ea04038596de74b917e613503f4bd41f

About add more than 2 datasets for training.

When I wanna to add my h36m 2D dataset, I found "training_iter = min([len(self.train_dataloader[i]) for i in range(len(self.train_dataloader))])" in the lib.core.trainer.py -> line 100.

I wonder if you want to add different dataset, why use min() not sum()?

Because if min(), the training will take the little dataset only.

如何处理Root点的位移

感谢作者,SmoothNet对于Pose的优化效果确实很好。

我想知道的是,SmoothNet是否可以用于对Root点的位移进行平滑?按照示例,似乎没有对position的处理。

期待回复,谢谢!

RLE 2D,3D and other dateset

Hello, the dataset you provided does not include the training set of RLE2D, 3D and VPose3D, can you upload the relevant dataset, thank you very much.

What is the experimental setup for human mesh recovery?

This table is from your paper.
image
In the paper, you mentioned that "SmoothNet is trained with the pose outputs from SPIN [22]. We test its performance across multiple backbone networks."
But you didn't describe the experimental configuration. So which dataset are you using poses from? And what size is your sliding window here?
Also, did you evaluate the metrics using the test set of the three datasets? Because I run your eval code, but the results of vibe is different from this table.
image
(data from pw3d_vibe_smpl_test.npz and pw3d_gt_smpl_test.npz)(and the vibe results are not affected by smoothnet pre-trained models)

trian on h36m

Some errors occurred with training on the H36M dataset.
evaluate on dataset: h36m, estimator: hrnet, body representation: 2D
Traceback (most recent call last):
File "SmoothNet/train_smoothnet.py", line 118, in
main(cfg)
File "SmoothNet/train_smoothnet.py", line 105, in main
Trainer(train_dataloader=train_loader,
File "SmoothNet/lib/core/trainer.py", line 68, in run
performance = self.evaluate()
File "SmoothNet/lib/core/trainer.py", line 240, in evaluate
performance.append(self.evaluate_2d(dataset_index,present_dataset))
File "SmoothNet/lib/core/trainer.py", line 199, in evaluate_2d
eval_dict = evaluate_smoothnet_2D(self.model, self.test_dataloader[dataset_index],
TypeError: evaluate_smoothnet_2D() missing 1 required positional argument: 'dataset'

Wrong skeleton in Visualization of PW3D_SPIN_3D_JOINTS.

I found that the skeleton of PW3D_SPIN_3D was misplaced when I tried to visualize the pw3d_spin_3d pretrained model. I wonder if there may be some errors in the PW3D_SPIN_3D_JOINTS or PW3D_SPIN_3D_EDGES settings?
As follow:
IMG_export_20220809_164055438

Jitter performance not improved

Hi, I tested the jitter performance by using serveral pretrained models after normalize the data.
But, the jitter performance is similiar or a little bit worse. Do you have some suggestions ?

Ground-truth data do not exist!

image
Hi, I have an issue when I run demo PARE with SmoothNet.
My command: ! python visualize_smoothnet.py --cfg configs/pw3d_spin_3D.yaml --checkpoint data/checkpoints/pw3d_spin_3D/checkpoints_8.pth.tar --dataset_name pw3d --estimator vibe --body_representation 3D --slide_window_size 32 --visualize_video_id 2 --output_video_path ./visualize

And Output cell gives me the error that "ImportError: Ground-truth data do not exist!"

Can you help solve this problem? Thank you

Run on arbitary video

Hi, I want to run SmoothNet on an arbitary video, but it seems like a ground truth for that video is required even for the inference phase, right?

how to deal the differernt shape between vibe predicted and gt?

Thanks for your good job! It behaves well on many dataseets. But when I want to train my motion-aware SmoothNet, I meet the question that vibe geet different frames to gt? I noticed that your ASSIT++ data predicted and gt has the same dimension and frames, so how to deal the differernt shape between vibe predicted and gt? Were you delete the redundant frames ? or other methods? Thanks for your good job again.

Another question about MuPoTS-3d data

Sorry but I have another question.
I tried to map the groundtruth and predicted points to the original video in MuPoTS-3d dataset. And I found the position in the file
./data/poses/mupots_tposenetrefinenet_3D/groundtruth/mupots_gt_3D_test.npz is a normalized format (Human 3.6M format?), which is described in the paper as: "We compare them on the universal coordinates, where each person is rescaled according to the hip and has a normalized height".
Is there any way to recover absolute pose position or visualize it?

Question (Plug and play)

Hi, thanks for sharing your work. I have a standard HRNET W-48 model trained on my dataset for pose estimation. Can i use the smoothnet to that model to refine 2D keypoints or heatmap outputs?
Is this a temporal tracking model of keypoints?

SmoothNet SMPL model input

Dear author,

Thank you for your amazing work. There is something confuse me about your method.
In this image, you said that your model is trained on 3D position representations only, so if I use a SmoothNet for PARE model (SMPL result), should I input a 3D keypoints postions or 6D rotation matrix for the model?
thank you so much!
Screenshot from 2023-08-25 15-36-21

ROMP support?

hi, does any pretrained model able to applied to ROMP?

convert onnx

great work for 3D task!
I wonder to know how can I convert the model to onnx.

Please update config file

None of these config file can using for loading according pretrained models.

A lot of mismatch....

EVALUATE.ROOT_RELATIVE

I would like to ask what the parameter EVALUATE.ROOT_RELATIVE is used for. The default setting is True. And the predicted value of the network and gt becomes 0 after calculation with this parameter

如何在mmpose中使用smoothnet平滑2d姿态

您好,
如果我想在mmpose中使用smoothnet平滑2D姿态估计器得出的关键点,我应该怎么做呢?mmpose官方的api接口好像就适用于基于滤波器的方法。

Smoothnet input during training and inference.

Hi, Ailing and Xuan,
Thanks for your great works! I have some questions:

  1. The project Github page says, 'Due to the temporal-only network without spatial modelings, SmoothNet is trained on 3D position representations only, and can be tested on 2D, 3D, and 6D representations, respectively.', which means 3D, 2D, and SMPL models are trained with their corresponding 3D keypoints, right? e.g. PARE SMPL model is trained by PARE 3D pred and g.t. During the inference, we only need input PARE SMPL pred.

  2. The prediction distribution of PARE is inconsistent with SPIN. Is it unreasonable to initialize PARE's Smoothnet by using SPIN's checkpoint? See github page table "SMPL Results" row 5.

image

  1. The code in lib/models/smoothnet looks like the base version of SmoothNet (Only position refinement, no velocity and acceleration). Where is the motion-aware SmoothNet?

I may have misunderstood your work because of limited time. I hope your reply.

usage

Can I directly use the smoothnet to smooth the output of alphapose without any additional training? Thx

该网络是否能够即插即用,兼容不同模型与数据集

作者您好,了解到smoothnet是即插即用的,我想要将smoothnet直接用于我们当前的3D姿态估计预测结果,使用的是NTU数据集,但加载的是h36m_fcn_3D的cfg与模型,效果非常差。我想问一下,smoothnet是与网络模型还有数据集绑定的么,如果我想用smoothnet做3D姿态估计后处理优化,是不是需要重新在自己的数据集和网络上进行训练。

About direct usage

Seems very useful!
So if I want to refine a 3D keypoints sequences, which checkpoint is good generally?
And since it is trained on 3D positions, why it can be used on a rotation sequences (like SMPL) as well?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.