Coder Social home page Coder Social logo

shimingyi / motionet Goto Github PK

View Code? Open in Web Editor NEW
551.0 551.0 82.0 1.4 MB

A deep neural network that directly reconstructs the motion of a 3D human skeleton from monocular video [ToG 2020]

Home Page: https://rubbly.cn/publications/motioNet/

License: BSD 2-Clause "Simplified" License

Python 100.00%
3d-pose-estimation bvh character-animation deep-neural-network

motionet's People

Contributors

kfiraberman avatar shimingyi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

motionet's Issues

Normalization in preprocessing

Hi @Shimingyi
Sorry to bother you again. I have a question about the preprocessing part in section 3.3.
image

Intuitively, both the bone length and the joint rotations are estimated based on the positional relations of two kinematic neighbouring joints (please correct me if I am wrong).
But after the 2nd normalization step in the preprocessing, the positional relation between two joints are completely lost since each joint is normalized w.r.t its own mean and std.

Therefore, I am wondering the motivation of the 2nd step, and did you try to train the network w/o the 2nd normalization?

Best.

如何将人的关节信息映射到动画人物上?

您好,我看到你们的视频里,将人的关节信息映射到了动物身上,从而驱动动物。人和动画人物的躯干长度比例都不同,这个是如何做到的呢?还是说只是将人身上获取的旋转角直接给动画人物?
image

怎么改为输出zxy顺序的bvh文件?

你好,我在使用BVH.py的save方法时,把order改为zxy,输出的结果在blender中显示异常。
然后我尝试用scipy把四元数转为zxy顺序的欧拉角,结果也有问题。
请问下,怎么实现zxy顺序的欧拉角。

谢谢

bone2skeleton conversion

Hello, thanks for sharing this nice work!

I have a question regarding the bone to skeleton conversion function in ./models/model.py
It seems the joints are shifted from the pelvis joint by offsets of bone lengths, but why is there no horizontal offsets for RKnee, RAnkle, LKnee, and LAnkle. I was expecting something like:

skel_in[:, 2, 0] = -unnorm_bones[:, 0, 0]
skel_in[:, 5, 0] = unnorm_bones[:, 0, 0]
skel_in[:, 3, 0] = -unnorm_bones[:, 0, 0]
skel_in[:, 6, 0] = unnorm_bones[:, 0, 0]

Thanks in advance for consideration!

About train.py !!

wild videos

python train.py -n wild -d 1 --kernel_size 5,3,1 --stride 3,1,1 --dilation 1,1,1 --channel 1024 --confidence 1 --translation 1 --contact 1 --loss_term 1101

First,I noticed that config.trainer.data always is "gt" 。This means only use “./data/data_h36m.npz " to train 。
Second, Can you provide './data/data_2d_h36m_cpn_ft_h36m_dbb.npz' and './data/data_2d_h36m_detectron_ft_h36m.npz' ??

Get self.poses_2d_mean, self.poses_2d_std with downloading h36m_dataset

Can you please provide the normalization values for self.poses_2d_mean, self.poses_2d_std so that i can run the a Test on wild videos .

(motionet) C:\Users\v84153490\PycharmProjects\MotioNet-master>python evaluate.py -r ./checkpoints/wild_gt_tcc.pth -i sample_output
Building the network
Traceback (most recent call last):
File "evaluate.py", line 145, in
main(config, args, output_folder)
File "evaluate.py", line 73, in main
parameters = [torch.from_numpy(np.array(item)).float().to(device) for item in h36m_loader(config, is_training=True).dataset.get_parameters()]
File "C:\Users\v84153490\PycharmProjects\MotioNet-master\data\data_loaders.py", line 11, in init
self.dataset = h36m_dataset.h36m_dataset(config, is_train=is_training)
File "C:\Users\v84153490\PycharmProjects\MotioNet-master\data\h36m_dataset.py", line 22, in init
positions_set = np.load('./data/data_h36m.npz', allow_pickle=True)['positions_3d'].item()
File "C:\Anaconda\envs\motionet\lib\site-packages\numpy\lib\npyio.py", line 416, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: './data/data_h36m.npz'

About foot-contact loss

Looking at trainer.py, it seems that loss_fc is measured based on joint speed in local frame (w/ root at the origin). Shouldn't it be measured based on joint speed in a fixed frame instead?

For example, one can shift their center of mass from right foot to left foot, creating foot joint movement relative to root joint, without actually lifting their feet.

AttributeError: 'NpzFile' object has no attribute 'zip'

hello! i've cloned the repo, downloaded the pretrained model and placed it in a new folder- 'checkpoints', the training data and placed it in the already existing folder 'data', but when running the following command python evaluate.py -r ./checkpoints/wild_gt_tcc.pth -i demo i get this error:

Building the network
Traceback (most recent call last):
File "evaluate.py", line 137, in
main(config, args, output_folder)
File "evaluate.py", line 73, in main
parameters = [torch.from_numpy(np.array(item)).float().to(device) for item in h36m_loader(config, is_training=True).dataset.get_parameters()]
File "/content/MotioNet/data/data_loaders.py", line 11, in init
self.dataset = h36m_dataset.h36m_dataset(config, is_train=is_training)
File "/content/MotioNet/data/h36m_dataset.py", line 22, in init
positions_set = np.load('./data/data_h36m.npz', allow_pickle=True)['positions_3d'].item()
File "/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py", line 432, in load
pickle_kwargs=pickle_kwargs)
File "/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py", line 186, in init
_zip = zipfile_factory(fid)
File "/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py", line 112, in zipfile_factory
return zipfile.ZipFile(file, *args, **kwargs)
File "/usr/lib/python3.6/zipfile.py", line 1131, in init
self._RealGetContents()
File "/usr/lib/python3.6/zipfile.py", line 1198, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
Exception ignored in: <bound method NpzFile.del of <numpy.lib.npyio.NpzFile object at 0x7f5b14ae70f0>>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py", line 223, in del
self.close()
File "/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py", line 214, in close
if self.zip is not None:
AttributeError: 'NpzFile' object has no attribute 'zip'

here is my project structure:

image

Blender visualization issue.

I tried to visualize the bvh files using blender. The resulting animation is upside down. Any solution? The initial T-pose is correct but after the second frame the video orientation becomes upside down.
image

blender中骨骼动画是倒立的

我运行了您提供的示例代码,跑出来的bvh放到blender中显示。结果发现除了第一帧,其他的动作都是上下颠倒的。脚也并没有像示例视频中的那样,在地面上。需要什么额外的设置吗,还请指点一下。谢谢

Several questions on the Neural FK.

Dear authors,

Thank you for sharing the code, really nice work!

In the past few days, I've been reading your paper and studying your code carefully, and have several questions on the bone2skeleton function and the neural FK layer.

  1. Why the bone2skel function (in model.py) reconstructs an unusual skeleton? According to your paper, the output of the S network is the bone length of a predefined skeleton (in my opinion, the skeleton only defines the topology), and the real skeleton is reconstructed by bone2skel() with the learned bone length and the topology. I visualized the reconstructed skeleton(left), as shown in the figure, it was definitely unusual and the topology (without the end-effectors) was incorrect. In my opinion, it should be the one on the right. (The skeleton was saved when running evaluation on H36m data using your pre-trained model: h36m_gt.pth)

image

  1. Since the skeleton topology was wrong, why the neural FK layer reconstructs the correct 3D points? Did the neural FK layer compute 3D joints differently from the traditional FK algorithm?

Any responses will be highly appreciated!

fair comparison of bone length estimation as show in fig. 9

Hi @Shimingyi ,

I have a question about the comparison of bone length estimation as show in Fig. 9.
I suppose you use the GT scale to rescale the estimated bone lengths. However, if I understand correctly, methods such as Pavllo[CVPR19] does not employ any GT information to calculate the bone lengths.
Therefore, I am wondering whether this comparison is fair or not? Additionally, what is the unit of the y-axis in Fig. 9?

Thanks a lot in advance.
Best.

Problem running quick start demo

Hi, after executing the following command python3 evaluate.py -r ./checkpoints/wild_gt_tcc.pth -i demo I get this error message.


Building the network
Traceback (most recent call last):
  File "evaluate.py", line 137, in <module>
    main(config, args, output_folder)
  File "evaluate.py", line 73, in main
    parameters = [torch.from_numpy(np.array(item)).float().to(device) for item in h36m_loader(config, is_training=True).dataset.get_parameters()]
  File "/home/mingos/MotioNet/data/data_loaders.py", line 11, in __init__
    self.dataset = h36m_dataset.h36m_dataset(config, is_train=is_training)
  File "/home/mingos/MotioNet/data/h36m_dataset.py", line 14, in __init__
    self.cameras = h36m_utils.load_cameras('./data/cameras.h5')
  File "/home/mingos/MotioNet/utils/h36m_utils.py", line 97, in load_cameras
    R, T, f, c, k, p, name = load_camera_params(hf, 'subject%d/camera%d/{0}' % (s, c_idx + 1))
  File "/home/mingos/MotioNet/utils/h36m_utils.py", line 78, in load_camera_params
    name = "".join([chr(item) for item in name])
  File "/home/mingos/MotioNet/utils/h36m_utils.py", line 78, in <listcomp>
    name = "".join([chr(item) for item in name])
TypeError: only integer scalar arrays can be converted to a scalar index

I followed all the steps listed in the repository and got no errors while installing the requirements, I appreciate any help.

代码大概什么时候发布?

论文很精彩,解决了很多之前的论文很少涉及的问题,如脚贴地的问题、绝对位移的问题、骨骼长度适配问题,很想实验一下,不知道代码什么时候发布啊

Question about Training Time

What is your best estimate of the amount of time it takes to train the network? And what kind of GPU did you use?

Missing heatmap_utils in utils?

Thank you for making this amazing project available.

I have encountered the error when I tried to run the evaluate.py script either on h36m or wild video demo.

`File "evaluate.py", line 7, in
import model.model as models
File "/content/MotioNet/model/model.py", line 9, in
from model import model_zoo
File "/content/MotioNet/model/model_zoo.py", line 7, in
from utils import heatmap_utils
ImportError: cannot import name 'heatmap_utils' i

I check the utils folder and it seems it doesn't include the heatmap_utils file.

About the output of OpenPose

HI guys,

Thanks for this great work. I tried your code and found it is good on hm3.6 testset. However, I'm wondering what is the output format of OpenPose, because I'm using Python API of OpenPose. And I know there is a 'datum.PoseKeypoints' of OpenPose as outputs. Could you please provide a guideline or a script about what the outputs of OpenPose should be?

Thank you very much.

Best regards,
Nisekoi.

Why the linear discriminator D works

Dear authors,

Thanks a lot for the amazing work and sharing the code. Accroding the appendix A in the paper, "discriminator D is a linear component (similarly to Kanazawa et al. [2018]), with an output value between 0 and 1, containing two convolution layers and one fully connected layer". However, as the last reponse in issue of the code for Kanazawa et al. [2018], it dose have activation function.

I'm wondering why a linear discriminator can classify whether a rotation speed is natural or not, as in my point of view, this classification is not trival.

Best,
Wenbo

question about training pose in world space.

" The reason is that, if we predict the pose in world space: (different 3d pose ⊕ different camera view) = same 2d projection, then if given a same 2d pose, these will be an ambiguity for predicted 3d pose. " what's that means ?
In my thought, different camera view ,different 2d keypoints location in the image, refer to the same 3d pose in world space, that is pretty fine, is there any problem i miss?

some questions about the code!!!

hi, @Shimingyi
Thanks a lot for the amazing work and sharing the code.
i read the paper and code, and i have some questions:
1、paper say random an integer value as clip length per iter,but i find the code in function set_sequences is only use once in the begin,so the clip will never changer at all epochs;
2、the pretrained model wild_gt_tcc is 73m,but use the python train.py -n wild -d 1 --kernel_size 5,3,1 --stride 3,1,1 --dilation 1,1,1 --channel 1024 --confidence 1 --translation 1 --contact 1 --loss_term 1101 trained model is 145m;
3、use python train.py -n wild -d 1 --kernel_size 5,3,1 --stride 3,1,1 --dilation 1,1,1 --channel 1024 --confidence 1 --translation 1 --contact 1 --loss_term 1101 train,it will be crash on self.branch_S because the stride is 3,acoording the model config at the end of paper,Es should use stride 1 not 3,but Eq should use 3 not 1,right?
4、paper say 'Since global information is discarded from the normalized local representation, we append to it the global 2D velocity (per-frame).'. what is the global 2d velocity? the code is input the normolized 2d(no global info) to train root-z via the self.branch_Q function,why not use the output_Q directly?
5、in the camera augment,the code use 'augment_depth',means only adjust the global translation,no orientaion,right ?
6、about the reference t-pose,what is the loss: loss_D = torch.mean(torch.norm((D_real - 1) ** 2)) + torch.mean(torch.sum((D_fake) ** 2, dim=-1)) mean?why use CMU dataset?

looking foward to your reply, thanks!

About network

for stage_index in range(0, stage_number): for conv_index in range(len(kernel_size_set)): layers.append( nn.Sequential( nn.Conv1d(channel, channel, kernel_size_set[conv_index], stride_set[conv_index], dilation=1, bias=True), nn.BatchNorm1d(channel, momentum=0.1) ) ) self.stage_layers = nn.ModuleList(layers)

stage_number : 2 len(kernel_size) : 3 2*3 = 6
2

At the end of your paper, ( Conv + BatchNorm + LReLU + Dropout ) or (Conv + BatchNorm + LReLU + Dropout + Adap AP) only have three layers ,not six layers.
Why?

关于测试时如何获得更好的效果

hi,最近尝试了一下motioNet,有几个问题想请教一下:
1.关于测试时的IMAGE_WIDTH,这里应该就用图片宽度吗,还是需要最长边?
2.MotioNet论文中最吸引我的是能够相对准确的回归global root position,事实上目前绝大多数3D pose的方法在这方面都比较糟糕。然而我看到测试时有 translation[:, :2] = translation[:, :2]*3;translation[:, 2] = translation[:, 2]*1.5,这里的scale系数是怎么得到的呢?经验值?在我测试自己的视频时对这个有什么建议吗?
3.演示视频里下面这个视频的global translation看起来效果很好,请问是有做什么其他的操作嘛?靠这个库的代码和模型能做到吗?
image
4.我自己测试了如下视频,发现效果并不好,动作错误较多,而且global translation基本固定在原地(实际上视频中的人从左走到右有比较明显的位移),我使用evaluate.py的默认设置(IMAGE_WIDTH改为视频的宽度),请问是还需要修改什么才能达到较好的效果呢?还是说我这个视频本身不太符合训练集的分布?
image

wild_gt_tcc.pth cannot generate foot contact signal?

I'd like to appreciate for code you released, I want to use the foot contact signal which should be generated in the MotioNet code. But I found in the wild_gt_tcc.pth file, the contact config is false, so I checked code and weight of parameters, and Branch Q's output feature is 49, I think the feature should be 51 if the code can generate foot contact signal. Here are some pictures i checked.
截屏2021-03-18 下午1 03 41
this is the pth file config.
截屏2021-03-18 下午1 02 23
this is the weight shape of branch Q.
截屏2021-03-18 下午1 02 59
this is the code in model/model.py.

If I'm right, the pth can not generate foot contact signal, if I'm wrong, pls tell me how to use the pth file to generate foot contact signal.
And if the pth file can not generate foot contact signal, can you offer a pth file which can generate foot contact signal? I would be appreciated for your sharing.

Concerning LICENSE

Thanks for the great works!
Just wondering if you have considered adding a license to this repository?

About multi-GPU operation

I seem to find two code errors!
In base_trainer.py
code:
if gpu_id > n_gpu:
msg = "Warning: The number of GPU's configured to use is {}, but only {} are available on this machine.".format(n_gpu_use, n_gpu)

                            problem :   n_gpu_use is not defined

code:
list_ids = list([gpu_id])
if len(device_ids) > 1:
self.model = torch.nn.DataParallel(model, device_ids=device_ids)

                     problem:  len(device_ids)  is  always  1,so cant use multi-GPU

How to obtain smooth animation?

The results for the wild are very different than presented in the paper. Open pose base key points extraction was fed as input to the model along with the confidence values as explained, however, the results are horrific.

About valid !!

QQ截图20200913211051

First, a is defined ,but not use .
Second, alphas is a list , why only use alphas[0] ??

I look forward to your answer !!!! Thank you !!!

BVH and motion

Greatwork!!

I have some questions to ask, why the predicted neck of the human body keeps leaning back. In addition, there are many ‘Nan’ in the saved bvh, and the feet of the output bvh motion have not been able to contact the ground as well as in the demo video. Is it because there is no IK?

Look forward to your answer

code/tools for post processing

If it is convenient, could you release the code/tools for IK post processing demonstrated in your video?
Thanks in advance!

prepare CMU issue

When i run the command python ./data/prepare_cmu.py to prepare the cmubvh dataset print me this error
Traceback (most recent call last):
File "./data/prepare_cmu.py", line 11, in
bvh_files = util.make_dataset(['/mnt/dataset/cmubvh'], phase='bvh', data_split=1, sort_index=0)
File "./utils/util.py", line 21, in make_dataset
assert os.path.isdir(dataroot), '%s is not a valid directory' % dataroot
AssertionError: /mnt/dataset/cmubvh is not a valid directory

In the file data in repository i make the mnt/dataset/cmubvh and in the cmubvh i storage all the files that you tell in read me
Best Antreas

kernel size

Dear authors,

Hello! thank you for providing the awesome codes.

I tested your demo code on the videos provided by the other work (SfV: Reinforcement Learning of Physical Skills from Videos [Peng et al. 2018]). Each video contains the human motion like jump, cartwheel, dance, and etc.

I ran your code on them but it failed to reconstruct the motion, instead got the Runtime error below..

Building the network
x: tensor([[[-5.2118e-03, -1.9778e-03, 3.7148e-02, ..., 4.0354e-01,
7.9354e-01, 1.3558e+00],
[ 5.9371e-01, 2.1293e+00, 3.0198e+00, ..., 1.3015e+00,
-3.0760e-03, 5.9612e-02],
[ 2.6908e+00, 2.8417e+00, 3.1971e+00, ..., 2.9587e+00,
-8.6406e-03, -2.6097e-02],
...,
[ 2.9316e+00, 4.8310e+00, 5.9596e+00, ..., 3.5114e+00,
6.2439e-01, 2.0394e+00],
[-1.5403e-02, -6.0191e-02, -8.2277e-02, ..., -4.6780e-02,
-2.5275e-02, -4.6015e-02],
[-8.5922e-03, -2.5728e-03, 5.3182e-02, ..., -2.7516e-03,
-9.3090e-03, -4.1515e-03]]], device='cuda:0') <class 'torch.Tensor'> torch.Size([1, 1024, 10])
x: tensor([[[-0.0239, -0.0236, -0.0197, ..., -0.0534, -0.0357, -0.0212],
[-0.0038, -0.0016, -0.0057, ..., -0.0095, -0.0091, -0.0155],
[-0.0023, -0.0016, -0.0032, ..., 0.6250, 0.5801, -0.0042],
...,
[-0.0215, -0.0185, -0.0183, ..., -0.0356, -0.0176, -0.0103],
[-0.0991, -0.1097, -0.0967, ..., -0.1064, -0.0717, -0.0246],
[-0.0532, -0.0627, -0.0558, ..., -0.0655, -0.0536, -0.0239]]],
device='cuda:0') <class 'torch.Tensor'> torch.Size([1, 1024, 8])
x: tensor([[[ 0.7402, 1.0614, 0.9979, ..., -0.0066, 0.5572, 1.4654],
[-0.0823, -0.1001, -0.0753, ..., -0.0456, -0.0340, -0.0150],
[-0.0556, -0.0696, -0.0560, ..., -0.0411, -0.0323, -0.0139],
...,
[ 1.7153, 2.3598, 1.4705, ..., -0.0079, 0.0161, 0.3899],
[-0.0161, -0.0133, -0.0115, ..., -0.0120, -0.0171, -0.0142],
[-0.0651, -0.0784, -0.0599, ..., -0.0418, -0.0331, -0.0132]]],
device='cuda:0') <class 'torch.Tensor'> torch.Size([1, 1024, 8])
x: tensor([[[-0.0262, -0.0155],
[ 1.2542, 0.8397],
[ 0.4866, 0.3434],
...,
[ 0.3525, -0.0054],
[-0.0251, -0.0131],
[-0.0137, -0.0037]]], device='cuda:0') <class 'torch.Tensor'> torch.Size([1, 1024, 2])
Traceback (most recent call last):
File "evaluate.py", line 144, in
main(config, args, output_folder)
File "evaluate.py", line 121, in main
export(args.input)
File "evaluate.py", line 101, in export
pre_bones, pre_rotations, pre_rotations_full, pre_pose_3d, pre_c, pre_proj = model.forward_fk(poses_2d_root, parameters)
File "/ssd_data/MotioNet/model/model.py", line 57, in forward_fk
fake_bones = self.forward_S(_input)
File "/ssd_data/MotioNet/model/model.py", line 40, in forward_S
return self.branch_S(_input)
File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/ssd_data/MotioNet/model/model_zoo.py", line 302, in forward
x = self.drop(self.relu(layer(x)))
File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 263, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/trif/anaconda3/envs/motionet-env/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 260, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: Calculated padded input size per channel: (2). Kernel size: (3). Kernel size can't be greater than actual input size

Do you know what is causing this error?

Thanks!

How to use custom pose 2D key-points?

@Shimingyi @kfiraberman I need "proper" guidance on, how can I use 2d points? because I have tried inserting it but the BVH file generated is not showing any movement while on your examples it's a working fine.

Note: "I've converted points into h36m format than given to the model."

Screenshot (37)

wild_gt_tcc.pth error

python evaluate.py -r ./checkpoints/wild_gt_tcc.pth -i demo

RuntimeError: ./checkpoints/wild_gt_tcc.pth is a zip archive (did you mean to use torch.jit.load()?)

about: achieving temporal coherence naturally in paper

Great project!

I read your paper, in the sixth part:CONCLUSION, LIMITATIONS, AND FUTURE WORK:

This is the original text:
Finally, since our system is trained in the space of motions, the inherent smoothness of human motions is learned from the data, achieving temporal coherence naturally.

My question:
achieving temporal coherence naturally .
Does it mean real-time? The input and output are real-time?

thanks!

How to smooth 2d position?

Now we haven't apply any smoothing process on the 2d input and output, you can do it by yourself to get better results like we show in the video, but here we want to show the original production.

Could you give me a concrete algorithm?

Read the output of openpose on wild videos

Hi there. Thanks for the code. I am trying to test it on videos that I have recorded, but I run into the following problem reading the output of openpose.
I use:

./build/examples/openpose/openpose.bin --video /home/laila/Nexar/openpose/examples/media/nombre_final.mp4 --write_json /home/laila/Nexar/openpose/out_openpose --display 0 --render_pose 0
And:
python evaluate.py -r ./checkpoints/wild_gt_tcc.pth -i /home/laila/Nexar/openpose/out_openpose/ -o /home/laila/Nexar/MotioNet/output/testvid

But my output is:
(env) laila@laila:~/Nexar/MotioNet$ python evaluate.py -r ./checkpoints/wild_gt_tcc.pth -i /home/laila/Nexar/openpose/out_openpose/ -o /home/laila/Nexar/MotioNet/output/testvid
Building the network
Traceback (most recent call last):
File "/home/laila/Nexar/MotioNet/evaluate.py", line 148, in
main(config, args, output_folder)
File "/home/laila/Nexar/MotioNet/evaluate.py", line 124, in main
export(args.input)
File "/home/laila/Nexar/MotioNet/evaluate.py", line 78, in export
files = util.make_dataset([pose_folder], phase='json', data_split=1, sort=True, sort_index=1)
File "/home/laila/Nexar/MotioNet/utils/util.py", line 28, in make_dataset
images.sort(key=lambda x: int(x.split('/')[-1].split('.')[0].split('')[sort_index]))
File "/home/laila/Nexar/MotioNet/utils/util.py", line 28, in
images.sort(key=lambda x: int(x.split('/')[-1].split('.')[0].split('')[sort_index]))
ValueError: invalid literal for int() with base 10: 'final'
In advance thanks for the help! =)

Problem in testing of repository

Hi @Shimingyi I'm facing problem to testing this wonderful repository.

1: By using h36m_gt_t.pth
When I run the evaluate.py then BVH file are saved but nothing is there in BVH files. There is no any skeleton but have joints info in generated BVH file, you can check attached BVH files in zip.

BVH_files.zip

I also got RuntimeWarning: invalid value encountered in true_divide here and here

After debugging I checked that there are many nan parameters in poses_2d_root, pred_bones, pre_rotations, pre_rotations_full, pre_pose_3d, pre_proj, rotations, translations.

So is this problem (empty BVH) due to RuntimeWarning or due to nan values?

2: By using wild_gt_tcc.pth
with this I got runtime error, please check the attached txt file for errors details because error logs are too many.

error.txt

Results in real-time videos

I tested some real-time videos from YouTube and I noticed that results are not good as your paper is showing. I compared MotioNet results with VideoPose after converting VideoPose results to BVH file.

As you compared your approach results with different approaches (here) and its clearly seems that MotionNet is performing well as compared to others but when I tested with different videos then I got shocked results. Check the attached gif files as sample results that I got.

part_1

part_2

part_3

part_4

Ignore the viewing angles of results I just set to get better view, you can judge with current view of every gif file.

MotioNet insures the geometric skeleton and rotational parameters which are key things to drive any 3D character, VideoPose didn't use any of these parameters but still results are stable and better as compared to MotioNet.

So did you compare after applying some filters?

If you wanna test these videos by yourself then I can send you these videos.

Note: Both approaches are using pre-trained weights without any filter or smoothness.

performance for in the wild

hey!

great work! any rough performance benchmarks for in the wild possibly to share? I realise it depends on resolution and duration but if you could share some rough breakdowns that would help a lot

How to open .bvh format files?

Thank you very much for your work.
I want to know how to use .bvh file to generate animation,as demonstrated in your video.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.