Coder Social home page Coder Social logo

limeng95 / multiposenet.pytorch Goto Github PK

View Code? Open in Web Editor NEW
196.0 196.0 32.0 5.83 MB

pytorch implementation of MultiPoseNet (ECCV 2018, Muhammed Kocabas et al.)

Python 94.75% Shell 0.29% Cuda 1.77% C 3.10% C++ 0.09%
human-pose-estimation multiposenet pose-estimation pytorch pytorch-implementation

multiposenet.pytorch's People

Contributors

limeng95 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

multiposenet.pytorch's Issues

Inference time and running on CPU

Hello,
You have made a great implementation.
I have 2 questions please
how much frame rate did you produce for a video
and can this model be tested on a CPU with no GPUs ?

About the model used to test the entire network

Thank you very much for your answer.If I want to use multipose_coco_eval.py to evaluate the performance of the entire network, which model should be loaded. Is it the last model trained by multipose_prn_train.py?

detection subnet training

你好,检测子网络我训练了80epoch,val_loss只降到0.48,跟baseline模型相差有点大,请问当时训练baseline的时候就是现在的设置了吗?

按照你的预设的max_epoch=50,训练到完成loss曲线仍一直下降,学习率(1e-5)还未发生过改变。

my model(80epoch):

Validation loss: mean: 0.48589630182399307, std: 0.056519281222650285

baseline model:
Validation loss: mean: 0.3989471616440041, std: 0.06292749785476406

请教

你好。请问可不可以在多人姿态中对某个人进行姿态测量。

Original image and heatmap alignment

Hi @LiMeng95 ,
Thank you for sharing your code.
I check the shape of the heatmap_max=(60,60). My question is how can I align it to original image (the joint coordinates between the original image and heatmap must be matched). Thank you for your time.

some questions about keypoints training?

When I was training keypoints subnet, I was wondering if there was no training in segment network, according to the following code, the comment code
`
def build_keypoint_loss(saved_for_loss, heat_temp, heat_weight):
names = build_names()
saved_for_log = OrderedDict()
criterion = nn.MSELoss(size_average=True).cuda()
total_loss = 0
div1 = 1.
#div2 = 100.

for j in range(5):

    pred1 = saved_for_loss[j][:, :18, :, :] * heat_weight
    gt1 = heat_weight * heat_temp

    #pred2 = saved_for_loss[j][:, 18:, :, :]
    #gt2 = mask_all

    # Compute losses
    loss1 = criterion(pred1, gt1)/div1  # heatmap_loss
    #loss2 = criterion(pred2, gt2)/div2  # mask_loss
    total_loss += loss1
    #total_loss += loss2

    # Get value from Tensor and save for log
    saved_for_log[names[j*2]] = loss1.item() #只是保留heatmap的loss
    #saved_for_log[names[j*2+1]] = loss2.item()

saved_for_log['max_ht'] = torch.max(
    saved_for_loss[-1].data[:, :18, :, :]).item()
saved_for_log['min_ht'] = torch.min(
    saved_for_loss[-1].data[:, :18, :, :]).item()
#saved_for_log['max_mask'] = torch.max(
#    saved_for_loss[-1].data[:, 18:, :, :]).item()
#saved_for_log['min_mask'] = torch.min(
#    saved_for_loss[-1].data[:, 18:, :, :]).item()

return total_loss, saved_for_log

`

another question is i found when i training on two gpu , it's speed lower than one gpu , log is below:
image
image

thanks for your reply

multipose_environment.yaml file

First off: thanks for the excellent documentation of this repo. By far the clearest pytorch implementation we've found yet!

The .yaml file for setting up a conda environment seems to be out of date. There are two main issues:

  1. The Tsinghua conda channels no longer exist.
  2. Given the above, the imports of pytorch, torchvision and cuda (first 3 packages) no longer work.

If helpful, I could push a solution?

Thanks,
Sam

@LiMeng95 hi,

Thank you very much for your answer.If I want to use multipose_coco_eval.py to evaluate the performance of the entire network, which model should be loaded. Is it the last model trained by multipose_prn_train.py?

有关detection subnet training 生成anchor速度的问题

你好,感谢你们提供的工程,我在训练检测子网络的时候发现训练速度是keypoints的1/6(keypoints 30fps,detection 5fps),打点发现生成 anchors非常费时间,分类和回归只需要2ms,而anchors却要200ms。
定位到最后发现,生成anchor是在cpu上进行,
最后执行
torch.from_numpy(all_anchors.astype(np.float32)).cuda()
转到GPU内存时,花了200ms。我对pytorch不是很熟,请问anchor这个操作能直接在gpu上创建并计算实现吗?

COCO.json?

Can u upload coco.json ?
I can not find under coco dir.

Thanks!

why weight_decay = 0?

不知道我的理解有没有错误:weight_decay是给正则项的一个参数,如果为0,那就是不使用正则项。

为什么设置为0呢?是因为数据量足够不用担心过拟合的问题吗?

有关预测结果问题

@LiMeng95 你好,我用你的预训练模型infer一些图片后发现有以下两点问题:
图一:pic1-2canvas
有两个keypoint跑到了box外面了,按照论文的思路应该只在框内找合适的keypoint匹配,这是什么原因呢。

图二:000025245-2canvas
最右边的那个框明显没包含到正确的人,而且有一个点也跑到框外了(虽然是对的)

Improvements detail

Hi,

Could you please tell which was the main changing that improved your score from 0.39 to 0.59?

I'm trying to replicate this paper in keras and i'm stuck at 0.37 mAP.

Thanks.

Questions about how to train the entire network

Hello, see that your training is training the subnet. If you train the whole network, set subnet_name to other values, corresponding to the case of else?
if subnet_name == 'keypoint_subnet':
return self.keypoint_forward(img_batch)
elif subnet_name == 'detection_subnet':
return self.detection_forward(img_batch)
elif subnet_name == 'prn_subnet':
return self.prn_forward(img_batch)
else: # entire_net
features = self.fpn(img_batch)
p2, p3, p4, p5 = features[0] # fpn features for keypoint subnet
features = features[1] # fpn features for keypoint subnet
Another problem is that I only saw the keypoints subnet and the detection subnet in the else. I didn't see the prn_subnet. Is there any problem here?

ImportError: lib/nms/_ext/nms/_nms.so: undefined symbol: __cudaPopCallConfiguration

Traceback (most recent call last):
File "./training/multipose_prn_train.py", line 12, in
from network.posenet import poseNet
File "/home/louis/Documents/DetPose/MultiPoseNet/network/posenet.py", line 16, in
from lib.nms.pth_nms import pth_nms
File "/home/louis/Documents/DetPose/MultiPoseNet/lib/nms/pth_nms.py", line 2, in
from ._ext import nms
File "/home/louis/Documents/DetPose/MultiPoseNet/lib/nms/_ext/nms/init.py", line 3, in
from ._nms import lib as _lib, ffi as _ffi
ImportError: /home/louis/Documents/DetPose/MultiPoseNet/lib/nms/_ext/nms/_nms.so: undefined symbol: __cudaPopCallConfiguration

conda envs:
Ubuntu: 18.04
python: 3.6.8
pytorch: 0.4.1
cuda90: 1.0

Details of Baseline model

Hello, I was wondering what's the difference between your released baseline model and the the models you used in your paper.

Because I found there is a noticeable gap, which is 59.0 (AP) with the baseline model and 0.696(AP) with the model in your paper. I also found the model with 69.6 AP runs at 23 FPS as you mentioned in the paper, and the baseline model runs at 3 FPS with single-scale test on my GTX 1080 Ti.

Thanks for your great work and open source code!

Missing license

Hi, could you please add license information to this repo. Thanks!

PRN when using tester

Hi,

I see in /evaluate/multipose_coco_eval.py that you're not using the PRN part of the network, and that there's a prn_process function that's not learned but heuristic.

Is there a way to actually evaluate the output of the PRN part of the network on mscoco dataset?

Questions about key point network training

Hello, thank you for this excellent code. I would like to ask you when I was training the key point network, and did not use gpu for training. Is this because the use of cudnn mentioned before will make it less accurate? Another point is that I did not disable cudnn. Now the training process is as follows:
2019-01-14 21:42:43 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2250/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008420568
Heatmap_loss_k3: 0.0008466825
Heatmap_loss_k4: 0.0009062610
Heatmap_loss_k5: 0.0010405664
Heatmap_loss: 0.0008101475
Max_ht: 0.9672015524
Min_ht: -0.0529488141
(0.01/0.39s, fps: 15.2, rest: 1:58:28)
2019-01-14 21:43:02 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2300/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008683999
Heatmap_loss_k3: 0.0008737236
Heatmap_loss_k4: 0.0009329009
Heatmap_loss_k5: 0.0010672004
Heatmap_loss: 0.0008370089
Max_ht: 0.9708980930
Min_ht: -0.0504962008
(0.01/0.39s, fps: 15.3, rest: 1:57:10)
2019-01-14 21:43:22 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2350/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008761945
Heatmap_loss_k3: 0.0008811883
Heatmap_loss_k4: 0.0009402647
Heatmap_loss_k5: 0.0010725809
Heatmap_loss: 0.0008429233
Max_ht: 0.9726167500
Min_ht: -0.0537234642
(0.01/0.40s, fps: 15.2, rest: 1:58:04)
2019-01-14 21:43:41 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2400/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008321729
Heatmap_loss_k3: 0.0008381093
Heatmap_loss_k4: 0.0008991256
Heatmap_loss_k5: 0.0010329885
Heatmap_loss: 0.0007991617
Max_ht: 0.9892217243
Min_ht: -0.0523971093
(0.01/0.39s, fps: 15.4, rest: 1:56:09)
2019-01-14 21:44:01 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2450/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008406795
Heatmap_loss_k3: 0.0008457731
Heatmap_loss_k4: 0.0009056911
Heatmap_loss_k5: 0.0010437739
Heatmap_loss: 0.0008086096
Max_ht: 1.0289121532
Min_ht: -0.0526983795
(0.01/0.39s, fps: 15.2, rest: 1:56:56)
2019-01-14 21:44:21 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2500/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008856484
Heatmap_loss_k3: 0.0008906652
Heatmap_loss_k4: 0.0009485921
Heatmap_loss_k5: 0.0010761062
Heatmap_loss: 0.0008587059
Max_ht: 0.9907071078
Min_ht: -0.0525300398
(0.01/0.40s, fps: 15.1, rest: 1:57:55)
2019-01-14 21:44:41 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2550/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008457120
Heatmap_loss_k3: 0.0008519809
Heatmap_loss_k4: 0.0009205155
Heatmap_loss_k5: 0.0010695740
Heatmap_loss: 0.0008087154
Max_ht: 0.9958183694
Min_ht: -0.0589106724
(0.01/0.40s, fps:15.0, rest: 1:58:23)
2019-01-14 21:45:01 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2600/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008354363
Heatmap_loss_k3: 0.0008410966
Heatmap_loss_k4: 0.0009031857
Heatmap_loss_k5: 0.0010408414
Heatmap_loss: 0.0008022347
Max_ht: 0.9953839695
Min_ht: -0.0547209233
(0.01/0.40s, fps: 14.9, rest: 1:58:36)
2019-01-14 21:45:21 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2650/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0007921754
Heatmap_loss_k3: 0.0007980470
Heatmap_loss_k4: 0.0008610832
Heatmap_loss_k5: 0.0009990778
Heatmap_loss: 0.0007611108
Max_ht: 0.9962352312
Min_ht: -0.0516994573
(0.01/0.40s, fps:15.1, rest: 1:56:18)
2019-01-14 21:45:41 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2700/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0007981830
Heatmap_loss_k3: 0.0008039676
Heatmap_loss_k4: 0.0008650466
Heatmap_loss_k5: 0.0009992117
Heatmap_loss: 0.0007669853
Max_ht: 0.9683502948
Min_ht: -0.0440845984
(0.01/0.40s, fps:15.1, rest: 1:56:08)
2019-01-14 21:46:01 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2750/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008513692
Heatmap_loss_k3: 0.0008569237
Heatmap_loss_k4: 0.0009197287
Heatmap_loss_k5: 0.0010576613
Heatmap_loss: 0.0008216371
Max_ht: 0.9908136356
Min_ht: -0.0525577789
(0.01/0.40s, fps:15.0, rest: 1:57:02)
2019-01-14 21:46:21 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2800/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008516031
Heatmap_loss_k3: 0.0008580373
Heatmap_loss_k4: 0.0009210396
Heatmap_loss_k5: 0.0010581265
Heatmap_loss: 0.0008185565
Max_ht: 0.9887517738
Min_ht: -0.0534696823
(0.01/0.40s, fps:15.0, rest: 1:55:58)
2019-01-14 21:46:41 [INFO]: res101_keypoint_subnet/
Training: epoch 1[2850/20254], lr: [0.0001]
Heatmap_loss_k2: 0.0008685764
Heatmap_loss_k3: 0.0008739439
Heatmap_loss_k4: 0.0009375646
Heatmap_loss_k5: 0.0010763726
Heatmap_loss: 0.0008395686
Max_ht: 0.9699891734
Min_ht: -0.0491570164
(0.01/0.40s, fps:15.0, rest: 1:55:48)

Observe this process, this should be convergent, thank you

有关intermediate supervision的问题

为什么代码里面计算intermediate supervision 的时候计算heatmap通道数是19?和GT对应的话不是应该18吗
self.convfin_k2 = nn.Conv2d(256, 19, kernel_size=1, stride=1, padding=0)
self.convfin_k3 = nn.Conv2d(256, 19, kernel_size=1, stride=1, padding=0)
self.convfin_k4 = nn.Conv2d(256, 19, kernel_size=1, stride=1, padding=0)
self.convfin_k5 = nn.Conv2d(256, 19, kernel_size=1, stride=1, padding=0)

Questions about training and evaluate

Hi,

I am trying to replicate this work in Keras, but I haven't been able to achieve the same results. I started my implementation by following the article and based coco evaluation in your code. This achieve mAP 0.390.

Could you describe your training strategy and results at each stage?

Thanks!

No such file or directory: '/data/COCO/COCO.json'

Where can I find this file? COCO.json

${COCO_ROOT}
--annotations
--instances_train2017.json
--instances_val2017.json
--person_keypoints_train2017.json
--person_keypoints_val2017.json
--images
--train2014
--val2014
--train2017
--val2017
--mask2014
--COCO.json

Question about the baseline model

I have downloaded the baseline parameter ckpt file and it seems that the keypoint detection network did not use this when training while the person detection network and prn network both used it in your code. I feel confused about this. What is the training process to get the baseline model parameters?
One more question, is it appropriate if I use the baseline res101 ckpt as the detection network ckpt instead of using the check point obtained after the keypoint traning?

different input size of the two subnet when training

你好,我注意到在training中,keypoint_subnet 的输入是480x480,而detection_subnet的是608x608,请问是出于什么考虑才设置不同的分辨率去训练这两个网络呢?

另外,我训练的keypoint_subnet经过48个epoch,val_loss为0.00312(上一个best为35epoch时的0.00313),这结果可以吧?

I want to know the best result of this work?

first, i will appreciate your PRN'S Implement!
so, i want to know your best MAP, how much?
how much the map about ablation experiment (dectection groundtruth + keypoints prediction) ?

Maybe some bugs in network?

  1. several layers don't using relu in forward function. here
    Are these layers no need to using relu in the FPN structure? in the multipose paper, they didn't say they using relu in some parts, but for smooth, they said using 3*3 conv and relu, it seems you forget it.
  2. in the paper, they using two 3*3 conv to do the transform form m2 to k2, in your model there are only one.

Maybe I have some misunderstanding portion, sorry to bother you and looking forward to your reply.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.