salihkaragoz / pose-residual-network-pytorch Goto Github PK

Code for the Pose Residual Network introduced in 'MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network' paper https://arxiv.org/abs/1807.04067

License: Other

Python 99.40% Shell 0.60%

deep-neural-networks human-behavior-understanding human-pose-estimation pose-estimation python pytorch

pose-residual-network-pytorch's People

Contributors

Stargazers

Watchers

Forkers

hzhang57 xizero00 porky12 villempieters yangsenius icewinechen jacke121 mingfengwuye trendingtechnology esmaeilinia dailyactie greenteahua ankitshah009 conanjm shahidpavis arvind-india my-hello-world pinglmlcv shayxurui 1700117hlt klqulei mathpopo songyaojiang baipdiw jackyjsy vidproc gm19900510 lunalulu kelvinson khanha2 nationalflag zero2er0 donproc superying daydreamer2023 louisnust hemp110 venalone zhengzhugithub laofei177 sjingwen peternara feiyang2008 hongchow dandingbudanding cooparation dongryeolee iq-scm

pose-residual-network-pytorch's Issues

Softmax across all keypoints?

class Flatten(nn.Module):
    def forward(self, input):
        return input.view(input.size(0), -1)

class PRN(nn.Module):
    def __init__(self,node_count,coeff):
        ...
        self.softmax   = nn.Softmax(dim=1)

    def forward(self, x):
        res = self.flatten(x)
        ...
        out = self.add(out,res)  # [N,H*W*C]
        out = self.softmax(out)
        out = out.view(out.size()[0],self.height, self.width, 17)

        return out

pretrain model is different from the defined model [Solved]

Hi, I download the code and pre-trained model. Then I just run test:
RuntimeError: Error(s) in loading state_dict for PRN:
Missing key(s) in state_dict: "bneck.weight", "bneck.bias".
Unexpected key(s) in state_dict: "dens3.weight", "dens3.bias".

It looks like the model definition is different from the pre-trainined model.

help

Hello, I am studying your thesis recently and trying to run train.py on Linux, but it appears core dumped , what should I do?

run_webcam

@salihkaragoz Is it possible to provide a script to do the inference using a webcam ?

How to build a entire solution with video or image as input?

@salihkaragoz
Thanks for you excellent works .I read the the code and found it is a key part of the solution your papers mentioned .
do you have a entire solution with video or image as input?
thanks!

it's just a joke

pls give up your trying. i check the model, the kernel are just some linear layers with the size of 1024*34272

More details about the Fig.2 in your paper.

@salihkaragoz Thanks for your excellent work and shared repo. I'm very interested in the Fig.2 results of your paper, and I would like to reproduce these results. And, would you mind sharing your code about how to get sample poses obtained via clustering the structures learned by PRN.
Thanks.
Looking forward to your any replies

This repo is a scam

Don't waste your time dealing with the code, cause I did it for you.
I've made an input-label collage. As you can see the network learns to blur the (slightly blurred and filtered) labels passed to it as input.

Corrupted tar archive

Hello,
Although the name of the issue is self-explanatory, I'll add a few details here :

The archive is 806MB, that seems a bit large, especially when the pth file of a trained retina net is less than 200MB
The file can't be opened, which is a shame, I'd love to replicate your results :)

I hope you will be able to help, have a nice day !

Are there some wrong?

Why I can't find the backbone and RPN?

funny work

Keypoint Estimation Subnet

Does this code contain the implementation of Keypoint Estimation Subnet? And how to add a loss at each level of K features in Keypoint Estimation Subnet? Thanks!

question about segmentation and detection ？

i can not find the related code about detection and segmentation in dataloader and network output, where is it? thanks

Zero division by bbox[3]

Hi,
I got a zero division error during the training, I'm wondering when a bbox[3] has a zero value?
Please help to solve this issue, thanks a lot!

index created!
14%|██████████████████▊ | 9072/64115 [11:40<52:52, 17.35it/s]Traceback (most recent call last):
File "train.py", line 81, in
main(option)
File "train.py", line 68, in main
Evaluation(model, opt)
File "~/pose_residual_network.pytorch/src/eval.py", line 221, in Evaluation
y_scale = float(h) / math.ceil(b[3])
ZeroDivisionError: float division by zero
14%|██████████████████▌ | 9072/64115 [11:40<1:10:52, 12.95it/s]

Please add a licence

Training Result not well

Dear salihkaragoz,
Thanks for you code. The results trained on your code seems not fine as your git . Something options missing?

Thanks for your reply!

Total Step: 17372 | Total Epoch: 16
17372it [05:57, 48.62it/s] | Epoch: 15 Total: 1:59:03 | ETA: 18:15:13 | loss:0.00240877992474
------------Evaulation Started------------
loading annotations into memory...
Done (t=0.17s)
creating index...
index created!
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2693/2693 [01:22<00:00, 32.71it/s]
Loading and preparing results...
DONE (t=0.11s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type keypoints
DONE (t=4.03s).
Accumulating evaluation results...
DONE (t=0.05s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.852
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.967
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.884
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.835
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = 0.889
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.886

Find some weird code in eval.py

Hi, thanks for your work. When I looking your code about this repository , i find somewhere is weird in eval.py. When to get predicated bbox_keypoints, you used the true keypoints to assign the bbox_keypoints. The code in eval.py is about line 200 and line 205.

The peaks is true keypoints coordinate, is it right? It seems that used the true coordinate to assign the predicated bbox_keypoints. Actually i think the line 209~220 in eval.py is the right way to get real predicated bbox_keypoints.

May be you can give me some advice about this, thanks.

Downloadable weights

Hi. I'm very interested in this implementation. For now I'm gonna try training myself.

But do you think you'll put some downloadable weights that reach the scores thrown in the article ? I'd be very interested.

Thanks.

Hey @VladislavZavadskyy,

Could you clearly describe your problem instead of making a groundless judgment? This repo isn't a full pipeline of the things that we introduced in our recent paper, just a demo of the main contribution to help people to understand the idea. If you can state your issue in a concise way, maybe we can guide you to correct resources.

Thanks,

Originally posted by @mkocabas in #14 (comment)

Yes we would like you to direct us exactly the training methodology your preprocessing pipeline and ur hyper paramater tuning strategy. Or you could just release ur pretrained model which you were promising from long time.

Could you give a link about keypoints estimation which you used.

Thanks for the PRN network, it is great. And if I want to reproduce your paper result, I also need to know what is the keypoints subnet you used. If it's opensourced, could you give a link. Thanks again!

Confuse of input and label

The dataloader of coco dataset shows the details of the work how to exploit in network training. But I check the dataloader function, the weight and output actually is same "dataloader.py---- line 43-64 and line 75--95", the input is the weighs variable from gendata function, why the input is coming from the known keypoints position information rather than the true image data? It is definitely different from what u said in the paper. It means that u use the known label to predict the known label? Does it make sense? If I have some misunderstanding about the code, please let me know.

for j in range(17):
if kpv[j] > 0:
x0 = int((kpx[j] - x) * x_scale)
y0 = int((kpy[j] - y) * y_scale)

            if x0 >= self.bbox_width and y0 >= self.bbox_height:
                output[self.bbox_height - 1, self.bbox_width - 1, j] = 1
            elif x0 >= self.bbox_width:
                output[y0, self.bbox_width - 1, j] = 1
            elif y0 >= self.bbox_height:
                try:
                    output[self.bbox_height - 1, x0, j] = 1
                except:
                    output[self.bbox_height - 1, 0, j] = 1
            elif x0 < 0 and y0 < 0:
                output[0, 0, j] = 1
            elif x0 < 0:
                output[y0, 0, j] = 1
            elif y0 < 0:
                output[0, x0, j] = 1
            else:
                output[y0, x0, j] = 1

    img_id = ann_data['image_id']
    img_data = coco.loadImgs(img_id)[0]
    ann_data = coco.loadAnns(coco.getAnnIds(img_data['id']))

    for ann in ann_data:
        kpx = ann['keypoints'][0::3]
        kpy = ann['keypoints'][1::3]
        kpv = ann['keypoints'][2::3]

        for j in range(17):
            if kpv[j] > 0:
                if (kpx[j] > bbox[0] - bbox[2] * self.threshold and kpx[j] < bbox[0] + bbox[2] * (1 + self.threshold)):
                    if (kpy[j] > bbox[1] - bbox[3] * self.threshold and kpy[j] < bbox[1] + bbox[3] * (1 + self.threshold)):
                        x0 = int((kpx[j] - x) * x_scale)
                        y0 = int((kpy[j] - y) * y_scale)

                        if x0 >= self.bbox_width and y0 >= self.bbox_height:
                            weights[self.bbox_height - 1, self.bbox_width - 1, j] = 1
                        elif x0 >= self.bbox_width:
                            weights[y0, self.bbox_width - 1, j] = 1
                        elif y0 >= self.bbox_height:
                            weights[self.bbox_height - 1, x0, j] = 1
                        elif x0 < 0 and y0 < 0:
                            weights[0, 0, j] = 1
                        elif x0 < 0:
                            weights[y0, 0, j] = 1
                        elif y0 < 0:
                            weights[0, x0, j] = 1
                        else:
                            weights[y0, x0, j] = 1

    for t in range(17):
        weights[:, :, t] = gaussian(weights[:, :, t])
    output = gaussian(output, sigma=2, mode='constant', multichannel=True)
    # weights = gaussian_multi_input_mp(weights)
    # output = gaussian_multi_output(output)
    return weights, output

Reproducing results reported by paper

Given that the full network & training flow is not released by the authors, did anyone actually fully succeed in reproducing the results written in the paper (both the accuracy & speed of 23 FPS)? Either DL framework is ok. Thank you.

Pre-Trained model tar archive corrupted

@salihkaragoz Provided pre trained model tar archive to reproduce test results is corrupted,
https://drive.google.com/file/d/1OhdMllLGnpRAk6Wexw8LzXF_EHiolVj1/view?usp=sharing

Can you please provide a new one?

Thank you in advance.

release all the code

thanks for sharing your code!
I am very interested in your code. I want to train your network from scratch. Can you release all the code(backbone, keypoint subnet, person detection subnet and pose residual net)? thank you very much!

Licensing

Hello,

What is the license of this model ?

Thank you

Doubt with Our keypoints + Our bboox？

Hi@salihkaragoz Thank you for your novel work, i have some doubt when we do real test.

PRN is trained on both GT, when we test with our predicted keypoints and box, do we need to train it again, or just use the same model in this reposity? If we need to train it with Our keypoints + Our bboox, how can we prepare the input and target?