Coder Social home page Coder Social logo

alexppppp / keypoint_rcnn_training_pytorch Goto Github PK

View Code? Open in Web Editor NEW
75.0 2.0 29.0 29 MB

How to Train a Custom Keypoint Detection Model with PyTorch (Article on Medium)

Home Page: https://medium.com/@alexppppp/how-to-train-a-custom-keypoint-detection-model-with-pytorch-d9af90e111da

License: MIT License

Python 76.67% Jupyter Notebook 23.33%
keypoints pytorch deep-learning keypoint-detection object-detection neural-network rcnn-model python computer-vision computer-vision-datasets

keypoint_rcnn_training_pytorch's People

Contributors

alexppppp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

keypoint_rcnn_training_pytorch's Issues

training error

ValueError: operands could not be broadcast together with shapes (2,) (17,)

Training on dataset which also contain images without annotations gives error.

I'm trying to train the key point RCNN model on a dataset containing images which sometime contain annotations, but also sometime doesn't contain annotations. The annotation file then looks like: {'bboxes':[], 'keypoints':[]}. This gives the error "too many indices for tensor of dimension 1" while trying to calculate the target["area"]. How can I make this code also work for images without annotations?

How to train my own dataset?

Congratulations for this job. It is nice project. I follow your directory for train. It works when I have two keypoints. Assuming I have 10 points but It didn't work when I didn't mark all the keypoints in the images. How i can ?

Training on custom dataset

Hi, first of all thank you for sharing your great job. I am trying to use a modified version of your code in order to train on a custom dataset with a different number of keypoints. After 5 epochs as you have done in your example. I have modified the code in order to use a different number of keypoints (6 keypoints): ClassDataset and the evaluation code (kpt_oks_sigmas for 6 keypoints).
I obtain a model that gets a good detection of the object (bounding box) but the keypoints are not well located. For example:

left-foot3
target

This images show that the detection is getting trained but the keypoints are not. Moreover at the evaluation phase I always get values equal to zero:

IoU metric: keypoints
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.75 | area= all | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets= 20 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets= 20 ] = -1.000

I have revised the dataset and it seems well annotated, so I don't know where the errors are. Any idea? Thank you in advance.

.pth to .pt torch script conversion support of your model in mobile applications

import torch
import torchvision
from torchvision.models.detection.anchor_utils import AnchorGenerator

Load the pre-trained model from the .pth file

images, targets = next(iterator)
images = list(image.to(device) for image in images)

with torch.no_grad():
model.to(device)
model.eval()
output = model(images)

print("Predictions: \n", output)

traced_model = torch.jit.trace(model, output)

Trace the model using TorchScript

#traced_model = torch.jit.trace(model, example_input)

Save the traced model to a .pt file

traced_model.save('traced_model.pt')

AttributeError: 'str' object has no attribute 'shape'

Results do not correspond to current coco set

Hi,
First, thanks for your really nice tutorial!
I tried to reproduce it and raised an error at the call to the function evaluate, at the end of the training loop:

assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds()))
'Results do not correspond to current coco set'

After exploring, I managed to fix it by replacing the line 100 in the engine.py script.
I replaced
res = {target["image_id"]: output for target, output in zip(targets, outputs)}
by the line
res = {(target["image_id"]).tolist()[0]: output for target, output in zip(targets, outputs)}

The newline cast the image_id (formerly a tensor du to the ClassDataset definition) into an int, to match the id of the coco_evaluator.
Not sure if it was the best way to fix this issue but it might be enough, and it could help another one facing this issue !

Licence Addition Request

Could you add an open license like a MIT or Apache license to enable use in opensource projects and other things?

Training on multi classes

Hi, first of all. Thank you for sharing your excellent work. I tried to modify a code to train a multi classes detection.
In this line. Does it allow to change to my own multi classes??? thank you.
bboxes_labels_original = ['Glue tube' for _ in bboxes_original]

labeling tool

Thank you for your work.

Which labeling and annotation tool do you use?

Inference speed of the model is slow

Hello,

I appreciate the excellent work. I have been using the model for research. I am struggling with high latency when running inference on the model. I am trying to reach production level speed during testing (a good number would be around 1s for prediction on one image). It would be helpful to understand how we can reach high speed when running prediction.

Thank you

How do I get `evaluate()` to work?

First off, thanks for the tutorial! You're a real lifesaver. I'm trying to get evaluate to work at the moment. I have num_keypoints=4. The model trains smoothly when I just comment out evaluate during training, but I need to evaluate the performance now so I'm trying to get that to work. I did what you instructed in your tutorial:

Update. It’s possible not to edit pycocotools/cocoeval.py file in pycocotools library to change kpt_oks_sigmas, but to edit coco_eval.py file, as Diogo Santiago suggested:

# self.coco_eval[iou_type] = COCOeval(coco_gt, iouType=iou_type)
coco_eval = COCOeval(coco_gt, iouType=iou_type)
coco_eval.params.kpt_oks_sigmas = np.array([.5, .5]) / 10.0
self.coco_eval[iou_type] = coco_eval

Since I have num_keypoints=4, I instead wrote np.array([.5, .5, .5, .5]) / 10.0 for kpt_oks_sigmas.

But I'm still getting the same error: ValueError: operands could not be broadcast together with shapes (4,) (17,)
image

Hoping you can help me with this! 🀞

I met a error when train with this code.

I used glue datasets and custom datasets ,but met the same question:
File "D:/program/keypoint_rcnn_training_pytorch-main/trainer.py", line 164, in
train_one_epoch(model, optimizer, data_loader_train, device, epoch, print_freq=1000)
File "D:\program\keypoint_rcnn_training_pytorch-main\engine.py", line 31, in train_one_epoch
loss_dict = model(images, targets)
File "E:\python38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "E:\python38\lib\site-packages\torchvision\models\detection\generalized_rcnn.py", line 99, in forward
detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
File "E:\python38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "E:\python38\lib\site-packages\torchvision\models\detection\roi_heads.py", line 740, in forward
assert t["labels"].dtype == torch.int64, "target labels must of int64 type"
KeyError: 'labels'

I met this error when train with this code.

Hi, @alexppppp .

I met this error when train with this code.

I used my own dataset which have 6 keypoints & 1 box in image.


Epoch: [0] [ 0/56] eta: 0:41:06 lr: 0.000019 loss: 9.5321 (9.5321) loss_classifier: 0.7595 (0.7595) loss_box_reg: 0.0000 (0.0000) loss_keypoint: 8.0790 (8.0790) loss_objectness: 0.6910 (0.6910) loss_rpn_box_reg: 0.0026 (0.0026) time: 44.0517 data: 0.0325
Epoch: [0] [55/56] eta: 0:00:44 lr: 0.001000 loss: 7.7045 (8.4640) loss_classifier: 0.0419 (0.2365) loss_box_reg: 0.0010 (0.0107) loss_keypoint: 7.2650 (7.6186) loss_objectness: 0.4400 (0.5934) loss_rpn_box_reg: 0.0027 (0.0048) time: 42.6397 data: 0.0408
Epoch: [0] Total time: 0:41:48 (44.7877 s / it)
creating index...
index created!

[W ParallelNative.cpp:214] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)


ValueError Traceback (most recent call last)
Input In [17], in
21 train_one_epoch(model, optimizer, data_loader_train, device, epoch, print_freq=1000)
22 lr_scheduler.step()
---> 23 evaluate(model, data_loader_test, device)
25 # Save model weights after training
26 torch.save(model.state_dict(), './weights/nz_krcnn_res50.pth')

File /usr/local/lib/python3.8/site-packages/torch/autograd/grad_mode.py:28, in _DecoratorContextManager.call..decorate_context(*args, **kwargs)
25 @functools.wraps(func)
26 def decorate_context(*args, **kwargs):
27 with self.class():
---> 28 return func(*args, **kwargs)

File ~/Music/dataZ/meca_nz_krcnn/engine.py:102, in evaluate(model, data_loader, device)
100 res = {target["image_id"].item(): output for target, output in zip(targets, outputs)}
101 evaluator_time = time.time()
--> 102 coco_evaluator.update(res)
103 evaluator_time = time.time() - evaluator_time
104 metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)

File ~/Music/dataZ/meca_nz_krcnn/coco_eval.py:43, in CocoEvaluator.update(self, predictions)
41 coco_eval.cocoDt = coco_dt
42 coco_eval.params.imgIds = list(img_ids)
---> 43 img_ids, eval_imgs = evaluate(coco_eval)
45 self.eval_imgs[iou_type].append(eval_imgs)

File ~/Music/dataZ/meca_nz_krcnn/coco_eval.py:194, in evaluate(imgs)
192 def evaluate(imgs):
193 with redirect_stdout(io.StringIO()):
--> 194 imgs.evaluate()
195 return imgs.params.imgIds, np.asarray(imgs.evalImgs).reshape(-1, len(imgs.params.areaRng), len(imgs.params.imgIds))

File /usr/local/lib/python3.8/site-packages/pycocotools/cocoeval.py:148, in COCOeval.evaluate(self)
146 elif p.iouType == 'keypoints':
147 computeIoU = self.computeOks
--> 148 self.ious = {(imgId, catId): computeIoU(imgId, catId)
149 for imgId in p.imgIds
150 for catId in catIds}
152 evaluateImg = self.evaluateImg
153 maxDet = p.maxDets[-1]

File /usr/local/lib/python3.8/site-packages/pycocotools/cocoeval.py:148, in (.0)
146 elif p.iouType == 'keypoints':
147 computeIoU = self.computeOks
--> 148 self.ious = {(imgId, catId): computeIoU(imgId, catId)
149 for imgId in p.imgIds
150 for catId in catIds}
152 evaluateImg = self.evaluateImg
153 maxDet = p.maxDets[-1]

File /usr/local/lib/python3.8/site-packages/pycocotools/cocoeval.py:229, in COCOeval.computeOks(self, imgId, catId)
227 dx = np.max((z, x0-xd),axis=0)+np.max((z, xd-x1),axis=0)
228 dy = np.max((z, y0-yd),axis=0)+np.max((z, yd-y1),axis=0)
--> 229 e = (dx2 + dy2) / vars / (gt['area']+np.spacing(1)) / 2
230 if k1 > 0:
231 e=e[vg > 0]

ValueError: operands could not be broadcast together with shapes (6,) (2,)


What's wrong to me ?

Thanks in advance.
Best,
@bemoregt.

How to load the Custom Trained Model for Inference on new Image set

Hi, I trained on model using custom Data, but now i am unable to load that custom model(.pth) for prediction.

Can you please provide a inference example, Basically the method to load the model back.

The get Model function was

def get_model(num_keypoints, weights_path=None):

anchor_generator = AnchorGenerator(sizes=(32, 64, 128, 256, 512), aspect_ratios=(0.25, 0.5, 0.75, 1.0, 2.0, 3.0, 4.0))
model = torchvision.models.detection.keypointrcnn_resnet50_fpn(pretrained=False,
                                                               pretrained_backbone=True,
                                                               num_keypoints=num_keypoints, # 4 in my case
                                                               num_classes = 3, # Background is the first class, object is the second class
                                                               rpn_anchor_generator=anchor_generator)

if weights_path:
    state_dict = torch.load(weights_path)
    model.load_state_dict(state_dict)        
    
return model 

Model Saving

torch.save(model.state_dict(), 'custom_keypointsrcnn_weights.pth')

Now the Inference Part

import torch
import torchvision
from torchvision.models.detection.rpn import AnchorGenerator
anchor_generator = AnchorGenerator(sizes=(32, 64, 128, 256, 512), aspect_ratios=(0.25, 0.5, 0.75, 1.0, 2.0, 3.0, 4.0))
m = torchvision.models.detection.keypointrcnn_resnet50_fpn(pretrained=False,pretrained_backbone=False,num_keypoints=4,num_classes = 3, rpn_anchor_generator=anchor_generator)

m.load_state_dict(torch.load("custom_keypointsrcnn_weights.pth"))
Traceback (most recent call last):
File "", line 1, in
File "C:\Python39\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for KeypointRCNN:
Missing key(s) in state_dict: "backbone.fpn.inner_blocks.0.weight", "backbone.fpn.inner_blocks.0.bias", "backbone.fpn.inner_blocks.1.weight", "backbone.fpn.inner_blocks.1.bias", "backbone.fpn.inner_blocks.2.weight", "backbone.fpn.inner_blocks.2.bias", "backbone.fpn.inner_blocks.3.weight", "backbone.fpn.inner_blocks.3.bias", "backbone.fpn.layer_blocks.0.weight", "backbone.fpn.layer_blocks.0.bias", "backbone.fpn.layer_blocks.1.weight", "backbone.fpn.layer_blocks.1.bias", "backbone.fpn.layer_blocks.2.weight", "backbone.fpn.layer_blocks.2.bias", "backbone.fpn.layer_blocks.3.weight", "backbone.fpn.layer_blocks.3.bias", "rpn.head.conv.weight", "rpn.head.conv.bias".
Unexpected key(s) in state_dict: "backbone.fpn.inner_blocks.0.0.weight", "backbone.fpn.inner_blocks.0.0.bias", "backbone.fpn.inner_blocks.1.0.weight", "backbone.fpn.inner_blocks.1.0.bias", "backbone.fpn.inner_blocks.2.0.weight", "backbone.fpn.inner_blocks.2.0.bias", "backbone.fpn.inner_blocks.3.0.weight", "backbone.fpn.inner_blocks.3.0.bias", "backbone.fpn.layer_blocks.0.0.weight", "backbone.fpn.layer_blocks.0.0.bias", "backbone.fpn.layer_blocks.1.0.weight", "backbone.fpn.layer_blocks.1.0.bias", "backbone.fpn.layer_blocks.2.0.weight", "backbone.fpn.layer_blocks.2.0.bias", "backbone.fpn.layer_blocks.3.0.weight", "backbone.fpn.layer_blocks.3.0.bias", "rpn.head.conv.0.0.weight", "rpn.head.conv.0.0.bias".

Even tried with get_model function getting the Same Error

Traceback (most recent call last):
File "D:\Projects\custom_KP\inference_kp.py", line 47, in
model = get_model(4,'custom_keypointsrcnn_weights.pth')
File "D:\Projects\custom_KP\inference_kp.py", line 35, in get_model
model.load_state_dict(state_dict)
File "C:\Python39\lib\site-packages\torch\nn\modules\module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for KeypointRCNN:
Missing key(s) in state_dict: "backbone.fpn.inner_blocks.0.weight", "backbone.fpn.inner_blocks.0.bias", "backbone.fpn.inner_blocks.1.weight", "backbone.fpn.inner_blocks.1.bias", "backbone.fpn.inner_blocks.2.weight", "backbone.fpn.inner_blocks.2.bias", "backbone.fpn.inner_blocks.3.weight", "backbone.fpn.inner_blocks.3.bias", "backbone.fpn.layer_blocks.0.weight", "backbone.fpn.layer_blocks.0.bias", "backbone.fpn.layer_blocks.1.weight", "backbone.fpn.layer_blocks.1.bias", "backbone.fpn.layer_blocks.2.weight", "backbone.fpn.layer_blocks.2.bias", "backbone.fpn.layer_blocks.3.weight", "backbone.fpn.layer_blocks.3.bias", "rpn.head.conv.weight", "rpn.head.conv.bias".
Unexpected key(s) in state_dict: "backbone.fpn.inner_blocks.0.0.weight", "backbone.fpn.inner_blocks.0.0.bias", "backbone.fpn.inner_blocks.1.0.weight", "backbone.fpn.inner_blocks.1.0.bias", "backbone.fpn.inner_blocks.2.0.weight", "backbone.fpn.inner_blocks.2.0.bias", "backbone.fpn.inner_blocks.3.0.weight", "backbone.fpn.inner_blocks.3.0.bias", "backbone.fpn.layer_blocks.0.0.weight", "backbone.fpn.layer_blocks.0.0.bias", "backbone.fpn.layer_blocks.1.0.weight", "backbone.fpn.layer_blocks.1.0.bias", "backbone.fpn.layer_blocks.2.0.weight", "backbone.fpn.layer_blocks.2.0.bias", "backbone.fpn.layer_blocks.3.0.weight", "backbone.fpn.layer_blocks.3.0.bias", "rpn.head.conv.0.0.weight", "rpn.head.conv.0.0.bias".

tried many option on pretrained=False/True, pretrained_backbone=False/True, but no use.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.