Coder Social home page Coder Social logo

daniil-osokin / lightweight-human-pose-estimation.pytorch Goto Github PK

View Code? Open in Web Editor NEW
2.0K 34.0 476.0 213 KB

Fast and accurate human pose estimation in PyTorch. Contains implementation of "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose" paper.

License: Apache License 2.0

Python 100.00%
human-pose-estimation deep-learning real-time openpose openvino coco-keypoints-detection mscoco-keypoint lightweight pytorch lightweight-openpose

lightweight-human-pose-estimation.pytorch's Introduction

Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose

This repository contains training code for the paper Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose. This work heavily optimizes the OpenPose approach to reach real-time inference on CPU with negliable accuracy drop. It detects a skeleton (which consists of keypoints and connections between them) to identify human poses for every person inside the image. The pose may contain up to 18 keypoints: ears, eyes, nose, neck, shoulders, elbows, wrists, hips, knees, and ankles. On COCO 2017 Keypoint Detection validation set this code achives 40% AP for the single scale inference (no flip or any post-processing done). The result can be reproduced using this repository. This repo significantly overlaps with https://github.com/opencv/openvino_training_extensions, however contains just the necessary code for human pose estimation.

🔥 Check out our new work on accurate (and still fast) single-person pose estimation, which ranked 10th on CVPR'19 Look-Into-Person challenge.

🔥🔥 Check out our lightweight 3D pose estimation, which is based on Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB paper and this work.

Table of Contents

Other Implementations

Requirements

  • Ubuntu 16.04
  • Python 3.6
  • PyTorch 0.4.1 (should also work with 1.0, but not tested)

Prerequisites

  1. Download COCO 2017 dataset: http://cocodataset.org/#download (train, val, annotations) and unpack it to <COCO_HOME> folder.
  2. Install requirements pip install -r requirements.txt

Training

Training consists of 3 steps (given AP values for full validation dataset):

  • Training from MobileNet weights. Expected AP after this step is ~38%.
  • Training from weights, obtained from previous step. Expected AP after this step is ~39%.
  • Training from weights, obtained from previous step and increased number of refinement stages to 3 in network. Expected AP after this step is ~40% (for the network with 1 refinement stage, two next are discarded).
  1. Download pre-trained MobileNet v1 weights mobilenet_sgd_68.848.pth.tar from: https://github.com/marvis/pytorch-mobilenet (sgd option). If this doesn't work, download from GoogleDrive.

  2. Convert train annotations in internal format. Run python scripts/prepare_train_labels.py --labels <COCO_HOME>/annotations/person_keypoints_train2017.json. It will produce prepared_train_annotation.pkl with converted in internal format annotations.

    [OPTIONAL] For fast validation it is recommended to make subset of validation dataset. Run python scripts/make_val_subset.py --labels <COCO_HOME>/annotations/person_keypoints_val2017.json. It will produce val_subset.json with annotations just for 250 random images (out of 5000).

  3. To train from MobileNet weights, run python train.py --train-images-folder <COCO_HOME>/train2017/ --prepared-train-labels prepared_train_annotation.pkl --val-labels val_subset.json --val-images-folder <COCO_HOME>/val2017/ --checkpoint-path <path_to>/mobilenet_sgd_68.848.pth.tar --from-mobilenet

  4. Next, to train from checkpoint from previous step, run python train.py --train-images-folder <COCO_HOME>/train2017/ --prepared-train-labels prepared_train_annotation.pkl --val-labels val_subset.json --val-images-folder <COCO_HOME>/val2017/ --checkpoint-path <path_to>/checkpoint_iter_420000.pth --weights-only

  5. Finally, to train from checkpoint from previous step and 3 refinement stages in network, run python train.py --train-images-folder <COCO_HOME>/train2017/ --prepared-train-labels prepared_train_annotation.pkl --val-labels val_subset.json --val-images-folder <COCO_HOME>/val2017/ --checkpoint-path <path_to>/checkpoint_iter_280000.pth --weights-only --num-refinement-stages 3. We took checkpoint after 370000 iterations as the final one.

We did not perform the best checkpoint selection at any step, so similar result may be achieved after less number of iterations.

Known issue

We observe this error with maximum number of open files (ulimit -n) equals to 1024:

  File "train.py", line 164, in <module>
    args.log_after, args.val_labels, args.val_images_folder, args.val_output_name, args.checkpoint_after, args.val_after)
  File "train.py", line 77, in train
    for _, batch_data in enumerate(train_loader):
  File "/<path>/python3.6/site-packages/torch/utils/data/dataloader.py", line 330, in __next__
    idx, batch = self._get_batch()
  File "/<path>/python3.6/site-packages/torch/utils/data/dataloader.py", line 309, in _get_batch
    return self.data_queue.get()
  File "/<path>/python3.6/multiprocessing/queues.py", line 337, in get
    return _ForkingPickler.loads(res)
  File "/<path>/python3.6/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
    fd = df.detach()
  File "/<path>/python3.6/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
  File "/<path>/python3.6/multiprocessing/reduction.py", line 182, in recv_handle
    return recvfds(s, 1)[0]
  File "/<path>/python3.6/multiprocessing/reduction.py", line 161, in recvfds
    len(ancdata))
RuntimeError: received 0 items of ancdata

To get rid of it, increase the limit to bigger number, e.g. 65536, run in the terminal: ulimit -n 65536

Validation

  1. Run python val.py --labels <COCO_HOME>/annotations/person_keypoints_val2017.json --images-folder <COCO_HOME>/val2017 --checkpoint-path <CHECKPOINT>

Pre-trained model

The model expects normalized image (mean=[128, 128, 128], scale=[1/256, 1/256, 1/256]) in planar BGR format. Pre-trained on COCO model is available at: https://download.01.org/opencv/openvino_training_extensions/models/human_pose_estimation/checkpoint_iter_370000.pth, it has 40% of AP on COCO validation set (38.6% of AP on the val subset).

Conversion to OpenVINO format

  1. Convert PyTorch model to ONNX format: run script in terminal python scripts/convert_to_onnx.py --checkpoint-path <CHECKPOINT>. It produces human-pose-estimation.onnx.
  2. Convert ONNX model to OpenVINO format with Model Optimizer: run in terminal python <OpenVINO_INSTALL_DIR>/deployment_tools/model_optimizer/mo.py --input_model human-pose-estimation.onnx --input data --mean_values data[128.0,128.0,128.0] --scale_values data[256] --output stage_1_output_0_pafs,stage_1_output_1_heatmaps. This produces model human-pose-estimation.xml and weights human-pose-estimation.bin in single-precision floating-point format (FP32).

C++ Demo

C++ demo can be found in the Intel® OpenVINO™ toolkit, the corresponding model is human-pose-estimation-0001. Please follow the official instruction to run it.

Python Demo

We provide python demo just for the quick results preview. Please, consider c++ demo for the best performance. To run the python demo from a webcam:

  • python demo.py --checkpoint-path <path_to>/checkpoint_iter_370000.pth --video 0

Citation:

If this helps your research, please cite the paper:

@inproceedings{osokin2018lightweight_openpose,
    author={Osokin, Daniil},
    title={Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose},
    booktitle = {arXiv preprint arXiv:1811.12004},
    year = {2018}
}

lightweight-human-pose-estimation.pytorch's People

Contributors

daniil-osokin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lightweight-human-pose-estimation.pytorch's Issues

Fastest version

Dear Daniil,

what is the fastest version, I ran the C++ version but could get up to 5 FPS on a Intel i5 CPU?
Maybe I am doing something wrong. Do I have to do something special to achieve >20 FPS?

best regards

pose coordinates for each person

@Daniil-Osokin I was looking for pose coordinates of all people present in a frame.
I called the function convert_to_coco_format(pose_entries,all_keypoints) inside the run_demo function. If there are two people in a frame will that give me pose coordinates of each individual?
I checked the output but it is showing coordinates of one person only. Maybe I am wrong. I would appreciate your advice. Thank you.

less than 1 fps speed on cpu without using openvino toolkit

i am getting less than 1 fps speed on cpu (Core™ i7-6850K) by using the pretrained model without using openvino toolkit. but in the paper it was mentioned that the speed would be 26 fps.so, why there is so much performance drop.

i am using the checkpoint_iter_370000.pth pretrained model.

What is AP?

Does it correspond to Average Precision (AP) @[ IoU=0.50 | area= all | maxDets= 20 ]?

Train on custom dataset

Hi, I tried to train the net with a custom dataset created with coco-annotator (https://github.com/jsbroks/coco-annotator) but the function CocoTrainDataset inside coco.py doesn't return anything (len(dataset) is 0 after dataset=CocoTrainDataset(...)).
Any help?
Anyway is it actually possible to train that model on custom datasets (with fewer keypoints maybe) ?
Thanks

use the code on another dataset

I have used Openpose on my images. I want to try this code too and compare the results. What are the steps to do it and is it possible?

about mobilenet_sgd_68.848.pth.tar

Hi!Thanks for your works.
When I train from MobileNet weights,the output is shown below
image
There seems to be no pre-trained parameters.

Identify detected People

Thanks for your contribution.
I want to know how can I identify all detected people? For example,label them.
And then,add rectangular box for one.Now I was blocked in identifying people,I just know "len(pose_entries)" means number of detected people.

Is it possible for my idea to come true? Or could you please give me a suggestion

Best wishes.

C++ code

Hello Daniil Osokin, Hello everyone :)

I have a little question about the C++ code, I would like to know where I could find it ? In fact, I can easily find the python code and use functions such as convert_to_coco_format. But if I need the sane function in C++, where could I find it ?

Thank you very much for your feedback, have a nice day :)

mobilenet v2

Hello,
Could you please share me the model base on mobilenet v2?

Points information

Thank You, @Daniil-Osokin , for your awesome work! I would like to ask, are there any explanations about the human points? In which order do they go in detection?
Thank You!

caffe train.prototxt

hi, i have seen the deploy.prototxt in tf-lightweight openpose, so can you give the train.prototxt of lightweight openpose in caffe, thanks

About the heatmap and paf value

hi, thanks for your nice work, but i still have some questions about the code.
1). keypoint_map[map_y, map_x] += math.exp(-exponent)
As mentioned in the original paper, it takes maximun of the confidence maps, but in the code, it takes sum of the confidence maps. So I wonder why you take the sum, maybe it's a trick?I modified this line of code like this: keypoint_map[map_y, map_x] = max(keypoint_map[map_y, map_x] , math.exp(-exponent)), and the experiment is still going on.
2)paf_map[0, y, x] = x_ba
paf_map[1, y, x] = y_ba
paf values takes the average values, but your code implements didn't take the average.

about pre-trained model

Hi, thanks for your nice work. I have download the pre_trained model from link, but can not successfully open it such as there is nothing contained in this .tar file.

ELU instead of ReLU in conv_dw_no_bn

Hi, nice work on this repo!

I'm wondering that why do you use ELU instead of ReLU in con_dw_no_bw, while the conv_dw counterpart uses the regular ReLU:

def conv_dw_no_bn(in_channels, out_channels, kernel_size=3, padding=1, stride=1, dilation=1):
return nn.Sequential(
nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, dilation=dilation, groups=in_channels, bias=False),
nn.ELU(inplace=True),
nn.Conv2d(in_channels, out_channels, 1, 1, 0, bias=False),
nn.ELU(inplace=True),
)

Is there a particular reason to use ELU? I didn't see any mentioning of activation function in your paper or the original openpose paper.

Thank you!

Humanpose Demo - Shapes issue in Python

Nice work on this repo! I am able to convert the pretrained model to onnx and then to IR format with no issue (using your README). The model works with running the demo (C++ and shell script ./human-pose) but I am trying to also run it with the Python plugin. Using your code to reshape the input image and then calling the python plugin, I get the following:
ValueError: could not broadcast input array from shape (1,3,256,384) into shape (1,3,256,456) .

Have you run the model from python on OpenVino? Just curious (hoping to save myself some time)....

Joints coordinates in real-time

Hello Daniil-Osokin, hello everyone ! Thank you very much for your work, it's very good. I would like to get the real-time coordinates of each joint and save them in a CSV file. Where is the place in the code I could get them ?

I already checked in keypoints.py but I don't know how I could clearly identify each joint (left wrist, right wrist, left elbow, right elbow etc etc) and get it's coordinates in real-time.

Thank you very much for your feedback, have a nice day :)

BODY_PARTS_KPT_IDS

BODY_PARTS_KPT_IDS = [[1, 8], [8, 9], [9, 10], [1, 11], [11, 12], [12, 13], [1, 2], [2, 3], [3, 4], [2, 16],
[1, 5], [5, 6], [6, 7], [5, 17], [1, 0], [0, 14], [0, 15], [14, 16], [15, 17]]
The BODY_PARTS_KPT_IDS in coco.py is the skeleton in person_keypoints_train2017.json? Why the n_keypoints is 18?

libtorch c++ demo

Hi, I have managed using openvino c++ demo, runs about 15fps on CPU. which is not fast enough.

Does there any plan to support a libtorch c++ demo? I want implement one if not, there are some problems for me:

  • in terms of preprocessing, you scaled image accroding to height, and padding the lost pixels with zeros. I think it's kind of complicated, if I resize image directly to target size say (256, 456), will it work?
  • I managed traced the model and loaded from libtorch, but there are 4 outputs, so I toke the last 2 outputs tensor. I saw the keypoints extract and group method in python, it's very hard implement on C++, does there any snippets could help directly extract all keypoints in instance from the output tensor?

Hope you reply soon.

Only one person while training

Hi,

When I checked the code, I could see that one image had only one person annotated in the COCO dataset. There was no mask on other portions which may contain other people. If this understanding is correct, how does it work on multiple persons?

I am referring to this line. I could see keypoints available for only one person in the list.

I am asking this because, when I ran training with the COCO dataset, I am getting zero detections. Even when I initiate my model with weights of checkpoint_iter_370000.pth (which works well), just after 5000 iterations (batch size 16), the detections on live demo becomes zero.

thanks and regards
skbhat

Unclear code in prepare_train_labels

I am trying to understand the code.. and I have some queries.

This line

if annotation['num_keypoints'] != 0 and not annotation['iscrowd']:
specifies that if there's a crowd, annotation for this image won't be added to annotations_per_image_mapping.

The line here

indicates that the image will only be added to crowd_segmentations_per_image_mapping if it has a crowd in it.

Hence, I don't think this condition

if image_id in annotations_per_image_mapping:
will ever be true.

Now, even if it gets true, we don't need to do this

annotations_per_image_mapping[image_id][1] = crowd_segmentations
. As the dictionary already contains the whole annotation including the segmentation part which was added here
annotations_per_image_mapping[annotation['image_id']][0].append(annotation)
.

So, I think that we can directly access the segmentation part by using annotation['image_id']][0]['segmentation'].

I am currently looking at COCO 2017 person keypoints json file. Thank you..

Ilustration transform with original coco keypoint label

Hi, this work is very good. However I got some puzzels about this:

for i in range(len(annotation['keypoints']) // 3):
                keypoint = [annotation['keypoints'][i * 3], annotation['keypoints'][i * 3 + 1], 2]
                if annotation['keypoints'][i * 3 + 2] == 1:
                    keypoint[2] = 0
                elif annotation['keypoints'][i * 3 + 2] == 2:
                    keypoint[2] = 1
                keypoints.append(keypoint)
            prepared_annotation['keypoints'] = keypoints

What does this transform do compare to original coco data label format?

Why set person mask = 0 ?

When preprocessing the COCO data,

mask = np.ones(shape=(label['img_height'], label['img_width']), dtype=np.float32)
mask = get_mask(label['segmentations'], mask)

def get_mask(segmentations, mask):
    for segmentation in segmentations:  # RLE, multi person
        rle = pycocotools.mask.frPyObjects(segmentation, mask.shape[0], mask.shape[1])
        mask[pycocotools.mask.decode(rle) > 0.5] = 0
    return mask

mask[pycocotools.mask.decode(rle) > 0.5] = 0 will make person masks equal to 0, while bg pixels are still 1.

I mean, why reverse 0 and 1 ?

And in loss function, this mask will just choose the background ???

def l2_loss(input, target, mask, batch_size):
    loss = (input - target) * mask
    loss = (loss * loss) / 2 / batch_size
    return loss.sum()

the issues of add on the joint

Dear Sir,

Currently I am doing research project for my degree. I have some problem with adding an extra point at the middle hip (pelvis), I have change the part connection by following the parameter from [https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/src/openpose/pose/poseParameters.cpp]
I am using 19 points.
The issue now I am facing is about the val.py
when the code function group_keypoints(all_keypoints_by_type, pafs, pose_entry_size=19, min_paf_score=0.05, demo=False) operating, my 'kpts_b" always get the error 'IndexError: list index out of range'.

Hope you can tell me how to add on the point like how you add 'neck' by using COCO dataset.

Thank you so much.

the version of requirements

hi, I saw the requirements are "torch==0.4.1
torchvision==0.2.1
pycocotools==2.0.0
opencv-python==3.4.0.14
numpy==1.14.0" I think they are a little bit old. And when I tried to run the 'pip install -r requirements.txt' I get ERROR: Could not find a version that satisfies the requirement opencv-python==3.4.0.14
I have already installed the torch 1.2.0, torchvision 0.3.0 pycocotools 2.0.0,
and opencv-python 4.1.0.25, dose that version work?
thanks!

Keypoint IDs

@Daniil-Osokin I got the pose coordinates using convert_to_coco_format. I am not sure about the the pose coordiantes mapping. Is it like the following?

"keypoints": {
        0: "nose",
        1: "left_eye",
        2: "right_eye",
        3: "left_ear",
        4: "right_ear",
        5: "left_shoulder",
        6: "right_shoulder",
        7: "left_elbow",
        8: "right_elbow",
        9: "left_wrist",
        10: "right_wrist",
        11: "left_hip",
        12: "right_hip",
        13: "left_knee",
        14: "right_knee",
        15: "left_ankle",
        16: "right_ankle"
    },

about inference time in crowed scene

@Daniil-Osokin Thanks for your works
when I check the inference time, and in a crowed scene with up to 30 or more person the speed is very slow

net forward ->0.06s
extract_keypoints->0.10s
group_keypoints->2.4s

is it possible to optimize the group_keypoints and extract_keypoints on GPU?
Thanks~

Error when converting to onnx

Hello, this work is excellent! And now I am trying to run the demo, I first just want to test the inference time. So I downloaded the checkpoint_iter_370000.pth and trying to convert it into onnx. And I got this error:

Traceback (most recent call last):
File "convert_to_onnx.py", line 30, in
convert_to_onnx(net, args.output_name)
File "convert_to_onnx.py", line 16, in convert_to_onnx
torch.onnx.export(net, input, output_name, verbose=True, input_names=input_names, output_names=output_names)
File "/home/lin/anaconda3/envs/panet/lib/python3.6/site-packages/torch/onnx/init.py", line 25, in export
return utils.export(*args, **kwargs)
File "/home/lin/anaconda3/envs/panet/lib/python3.6/site-packages/torch/onnx/utils.py", line 84, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names)
File "/home/lin/anaconda3/envs/panet/lib/python3.6/site-packages/torch/onnx/utils.py", line 140, in _export
trace.set_graph(_optimize_graph(trace.graph(), aten))
File "/home/lin/anaconda3/envs/panet/lib/python3.6/site-packages/torch/onnx/utils.py", line 95, in _optimize_graph
graph = torch._C._jit_pass_onnx(graph, aten)
File "/home/lin/anaconda3/envs/panet/lib/python3.6/site-packages/torch/onnx/init.py", line 40, in _run_symbolic_function
return utils._run_symbolic_function(*args, **kwargs)
File "/home/lin/anaconda3/envs/panet/lib/python3.6/site-packages/torch/onnx/utils.py", line 368, in _run_symbolic_function
return fn(g, *inputs, **attrs)
TypeError: elu() got an unexpected keyword argument 'scale' (occurred when translating elu)

Any idea how to solve this?
By the way, I am using pytorch 0.4.0 and python 3.6.6.
Thank you!

About net structure

@Daniil-Osokin
Q1: Does cpm in the network mean Convolutional Pose Machines?
Q2: Where did cpm idea come from?It seems like that it's very important for result
Q3: In #27, ELU was talked.Is it a good idea using ELU in InitialStage and RefinementStage? Because , in InitialStage, BN is not used.In RefinementStage,some layers have no BN after ReLU.

using the onnx model with opencv

hi ;)
i'm trying to use the lightweight-human-pose model on colab:

!wget 'https://drive.google.com/uc?export=dowload&id=1T2Kq01WXzPMrQdnEOUEiVBhwouW8Pka5' -O pose.onnx

net = cv2.dnn.readNet("pose.onnx")
# load 256x256 image and sent through net
net.dumpToFile("pose_dot.txt")
!dot pose_dot.txt -Tpng -opose_dot.png

and the last layer :
Clipboard01

that looks weird. 38 heatmaps ? shouldn't it have 19 heat and 2x19 paf maps (=57) ?
or, if it's meant to be used for single persons , 19 heatmaps only ?

i look at the model:

def __init__(self, num_refinement_stages=1, num_channels=128, num_heatmaps=19, num_pafs=38):

and it should have returned [heatmaps, pafs] here:

but it looks to me, like we only have the pafs returned from the forward pass in opencv

the newly added opencv Keypoints model tries to loop over all of them, and results look quite bad ;(

something wrong about visualize dataset

When I visualize the dataset, use you code, I found some keypoints in picture are misleading, e.g.

sample-vis-kp-kp-ori-with-lines

the image info is:
{'img_paths': '000000181542.jpg', 'img_width': 640, 'img_height': 600, 'objpos': [590.4, 256.405], 'image_id': 181542}

the reason is, the man's left shoulder is marked as "not in image", so the stat number is 2, and right sholder is visualizable, so the stat number is 0, according the code

converted_keypoints.insert(1, [(keypoints[5][0] + keypoints[6][0]) / 2,
                               (keypoints[5][1] + keypoints[6][1]) / 2, 0])  # Add neck as a mean of shoulders
if keypoints[5][2] == 2 and keypoints[6][2] == 2:
    converted_keypoints[1][2] = 2
elif keypoints[5][2] == 3 and keypoints[6][2] == 3:
    converted_keypoints[1][2] = 3
elif keypoints[5][2] == 1 and keypoints[6][2] == 1:
    converted_keypoints[1][2] = 1
if (converted_keypoints[1][0] < 0
        or converted_keypoints[1][0] >= w
        or converted_keypoints[1][1] < 0
        or converted_keypoints[1][1] >= h):
    converted_keypoints[1][2] = 2

converted_keypoints[1][1] is > 0 and converted_keypoints[1][1] is between [0, h], so finally it will insert a neck as state number 0. I change the code as blow and it works just it is:

converted_keypoints.insert(1, [(keypoints[5][0] + keypoints[6][0]) / 2,
                               (keypoints[5][1] + keypoints[6][1]) / 2, 1])  # Add neck as a mean of shoulders
if keypoints[5][2] == 2 or keypoints[6][2] == 2:
    converted_keypoints[1][2] = 2
elif keypoints[5][2] == 0 and keypoints[6][2] == 0:
    converted_keypoints[1][2] = 0

using IP camera for real-time demo

@Daniil-Osokin thanks for sharing your excellent work. I am trying to use an IP camera.
I used

python demo.py --checkpoint-path scripts/checkp
oint_iter_370000.pth --video rtsp://root:[email protected]/axis-media/media.amp

I got the following output:
OSError: Video rtsp://root:[email protected]/axis-media/media.amp cannot be opened

It is not able to read the image. Am I doing something wrong here? I am a newbie to CV.
I would really appreciate your advice on this. Thank you

Matching Parts to Pose

When matching the body parts to form the human pose, the pose_entry in modules/keypoints.py does not seem to be merged. Is it possible for the parts to be assembled such that two pose_entry is actually from the same person and that when adding in a part, it then becomes necessary to merge the 2 pose_entry together ?

Image resolution vs. accuracy

When you do the training on COCO dataset, where do you define the image size and how does that map to the convolution size?

This is in reference to the implementation of MobileNet V2.
When I change the number of channels to 24 from 32, it throws a runtime error: expected 32 channels, got 24 instead.

Can you please help me understand the input size and the output after convolutional layers?

demo

How to Save the running results

training step

Hi:
all,
when I follow the traing steps, in the step 3, I run the training codes, there are no errors, however, the code has stayed in this screen for more than 10 hours, there is no other print information, has anyone ever met this situation?
thanks!
QQ截图20191009094122

Skeletons ID

Hello Daniil Osokin, hello everyone. I have a question about the skeletons tracked by the camera.

Actually I'm doing games on Unity, and I need to recognize each player that is in front of the camera, so I need to affect a skeleton (with an ID) to each player during the whole duration of a game (the game lasts between 2 and 5 minutes). But if the skeletons are refreshed at each frame I won't be able to do that because each skeleton will have a new ID for every new frame. So my questions are :

Is it possible to get each skeleton's ID ? Or, are the skeletons refreshed at each frame ?

Thank you very much for your feedback, I wish you a nice day :)

Very Slow Inference on CPU

Hi I found the model is very very slow on CPU using the pretrained weight "checkpoint_iter_370000.pth". I have attached the code below. I have tested different scenarios, and summarize the results below:
GPU w pretrained weight: 0.007 sec
GPU w/o pretrained weight: 0.007 sec
CPU w pretrained weight: 2.829 sec
CPU w/o pretrained weight: 0.376 sec

Could you kindly explain why the inference time using CPU and pretrained weight is so slow ?

import argparse
import cv2
import numpy as np
import torch
from models.with_mobilenet import PoseEstimationWithMobileNet
from modules.keypoints import extract_keypoints, group_keypoints
from modules.load_state import load_state
from modules.pose import Pose, propagate_ids
import time
#from val import normalize, pad_width
device = torch.device('cpu')
model_Mobilenet = PoseEstimationWithMobileNet().to(device)
checkpoint = torch.load('checkpoint_iter_370000.pth', map_location=lambda storage, loc: storage)
load_state(model_Mobilenet, checkpoint)
model_Mobilenet.eval()
input = torch.Tensor(2, 3, 368, 368).to(device)

since = time.time()
stages_output= model_Mobilenet(input)
PAF_Mobilenet, Heatmap_Mobilenet = stages_output[-1], stages_output[-2]
print('Mobilenet Inference time is {:2.3f} seconds'.format(time.time() - since))

Is the default value of img_scale the right one?

Hello,
Thanks a lot for your great repo. I have a question about the variable img_scale used on the following line:

pad_value=(0, 0, 0), img_mean=(128, 128, 128), img_scale=1/256):

According to its use on the line:

img = (img - img_mean) * img_scale
I would say that the default value should be 1/255 and not 1/256 as it is a cross product to convert the color values from 0 to 255 to 0 to 1.

Is it just a confusion regarding the input size of the image or is there something I didn't understand correctly?

Looking forward to your answer,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.