Coder Social home page Coder Social logo

zhec / realtime_multi-person_pose_estimation Goto Github PK

View Code? Open in Web Editor NEW
5.1K 258.0 1.4K 45.73 MB

Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)

License: Other

MATLAB 1.50% Shell 0.02% Python 0.21% Jupyter Notebook 98.26% Gnuplot 0.01%
human-pose-estimation realtime caffe human-behavior-understanding deep-learning computer-vision matlab python cpp11 cvpr-2017

realtime_multi-person_pose_estimation's Introduction

Realtime Multi-Person Pose Estimation

By Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh.

Introduction

Code repo for winning 2016 MSCOCO Keypoints Challenge, 2016 ECCV Best Demo Award, and 2017 CVPR Oral paper.

Watch our video result in YouTube or our website.

We present a bottom-up approach for realtime multi-person pose estimation, without using any person detector. For more details, refer to our CVPR'17 paper, our oral presentation video recording at CVPR 2017 or our presentation slides at ILSVRC and COCO workshop 2016.

This project is licensed under the terms of the license.

Other Implementations

Thank you all for the efforts for the reimplementation! If you have new implementation and want to share with others, feel free to make a pull request or email me!

Contents

  1. Testing
  2. Training
  3. Citation

Testing

C++ (realtime version, for demo purpose)

  • Please use OpenPose, now it can run in CPU/ GPU and windows /Ubuntu.
  • Three input options: images, video, webcam

Matlab (slower, for COCO evaluation)

  • Compatible with general Caffe. Compile matcaffe.
  • Run cd testing; get_model.sh to retrieve our latest MSCOCO model from our web server.
  • Change the caffepath in the config.m and run demo.m for an example usage.

Python

  • cd testing/python
  • ipython notebook
  • Open demo.ipynb and execute the code

Training

Network Architecture

Teaser?

Training Steps

  • Run cd training; bash getData.sh to obtain the COCO images in dataset/COCO/images/, keypoints annotations in dataset/COCO/annotations/ and COCO official toolbox in dataset/COCO/coco/.
  • Run getANNO.m in matlab to convert the annotation format from json to mat in dataset/COCO/mat/.
  • Run genCOCOMask.m in matlab to obatin the mask images for unlabeled person. You can use 'parfor' in matlab to speed up the code.
  • Run genJSON('COCO') to generate a json file in dataset/COCO/json/ folder. The json files contain raw informations needed for training.
  • Run python genLMDB.py to generate your LMDB. (You can also download our LMDB for the COCO dataset (189GB file) by: bash get_lmdb.sh)
  • Download our modified caffe: caffe_train. Compile pycaffe. It will be merged with caffe_rtpose (for testing) soon.
  • Run python setLayers.py --exp 1 to generate the prototxt and shell file for training.
  • Download VGG-19 model, we use it to initialize the first 10 layers for training.
  • Run bash train_pose.sh 0,1 (generated by setLayers.py) to start the training with two gpus.

Citation

Please cite the paper in your publications if it helps your research:

@inproceedings{cao2017realtime,
  author = {Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh},
  booktitle = {CVPR},
  title = {Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields},
  year = {2017}
  }
  
@inproceedings{wei2016cpm,
  author = {Shih-En Wei and Varun Ramakrishna and Takeo Kanade and Yaser Sheikh},
  booktitle = {CVPR},
  title = {Convolutional pose machines},
  year = {2016}
  }

realtime_multi-person_pose_estimation's People

Contributors

aseaday avatar gineshidalgo99 avatar hzzone avatar justinshenk avatar lygztq avatar mikeofzen avatar niteshbharadwaj avatar wagamamaz avatar yangzeyu95 avatar zhec avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

realtime_multi-person_pose_estimation's Issues

Training not converge

Hi,

Can you share how long does it take for you to train on what GPU? And what should the final loss be?

After 2 days' training on K80, it just trained for 17100 iters and the loss is still in the range of 500-1000. Is this right?

Thanks

Cannot create mdb file : MDB_MAP_FULL Environment mapsize limit reached

@ZheC
Hi,
First of all, thanks for your great contribution. I'm now trying to train the network but I got stuck at the generation of the mdb files. I'm pretty newbie to lmdb. Somehow, if I run the code as it is, with the option map_size set to 1e12, I get the error

lmdb.MemoryError: [....]: Cannot allocate memory

So I moved into lowering the size of the map to 1e10 to be able to initialize the process. Nevertheless, when the memory get to its limit it pops the following

lmdb.MapFullError: mdb_put: MDB_MAP_FULL: Environment mapsize limit reached

I tried to set the flag writemap=True on lmdb.open() but without sucess, i.e. same MDB_MAP_FULL issue.
My best guess is that I have to try to flush the current map into memory and build the mdb file in a incrementally way. I wonder if you know any strategy to try to workaround this issue.

Thanks in advance

Keypoints- visibility and labelling

I checked the code, and I could only see that both invisible, labelled(occluded ) and visible labelled key points((isVisible = 0,1)) are handled in the same way. Is this true? Is there any case where they are treated differently?

What about the unlabelled(isVisible = 2), key points? I see only the following code corresponding to it. Here cases isVisible = 1,2 treated in the same way, but why?

what does (isVisible = 3) stand for, is there such a case?

 if(i < 17 && meta.joint_self.isVisible[i] >= 1) {
      circle(img_vis, meta.joint_self.joints[i], 3, CV_RGB(0,0,255), -1);
    }

and also 

 if (meta.joint_self.isVisible[i] != 3){
              transformed_label[i*channelOffset + g_y*grid_x + g_x] = weight;

How to change the batch size

When I am trying to execute demo.ipynb, its getting out of memory (inspite of using GPU) [error == cudaSuccess (2 vs 0) out of memory]
So, I think i need to alter the batch size. How can I do it?

About 'cpm_transform_param'

thanks for your share, but when I use 'train_pose.sh', It gave me error following:

[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 11:23: Message type "caffe.LayerParameter" has no field named "cpm_transform_param".
F0206 11:29:26.923185 7789 upgrade_proto.cpp:79] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: pose_train_test.prototxt

Could you help me?
Thank you very much

CPU modes

Hi, which implementations have CPU support?

C++
Matlab
or Python

Cheers:)

Unable to find joints plot code

Hi ZheC
Thanks for making your code available in Github, your model performs so great!
I have a problem that I cannot find joints drawing code anywhere. I want to change the form of display result in image, such as color and line width. How can I find these code?
Thanks

Error running train_pose.sh

Hi, I'm getting much closer to being able to train a network with this. I just have this one error when I run train_pose.sh:

I0328 19:23:52.507542 17876 cpm_data_transformer.cpp:242] CPMDataTransformer constructor done.
I0328 19:23:52.508219 17884 db_lmdb.cpp:35] Opened lmdb /home/mrmagic/COCO_kpt/lmdb_trainVal
*** Aborted at 1490743432 (unix time) try "date -d @1490743432" if you are using GNU date ***
PC: @     0x7fc73809bdbf (unknown)
*** SIGBUS (@0x7ede92c5600a) received by PID 17876 (TID 0x7fc712888700) from PID 18446744071876993034; stack trace: ***
    @     0x7fc73f6134b0 (unknown)
    @     0x7fc73809bdbf (unknown)
    @     0x7fc73809c0c8 (unknown)
    @     0x7fc73809c3ac (unknown)
    @     0x7fc73809b173 mdb_cursor_get
    @     0x7fc740e932d5 caffe::db::LMDB::NewCursor()
    @     0x7fc740e6900f caffe::DataReader::Body::InternalThreadEntry()
    @     0x7fc740e58935 caffe::InternalThread::entry()
    @     0x7fc736d2f5d5 (unknown)
    @     0x7fc73135d6ba start_thread
    @     0x7fc73f6e482d clone

I've never seen an error like this before, so if someone could point me in the right direction as to how to debug it that would be amazing!

I'm not sure if this is relevant or not, but when I run make runtest in caffe_train after running make all; make test I get this:

F0328 20:40:02.593730 32048 pooling_layer.cu:212] Check failed: error == cudaSuccess (10 vs. 0)  invalid device ordinal
*** Check failure stack trace: ***
    @     0x7fa3f7f2226d  google::LogMessage::Fail()
    @     0x7fa3f7f24083  google::LogMessage::SendToLog()
    @     0x7fa3f7f21dfb  google::LogMessage::Flush()
    @     0x7fa3f7f24a6e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fa3f56977c1  caffe::PoolingLayer<>::Forward_gpu()
    @           0x4798a6  caffe::Layer<>::Forward()
    @           0x482d7d  caffe::MaxPoolingDropoutTest_TestForward_Test<>::TestBody()
    @           0x91a693  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x913caa  testing::Test::Run()
    @           0x913df8  testing::TestInfo::Run()
    @           0x913ed5  testing::TestCase::Run()
    @           0x9151af  testing::internal::UnitTestImpl::RunAllTests()
    @           0x9154d3  testing::UnitTest::Run()
    @           0x46dd6d  main
    @     0x7fa3f47f6830  __libc_start_main
    @           0x4757d9  _start
Makefile:523: recipe for target 'runtest' failed
make: *** [runtest] Aborted (core dumped)

This doesn't happen when I run the same command with regular caffe, which makes me think this
is a caffe_train issue, not a caffe issue.

Any help at all would be greatly appreciated,
Thanks!

No pose output

Hi,
thank you for making your code available.
I have successfully tested your code last month on a GTX 970.
I am now using a GTX 1080 and I am getting no pose as output, using the model you provide.
Have you faced a similar issue? Could it be an incompatibility with the Pascal architecture?
Thanks

LimbSeq

I think "limbseq" should be:
limbSeq = [2 3; 2 6; 3 4; 4 5; 6 7; 7 8; 2 9; 9 10; 10 11; 2 12; 12 13; 13 14; 2 1; 1 15; 15 17; 1 16; 16 18; 3 18; 6 17]; % last two rows are different or not necessary

Label generation from keypoints -process flow and shape

@ZheC
label_vec, label_heat should have the same shape to calculate loss. In this link (https://github.com/CMU-Perceptual-Computing-Lab/caffe_train/blob/master/src/caffe/cpm_data_transformer.cp) , putVecMaps and putGaussianMap create PAFS and confidence maps respectively, what are their shapes? I think these are stored to the transformed label, what is its shape? How are they connected?

Is the following process flow correct?
keypoints from data set --> gaussian-maps, Pafs ---> transform_label-->sliced to label_vec, label_heat

Is there anyway to understand these process better? I have the paper, but creation of labels from mere key points confuses me, particularly the PAF generation! Thanks in advance

Has anyone successfully installed caffe_train and train this model?

My specs is ubuntu 16.04, titan x pascal, opencv 2.4.13, cuda 8, but I am not able to install caffe_train nor use setLayer.py.

My caffe_train gives error when running runtest and running python setLayer.py gives
'''
Traceback (most recent call last):
File "setLayers.py", line 17, in
import caffe
File "/home/yxchng/caffe_train/python/caffe/init.py", line 1, in
from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver
File "/home/yxchng/caffe_train/python/caffe/pycaffe.py", line 13, in
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver,
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory
'''

Test Loss unmatched to the Train Loss?

I have trained the model you provided by 340K iterations using batchsize=10 with 2 GTX1080 GPUs, the train-set is train2014 and test-set is val2014, the train-loss is around 300, while the test-loss is always around 530, is it normal, or the model goes wrong? I think the test-loss should be larger than the train-loss, but the difference should not be so great, and also the test-loss should go down with the train-loss declined.

resized_map and joints

@ZheC I find the resized_map(coco) shape is 1 57 544 960 and 57 means 19 + 38. Does 19 means confidence score of body parts+bgk? What does 38 means (orientation vector [x,y] of each pixel)?

Joints shape is 1 18 65 3. I guess 18 is number of body parts(without background). 65=1+64, 1 is used to represent number of peaks and 64 is the max number of peaks. 3=2+1, here 1 is score of the peak. What makes me confused is 2. which one does it mean? orientation vector [x,y] or position point [x,y]?

Keypoints- visibility and labelling

@ZheC -could you please consider this question?
I checked the code, and I could only see that both invisible, labelled( outside segment ) and visible labelled key points((isVisible = 0,1)) are handled in the same way. Is this true? Is there any case where they are treated differently?

What about the unlabelled(isVisible = 2), key points? I see only the following code corresponding to it. Here cases isVisible = 1,2 treated in the same way, but why?

what does (isVisible = 3) stand for, is there such a case?

 if(i < 17 && meta.joint_self.isVisible[i] >= 1) {
      circle(img_vis, meta.joint_self.joints[i], 3, CV_RGB(0,0,255), -1);
    }

and also 

 if (meta.joint_self.isVisible[i] != 3){
              transformed_label[i*channelOffset + g_y*grid_x + g_x] = weight;

Benchmark of the train project

@ZheC
I found that in your train project, the solver has not defined a test net yet, so how to check the performance of the network on the validation-set, maybe some evaluate-methods should be given for testing. I would be appreciated if you could open source the evaluation code of your work on the COCO and MPI datasets, as you reported in the paper. Thanks a lot!!

non-POD when compile on osx

Hello, i trying to compile for osx ang got error:

examples/rtpose/rtpose.cpp:1087:22: error: variable length array of non-POD element type 'Frame'
Frame frame_batch[BATCH_SIZE];

may be need other compiler of flag?

Undefined symbol in libcaffe.so.1.0.0-rc3

Hi, this is some impressive software you guys have. I'm trying to follow the instructions for training in the README, but I'm stuck on this one issue and I cannot find a solution. When I run genLMDB.py I get this error:

Traceback (most recent call last):
  File "training/setLayers.py", line 15, in <module>
    import caffe
  File "/home/mrmagic/caffe_rtpose/python/caffe/__init__.py", line 1, in <module>
    from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver
  File "/home/mrmagic/caffe_rtpose/python/caffe/pycaffe.py", line 13, in <module>
    from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \
ImportError: /home/mrmagic/caffe_rtpose/python/caffe/../../build/lib/libcaffe.so.1.0.0-rc3: undefined symbol: _ZN2cv7imwriteERKNS_6StringERKNS_11_InputArrayERKSt6vectorIiSaIiEE

Does anyone know what this means or how to fix this? I have no idea what to do. I made sure to run make pycaffe in caffe_train and set my PYTHONPATH to /path/to/python/caffe. I've also tried this same process with caffe_rtpose instead of caffe_train.

Any help would be greatly appreciated.

Pose estimate for single person with less than full body in picture

Hi Dr Zhe,

I have tried your pose estimator and it works brilliantly for single and multiple people. However, if there is only one person in the picture and the picture does not contain his full body, I get an error which says:

File "demo.py", line 254, in
partAs = connection_all[k][:,0]
TypeError: list indices must be integers, not tuple

Any idea how to resolve this, or does the pose estimator only work for full body pictures?

Thanks a lot!

Can not understand the math in render_pose_coco_parts kernel function.

@ZheC Could you explain the follow code? Does render_pose_coco_parts only draw color on frame.data_for_mat? Thanks!

    if(value_a > threshold && value_b > threshold){
      //  -------------------
      // |
      // |            .(x_a, y_a)
      // |           /
      // |          .(x_p, y_p)
      // |         /
      // |        .(x_b, y_b) 
      // |       
      // |

      float x_p = (x_a + x_b) / 2;
      float y_p = (y_a + y_b) / 2;
      float angle = atan2f(y_b - y_a, x_b - x_a);
      float sine = sinf(angle);
      float cosine = cosf(angle);
      float a_sqrt = (x_a - x_p) * (x_a - x_p) + (y_a - y_p) * (y_a - y_p);

      float A = cosine * (x - x_p) + sine * (y - y_p); // what is A?
      float B = sine * (x - x_p) - cosine * (y - y_p); // what is B?

      float judge = A * A / a_sqrt + B * B / b_sqrt;
      }

Some questions about your projects

Dear Dr. Zhe Cao,
I'm a postgraduate student from Xiamen University. Recently, I am focusing on the research about multi-person pose estimation.I have some questions about your projects.
There are two major questions:

  1. I ran your model in (https://github.com/CMU-Perceptual-Computing-Lab/caffe_rtpose) to test COCO dataset. But there is no way to be aligned on your result in COCO_eval_2014.

It's note that we set one person's score as the average of none-zero scores in this person.
The result of 1000 testing images showed in the following:

qq 20170313192342

  1. As to training(https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation), I just found the code of generating COCO, but I need to generate both json and lmdb MPII's data, do you have this code? I appreciate it if you share the code with me or tell me how to do this.

Thanks a lot!
Han

Adding a new channel to transformed_data

Hi,

I would like to add a new data channel to the transformed_data , I cannot find where the memory is allocated. I cant also find the part of code where centre map is added to transformed_data. Could you please tell me where this is written, so that I can add a layer, and also change the code for centre map , and use it for my new channel data?

Thank you very much in advance!

About PAFs

@ZheC Thanks for your great work. But when I read the paper, I have some confused about 'Part Affinity Fields for Part Association' this part. I want to check the part of codes, could you tell me where can I find it. Thanks a lot.

Failed to open leveldb IO error: /LOCK: Permission denied

I am trying to compile the caffe_train, but evrytime make runtest gives me this error:
Check failed: status.ok() Failed to open leveldb
IO error: /LOCK: Permission denied
. What could be the reason?

1 test from LayerFactoryTest/3, where TypeParam = caffe::GPUDevice<double>
[ RUN      ] LayerFactoryTest/3.TestCreateLayer
F0306 16:28:33.630322 38575 db_leveldb.cpp:16] Check failed: status.ok()  Failed to open leveldb
IO error: /LOCK: Permission denied
*** Check failure stack trace: ***
    @     0x2b003e64fdaa  (unknown)
    @     0x2b003e64fce4  (unknown)
    @     0x2b003e64f6e6  (unknown)
    @     0x2b003e652687  (unknown)
    @     0x2b0040639eca  caffe::db::LevelDB::Open()
    @     0x2b004065ac64  caffe::DataReader::Body::InternalThreadEntry()
    @     0x2b00406abf10  caffe::InternalThread::entry()
    @     0x2b003ff557a9  thread_proxy
    @     0x2b004141b184  start_thread
    @     0x2b004172b37d  (unknown)
    @              (nil)  (unknown)
make: *** [runtest] Aborted (core dumped)

configobj for python

This may be a stupid question.

I'm trying to run demo.ipynb and encountered complaints as following.

----> 1 from configobj import ConfigObj
2 import numpy as np
3
4
5 def config_reader():

ImportError: No module named configobj

Where can I find configobj.py?

Extract bounding box as well of the person out of the code

I am using your Matlab demo. I skimmed through the code and It appeared to me that the code in its given form does not output bounding box of a person. I understand one could compose the box out of key points but I was wondering if it is possible through the original code to get the bounding box of a person as well ?

Regards

Crash of python example

On 64bit Ubuntu 14.04 with CPU-only caffe (changed use_gpu in config file to 0) I'm getting this:

F0421 13:25:27.432996 22411 syncedmem.hpp:33] Check failed: *ptr host allocation of size 53086208 failed
*** Check failure stack trace: ***

Any idea what might be wrong? Many thanks in advance.

A confused question in the data_transformer.cpp

I have read your paper, but I have a question when I check the training source code (caffe_train).
In "generateLabelMap()", line between 1840 and 1856, it seems 'count' used in 'putVecMaps()' always is zero-Mat.

not support for image size (rows > cols)

I try to evaluate the rtpose by:
./build/examples/rtpose/rtpose.bin --net_resolution 288x512
but the codes line(625~628) in connectLimbs() failed.
the result is NAN at the position where row index > columum index.

I've added the following code to keep program running, parts detection, PAFs etc. seemed ok, except the limbs/pose image.
if (std::isnan(d_x) || std::isnan(d_y)) continue;

number of training data in lmdb released

@ZheC
Hi Zhe, thanks for the code for the generation of LMDB data. However, I found that there are about 85792 people is training dataset of COCO based on the genlmdb code while there are about 120k sample in the released lmdb. Do I make some mistakes for the counting?

Thanks in advance!

evaluation issue

Hi, ZheC
We met problems when running the following codes.
"for part = 1:12 index = subset(ridxPred,part); if(index >0) part_cnt = part_cnt +1; point(part_cnt).x = candidates(index,1); point(part_cnt).y = candidates(index,2); point(part_cnt).score = candidates(index,3); point(part_cnt).id = orderMPI(part); score_sum = score_sum + candidates(index,3); end end

  1. Undefined function 'orderMPI' for input arguments of type 'double'.
  2. should the first line of the code paragraph be "for part = 1:12" or "for part = 1:17" when evaluated on MS COCO?

Thanks!

Best Regards!

Venus

When use setLayers.py, AttributeError: 'LayerParameter' object has no attribute 'cpm_transform_param'

Hello, i want to train a model follow your training steps, when execute "python setLayers.py --exp 1", Generate errors likes this:

Traceback (most recent call last):
  File "setLayers.py", line 488, in <module>
    writePrototxts(dataFolder, sub_dir, batch_size, layername, kernel, stride, outCH, transform_param, base_lr, d_caffemodel, label_name, 0, lr_mult_distro, 6)
  File "setLayers.py", line 308, in writePrototxts
    str_to_write = setLayers_twoBranches(source, batch_size, layername, kernel, stride, outCH, label_name, transform_param_in, deploy=False, batchnorm=batchnorm, lr_mult_distro=lr_mult_distro)
  File "setLayers.py", line 292, in setLayers_twoBranches
    return str(n.to_proto())
  File "/home/kunkun/DL/pose_estimate/caffe_train/python/caffe/net_spec.py", line 183, in to_proto
    top._to_proto(layers, names, autonames)
  File "/home/kunkun/DL/pose_estimate/caffe_train/python/caffe/net_spec.py", line 97, in _to_proto
    return self.fn._to_proto(layers, names, autonames)
  File "/home/kunkun/DL/pose_estimate/caffe_train/python/caffe/net_spec.py", line 152, in _to_proto
    assign_proto(layer, k, v)
  File "/home/kunkun/DL/pose_estimate/caffe_train/python/caffe/net_spec.py", line 64, in assign_proto
    is_repeated_field = hasattr(getattr(proto, name), 'extend')
AttributeError: 'LayerParameter' object has no attribute 'cpm_transform_param'

Looking forward to your help, Thanks!

Using on Windows OS

Hi
after installing CUDA & downloading cudnn binaries, i used Cygwin64 Terminal on windows and entered
$ ./install_caffe_and_cpm.sh
got the following output


------------------------- INSTALLING CAFFE AND CPM -------------------------
NOTE: This script assumes that CUDA and cuDNN are already installed on your mach
------------------------- Checking Ubuntu Version -------------------------
./install_caffe_and_cpm.sh: line 23: lsb_release: command not found
Ubuntu
Ubuntu release older than version 14. This installation script might fail.
------------------------- Ubuntu Version Checked -------------------------

------------------------- Checking Number of Processors ------------------------
4 cores
------------------------- Number of Processors Checked -------------------------

------------------------- Installing some Caffe Dependencies -------------------
./install_caffe_and_cpm.sh: line 50: sudo: command not found
./install_caffe_and_cpm.sh: line 51: sudo: command not found
./install_caffe_and_cpm.sh: line 53: sudo: command not found
./install_caffe_and_cpm.sh: line 54: sudo: command not found
./install_caffe_and_cpm.sh: line 57: sudo: command not found

------------------------- -------------------------
Errors detected. Exiting script. The software might have not been successfully installed
------------------------- -------------------------

any workaround or changes in code to try on windows?
Thanks & Regards

Retraining: Freeze heat map weights

I would want to fix the weights for predicting heat map and retrain using a different data set. Is it just enough to put param{ lr_mult=0} for all 6 stages of 'L2' convolution layers?

Input Imagedata normalization problem

@ZheC
Thanks for your great work. I have read your paper and your demo-code, and was confused by the normalization method you have done to the input image-data during test and train. In test-mode, in the frame-generation thread, the pixel value is normalized by p(i,j)=(p(i,j)-0.5)/256, while in train-mode, the image-data is normalized by p(i,j)=(p(i,j)-128)/256, I want to know if it makes sense to the network performance and why it's different in two modes. Additionally, what range of input is needed for VGG19, (-0.5,+0.5) or (0,1), which type will be better? Thanks in advance!

coco_val , imread -Matlab evaluation

@ZheC I am sorry, I went through different issues on this, but could not figure it out!

I don't know, if array of image index, mean array of image names? and what does the following do?
oriImg = imread('....',coco_val(i)) are we supposed to edit something here?
Could you briefly explain how to use your evaluation code? It is not being called from demo, so how should I run this code. How did you generate coco_val? Would you be kind enough to explain briefly!

How long does training typically take?

Hi, I'm trying to train on the COCO dataset using this algorithm. I'm using one GeoForce 1050 gtx GPU. I also changed the batch_size on line 454 of setLayers.py to 2, and batch_size on line 443 to 4 because I was getting an out of memory error with CUDA.

It's been training for 5 days now and is on iteration 410,980 with lr=1.47e-6, is there any way that I can get an idea of how much longer training will take?

Thank You

number of joint points

@ZheC the testing demo generates output Mconv7_stage6_L1 (1x38x46x46) and Mconv7_stage6_L2 (1x19x46x46), which should represent the confidence map of 19 kinds of part and corresponding part affinity maps. But COCO challenge introduces their annotations with 17 keypoints, and it is also quite confusing when seeing np = 15 in testing demo. Could you please explain a bit about the definition of number of joint points in all these places? Looking forward to your answer. Thanks.

no display windows

[platform] Ubuntu 14.04 +cuda7.5+cudnn4.0+opencv2.4.9
After make all,when I run the command of "./build/examples/rtpose/rtpose.bin",it appear a error about threads, which is: Finish spawning 1 threads. now waiting.
init done
opengl support available,
QMetaMethod::invoke: Unable to invoke methods with return values in queued connections

However,then it all always waiting,no results,

when I run the command of “./build/examples/rtpose/rtpose.bin --video video/4.mp4 --logtostderr
I0223 18:30:11.356386 4326 rtpose.cpp:1687] Display resolution: 1280x720
I0223 18:30:11.356426 4326 rtpose.cpp:1693] Net resolution: 656x368
I0223 18:30:11.356444 4326 rtpose.cpp:1468] Finish spawning 1 threads.
I0223 18:30:11.356467 4329 rtpose.cpp:175] Setting GPU 0
init done
opengl support available
I0223 18:30:11.903545 4329 rtpose.cpp:180] GPU 0: copying to person net
I0223 18:30:13.752050 4329 rtpose.cpp:202] start_scale = 1
I0223 18:30:13.752099 4329 rtpose.cpp:231] Dry running...
I0223 18:30:13.972661 4329 rtpose.cpp:233] Success.
I0223 18:30:13.973018 4329 rtpose.cpp:1081] GPU 0 is ready
QMetaMethod::invoke: Unable to invoke methods with return values in queued connections
OpenCV Error: Assertion failed (src.cols > 0 && src.rows > 0) in warpAffine, file /home/dl/opencv-2.4.9/modules/imgproc/src/imgwarp.cpp, line 3455
terminate called after throwing an instance of 'cv::Exception'
what(): /home/dl/opencv-2.4.9/modules/imgproc/src/imgwarp.cpp:3455: error: (-215) src.cols > 0 && src.rows > 0 in function warpAffine

已放弃 (核心已转储).
could anyone help me solver the problem.Thank you!

Error Caffe when running notebook


ImportError Traceback (most recent call last)
in ()
4 import PIL.Image
5 import math
----> 6 import caffe
7 import time
8 from config_reader import config_reader

ImportError: No module named caffe

Python demo and pycaffe

I successfully managed to get the demo to run with python3 but I experience this weird behaviour. At one hand when I don't do the scale search and compute the average of the outputs, the results are quite not satisfying. On the other hand it takes a lot of VRAM. When I pass a single 540x720 (WxH) image it takes around 1.5GB VRAM, but when I try it with a batch of images (the size of the batch is only 5) it takes 4.5GB VRAM.

net.blobs['data'].reshape(*(batch.shape[0], batch.shape[3], batch.shape[1], batch.shape[2]))
net.blobs['data'].data[...] = np.transpose(np.float32(batch[:,:,:,:]), (0,3,1,2))/256 - 0.5
net.forward()

The above is the snippet I'm working with where the batch's shape is (size_of_batch, height_of_images, width_of_images, channels_of_images). Also, the evaluation for a single image is like 0.9 sec and for the batch it takes like 3 seconds.

I can't really seem to find out what's the source of my problem. I would appreciate any advice.

EDIT

The memory issue is still there, but it seems that the NVIDIA caffe fork has performance problems. With the original BVLC caffe I managed to reach 200ms for the picture mentioned above.

Running training on newer caffe

Hi there, love this work, it's really great, so thank you.

I want to get training working on the newer caffe as I want to integrate some ideas together and they require the newer caffe.

I have taken your caffe_train and carried over the training code into your caffe_demo (the newer caffe version) but it throws an error in the Translate_nv function. On this line

transformed_data[2*offset + i*img_aug.cols + j] = (rgb[2] - 128)/256.0;

That is in the for loop, the 3rd line like this. I assume the transformed_data isn't large enough and the 2*offset + i*img_aug.cols + j equals something like 480012 and is out of range. When I trace it back it looks like the size of the transformed_data is something like 884427 so should fit.

Would you have any ideas that would help me here?

Thanks again.

Error: Matlab demo Undefined function 'insertShape'

Getting the following error, while running demo.m

Undefined function 'insertShape' for input arguments of type 'uint8'.

Error in connect56LineVec (line 288)
            image = insertShape(image, 'FilledCircle', [X Y 5], 'Color', joint_color(i,:));

Error in demo (line 21)
	[candidates, subset] = connect56LineVec(oriImg, final_score, param, vis);

Evaluation in Matlab

Hi ZheC,
I have some questions about evaluation code in Matlab.
What is the type of coco_val in testing/eval.m line 12, or how can we get coco_val?
In addition, if the number of CPM stage is set as 6, twoLevel needs to changes to 7, is it right?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.