VNect

A tensorflow implementation of VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera.

For the caffe model required in the repository: please contact the author of the paper.

Environments

Python 3.5
- opencv-python 3.4.4.19
- tensorflow-gpu 1.12.0
- pycaffe
- matplotlib 3.0.0 or 3.0.2 (this module shuts down occasionally for unknown reason)
- ……

Setup

Fedora 29

Install python dependencies:

pip3 install -r requirements.txt --user

Install caffe dependencies

sudo dnf install protobuf-devel leveldb-devel snappy-devel opencv-devel boost-devel hdf5-devel glog-devel gflags-devel lmdb-devel atlas-devel python-lxml boost-python3-devel

Setup Caffe

git clone https://github.com/BVLC/caffe.git
cd caffe

Configure Makefile.config (Include python3 and fix path)

Build Caffe

sudo make all
sudo make runtest
sudo make pycaffe
sudo make distribute
sudo cp .build_release/lib/ /usr/lib64
sudo cp -a distribute/python/caffe/ /usr/lib/python3.7/site-packages/

Usage

Preparation

Drop the caffe model into models/caffe_model.
Run init_weights.py to generate tensorflow model.

Application

benchmark.py is a class implementation containing all the elements needed to run the model.
run_estimator.py is a script for running with video stream.
run_estimator_ps.py is a multiprocessing version script. When 3d plotting function shuts down in run_estimator.py mentioned above, you can try this one.
run_estimator_robot.py provides ROS and serial connection for communication in robot controlling besides the functions in run_estimator.py.
[NOTE] To run the video stream based scripts mentioned above:

i ) click the left mouse button to confirm a simple static bounding box generated by HOG method;

ii) trigger any keyboard input to exit while the network running.
run_pic.py is a script for running with one single picture: the outputs are 4×21 heatmaps and 2D results.

Notes

I don't know why in some cases the 3d plotting function shuts down in the script. It may result from the variety of programming environments. Anyone capable to fix this and pull a request would be gratefully appreciated.
The input image in this implementation is in BGR color format (cv2.imread()) and the pixel value is regulated into a range of [-0.4, 0.6).
The joint-parent map (detailed information in joint_index.xlsx):

The joint positions (index numbers as above):

Every input image is assumed to contain 21 joints to be found, which means it is easy to fit wrong results when a joint is actually not in the input.
In some cases the estimation results are not so good as the results shown in the paper author's promotional video.
UPDATE: the running speed is now faster thanks to some coordinate extraction optimization!

TODO

Optimize the structure of the codes.
Implement a better bounding box strategy.
Implement the training script.

About Training Data

Refer to MPI-INF-3DHP Dataset

Reference Repositories

original MATLAB implementation provided by the model author
timctho/VNect-tensorflow
EJShim/vnect_estimator

tinatinaz / vnect Goto Github PK

vnect's Introduction

VNect

Environments

Setup

Install python dependencies:

Install caffe dependencies

Setup Caffe

Configure Makefile.config (Include python3 and fix path)

Build Caffe

Usage

Preparation

Application

Notes

TODO

About Training Data

Reference Repositories

vnect's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org