Coder Social home page Coder Social logo

tucan9389 / tf2-mobile-2d-single-pose-estimation Goto Github PK

View Code? Open in Web Editor NEW
167.0 17.0 41.0 39.74 MB

:dancer: Pose estimation for iOS and android using TensorFlow 2.0

License: Apache License 2.0

Python 99.50% Shell 0.50%
tensorflow2 pose-estimation mobile ios tensorflow-lite

tf2-mobile-2d-single-pose-estimation's Introduction

๐Ÿ’ƒ Mobile 2D Single Person (Or Your Own Object) Pose Estimation for TensorFlow 2.0

This repository is forked from edvardHua/PoseEstimationForMobile when the original repository was closed.
edvardHua/PoseEstimationForMobile repository is reopened! I'll maintain it separately. ๐Ÿ‘

This repository currently implemented the Hourglass model using TensorFlow 2.0 with Keras API.

Table of contents

Goals

  • ๐Ÿ“š Easy to train
  • ๐Ÿƒโ€ Easy to use the model on mobile device

Getting Started

Install Anaconda (~10 min)

Create Virtual Environment (~2 min)

Create new environment.

conda create -n {env_name} python={python_version} anaconda
# in my case
# conda create -n mpe-env-tf2-alpha0 python=3.7 anaconda

Start the environment.

source activate {env_name}
# in my case
# source activate mpe-env-tf2-alpha0

Install the requirements (~1 min)

cd {tf2-mobile-pose-estimation_path}
pip install -r requirements.txt
pip install git+https://github.com/philferriere/cocoapi.git@2929bd2ef6b451054755dfd7ceb09278f935f7ad#subdirectory=PythonAPI
Download original COCO dataset.

Download original COCO dataset

Special script that will help you to download and unpack needed COCO datasets. Please fill COCO_DATASET_PATH with path that is used in current version of repository. You can check needed path in file train.py

Warning Your system should have approximately 40gb of free space for datasets

python downloader.py --download-path=COCO_DATASET_PATH

Run The Project

In order to use the project you have to:

  1. Prepare the dataset(ai_challenger dataset) and unzip.
  2. Run the model using:
python train.py \
--dataset_config config/dataset/coco_single_person_only-gpu.cfg \
--experiment_config config/training/coco_single_experiment01-cpm-sg4-gpu.cfg

Compatiable Datasets

Dataset Name Doanload Size Number of images
train/valid
Number of Keypoints Note
ai challenge google drive 2GB 22k/1.5k 14 default dataset of this repo
coco single person only google drive 4GB 25k/1k 17 filtered by showing only one person in an image which is from coco 2017 keypoint dataset
  • ai challenge's keypoint names: ['top_head', 'neck', 'left_shoulder', 'right_shoulder', 'left_elbow', 'right_elbow', 'left_wrist', 'right_wrist', 'left_hip', 'right_hip', 'left_knee', 'right_knee', 'left_ankle', 'right_ankle']
  • coco's keypoint names: ['nose', 'left_eye', 'right_eye', 'left_ear', 'right_ear', 'left_shoulder', 'right_shoulder', 'left_elbow', 'right_elbow', 'left_wrist', 'right_wrist', 'left_hip', 'right_hip', 'left_knee', 'right_knee', 'left_ankle', 'right_ankle']

Results

AI Challenge Dataset

Model Name Backbone Stage Or Depth [email protected] Size Total Epoch Total Training Time Note
MobileNetV2 based CPM cpm-b0 Stage 1 .. .. .. .. Default CPM
MobileNetV2 based CPM cpm-b0 Stage 2 .. .. .. ..
MobileNetV2 based CPM cpm-b0 Stage 3 .. .. .. ..
MobileNetV2 based CPM cpm-b0 Stage 4 .. .. .. ..
MobileNetV2 based CPM cpm-b0 Stage 5 .. .. .. ..
MobileNetV2 based Hourglass hg-b0 Depth 4 .. .. .. .. Default Hourglass

COCO Single persononly Dataset

Model Name Backbone Stage Or Depth OKS Size Total Epoch Total Training Time Note
MobileNetV2 based CPM cpm-b0 Stage 1 .. .. .. .. Default CPM
MobileNetV2 based CPM cpm-b0 Stage 2 .. .. .. ..
MobileNetV2 based CPM cpm-b0 Stage 3 .. .. .. ..
MobileNetV2 based CPM cpm-b0 Stage 4 .. .. .. ..
MobileNetV2 based CPM cpm-b0 Stage 5 .. .. .. ..
MobileNetV2 based Hourglass hg-b0 Depth 4 .. .. .. .. Default Hourglass

Converting To Mobile Model

TensorFLow Lite

If you train the model, it will create tflite models per evaluation step.

Core ML

Check convert_to_coreml.py script. The converted .mlmodel support iOS14+.

Details

This section will be separated to other .md file.

Folder Structure

tf2-mobile-pose-estimation
โ”œโ”€โ”€ config
|   โ”œโ”€โ”€ model_config.py
|   โ””โ”€โ”€ train_config.py
โ”œโ”€โ”€ data_loader
|   โ”œโ”€โ”€ data_loader.py
|   โ”œโ”€โ”€ dataset_augment.py
|   โ”œโ”€โ”€ dataset_prepare.py
|   โ””โ”€โ”€ pose_image_processor.py
โ”œโ”€โ”€ models
|   โ”œโ”€โ”€ common.py
|   โ”œโ”€โ”€ mobilenet.py
|   โ”œโ”€โ”€ mobilenetv2.py
|   โ”œโ”€โ”€ mobilenetv3.py
|   โ”œโ”€โ”€ resnet.py
|   โ”œโ”€โ”€ resneta.py
|   โ”œโ”€โ”€ resnetd.py
|   โ”œโ”€โ”€ senet.py
|   โ”œโ”€โ”€ simplepose_coco.py
|   โ””โ”€โ”€ simpleposemobile_coco.py
โ”œโ”€โ”€ train.py            - the main training script
โ”œโ”€โ”€ common.py 
โ”œโ”€โ”€ requirements.txt
โ””โ”€โ”€ outputs             - this folder will be generated automatically when start training
    โ”œโ”€โ”€ 20200312-sp-ai_challenger
    |   โ”œโ”€โ”€ saved_model
    |   โ””โ”€โ”€ image_results
    โ””โ”€โ”€ 20200312-sp-ai_challenger
        โ””โ”€โ”€ ...

My SSD    
โ””โ”€โ”€ datasets            - this folder contains the datasets of the project.
    โ””โ”€โ”€ ai_challenger
        โ”œโ”€โ”€ train.json
        โ”œโ”€โ”€ valid.json
        โ”œโ”€โ”€ train
        โ””โ”€โ”€ valid

TODO

  • Save model to saved_model
  • Convert the model(saved_model) to TFLite model(.tflite)
  • Convert the model(saved_model) to Core ML model(.mlmodel)
  • Run the model on iOS
  • Release 1.0 models
  • Support distributed GPUs training
  • Make DEMO gif running on mobile device
  • Run the model on Android

Reference

[1] Paper of Convolutional Pose Machines
[2] Paper of Stack Hourglass
[3] Paper of MobileNet V2
[4] Repository PoseEstimation-CoreML
[5] Repository of tf-pose-estimation
[6] Devlope guide of TensorFlow Lite
[7] Mace documentation

Related Projects

Other Pose Estimation Projects

Contributing

This section will be separated to other .md file.

Any contributions are welcome including improving the project.

License

Apache License 2.0

tf2-mobile-2d-single-pose-estimation's People

Contributors

idchlife avatar tucan9389 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tf2-mobile-2d-single-pose-estimation's Issues

Training trend seems weird on COCO Datasets

This is my configuration file rewritten as the yaml format for my own convenience.

dataset:
  dataset_root_path: /datasets/
  dataset_directory_name: coco_dataset
  train_images: train2017
  train_annotation: annotations_trainval2017/person_keypoints_train2017.json
  valid_images: val2017
  valid_annotation: annotations_trainval2017/person_keypoints_val2017.json

model:
  model_name: cpm
  model_subname: 
  batch_size: 3
  input_width: 192
  input_height: 192
  output_width: 24
  output_height: 24

preprocessing:
  is_scale: True
  is_rotate: True
  is_flipping: True
  is_resize_shortest_edge: True
  is_crop: True
  rotate_min_degree: -15.0
  rotate_max_degree: 15.0
  heatmap_std: 5.0

training:
  batch_size: 32
  learning_rate: 0.001
  epsilon: +1.e-8
  number_of_epoch: 200
  period_echo: 100
  period_save_model: 5000
  period_tensorboard: 10
  period_valid_image: 5000
  valid_pckh: True
  pckh_distance_ratio: 0.5
  multiprocessing_num: `4`

I've trained CPM model for days using this configuration and when I looked at the tensorboard the loss score and PCKh value showed no improvements over the course of training.

image

Of course the heatmap result somehow showed the same results.
result2-090000

When I changed my dataset from coco to ai_challenger dataset, the model seemed to be training fine.
image

I'm thinking that the error is caused from the data preparation steps however did not find specific codes to fix. Do you have an idea for this?

RAM memory increases

I use Colab to retrain with custom data. I noticed that after saving the model (5000 steps), I see an increase in the amount of RAM. It could also be due to the val_step function or something else. Has anyone come across a case like this?

training was stoped when reaching num_train_samples (I guess)

  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 8009, in sub
    "Sub", x=x, y=y, name=name)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Incompatible shapes: [16,48,48,14] vs. [12,48,48,14]
         [[Node: GPU_0/sub_4 = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU_0/upsample_for_loss_0, IteratorGetNext_1/_4657)]]
         [[Node: GPU_0/hourglass_out_3_1/_4663 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2975_GPU_0/hourglass_out_3_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'GPU_0/sub_4', defined at:
  File "src/train.py", line 256, in <module>
    tf.app.run()
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "src/train.py", line 158, in main
    reuse_variable)
  File "src/train.py", line 48, in get_loss_and_output
    loss_l2 = tf.nn.l2_loss(tf.concat(pred_heat, axis=0) - input_heat, name='loss_heatmap_stage%d' % idx)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 979, in binary_op_wrapper
    return func(x, y, name=name)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 8009, in sub
    "Sub", x=x, y=y, name=name)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Incompatible shapes: [16,48,48,14] vs. [12,48,48,14]
         [[Node: GPU_0/sub_4 = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU_0/upsample_for_loss_0, IteratorGetNext_1/_4657)]]
         [[Node: GPU_0/hourglass_out_3_1/_4663 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2975_GPU_0/hourglass_out_3_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

KeyError: 'extra' in train.py

Output:
"""
tensorflow version : 2.2.0
keras version : 2.3.0-tf
config/dataset/coco2017-gpu.cfg
config/training/experiment01.cfg
Traceback (most recent call last):
File "train.py", line 78, in
for key in parser["extra"]:
File "/usr/local/lib/python3.6/configparser.py", line 959, in getitem
raise KeyError(key)
KeyError: 'extra'
"""

[Building the model in google colab(https://colab.research.google.com/)]

Convert Eager to Keras fit in training

Related script

@tf.function
def train_step(model, images, labels):
with tf.GradientTape() as tape:
model_output = model(images)
predictions_layers = model_output
losses = [loss_object(labels, predictions) for predictions in predictions_layers]
total_loss = tf.math.add_n(losses)
max_val = tf.math.reduce_max(predictions_layers[-1])
gradients = tape.gradient(total_loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
train_loss(total_loss)
return total_loss, losses[-1], max_val

Sequencial training

Usage

python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment01.cfg \
python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment02.cfg \
python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment03.cfg \
python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment04.cfg \
python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment05.cfg 

Features

  • Slack or mail on end of training for each experiment
  • Write report automatically

Training doesn't work

First of all, thank you for sharing the code ๐Ÿ‘๐ŸปI'm trying to get into ML for my project. I've installed all dependancies successfully, downloaded the dataset and started the training script with python train.py.

It seems to work fine. I'm getting output like this:

Train for 100 steps
Epoch 1/500
2019-11-20 05:27:23.706954: I tensorflow/core/profiler/lib/profiler_session.cc:184] Profiler session started.
 99/100 [============================>.] - ETA: 2s - loss: 0.2444   
Epoch 00001: saving model to /Users/aeyoa/Documents/Sites/tf2-mpe/outputs/models/11200527_hg_lr0.0001.hdf5
100/100 [==============================] - 251s 3s/step - loss: 0.2422

After 500 epochs I'm getting loss of 0.0038

Epoch 500/500
 99/100 [============================>.] - ETA: 1s - loss: 0.0038  
Epoch 00500: saving model to /Users/aeyoa/Documents/Sites/tf2-mpe/outputs/models/11200527_hg_lr0.0001.hdf5
100/100 [==============================] - 208s 2s/step - loss: 0.0038

But when I'm checking the progress in Tensorboard the training seems to be broken: heatmaps are changing randomly. Here are a few examples:

After 245 epochs
Screenshot 2019-11-20 at 19 30 52

After 246 epochs โ€” suddenly white
Screenshot 2019-11-20 at 19 37 03

After 306 epochs โ€” still no results
Screenshot 2019-11-20 at 23 01 59

After the fitting I've checked the model with model.predict and still didn't get any reasonable heatmap.

Have you tried the latest code for training? Does is work? Do you have any idea why this isn't working?

Thank you

from_concrete_function()

'TFLiteConverterV2' has no attribute 'from_concrete_function'?

The reason is that tensorflow2.0 has changed the function from 'from_concrete_function(func)' to 'from_concrete_function(cls,funcs)'. So we only need to change the code to 'converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])'

Evaluate PCKh chart with saved models

  • write on tensorboard
  • evaluate pckh for each part

Example PCKh Table

Name Head Neck Shoulder Elbow Wrist Hip Knee Ankle Total
mv2 hourglass 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
mv2 cpm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
mv2 simplepose 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
resnet18 simplepose 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Model Files Needed

Training takes too long, in my case about 200 hrs for one model. Could you please put trained models into the project so that we can compare the performance?

Replace config python code to .cfg file

  • repace model_config.py โ†’ config/experiment01.cfg
  • repace config/train_config.py โ†’ config/experiment01.cfg
  • create config/dataset/ai_challenge-gpu.cfg
  • create config/dataset/coco2017-gpu.cfg

getting error while running train.py in windows

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = main(fd)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 114, in main
    prepare(preparation_data)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main")
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\runpy.py", line 96, in run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\runpy.py", line 85, in run_code
    exec(code, run_globals)
  File "C:\Users\Harsh\Desktop\tf2-mobile\train.py", line 135, in <module>
    from evaluate import calculate_total_pckh
  File "C:\Users\Harsh\Desktop\tf2-mobile\evaluate.py", line 105, in <module>
    manager = multiprocessing.Manager()
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\context.py", line 56, in Manager
    m.start()
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\managers.py", line 563, in start
    self._process.start()
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Please help me to solve this.

Update README for 1.0.0

  • Add results of experiments
  • Change training script command
  • Add new supporting models
  • Add future works

Future Works

  • Convert to S4TF
  • Rearrange the pre-processing code
  • Converting script for Core ML
  • Evaluate PCKh with Core ML model(.mlmodel)

Getting confidence values from the body points

I'm using the Pose Estimation model for Android, I was wondering how I can get the confidence value from a body point?

I'm also using the iOS/CoreML model and it has a confidence value.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.