The tf2-mobile-2d-single-pose-estimation's discuss from tucan9389

Purpose

For multiple model experiments with config files

First of all, thank you for sharing the code 👍🏻I'm trying to get into ML for my project. I've installed all dependancies successfully, downloaded the dataset and started the training script with python train.py.

It seems to work fine. I'm getting output like this:

Train for 100 steps
Epoch 1/500
2019-11-20 05:27:23.706954: I tensorflow/core/profiler/lib/profiler_session.cc:184] Profiler session started.
 99/100 [============================>.] - ETA: 2s - loss: 0.2444   
Epoch 00001: saving model to /Users/aeyoa/Documents/Sites/tf2-mpe/outputs/models/11200527_hg_lr0.0001.hdf5
100/100 [==============================] - 251s 3s/step - loss: 0.2422

After 500 epochs I'm getting loss of 0.0038

Epoch 500/500
 99/100 [============================>.] - ETA: 1s - loss: 0.0038  
Epoch 00500: saving model to /Users/aeyoa/Documents/Sites/tf2-mpe/outputs/models/11200527_hg_lr0.0001.hdf5
100/100 [==============================] - 208s 2s/step - loss: 0.0038

But when I'm checking the progress in Tensorboard the training seems to be broken: heatmaps are changing randomly. Here are a few examples:

After 245 epochs

After 246 epochs — suddenly white

After 306 epochs — still no results

After the fitting I've checked the model with model.predict and still didn't get any reasonable heatmap.

Have you tried the latest code for training? Does is work? Do you have any idea why this isn't working?

Thank you

🤛 tf2-mobile-pose-estimation-1.0 release

if you want contribution, pr to pose-estimation-for-mobile-1.0 branch

restore saved checkpoint

show predicted images containing a target image, true heatmap and predicted heatmap on tensorboard each epoch

This feature is already implemented on my private repo. I'll apply as soon as possible.

from_concrete_function()

'TFLiteConverterV2' has no attribute 'from_concrete_function'?

The reason is that tensorflow2.0 has changed the function from 'from_concrete_function(func)' to 'from_concrete_function(cls,funcs)'. So we only need to change the code to 'converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])'

Support tensorboard

Reference

https://www.tensorflow.org/tensorboard/get_started#using_tensorboard_with_other_methods

Convert Eager to Keras fit in training

Lines 112 to 125 in 82a0135

    
           @tf.function 
        
           def train_step(model, images, labels): 
        
               with tf.GradientTape() as tape: 
        
                   model_output = model(images) 
        
                   predictions_layers = model_output 
        
                   losses = [loss_object(labels, predictions) for predictions in predictions_layers] 
        
                   total_loss = tf.math.add_n(losses) 
        
               max_val = tf.math.reduce_max(predictions_layers[-1]) 
        
               gradients = tape.gradient(total_loss, model.trainable_variables) 
        
               optimizer.apply_gradients(zip(gradients, model.trainable_variables)) 
        
               train_loss(total_loss) 
        
               return total_loss, losses[-1], max_val

Set the layer names when release

Update exp001~exper005

有没有成熟的ios的app?可以直接使用的

use aleju/imgaug instead of tensorpack

https://github.com/aleju/imgaug

Training trend seems weird on COCO Datasets

This is my configuration file rewritten as the yaml format for my own convenience.

dataset:
  dataset_root_path: /datasets/
  dataset_directory_name: coco_dataset
  train_images: train2017
  train_annotation: annotations_trainval2017/person_keypoints_train2017.json
  valid_images: val2017
  valid_annotation: annotations_trainval2017/person_keypoints_val2017.json

model:
  model_name: cpm
  model_subname: 
  batch_size: 3
  input_width: 192
  input_height: 192
  output_width: 24
  output_height: 24

preprocessing:
  is_scale: True
  is_rotate: True
  is_flipping: True
  is_resize_shortest_edge: True
  is_crop: True
  rotate_min_degree: -15.0
  rotate_max_degree: 15.0
  heatmap_std: 5.0

training:
  batch_size: 32
  learning_rate: 0.001
  epsilon: +1.e-8
  number_of_epoch: 200
  period_echo: 100
  period_save_model: 5000
  period_tensorboard: 10
  period_valid_image: 5000
  valid_pckh: True
  pckh_distance_ratio: 0.5
  multiprocessing_num: `4`

I've trained CPM model for days using this configuration and when I looked at the tensorboard the loss score and PCKh value showed no improvements over the course of training.

Of course the heatmap result somehow showed the same results.

When I changed my dataset from coco to ai_challenger dataset, the model seemed to be training fine.

I'm thinking that the error is caused from the data preparation steps however did not find specific codes to fix. Do you have an idea for this?

Evaluate with MPII

MobileNetV2 base CPM

Structure

Test Points on Future

[improving pck] Use MobileNetV3?
[lightweight] Remove b1 and b3 layer on MobileNetV2
~~[lightweight] Remove upsample on CPM ~~ → slower

Reference

port to tf.keras from slim

RAM memory increases

I use Colab to retrain with custom data. I noticed that after saving the model (5000 steps), I see an increase in the amount of RAM. It could also be due to the val_step function or something else. Has anyone come across a case like this?

Regarding changing augmentation to improve accuracy

The current CPM model provided in the https://github.com/edvardHua/PoseEstimationForMobile repository doesn't detect people lying on the ground or doing a headstand properly.

Does changing the augmentation degrees from:

rotate_min_degree: -15.0
rotate_max_degree: 15.0

to:

rotate_min_degree: -90.0   #(or -180)
rotate_max_degree: 90.0    #(or 180)

will make any difference?

Support custom dataset annotated yourself

The output(heatmap shape) should be available tuned

Evaluate PCKh chart with saved models

write on tensorboard
evaluate pckh for each part

Example PCKh Table

Name	Head	Neck	Shoulder	Elbow	Wrist	Hip	Knee	Ankle	Total
mv2 hourglass	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`
mv2 cpm	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`
mv2 simplepose	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`
resnet18 simplepose	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`	`0.00`

Use class concept for OOP

use class concept for OOP

MobileNetV2 based Hourglass

working branch: new-model/mv2_hourglass

Structure

Test Points on Future

[lightweight] Change stage (4 → 3, 2, 1)
[improving pck] Change stage (4 → 5, 6 ,8, 16)
[improving pck] More hourglass (1 → 2, 3, 4, 8)

Reference

Experiments

Table

Use EffecientNet as backbone

Replace config python code to .cfg file

repace model_config.py → config/experiment01.cfg
repace config/train_config.py → config/experiment01.cfg
create config/dataset/ai_challenge-gpu.cfg
create config/dataset/coco2017-gpu.cfg

Not showing output even model achieved 93 % validation accuracy

Hi, Thanks for the code. I trained this model with 93% accuracy but still model did not give any output.

Convert to Core ML

Getting confidence values from the body points

I'm using the Pose Estimation model for Android, I was wondering how I can get the confidence value from a body point?

I'm also using the iOS/CoreML model and it has a confidence value.

How can I add Upsampling2D at end of mv2_cpm?

In https://github.com/edvardHua/PoseEstimationForMobile,
His Cpm model has resizebilinear at the end of network
so its output shape is 1 x 96 x 96 x 17
so how can I add Upsampling2D at the end of mv2_cpm?

training was stoped when reaching num_train_samples (I guess)

  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 8009, in sub
    "Sub", x=x, y=y, name=name)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Incompatible shapes: [16,48,48,14] vs. [12,48,48,14]
         [[Node: GPU_0/sub_4 = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU_0/upsample_for_loss_0, IteratorGetNext_1/_4657)]]
         [[Node: GPU_0/hourglass_out_3_1/_4663 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2975_GPU_0/hourglass_out_3_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'GPU_0/sub_4', defined at:
  File "src/train.py", line 256, in <module>
    tf.app.run()
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "src/train.py", line 158, in main
    reuse_variable)
  File "src/train.py", line 48, in get_loss_and_output
    loss_l2 = tf.nn.l2_loss(tf.concat(pred_heat, axis=0) - input_heat, name='loss_heatmap_stage%d' % idx)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 979, in binary_op_wrapper
    return func(x, y, name=name)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 8009, in sub
    "Sub", x=x, y=y, name=name)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3392, in create_op
    op_def=op_def)
  File "/home/mot/.conda/envs/pefm-env/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1718, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Incompatible shapes: [16,48,48,14] vs. [12,48,48,14]
         [[Node: GPU_0/sub_4 = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](GPU_0/upsample_for_loss_0, IteratorGetNext_1/_4657)]]
         [[Node: GPU_0/hourglass_out_3_1/_4663 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2975_GPU_0/hourglass_out_3_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Filtering COCO keypoint dataset to single person only and convert it

Goal 🎯

Filtering COCO keypoint dataset to exist single person only and convert format

KeyError: 'extra' in train.py

Output:
"""
tensorflow version : 2.2.0
keras version : 2.3.0-tf
config/dataset/coco2017-gpu.cfg
config/training/experiment01.cfg
Traceback (most recent call last):
File "train.py", line 78, in
for key in parser["extra"]:
File "/usr/local/lib/python3.6/configparser.py", line 959, in getitem
raise KeyError(key)
KeyError: 'extra'
"""

[Building the model in google colab(https://colab.research.google.com/)]

Restructure folder hierarchy

Update README for 1.0.0

Add results of experiments
Change training script command
Add new supporting models
Add future works

Future Works

Convert to S4TF
Rearrange the pre-processing code
Converting script for Core ML
Evaluate PCKh with Core ML model(.mlmodel)

How can I get the epoch accuracy?

Sequencial training

Usage

python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment01.cfg \
python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment02.cfg \
python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment03.cfg \
python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment04.cfg \
python train.py --dataset_config=config/dataset/coco2017-gpu.cfg --experiment_config=config/training/experiment05.cfg

Features

Slack or mail on end of training for each experiment
Write report automatically
- model name
- [email protected]
- saved_model size
- inference time

need to separate android repository

All in one dataset

ai_challanger
MPII
coco2017

Add guide for downloading datasets and configure the path

Datasets

ai challenger dataset
coco 2017 dataset

related with #8

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = main(fd)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 114, in main
    prepare(preparation_data)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main")
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\runpy.py", line 96, in run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\runpy.py", line 85, in run_code
    exec(code, run_globals)
  File "C:\Users\Harsh\Desktop\tf2-mobile\train.py", line 135, in <module>
    from evaluate import calculate_total_pckh
  File "C:\Users\Harsh\Desktop\tf2-mobile\evaluate.py", line 105, in <module>
    manager = multiprocessing.Manager()
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\context.py", line 56, in Manager
    m.start()
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\managers.py", line 563, in start
    self._process.start()
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\Harsh\anaconda3\envs\posewin\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

	@tf.function
	def train_step(model, images, labels):
	with tf.GradientTape() as tape:
	model_output = model(images)
	predictions_layers = model_output
	losses = [loss_object(labels, predictions) for predictions in predictions_layers]
	total_loss = tf.math.add_n(losses)

	max_val = tf.math.reduce_max(predictions_layers[-1])

	gradients = tape.gradient(total_loss, model.trainable_variables)
	optimizer.apply_gradients(zip(gradients, model.trainable_variables))
	train_loss(total_loss)
	return total_loss, losses[-1], max_val

tucan9389 / tf2-mobile-2d-single-pose-estimation Goto Github PK

tf2-mobile-2d-single-pose-estimation's Issues

Purpose

Reference

Related script

Structure

Test Points on Future

Reference

Example PCKh Table

Structure

Test Points on Future

Reference

Table

Goal 🎯

Future Works

Usage

Features

Datasets

Reference

Recommend Projects

Recommend Topics

Recommend Org