The custom-object-detection from bourdakos1

Object detection with own model fails with ...

Hi Nick,
after recapping successfully your tutorial "as is" I tried out to create my own object detection model. I replaced your pictures with mine, used my own annotation xml files (boxes only), trained it (was very short run time - only one global step ...), created the graph file and then tested it with the following result. There seems to go something wrong with the "aspect" parameter set to "normal" ... but I do not know what that means ;-)

jre@ibm-jre-mbp  ==_-+- python object_detection/object_detection_runner.py
dyld: warning, LC_RPATH $ORIGIN/../../_solib_darwin_x86_64/_U_S_Stensorflow_Spython_C_Upywrap_Utensorflow_Uinternal.so___Utensorflow in /Users/jre/Library/Python/2.7/lib/python/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so being ignored in restricted program because it is a relative path
Loading model...
detecting...
2017-12-01 11:55:11.616214: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Traceback (most recent call last):
  File "object_detection/object_detection_runner.py", line 90, in <module>
    detect_objects(image_path)
  File "object_detection/object_detection_runner.py", line 63, in detect_objects
    plt.imshow(image_np, aspect = 'normal')
  File "/Users/jre/Library/Python/2.7/lib/python/site-packages/matplotlib/pyplot.py", line 3080, in imshow
    **kwargs)
  File "/Users/jre/Library/Python/2.7/lib/python/site-packages/matplotlib/__init__.py", line 1710, in inner
    return func(ax, *args, **kwargs)
  File "/Users/jre/Library/Python/2.7/lib/python/site-packages/matplotlib/axes/_axes.py", line 5189, in imshow
    self.set_aspect(aspect)
  File "/Users/jre/Library/Python/2.7/lib/python/site-packages/matplotlib/axes/_base.py", line 1273, in set_aspect
    self._aspect = float(aspect)  # raise ValueError if necessary
ValueError: could not convert string to float: normal
[ /Users/jre/Downloads/Watson/JRE-Object-Detection-master ]
jre@ibm-jre-mbp  ==_-+-

What is going wrong? Find the installed versions of numpy and matplotlib below:

jre@ibm-jre-mbp  ==_-+- pip list --format=legacy
altgraph (0.10.2)
appnope (0.1.0)
asn1crypto (0.23.0)
backports-abc (0.5)
backports.functools-lru-cache (1.4)
backports.shutil-get-terminal-size (1.0.0)
backports.weakref (1.0.post1)
bdist-mpkg (0.5.0)
bleach (2.1.1)
bonjour-py (0.3)
certifi (2017.11.5)
cffi (1.11.2)
chardet (3.0.4)
configparser (3.5.0)
cryptography (2.1.3)
cycler (0.10.0)
decorator (4.1.2)
entrypoints (0.2.3)
enum34 (1.1.6)
funcsigs (1.0.2)
functools32 (3.2.3.post2)
futures (3.1.1)
html5lib (1.0b10)
idna (2.6)
ipaddress (1.0.18)
ipykernel (4.6.1)
ipython (5.5.0)
ipython-genutils (0.2.0)
ipywidgets (7.0.5)
Jinja2 (2.10)
jsonschema (2.6.0)
jupyter (1.0.0)
jupyter-client (5.1.0)
jupyter-console (5.2.0)
jupyter-core (4.4.0)
lxml (4.1.1)
macholib (1.5.1)
Markdown (2.6.9)
MarkupSafe (1.0)
matplotlib (2.1.0)
mistune (0.8.1)
mock (2.0.0)
modulegraph (0.10.4)
nbconvert (5.3.1)
nbformat (4.4.0)
notebook (5.2.1)
numpy (1.13.3)
olefile (0.44)
pandocfilters (1.4.2)
pathlib2 (2.3.0)
pbr (3.1.1)
pexpect (4.3.0)
pickleshare (0.7.4)
Pillow (4.3.0)
pip (9.0.1)
prompt-toolkit (1.0.15)
protobuf (3.5.0.post1)
ptyprocess (0.5.2)
py2app (0.7.3)
pycparser (2.18)
Pygments (2.2.0)
pyobjc-core (2.5.1)
pyobjc-framework-Accounts (2.5.1)
pyobjc-framework-AddressBook (2.5.1)
pyobjc-framework-AppleScriptKit (2.5.1)
pyobjc-framework-AppleScriptObjC (2.5.1)
pyobjc-framework-Automator (2.5.1)
pyobjc-framework-CFNetwork (2.5.1)
pyobjc-framework-Cocoa (2.5.1)
pyobjc-framework-Collaboration (2.5.1)
pyobjc-framework-CoreData (2.5.1)
pyobjc-framework-CoreLocation (2.5.1)
pyobjc-framework-CoreText (2.5.1)
pyobjc-framework-DictionaryServices (2.5.1)
pyobjc-framework-EventKit (2.5.1)
pyobjc-framework-ExceptionHandling (2.5.1)
pyobjc-framework-FSEvents (2.5.1)
pyobjc-framework-InputMethodKit (2.5.1)
pyobjc-framework-InstallerPlugins (2.5.1)
pyobjc-framework-InstantMessage (2.5.1)
pyobjc-framework-LatentSemanticMapping (2.5.1)
pyobjc-framework-LaunchServices (2.5.1)
pyobjc-framework-Message (2.5.1)
pyobjc-framework-OpenDirectory (2.5.1)
pyobjc-framework-PreferencePanes (2.5.1)
pyobjc-framework-PubSub (2.5.1)
pyobjc-framework-QTKit (2.5.1)
pyobjc-framework-Quartz (2.5.1)
pyobjc-framework-ScreenSaver (2.5.1)
pyobjc-framework-ScriptingBridge (2.5.1)
pyobjc-framework-SearchKit (2.5.1)
pyobjc-framework-ServiceManagement (2.5.1)
pyobjc-framework-Social (2.5.1)
pyobjc-framework-SyncServices (2.5.1)
pyobjc-framework-SystemConfiguration (2.5.1)
pyobjc-framework-WebKit (2.5.1)
pyOpenSSL (17.4.0)
pyparsing (2.2.0)
pysolr (3.6.0)
python-dateutil (2.6.1)
pytz (2017.3)
pyzmq (16.0.3)
qtconsole (4.3.1)
requests (2.18.4)
scandir (1.6)
scipy (0.13.0b1)
setuptools (38.2.1)
simplegeneric (0.8.1)
singledispatch (3.4.0.3)
six (1.11.0)
subprocess32 (3.2.7)
tensorflow (1.4.0)
tensorflow-tensorboard (0.4.0rc3)
terminado (0.8)
testpath (0.3.1)
tornado (4.5.2)
traitlets (4.3.2)
urllib3 (1.22)
vboxapi (1.0)
watson-developer-cloud (1.0.0)
wcwidth (0.1.7)
webencodings (0.5.1)
Werkzeug (0.12.2)
wheel (0.30.0)
widgetsnbextension (3.0.8)
xattr (0.6.4)
zope.interface (4.1.1)
[ /Users/jre/Downloads/Watson/JRE-Object-Detection-master ]
jre@ibm-jre-mbp  ==_-+-

Memory leak issue

tensorflow-gpu 1.3.0
tensorflow-tensorboard 0.1.8
Keras 2.0.6
Keras-Applications 1.0.6

I downloaded model_zoo link and successfully running 2 models ( ssd_mobilenet_v1_coco, ssd_inception_v2_coco) without any problems.But when i try faster_rcnn_inception_resnet_v2_atrous_coco, and rfcn_resnet101_coco models, models start working properly, but it consumes almost 62.8G/62.8G memory, so I couldn't run it. Have you ever got this issues?

I have 2394*3062 png image files for training, so i resizing the image as 600 * 767(config file).
i also using 'os.environ['CUDA_VISIBLE_DEVICES'] = '1'' and it works well.

Total memory: 10.91GiB
Free memory: 10.75GiB
2018-10-23 10:55:29.290302: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2018-10-23 10:55:29.290311: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2018-10-23 10:55:29.290323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0)
2018-10-23 10:55:31.938505: I tensorflow/core/common_runtime/simple_placer.cc:697] Ignoring device specification /device:GPU:0 for node 'prefetch_queue_Dequeue' because the input edge from 'prefetch_queue' is a reference connection and already has a device field set to /device:CPU:0
INFO:tensorflow:Restoring parameters from model.ckpt
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path train/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Recording summary at step 0.
INFO:tensorflow:global step 1: loss = 1.5886 (24.888 sec/step)
INFO:tensorflow:global step 2: loss = 1.4227 (0.812 sec/step)
INFO:tensorflow:global step 3: loss = 1.2016 (0.875 sec/step)

model {
  faster_rcnn {
    num_classes: 1
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 767
      }
    }
    feature_extractor {
      type: 'faster_rcnn_inception_resnet_v2'
      first_stage_features_stride: 8
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 8
        width_stride: 8
      }
    }
    first_stage_atrous_rate: 2
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.7
    first_stage_max_proposals: 300
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 17
    maxpool_kernel_size: 1
    maxpool_stride: 1
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0003
          schedule {
            step: 0
            learning_rate: .0003
          }
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  fine_tune_checkpoint: "model.ckpt"
  from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 200000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "train.record"
  }
  label_map_path: "annotations/label_map.pbtxt"
}

eval_config: {
  num_examples: 170
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "val.record"
  }
  label_map_path: "annotations/label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

create_tf_record not up to date

There is no longer a create_tf_record.py in tools. There is a pet create tf record, but that one requires trimap's and may not even be geared towards ssd.

Regarding Configuration

For the model to train on 500 images 200000-epochs !

What is the configuration(RAM,NVIDIA Graphics) Required to finish it in 2 days.

Please can any one suggest !

Add wiki

How do I train and test?

scores drop to zero

Trying to reproduce your demo, I run your training example with your images and "faster_rcnn_resnet101.config" . Although the loss drops gradually the prediction scores drop fast to zero after few hundred of steps. Changing the learning rate didn't solved it.

Do you have any suggestion?

when i run the last command python object_detection/object_detection_runner.py

Traceback (most recent call last):
File "object_detection/object_detection_runner.py", line 95, in
detect_objects(image_path)
File "object_detection/object_detection_runner.py", line 64, in detect_objects
plt.imshow(image_np, aspect = 'normal')
File "/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py", line 3080, in imshow
**kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/init.py", line 1710, in inner
return func(ax, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/axes/_axes.py", line 5189, in imshow
self.set_aspect(aspect)
File "/usr/local/lib/python2.7/dist-packages/matplotlib/axes/_base.py", line 1273, in set_aspect
self._aspect = float(aspect) # raise ValueError if necessary
ValueError: could not convert string to float: normal

Failed to find any matching files for model.ckpt

.local/lib/python2.7/site-packages/requests/init.py:80: RequestsDependencyWarning: urllib3 (1.22) or chardet (2.3.0) doesn't match a supported version!
RequestsDependencyWarning)
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Summary name Learning Rate is illegal; using Learning_Rate instead.
Traceback (most recent call last):
File "object_detection/train.py", line 198, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/media/admin1/8C4CB1B64CB19C00/Tej/ML/Custom-Object-Detection-master/object_detection/trainer.py", line 218, in train
var_map, train_config.fine_tune_checkpoint))
File "/media/admin1/8C4CB1B64CB19C00/Tej/ML/Custom-Object-Detection-master/object_detection/utils/variables_helper.py", line 122, in get_variables_available_in_checkpoint
ckpt_reader = tf.train.NewCheckpointReader(checkpoint_path)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 110, in NewCheckpointReader
return CheckpointReader(compat.as_bytes(filepattern), status)
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for model.ckpt

Where I'm getting wrong?
Could anyone help me.....

Warnings when running create_tf_record.py

Hi Nick, I am not sure if I am doing wrong. Running "python object_detection/create_tf_record.py" gives a warning but train.record and val.record are created:

dyld: warning, LC_RPATH $ORIGIN/../../_solib_darwin_x86_64/_U_S_Stensorflow_Spython_C_Upywrap_Utensorflow_Uinternal.so___Utensorflow in /Users/jre/Library/Python/2.7/lib/python/site-packages/tensorflow/python/pywrap_tensorflow_internal.so being ignored in restricted program because it is a relative path
/Users/jre/Downloads/Watson/Custom-Object-Detection-master/object_detection/utils/dataset_util.py:75: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
if not xml:
[ /Users/jre/Downloads/Watson/Custom-Object-Detection-master ]
jre@ibm-jre-mbp ==-+-

Is this ok?

My own images fails

Hi nick,

I'm new to tensorflow so this may be a dumb question, but every output images that the program makes is resized to 558 of height and i think maybe that's why the program doesn't recognise features in my test images.

Maybe you can help?

How to train coco preset models

The project I am targeting is a ship.
When I try to train with your project.
Instead, the accuracy rate has dropped.
It may be because there is a similar project So affecting the weight.
Is there any way to train the original coco project
Instead of creating a new one?

Consumes whole ram and freeze system

Hi you have given a very good explanation. But I have some Issue in training.
I have followed your instruction but When I run train.py with the dataset given by you my system freezes at step 0 and then I have to restart my system.
When I checked ram usage using htop command, I saw It consumes whole RAM.
I am using linux 14.04 with 8 gb, 4 CPU cores.
Can we run this code on CPU or it is only compatible to GPU?

can't finetune ssd_inception with this repository, what's the reason?

python object_detection/train.py --logtostderr --train_dir=train/ --pipeline_config_path=ssd_inception_v2_coco.config
WARNING:tensorflow:From /home/prashant/Custom-Object-Detection/object_detection/trainer.py:176: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
Traceback (most recent call last):
File "object_detection/train.py", line 198, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/prashant/Custom-Object-Detection/object_detection/trainer.py", line 192, in train
clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
File "/home/prashant/Custom-Object-Detection/slim/deployment/model_deploy.py", line 193, in create_clones
outputs = model_fn(*args, **kwargs)
File "/home/prashant/Custom-Object-Detection/object_detection/trainer.py", line 133, in _create_losses
losses_dict = detection_model.loss(prediction_dict)
File "/home/prashant/Custom-Object-Detection/object_detection/meta_architectures/ssd_meta_arch.py", line 431, in loss
location_losses, cls_losses, prediction_dict, match_list)
File "/home/prashant/Custom-Object-Detection/object_detection/meta_architectures/ssd_meta_arch.py", line 565, in _apply_hard_mining
match_list=match_list)
File "/home/prashant/Custom-Object-Detection/object_detection/core/losses.py", line 445, in call
location_losses = tf.unstack(location_losses)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1054, in unstack
(axis, -value_shape.ndims, value_shape.ndims))

When using ssd_mobilenet_v1, the object_detection_runner.py giving Value Error

This happened with at least 3 images so far and this was not the problem when these images were in the training set before.

The error code is like this:

Traceback (most recent call last):
File "object_detection/object_detection_runner.py", line 90, in
detect_objects(image_path)
File "object_detection/object_detection_runner.py", line 43, in detect_objects
image_np = load_image_into_numpy_array(image)
File "object_detection/object_detection_runner.py", line 39, in load_image_into_numpy_array
(im_height, im_width, 3)).astype(np.uint8)
ValueError: cannot reshape array of size 587214 into shape (606,969,3)

It seems weird to me that these 3 images were resulting an error when doing predictions and they were completely fine when they were in the training set. Do you have any fix or suggestions for this?

Thank you,
Selin

WARNING:root:Could not find annotations/xmls/

I was following your article on Medium and everything went well up until where I had to run python object_detection/create_tf_record.py I have updated my trainval.txt and label_map.pbtxt. I also have my xml files in the folder where they are supposed to be also with the corresponding images in the images folder. However, whenever I try to run the create_tf_record.py file, it keeps giving me an error saying cannot find annotations. I am not sure what I am doing wrong. Please let me know if this can be fixed

object_detection/protos/*.proto: No such file or directory

Getting error while compiling the protobuf libraries

Incompatible shapes: [1,63,4] vs. [1,64,4] while training

INFO:tensorflow:global step 2: loss = 1.7981 (25.561 sec/step)
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>, Incompatible shapes: [1,63,4] vs. [1,64,4]
[[Node: gradients/Loss/BoxClassifierLoss/Loss/sub_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](gradients/Loss/BoxClassifierLoss/Loss/sub_grad/Shape, gradients/Loss/BoxClassifierLoss/Loss/sub_1_grad/Shape)]]

how to detect the ship use the vedio as input

thank you for your share. i want to know how to detenct the vedio's ship

training error

2018-03-04 13:19:20.290777: W tensorflow/core/framework/op_kernel.cc:1192] Resource exhausted: OOM when allocating tensor with shape[1,1024,51,38]
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.ResourceExhaustedError'>, OOM when allocating tensor with shape[1,1024,51,38]
[[Node: FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/FusedBatchNorm = FusedBatchNorm[T=DT_FLOAT, data_format="NHWC", epsilon=1.001e-05, is_training=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/Conv2D, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/gamma/read/_683, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/beta/read/_685, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/moving_mean/read/_687, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/moving_variance/read/_689)]]
[[Node: Reshape_24/_1255 = _HostRecvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4073_Reshape_24", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op u'FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/FusedBatchNorm', defined at:
File "object_detection/train.py", line 198, in
tf.app.run()
File "/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/Documents/Custom-Object-Detection-master/object_detection/trainer.py", line 192, in train
clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
File "/Documents/Custom-Object-Detection-master/slim/deployment/model_deploy.py", line 193, in create_clones
outputs = model_fn(*args, **kwargs)
File "/Documents/Custom-Object-Detection-master/object_detection/trainer.py", line 131, in _create_losses
prediction_dict = detection_model.predict(images)
File "/Documents/Custom-Object-Detection-master/object_detection/meta_architectures/faster_rcnn_meta_arch.py", line 513, in predict
image_shape) = self._extract_rpn_feature_maps(preprocessed_inputs)
File "/Documents/Custom-Object-Detection-master/object_detection/meta_architectures/faster_rcnn_meta_arch.py", line 652, in _extract_rpn_feature_maps
preprocessed_inputs, scope=self.first_stage_feature_extractor_scope)
File "/Documents/Custom-Object-Detection-master/object_detection/meta_architectures/faster_rcnn_meta_arch.py", line 131, in extract_proposal_features
return self._extract_proposal_features(preprocessed_inputs, scope)
File "/Documents/Custom-Object-Detection-master/object_detection/models/faster_rcnn_resnet_v1_feature_extractor.py", line 126, in _extract_proposal_features
scope=var_scope)
File "/Documents/Custom-Object-Detection-master/slim/nets/resnet_v1.py", line 298, in resnet_v1_101
reuse=reuse, scope=scope)
File "/Documents/Custom-Object-Detection-master/slim/nets/resnet_v1.py", line 216, in resnet_v1
net = resnet_utils.stack_blocks_dense(net, blocks, output_stride)
File "/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/Documents/Custom-Object-Detection-master/slim/nets/resnet_utils.py", line 185, in stack_blocks_dense
net = block.unit_fn(net, rate=rate, **dict(unit, stride=1))
File "/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/Documents/Custom-Object-Detection-master/slim/nets/resnet_v1.py", line 118, in bottleneck
activation_fn=None, scope='conv3')
File "/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1042, in convolution
outputs = normalizer_fn(outputs, **normalizer_params)
File "/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 643, in batch_norm
outputs = layer.apply(inputs, training=is_training)
File "/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 671, in apply
return self.call(inputs, *args, **kwargs)
File "/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 575, in call
outputs = self.call(inputs, *args, **kwargs)
File "/lib/python2.7/site-packages/tensorflow/python/layers/normalization.py", line 395, in call
return self._fused_batch_norm(inputs, training=training)
File "/lib/python2.7/site-packages/tensorflow/python/layers/normalization.py", line 302, in _fused_batch_norm
training, _fused_batch_norm_training, _fused_batch_norm_inference)
File "/lib/python2.7/site-packages/tensorflow/python/layers/utils.py", line 208, in smart_cond
return fn2()
File "/lib/python2.7/site-packages/tensorflow/python/layers/normalization.py", line 299, in _fused_batch_norm_inference
data_format=self._data_format)
File "/lib/python2.7/site-packages/tensorflow/python/ops/nn_impl.py", line 831, in fused_batch_norm
name=name)
File "/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 2034, in _fused_batch_norm
is_training=is_training, name=name)
File "/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "~/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,1024,51,38]
[[Node: FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/FusedBatchNorm = FusedBatchNorm[T=DT_FLOAT, data_format="NHWC", epsilon=1.001e-05, is_training=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/Conv2D, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/gamma/read/_683, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/beta/read/_685, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/moving_mean/read/_687, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/moving_variance/read/_689)]]
[[Node: Reshape_24/_1255 = _HostRecvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4073_Reshape_24", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Traceback (most recent call last):
File "object_detection/train.py", line 198, in
tf.app.run()
File "/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/Documents/Custom-Object-Detection-master/object_detection/trainer.py", line 296, in train
saver=saver)
File "/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 775, in train
sv.stop(threads, close_summary_writer=True)
File "/lib/python2.7/contextlib.py", line 35, in exit
self.gen.throw(type, value, traceback)
File "/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 792, in stop
stop_grace_period_secs=self._stop_grace_secs)
File "/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 296, in stop_on_exception
yield
File "/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 494, in run
self.run_loop()
File "/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 994, in run_loop
self._sv.global_step])
File "/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[1,1024,51,38]
[[Node: FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/FusedBatchNorm = FusedBatchNorm[T=DT_FLOAT, data_format="NHWC", epsilon=1.001e-05, is_training=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/Conv2D, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/gamma/read/_683, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/beta/read/_685, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/moving_mean/read/_687, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/moving_variance/read/_689)]]
[[Node: Reshape_24/_1255 = _HostRecvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4073_Reshape_24", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op u'FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/FusedBatchNorm', defined at:
File "object_detection/train.py", line 198, in
tf.app.run()
File "/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/Documents/Custom-Object-Detection-master/object_detection/trainer.py", line 192, in train
clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
File "/Documents/Custom-Object-Detection-master/slim/deployment/model_deploy.py", line 193, in create_clones
outputs = model_fn(*args, **kwargs)
File "/Documents/Custom-Object-Detection-master/object_detection/trainer.py", line 131, in _create_losses
prediction_dict = detection_model.predict(images)
File "/Documents/Custom-Object-Detection-master/object_detection/meta_architectures/faster_rcnn_meta_arch.py", line 513, in predict
image_shape) = self._extract_rpn_feature_maps(preprocessed_inputs)
File "/Documents/Custom-Object-Detection-master/object_detection/meta_architectures/faster_rcnn_meta_arch.py", line 652, in _extract_rpn_feature_maps
preprocessed_inputs, scope=self.first_stage_feature_extractor_scope)
File "/Documents/Custom-Object-Detection-master/object_detection/meta_architectures/faster_rcnn_meta_arch.py", line 131, in extract_proposal_features
return self._extract_proposal_features(preprocessed_inputs, scope)
File "/Documents/Custom-Object-Detection-master/object_detection/models/faster_rcnn_resnet_v1_feature_extractor.py", line 126, in _extract_proposal_features
scope=var_scope)
File "/Documents/Custom-Object-Detection-master/slim/nets/resnet_v1.py", line 298, in resnet_v1_101
reuse=reuse, scope=scope)
File "/Documents/Custom-Object-Detection-master/slim/nets/resnet_v1.py", line 216, in resnet_v1
net = resnet_utils.stack_blocks_dense(net, blocks, output_stride)
File "/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/Documents/Custom-Object-Detection-master/slim/nets/resnet_utils.py", line 185, in stack_blocks_dense
net = block.unit_fn(net, rate=rate, **dict(unit, stride=1))
File "/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/Documents/Custom-Object-Detection-master/slim/nets/resnet_v1.py", line 118, in bottleneck
activation_fn=None, scope='conv3')
File "/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1042, in convolution
outputs = normalizer_fn(outputs, **normalizer_params)
File "/lib/python2.7/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 643, in batch_norm
outputs = layer.apply(inputs, training=is_training)
File "/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 671, in apply
return self.call(inputs, *args, **kwargs)
File "/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 575, in call
outputs = self.call(inputs, *args, **kwargs)
File "/lib/python2.7/site-packages/tensorflow/python/layers/normalization.py", line 395, in call
return self._fused_batch_norm(inputs, training=training)
File "/lib/python2.7/site-packages/tensorflow/python/layers/normalization.py", line 302, in _fused_batch_norm
training, _fused_batch_norm_training, _fused_batch_norm_inference)
File "/lib/python2.7/site-packages/tensorflow/python/layers/utils.py", line 208, in smart_cond
return fn2()
File "/lib/python2.7/site-packages/tensorflow/python/layers/normalization.py", line 299, in _fused_batch_norm_inference
data_format=self._data_format)
File "/lib/python2.7/site-packages/tensorflow/python/ops/nn_impl.py", line 831, in fused_batch_norm
name=name)
File "/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 2034, in _fused_batch_norm
is_training=is_training, name=name)
File "/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "~/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,1024,51,38]
[[Node: FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/FusedBatchNorm = FusedBatchNorm[T=DT_FLOAT, data_format="NHWC", epsilon=1.001e-05, is_training=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/Conv2D, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/gamma/read/_683, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/beta/read/_685, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/moving_mean/read/_687, FirstStageFeatureExtractor/resnet_v1_101/block3/unit_12/bottleneck_v1/conv3/BatchNorm/moving_variance/read/_689)]]
[[Node: Reshape_24/_1255 = _HostRecvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4073_Reshape_24", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Training error

I get the following error, when I run "python -m wml.start_training_run"

from object_detection.protos import string_int_label_map_pb2
training-UGIxBkrmR: ImportError: cannot import name 'string_int_label_map_pb2'

Any help? @bourdakos1

May you provide a complete example please?

I want to attempt to utilise your project but i feel intimidated by it. I would love a complete example so i can run through it to ensure i have everything set up correctly and see how it works before i start to play.

Ps. It looks like you have a spelling mistake on line 61 in the first sentence of your readme.md. It also looks like you've merged two sentences together. It's very minor.

Problem with train.record

I am geting this error:
"INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.NotFoundError'>,train.record; No such file or directory"
Please help.

ValueError: No variables to save

using default config and model, while training:
Traceback (most recent call last):
File "object_detection/train.py", line 198, in
tf.app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/santhosh/Documentsmns/object_detection/trainer.py", line 219, in train
init_saver = tf.train.Saver(available_var_map)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1239, in init
self.build()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1248, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1272, in _build
raise ValueError("No variables to save")

why is it asking for trimaps ?

I thought the point of having an images folder and an annotations folder was that that was all the required information for the network to train. But looking at step 6 of
https://towardsdatascience.com/custom-object-detection-using-tensorflow-from-scratch-e61da2e10087

create_tf_record needs a trimaps folder as well. Also apparently create_tf_record is not compatible with jpg's? as it is only hardcoded to accept png's

Failed to find any matching files for model.ckpt

I'm on Arch, so there could be any number of crazy problems. Followed all install instructions; while running train.py I receive:
Unsuccessful TensorSliceReader constructor: Failed to find any matching files for model.ckpt
There is a tensorflow issue currently open regarding naming conventions, but even using the full path for all of the files I still get the same error. Any ideas?

How do enable multi GPU training?

train.py fails ... don't know what's wrong. I suspect image path in XML annotations file ...

When running train.py with the parameters given in the tutorial I get also a lot of the relative path warnings as documented in my other issue but train.py completely fails with the following last error messages (I did not post all the warnings before):

...
dyld: warning, LC_RPATH $ORIGIN/../../../../../_solib_darwin_x86_64/_U_S_Stensorflow_Scontrib_Stensor_Uforest_Cpython_Sops_S_Ustats_Uops.so___Utensorflow in /Users/jreich/Library/Python/2.7/lib/python/site-packages/tensorflow/contrib/tensor_forest/python/ops/_stats_ops.so being ignored in restricted program because it is a relative path
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
WARNING:tensorflow:From /Users/jreich/Downloads/Watson/Custom-Object-Detection-master/object_detection/trainer.py:176: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.get_or_create_global_step
INFO:tensorflow:Summary name Learning Rate is illegal; using Learning_Rate instead.
Traceback (most recent call last):
  File "object_detection/train.py", line 198, in <module>
    tf.app.run()
  File "/Users/jre/Library/Python/2.7/lib/python/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "object_detection/train.py", line 194, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "/Users/jre/Downloads/Watson/Custom-Object-Detection-master/object_detection/trainer.py", line 218, in train
    var_map, train_config.fine_tune_checkpoint))
  File "/Users/jre/Downloads/Watson/Custom-Object-Detection-master/object_detection/utils/variables_helper.py", line 122, in get_variables_available_in_checkpoint
    ckpt_reader = tf.train.NewCheckpointReader(checkpoint_path)
  File "/Users/jre/Library/Python/2.7/lib/python/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 150, in NewCheckpointReader
    return CheckpointReader(compat.as_bytes(filepattern), status)
  File "/Users/jre/Library/Python/2.7/lib/python/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for model.ckpt
[ /Users/jre/Downloads/Watson/Custom-Object-Detection-master ]
jre@ibm-jre-mbp  ==_-+-

I interpret the error message that train.py is not able to find the files for the model. I checked the annotation files in "annotations/xmls" and found that there is different folder name given in all of them:

<folder>less_selected</folder>

But the "videoplayback####" JPGs are stored in the "images" folder ... Maybe this also causes trouble with "create_tf_record.py". But probably I am completely on the wrong track ;-)

jre@ibm-jre-mbp  ==_-+- cat videoplayback0051.xml 
<annotation>
    <folder>less_selected</folder>
    <filename>videoplayback0051.jpg</filename>
    <size>
        <width>1000</width>
        <height>563</height>
    </size>
    <segmented>0</segmented>
    <object>
        <name>Tie Fighter</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>157</xmin>
            <ymin>165</ymin>
            <xmax>166</xmax>
            <ymax>176</ymax>
        </bndbox>
    </object>
    <object>
        <name>Tie Fighter</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>136</xmin>
            <ymin>151</ymin>
            <xmax>145</xmax>
            <ymax>160</ymax>
        </bndbox>
    </object>
</annotation>[ /Users/jre/Downloads/Watson/Custom-Object-Detection-master/annotations/xmls ]
jre@ibm-jre-mbp  ==_-+-

I am using for python 2.7.10 which is the default version of my mac os sierra and the pip version is 9.0.1. I hope it is not a big deal to solve :-)

What tool did you use for annotations?

Seems like a disaster to label each frame by hand.

how can I get the org mp4?

How can I get the org mp4? And could you provide script to produce a new labeled video from org? Thank you!

InvalidArgumentError: Assign requires shapes of both tensors to match.

I only created the tfrecord files using my own dataset and changed num_classes in faster_rcnn_resnet101.config accordingly.

Then when I run the code, it raised the following error:

Caused by op u'save_1/Assign_815', defined at:
File "object_detection/train.py", line 198, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/Custom-Object-Detection/object_detection/trainer.py", line 281, in train
keep_checkpoint_every_n_hours=keep_checkpoint_every_n_hours)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1218, in init
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1227, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1263, in _build
build_save=build_save, build_restore=build_restore)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 751, in _build_internal
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 439, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 160, in restore
self.op.get_shape().is_fully_defined())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 57, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [584] rhs shape= [8]
[[Node: save_1/Assign_815 = Assign[T=DT_FLOAT, _class=["loc:@SecondStageBoxPredictor/BoxEncodingPredictor/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](SecondStageBoxPredictor/BoxEncodingPredictor/biases/Momentum, save_1/RestoreV2_815)]]

It seems the model was restored failed. Besides, my own dataset has 146 classes, so it seems 584 = 146 * 4 is not equal the original 2 classes * 4 = 8.

Can this be used with mobilenetv2/ssdlite?

Can the existing config for mobilenetv1 easily adapted to be used with the new mobilenetv2 /ssdlite?

how to make the model perform better?

@bourdakos1
hi,I've trained with my own dataset, and the result is just okay, so I wonder which aspect I can do some changes to finetune the model, such as: the optimizier, learning rate or other parameters, and how to change ?
thanks !

tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: images\armas1721 : The system cannot find the file specified. ; No such file or directory

Hi Nick,

I was able to run the exact repo with your help. Thank you very much for your helps.
So, now, I need to apply the same exact steps to my images and xml files. I put my images to images folder. They are all ".jpg" though size is changing from 5-8 KB to 1400 KBs. I put the xmls in the annotations/xmls folder. I updated the trainval.txt, test_images folder, label_map.pbtxt. Everything is as it supposed to be.

However, I keep getting this error when I run the create_tf_record.py ("python object_detection/create_tf_record.py").

''' tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: images\armas1721 : The system cannot find the file specified.
; No such file or directory'''

The file exists, the folder exists. I am not sure why this error is popping up. I google'd around it but no solutions come up in the stackoverflow. I see people are posting the same error code, or similar error codes; have not seen a solution yet.

Does anyone know how to overcome this problem?

Thank you,
Selin

Inference time increases with custom model training.

Compare to base model that i use for transfer learning, the model i extracted increases inference time after custom object training. is it th normal behaviour

Issue with last step trying to run object_detection_runner.py

I was able to generate my model and inference graph, but when it got down to going through the last step and running python3 ./object_detection/object_detection_runner.py I got this error:

root@ml-2:/home/science/Custom-Object-Detection# python3 ./object_detection/object_detection_runner.py 
Loading model...
detecting...
2019-01-18 08:47:33.209893: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
detecting object in path: test_images/image-1.jpg
Traceback (most recent call last):
  File "./object_detection/object_detection_runner.py", line 93, in <module>
    detect_objects(image_path)
  File "./object_detection/object_detection_runner.py", line 59, in detect_objects
    fig = plt.figure()
  File "/usr/local/lib/python3.5/dist-packages/matplotlib/pyplot.py", line 535, in figure
    **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/matplotlib/backends/backend_tkagg.py", line 81, in new_figure_manager
    return new_figure_manager_given_figure(num, figure)
  File "/usr/local/lib/python3.5/dist-packages/matplotlib/backends/backend_tkagg.py", line 89, in new_figure_manager_given_figure
    window = Tk.Tk()
  File "/usr/lib/python3.5/tkinter/__init__.py", line 1871, in __init__
    self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: no display name and no $DISPLAY environment variable

I've been following the steps laid out in this tutorial: https://medium.freecodecamp.org/tracking-the-millenium-falcon-with-tensorflow-c8c86419225e so all the test images that I've been training on and are trying to call the runner on are pointing to the same three test images that are in this project.

Any ideas on why this error is occurring? I tried googling it but didn't find much that seemed applicable.

Cannot run create_tfrecord

Hi,
I'm not sure what am i missing in here, I just followed your tutorial in here 'https://medium.freecodecamp.org/tracking-the-millenium-falcon-with-tensorflow-c8c86419225e'

When i try to run the 'create_tf_record.py' from PyCharm i get the below error,

but at the same time i can run the 'create_tf_record.py' file from terminal.

Can you please point me, what do i miss in here !!! Thanks in advance buddy.

bourdakos1 / custom-object-detection Goto Github PK

custom-object-detection's Introduction

Custom Object Detection with TensorFlow — We've moved!

- check out our comfy new home github.com/cloud-annotations/training

License

custom-object-detection's People

Contributors

Stargazers

Watchers

Forkers

custom-object-detection's Issues

Recommend Projects

Recommend Topics

Recommend Org