Coder Social home page Coder Social logo

sku110k_cvpr19's Introduction

SKU-110K

Eran Goldman* , Roei Herzig* , Aviv Eisenschtat*, Jacob Goldberger, Tal Hassner

Dataset and Codebase for CVPR2019 "Precise Detection in Densely Packed Scenes" [Paper link]

A typical image in our SKU-110K, showing densely packed objects. (a) Detection results for the state-of-the-art RetinaNet[2], showing incorrect and overlapping detections, especially for the dark objects at the bottom which are harder to separate. (b) Our results showing far fewer misdetections and better fitting bounding boxes. (c) Zoomed-in views for RetinaNet[2] and (d) our method.

Our novel contributions are:

  1. Soft-IoU layer, added to an object detector to estimate the Jaccard index between the detected box and the (unknown) ground truth box.
  2. EM-Merger unit, which converts detections and Soft-IoU scores into a MoG (Mixture of Gaussians), and resolves overlapping detections in packed scenes.
  3. A new dataset and benchmark, the store keeping unit, 110k categories (SKU-110K), for item detection in store shelf images from around the world.

Introduction

In our SKU-110K paper[1] we focus on detection in densely packed scenes, where images contain many objects, often looking similar or even identical, positioned in close proximity. These scenes are typically man-made, with examples including retail shelf displays, traffic, and urban landscape images. Despite the abundance of such environments, they are under-represented in existing object detection benchmarks, therefore, it is unsurprising that state-of-the-art object detectors are challenged by such images.

Method

We propose learning the Jaccard index with a soft Intersection over Union (Soft-IoU) network layer. This measure provides valuable information on the quality of detection boxes. Those detections can be represented as a Mixture of Gaussians (MoG), reflecting their locations and their Soft-IoU scores. Then, an Expectation-Maximization (EM) based method is then used to cluster these Gaussians into groups, resolving detection overlap conflicts.

System diagram: (a) Input image. (b) A base network, with bounding box (BB) and objectness (Obj.) heads, along with our novel Soft-IoU layer. (c) Our EM-Merger converts Soft-IoU to Gaussian heat-map representing (d) objects captured by multiple, overlapping bounding boxes. (e) It then analyzes these box clusters, producing a single detection per object

Dataset

We compare between key properties for related benchmarks. #Img.: Number of images. #Obj./img.: Average items per image. #Cls.: Number of object classes (more implies a harder detection problem due to greater appearance variations). #Cls./img.: Average classes per image. Dense: Are objects typically densely packed together, raising potential overlapping detection problems?. Idnt: Do images contain multiple identical objects or hard to separate object sub-regions?. BB: Bounding box labels available for measuring detection accuracy?.

The dataset is provided for the exclusive use by the recipient and solely for academic and non-commercial purposes.

The dataset can be downloaded from here or here.

A pretrained model is provided here. Note that its performance is slighly better than originally reported in the paper because of improved optimization.

CVPR 2020 Challenge

The detection challenge was held in CVPR 2020 Retail-Vision workshop. Please visit our workshop page for more information. The data and evaluation code are available in the challenge page.

Qualitative Results

Qualitative detection results on SKU-110K.

Notes

Please note that the main part of the code has been released, though we are still testing it to fix possible glitches. Thank you.

This implementation is built on top of https://github.com/fizyr/keras-retinanet. The SKU110K dataset is provided in csv format compatible with the code CSV parser.

Dependencies include: keras, keras-resnet, six, scipy. Pillow, pandas, tensorflow-gpu, tqdm This repository requires Keras 2.2.4 or higher, and was tested using Python 3.6.5, Python 2.7.6 and OpenCV 3.1.

The output files will be saved under "$HOME"/Documents/SKU110K and have the same structure as in https://github.com/fizyr/keras-retinanet: The weight h5 files will are saved in the "snapshot" folder and the tensorboard log files are saved in the "logs" folder.

Note that we have made several upgrades to the baseline detector since the beginning of this research, so the latest version can actually achieve even higher results than the ones originally reported.

The EM-merger provided here is the stable version (not time-optimized). Some of the changes required for optimization are mentioned in the TO-DO comments.

Contributions to this project are welcome.

Usage

Move the unzipped SKU110K folder to "$HOME"/Documents

Set $PYTHONPATH to the repository root

e.g. from this repository: export PYTHONPATH=$(pwd)

train:

(1) Train the base model: python -u object_detector_retinanet/keras_retinanet/bin/train.py csv

(2) Train the IoU layer:

python -u object_detector_retinanet/keras_retinanet/bin/train_iou.py --weights WEIGHT_FILE csv where WEIGHT_FILE is the full path to the h5 file from step (1)

e.g.: python -u object_detector_retinanet/keras_retinanet/bin/train_iou.py --gpu 0 --weights "/home/ubuntu/Documents/SKU110K/snapshot/Thu_May__2_17:07:11_2019/resnet50_csv_10.h5" csv | tee train_iou_sku110k.log

(3) predict:

python -u object_detector_retinanet/keras_retinanet/bin/predict.py csv WEIGHT_FILE [--hard_score_rate=RATE] where WEIGHT_FILE is the full path to the h5 file from step (2), and 0<=RATE<=1 computes the confidence as a weighted average between soft and hard scores.

e.g: nohup env PYTHONPATH="/home/ubuntu/dev/SKU110K" python -u object_detector_retinanet/keras_retinanet/bin/predict.py --gpu 3 csv "/home/ubuntu/Documents/SKU110K/snapshot/Thu_May__2_17:10:30_2019/iou_resnet50_csv_07.h5" --hard_score_rate=0.5 | tee predict_sku110k.log

The results are saved in CSV format in the "results" folder and drawn in "res_images_iou" folder.

References

[1] Eran Goldman*, Roei Herzig*, Aviv Eisenschtat*, Jacob Goldberger, Tal Hassner, Precise Detection in Densely Packed Scenes, 2019.

[2] Tsung-Yi Lin, Priyal Goyal, Ross Girshick, Kaiming He, Piotr Dollar, Focal loss for dense object detection, 2018.

Citation

@inproceedings{goldman2019dense,
 author    = {Eran Goldman and Roei Herzig and Aviv Eisenschtat and Jacob Goldberger and Tal Hassner},
 title     = {Precise Detection in Densely Packed Scenes},
 booktitle = {Proc. Conf. Comput. Vision Pattern Recognition (CVPR)},
 year      = {2019}
}

sku110k_cvpr19's People

Contributors

alvarocavalcante avatar dependabot[bot] avatar eg4000 avatar roeiherz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sku110k_cvpr19's Issues

pretrained network weights

Hi!
Thank you for this nice paper and code.
Are you planning to share the trained network weights?

Thanks,

Why is weighted sum used here?

Nice job. I'm interested in your implementation and find that in this line you use a weighted sum of objectness score and iou score. It puzzles me as I think it's a little weird to add a score which denotes whether a bounding box contains an object with a score which denotes how large the bounding box occupy with ground truth. Do I misunderstand it?

soft_scores = hard_score_rate * hard_scores + (1 - hard_score_rate) * soft_scores

Why the result of base detector is worse than retinanet?

Thanks for sharing your work. In my thought, the base detector is the same model as retinanet and should have similar result too. But from Table 3 in your paper, the result of Base&NMS is worse than Retinanet. Is there anything I misunderstood?

Only single label in present the dataset

In the provided dataset(SKU110K), all the detections are marked with a single label ("object"). However, in the paper and the README, the number of labels mentioned are 110712.
Are there any plans for providing all the labels or only one label will be provided?

A question about EM-Merger Unit

I have read your excellent work, you have done a good job!
I have three questions about EM-Meger:

  • In your paper Page 5, the third equation in Eq.(10):
    it seems the covariance is initialized by information of h and w of a box, why does it add the information of distance between μi and μj while updating?
  • In Gaussians as detections part, you describe
To extract the final detections, for each of the K' Gaussians, we consider the ellipse
at two standard deviations around its center, visualized in Fig. 3 in green. We then 
search the original set of N detections (Sec. 3.1) for those whose center, μ = (x, y), 
falls inside this ellipse.

It seems the area in the green ellipse is still large, there may have several boxes' centers lay in the ellipse, how do you decide which one to choose?

  • You use clustering to init the parameter of G, do you use the boxes to cluster directly? And what is the difference between EM-Merger and clustering, why can't the clustering approach to the goal?

optimum training configuration

i think my retinanet training has started overfitting, i am using default/baseline configuration (with no argument), would appreciate some insights how to optimally tune the parameters

image

image

requirement of CV version needs to be clarified

Hi @eg4000 ,

I think you may need to clarify your CV version since your pipenv doesn't have.
I met a bug about this in EmMerge.py:114, claimed that "cv2.boundingRect has cv2.num_pts or (CV2.32F || CV2.32S)" issue. My CV version is 4.1, and then I do modification like this in line 64:
'''
contours, _ = cv2.findContours(numpy.ndarray.copy(heat_map), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
'''

just change the output var from 'contours' to 'contours,_' . The code works then, but no result show up. I train the base model with 41 Epoch and the loss down to 0.78. Then I do train-iou and after 2 epoch, the loss down to 0.426 and will not go down anymore.

Hope to hear your suggestion, thank you.

License of the code and the dataset

Hello,

Thanks for making (soon) the code and the dataset available!

In the README, it is stated that

The dataset is provided for the exclusive use by the recipient and solely for academic and non-commercial purposes.

I guess the same will apply for the code when it's available. In this case, if an industrial lab wants to compare their method with yours on your dataset then they cannot. Would it be possible to replace academic by research? For a CVPR publication we would expect it to be more accessible.

Thank you in advance for your response!

Unable to parse arguments

i received the following error message when parsing arguments during training, sorry if it's a beginner's mistake

train.py: error: unrecognized arguments: --steps=20 --epochs=2

Error while running predict.py

I am using pretrained resnet50 on coco dataset 80 classes. Used it to train Soft IOU Layer using coco person data.

Command to run predict.py

python3 -u object_detector_retinanet/keras_retinanet/bin/predict.py csv --annotations=/home/deepak/Projects/dense_object_detection/SKU110K_CVPR19/coco_csv/coco_test_annotation.csv --classes=./class_label.csv /home/deepak/Projects/dense_object_detection/SKU110K_CVPR19/iou_resnet50_csv_05.h5 --base_dir=/home/deepak/Dataset/UNZIPPED/COCO/val2017/ --backbone=resnet50 --image-min-side=608 --image-max-side=608 2>&1 | tee predict.log

Error

Traceback (most recent call last):
File "object_detector_retinanet/keras_retinanet/bin/predict.py", line 155, in
main()
File "object_detector_retinanet/keras_retinanet/bin/predict.py", line 150, in main
hard_score_rate=hard_score_rate
File "/home/deepak/Projects/dense_object_detection/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/utils/predict_iou.py", line 57, in predict
soft_scores = np.squeeze(soft_scores, axis=-1)
File "<array_function internals>", line 6, in squeeze
File "/usr/local/lib/python3.5/dist-packages/numpy/core/fromnumeric.py", line 1438, in squeeze
return squeeze(axis=axis)
ValueError: cannot select an axis to squeeze out which has a size not equal to one

The shape of numpy array of soft_scores is (1, 99999, 2).

trained model converted for inference/testing

i converted both models the results from train.py and train_iou.py using convert_model.py, and use those models to perform inference/testing using this notebook from keras-retinanet https://github.com/delftrobotics/keras-retinanet/blob/master/examples/ResNet50RetinaNet.ipynb

Modified the imports accordingly from object_detector_retinanet.keras_retinanet and run the inference/testing

inference/testing using model converted from train.py went successfully, but inference/testing using model converted from train_iou.py i received the following error

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-c86318e1b643> in <module>
12 # process image
13 start = time.time()
---> 14 boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))
15 print("processing time: ", time.time() - start)
16

ValueError: too many values to unpack (expected 3)

Comparision vs other Object Detection models

When reading your paper I saw some references towards Faster R-CNN, YOLO and RetinaNet. I found no references to SSD. Could you ellaborate where your methods differ and why one might choose your developed model over SSD and vice versa?

Thanks in advance and good luck with further research!

resnet101 and Soft-IoU layer

Hi @eg4000, I have two questions:
(1) Can I change the backbone to resnet101 to train the base model?
(2) Is it possible to add the Soft-IoU layer to other one-stage algorithms, such as yolov3?

predicting failed

Hi, I used your code to train my own model with custom data. After training the base model and IOU layer, I ran the predict.py but got error like:

Traceback (most recent call last):
File "object_detector_retinanet/keras_retinanet/bin/predict.py", line 155, in
main()
File "object_detector_retinanet/keras_retinanet/bin/predict.py", line 150, in main
hard_score_rate=hard_score_rate
File "/home/wushengjian/Documents/SKU110K/object_detector_retinanet/keras_retinanet/utils/predict_iou.p
y", line 58, in predict
soft_scores = np.squeeze(soft_scores, axis=-1)
File "/home/wushengjian/anaconda3/envs/pose/lib/python3.6/site-packages/numpy/core/fromnumeric.py", lin
e 1292, in squeeze
return squeeze(axis=axis)
ValueError: cannot select an axis to squeeze out which has size not equal to one

The shape of soft_scores is (1, 999999, 5)

Why the GPU-Util is so low and the training speed is so slow?

I‘m very interested in your work. But when I run
"python -u object_detector_retinanet/keras_retinanet/bin/train.py --gpu 7 csv" and watch -n 3 nvidia-smi, the gpu-util is very low (0~16%) and the cpu-util is very high, the training speed is also very slow (train just 3 epochs on the rtx2080 for 12 hours)
Epoch 3/150
1961/10000 [====>.........................]

hard scores and soft scores

Thanks for the wonderful work!
Question from the prediction code. What is the difference between hard scores and soft scores ? Where is the output format defined?

Hello, duplicate_merger.filter_duplicate_candidates problem

Hello, duplicate_merger.filter_duplicate_candidates The two parameters of the function, result_df and pixel_data, are both valuable, but the return is indeed the Empty DataFrame. I traced it inside but I don't know how to solve the problem. What should I do?

How can I deal with snapshot term?

Hi, Your code is very nice to test model.

BTW, how can i store snapshot(weight) at specific traning step interval?

I check the snapshot path, but couldn't find interval step for save snapshot.

Thank you.

Unable to train using GPU

I am using AWS EC2 g3 instance for training, looks like the training is not utilizing the GPU despite i state it in the arguments

python -u /home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/bin/train.py --gpu 0 --steps 20 --epochs 10 csv

image

Not able to train with coco dataset

python3 -u object_detector_retinanet/keras_retinanet/bin/train.py --gpu=1 --tensorboard-dir=./tensorboard/ --batch-size=4 --coco_path=./coco_dataset/ coco

I am using the above command to train with coco dataset.

train.py coco: error: the following arguments are required: coco_path

Can you please tell me what am I doing wrong while giving path?

Hello, what is the meaning of the -annotations parameter?

Hello, I have a question: --annotations parameter is used, I really did not understand, now I filled in the parameters of my prediction picture, but he asked me to fill in x1, x2, y1, Y2, but this is my predictive picture, I don't know what it means to fill this.
I have a guess, is it true that if the prediction succeeds, the bounding box's coordinate values will overwrite the contents of --annotations?

how to deal with soft_scores when class_nums > 1.

When I train my own dataset (40 classes), using predict.py and meet
ValueError: cannot select an axis to squeeze out which has size not equal to one Error
I found that soft_scores is a [1*999999*40] ndarray.
And the model output define is

outputs = keras.layers.Reshape((-1, num_classes), name='pyramid_classification_reshape')(outputs)
    outputs = keras.layers.Activation('sigmoid', name='pyramid_classification_sigmoid')(outputs)

The output size is true, so the np.squeeze is not suitable for class_nums > 1.
Should I get the softscore by the output label?

Code

Will the code be released if so when?

Hello, duplicate_merger. Filter_duplicate_candidates problem

Hello, duplicate_merger. Filter_duplicate_candidates, the two input parameters of this function, result_df and pixel_data, all have values, but the return is Empty DataFrame, I trace it, but I don't know how to solve the problem, may I ask what should I do

How to train with GPU?

Here is my [train] code:
train.py --gpu 0,1,2,3 --multi-gpu 4 csv

But got an error occur:

Traceback (most recent call last):
File "/root/.pycharm_helpers/pydev/pydevd.py", line 1664, in
main()
File "/root/.pycharm_helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/root/.pycharm_helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/root/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/tmp/object_detector_retinanet/keras_retinanet/bin/train.py", line 450, in
main()
File "/tmp/object_detector_retinanet/keras_retinanet/bin/train.py", line 415, in main
freeze_backbone=args.freeze_backbone
File "/tmp/object_detector_retinanet/keras_retinanet/bin/train.py", line 108, in create_models
training_model = multi_gpu_model(model, gpus=multi_gpu)
File "/root/miniconda3/lib/python3.6/site-packages/keras/utils/multi_gpu_utils.py", line 181, in multi_gpu_model
available_devices))
ValueError: To call multi_gpu_model with gpus=4, we expect the following devices to be available: ['/cpu:0', '/gpu:0', '/gpu:1', '/gpu:2', '/gpu:3']. However this machine only has: ['/cpu:0', '/xla_gpu:0', '/xla_gpu:1', '/xla_gpu:2', '/xla_gpu:3', '/xla_cpu:0']. Try reducing gpus.
^C
Process finished with exit code 1

about csv_generator

Hi there,
When i run train.py i am gettin this error. Could you please help?

Thanks.

raise_from(ValueError('invalid CSV annotations file: {}: {}'.format(csv_data_file, e)), None)

File "", line 3, in raise_from
ValueError: invalid CSV annotations file: /home/furkan/Documents/SKU110K/annotations/annotations_train.csv: line 1: format should be 'img_file,x1,y1,x2,y2,class_name' or 'img_file,,,,,'

Resume training from a snapshot

lets say i train the retinanet model using 1000 images, and then later continue training the retinanet with different 200 images (not including the initial 1000 images) containing new products, would that be possible? i observe from my experiment the loss just continuing from the previous 1000 images training - which has already converged - therefore the loss instead of decreasing; it increasing

Why the loss can't reduce when i train the IoU layer?

When i train the IOU layer, the loss maintained at around 0.4.
1438/10000 [===>..........................] - ETA: 3:57:13 - loss: 0.4222
1439/10000 [===>..........................] - ETA: 3:57:10 - loss: 0.4222
1440/10000 [===>..........................] - ETA: 3:57:07 - loss: 0.4222
1441/10000 [===>..........................] - ETA: 3:57:08 - loss: 0.4222
1442/10000 [===>..........................] - ETA: 3:57:00 - loss: 0.4222
1443/10000 [===>..........................] - ETA: 3:56:58 - loss: 0.4222
1444/10000 [===>..........................] - ETA: 3:56:56 - loss: 0.4223
1445/10000 [===>..........................] - ETA: 3:56:54 - loss: 0.4223
1446/10000 [===>..........................] - ETA: 3:56:51 - loss: 0.4223
1447/10000 [===>..........................] - ETA: 3:56:51 - loss: 0.4223
1448/10000 [===>..........................] - ETA: 3:56:50 - loss: 0.4222
1449/10000 [===>..........................] - ETA: 3:56:52 - loss: 0.4223
1450/10000 [===>..........................] - ETA: 3:56:52 - loss: 0.4223
1451/10000 [===>..........................] - ETA: 3:56:50 - loss: 0.4223
1452/10000 [===>..........................] - ETA: 3:56:49 - loss: 0.4223
1453/10000 [===>..........................] - ETA: 3:56:45 - loss: 0.4223
1454/10000 [===>..........................] - ETA: 3:56:44 - loss: 0.4223
1455/10000 [===>..........................] - ETA: 3:56:43 - loss: 0.4223
1456/10000 [===>..........................] - ETA: 3:56:44 - loss: 0.4222
1457/10000 [===>..........................] - ETA: 3:56:41 - loss: 0.4222
1458/10000 [===>..........................] - ETA: 3:56:38 - loss: 0.4223
1459/10000 [===>..........................] - ETA: 3:56:38 - loss: 0.4223
1460/10000 [===>..........................] - ETA: 3:56:35 - loss: 0.4223
1461/10000 [===>..........................] - ETA: 3:56:35 - loss: 0.4223
1462/10000 [===>..........................] - ETA: 3:56:34 - loss: 0.4224
1463/10000 [===>..........................] - ETA: 3:56:32 - loss: 0.4223
1464/10000 [===>..........................] - ETA: 3:56:30 - loss: 0.4224
1465/10000 [===>..........................] - ETA: 3:56:27 - loss: 0.4224
1466/10000 [===>..........................] - ETA: 3:56:25 - loss: 0.4224
1467/10000 [===>..........................] - ETA: 3:56:23 - loss: 0.4224
1468/10000 [===>..........................] - ETA: 3:56:20 - loss: 0.4224
1469/10000 [===>..........................] - ETA: 3:56:18 - loss: 0.4224
image

I have already trained 23 epochs.
image

model deployment to cloud

Can anyone with greater experience guide me how to deploy the model to cloud, in particular with aws, either using lambda, or sagemaker, or ec2, or elastic inference instance?

code for calculating mAP on the test set

Hi,

It'd be great if you can share the code for calculating mAP on your test set. I've modified the cocoeval.py from pycocotools to evaluate my results but it'd be great if you could provide it so as to remove any possible mistake from my side.

error: (-215) points.checkVector(2) >= 0 && (points.depth() == CV_32F || points.depth() == CV_32S) in function boundingRect

When I run:
python -u object_detector_retinanet/keras_retinanet/bin/predict.py --gpu 3 csv "/home/wzz/Documents/SKU110K/snapshot/Mon_Jul_15_13:51:47_2019/iou_resnet50_csv_07.h5" --hard_score_rate=0.5 | tee predict_sku110k_log^C
I got:
image
When I run:
nohup env PYTHONPATH="/home/wzz/dev/SKU110K" python -u object_detector_retinanet/keras_retinanet/bin/predict.py --gpu 3 csv "/home/wzz/Documents/SKU110K/snapshot/Mon_Jul_15_13:51:47_2019/iou_resnet50_csv_07.h5" --hard_score_rate=0.5 | tee predict_sku110k_log
I got:
image

OSError: image file is truncated (23 bytes not processed)

I am trying to train the model on the same dataset. But for some reason my training stops at iteration 768/10000 and throws this error:
OSError: image file is truncated (23 bytes not processed)
I am not sure why this is happening as I am trying to replicate the results using the same model and same data the authors used. The entire error is:
768/10000 [=>............................] - ETA: 46:01:31 - loss: 2.7264 - regression_loss: 2.2707 - classification_loss: 0.4557Traceback (most recent call last):
File "object_detector_retinanet/keras_retinanet/bin/train.py", line 440, in
main()
File "object_detector_retinanet/keras_retinanet/bin/train.py", line 435, in main
validation_steps=validation_generator.size()
File "/opt/conda/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/opt/conda/lib/python3.7/site-packages/keras/engine/training_generator.py", line 181, in fit_generator
generator_output = next(output_generator)
File "/opt/conda/lib/python3.7/site-packages/keras/utils/data_utils.py", line 709, in get
six.reraise(*sys.exc_info())
File "/opt/conda/lib/python3.7/site-packages/six.py", line 693, in reraise
raise value
File "/opt/conda/lib/python3.7/site-packages/keras/utils/data_utils.py", line 685, in get
inputs = self.queue.get(block=True).get()
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/opt/conda/lib/python3.7/site-packages/keras/utils/data_utils.py", line 626, in next_sample
return six.next(_SHARED_SEQUENCES[uid])
File "/notebooks/SKU/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 287, in next
return self.next()
File "/notebooks/SKU/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 298, in next
return self.compute_input_output(group)
File "/notebooks/SKU/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 269, in compute_input_output
image_group = self.load_image_group(group)
File "/notebooks/SKU/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 147, in load_image_group
return [self.load_image(image_index) for image_index in group]
File "/notebooks/SKU/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 147, in
return [self.load_image(image_index) for image_index in group]
File "/notebooks/SKU/object_detector_retinanet/keras_retinanet/preprocessing/csv_generator.py", line 270, in load_image
return read_image_bgr(self.image_path(image_index))
File "/notebooks/SKU/object_detector_retinanet/keras_retinanet/utils/image.py", line 32, in read_image_bgr
image = np.asarray(Image.open(path).convert('RGB'))
File "/opt/conda/lib/python3.7/site-packages/PIL/Image.py", line 912, in convert
self.load()
File "/opt/conda/lib/python3.7/site-packages/PIL/ImageFile.py", line 239, in load
len(b))
OSError: image file is truncated (23 bytes not processed)

Any help is appreciated.

Understanding your loss function.

Hi,

In the paper you write that the loss is defined as follows:

L = LClassification + LRegression + LsIoU

From this repository (the readme and your code), I understand that you follow this procedure:

  1. Train RetinaNet. This model is compiled using the two losses defined by RetinaNet (LClassification and LRegression).
  2. Add the IoU-submodel. Freeze every other layer that is not in this submodel and compile the entire new model with LsIoU. Then train again to obtain your final model.

To me it seems that all layers except the IoU submodel do not contribute from the added loss and that the layers in the new submodel are updated without considering the RetinaNet losses. So, it seems that the 'sublosses' do not contribute evenly to the final loss L.

Can you explain this to me?

ImportError : No module named 'object _detector _retinanet'

when I run train_iou.py,The following error occurred while using the following command line.
Terminal command 👍
"python3 -u object_detector_retinanet/keras_retinanet/bin/train_iou.py --gpu 0 --weights "/home/ipsg/Documents/SKU110K/snapshot/Wed_Jul__3_11_37_12_2019/resnet50_csv_03.h5" csv | tee train_iou_sku110k.log"

I just started deep learning, don't know how to correct. dear Iran ,Please give me some advice.

image

output.csv has nothing after prediction

Hello authors,

I tried running the prediction on 1 image with all annotations for that 1 image. I am able to get the res_image_iou (marked image) but I don't see anything inside the csv generated under result folder.

Also, the weight shared at #9 gives output with 3 elements whereas at #21 author says there are 4 elements returned by iou_retinanet. I am following the notebook file https://github.com/yafeunteun/SKU110K_code/blob/master/examples/retinanet.ipynb (Load Retinanet + IoU model) with the weights given by the author. Wanted to check if I am doing something wrong

no attribute 'SIGALRM'

I get this error: “AttributeError: module 'signal' has no attribute 'SIGALRM'”, after calling predict.py. I used pre-trained weights that were shared in earlier Issue. I realized problem is with signal.SIGALRM which is used in CollapsingMoG.py. Windows OS doesn’t implement that signal. Is there a way to overcome this and did actually someone run prediction on windows?

Pretrain Model

Hi @eg4000,

Thanks for your work, the result looks impressive. Just want to know are you train the model from scratch? Will you share your model for display and eval?

Thx.

cv2.error: in function cv::pointPolygonTest

when I run predict.py, I met this error in EmMerger.py.Could you give me some help?
Thanks.

cv2.error: OpenCV(4.1.0) C:\projects\opencv-python\opencv\modules\imgproc\src\geometry.cpp:103: error: (-215:Assertion failed) total >= 0 && (depth == CV_32S || depth == CV_32F) in function 'cv::pointPolygonTest'

Traceback (most recent call last):
File "predict.py", line 155, in
main()
File "predict.py", line 150, in main
hard_score_rate=hard_score_rate
File "D:\chenle\SKU110K_CVPR19.git\trunk\object_detector_retinanet\keras_retinanet\utils\predict_iou.py", line 81, in predict
filtered_data = EmMerger.merge_detections(image_name, results)
File "D:\chenle\SKU110K_CVPR19.git\trunk\object_detector_retinanet\keras_retinanet\utils\EmMerger.py", line 387, in merge_detections
filtered_data = duplicate_merger.filter_duplicate_candidates(result_df, pixel_data)
File "D:\chenle\SKU110K_CVPR19.git\trunk\object_detector_retinanet\keras_retinanet\utils\EmMerger.py", line 69, in filter_duplicate_candidates
candidates = self.find_new_candidates(contours, heat_map, data, original_detection_centers, image)
File "D:\chenle\SKU110K_CVPR19.git\trunk\object_detector_retinanet\keras_retinanet\utils\EmMerger.py", line 149, in find_new_candidates
cov, mu, num, roi = self.remove_redundant(contour_bbox, cov, k, mu, image, sub_heat_map)
File "D:\chenle\SKU110K_CVPR19.git\trunk\object_detector_retinanet\keras_retinanet\utils\EmMerger.py", line 214, in remove_redundant
ct_i_to_pt_j = -cv2.pointPolygonTest(cnt_i, (mu[j][0], mu[j][1]), measureDist=True)
cv2.error: OpenCV(4.1.0) C:\projects\opencv-python\opencv\modules\imgproc\src\geometry.cpp:103: error: (-215:Assertion failed) total >= 0 && (depth == CV_32S || depth == CV_32F) in
function 'cv::pointPolygonTest'

OSError: image file is truncated (corrupted images?)

i received the following error when training using the SKU110K images, i assume there is/are corrupted images, but i couldn't identify which one.

Traceback (most recent call last): File "/home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/bin/train.py", line 448, in <module> main() File "/home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/bin/train.py", line 443, in main validation_steps=validation_generator.size() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/site-packages/keras/legacy/interfaces.py", line 91, in wrapper return func(*args, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/site-packages/keras/engine/training.py", line 1418, in fit_generator initial_epoch=initial_epoch) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/site-packages/keras/engine/training_generator.py", line 181, in fit_generator generator_output = next(output_generator) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/site-packages/keras/utils/data_utils.py", line 709, in get six.reraise(*sys.exc_info()) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/site-packages/six.py", line 693, in reraise raise value File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/site-packages/keras/utils/data_utils.py", line 685, in get inputs = self.queue.get(block=True).get() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/site-packages/keras/utils/data_utils.py", line 626, in next_sample return six.next(_SHARED_SEQUENCES[uid]) File "/home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 287, in __next__ return self.next() File "/home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 298, in next return self.compute_input_output(group) File "/home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 269, in compute_input_output image_group = self.load_image_group(group) File "/home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 147, in load_image_group return [self.load_image(image_index) for image_index in group] File "/home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/preprocessing/generator.py", line 147, in <listcomp> return [self.load_image(image_index) for image_index in group] File "/home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/preprocessing/csv_generator.py", line 270, in load_image return read_image_bgr(self.image_path(image_index)) File "/home/ubuntu/SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/utils/image.py", line 32, in read_image_bgr image = np.asarray(Image.open(path).convert('RGB')) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/site-packages/PIL/Image.py", line 912, in convert self.load() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.7/site-packages/PIL/ImageFile.py", line 239, in load len(b)) OSError: image file is truncated (23 bytes not processed)

How can i use multi GPUs to train?

When i use the train_iou.py, there are many warning.

like this:

...SKU110K_CVPR19/object_detector_retinanet/keras_retinanet/preprocessing/generatorIou.py:140: UserWarning: Image with id 5425 (shape (3264, 2448, 3)) contains the following invalid boxes: [array([2365., 2627., 2449., 2763., 0.])].

How to avoid SIGALRM (failed prediction)?

What is really causing SIGALRM during inference? Am I out of computer resources (cpu, memory, gpu) hence timeout? I usually increase the threshold from 0.4 or 0.5 to avoid this problem

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.