penny4860 / yolo-digit-detector Goto Github PK

Implemented digit detector in natural scene using resnet50 and Yolo-v2. I used SVHN as the training set, and implemented it using tensorflow and keras.

License: MIT License

Python 18.84% Jupyter Notebook 81.16%

yolov2 keras svhn-dataset digit-detector

yolo-digit-detector's Introduction

SVHN yolo-v2 digit detector

I have implemented a digit detector that applies yolo-v2 to svhn dataset.

Usage for python code

0. Requirement

python 3.6
tensorflow 1.14.0
keras 2.1.1
opencv 3.3.0
Etc.

I recommend that you create and use an anaconda env that is independent of your project. You can create anaconda env for this project by following these simple steps. This process has been verified on Windows 10 and ubuntu 16.04.

$ conda create -n yolo python=3.6
$ activate yolo # in linux "source activate yolo"
(yolo) $ pip install -r requirements.txt
(yolo) $ pip install -e .

1. Digit Detection using pretrained weight file

In this project, the pretrained weight file is stored in weights.h5.

Example code for predicting a digit region in a natural image is described in detection_example.ipynb.
Training set evaluation (1000-images) is as follows:
- fscore / precision / recall: 0.799, 0.791, 0.807

2. Training from scratch

This project provides a way to train digit detector from scratch. If you follow the command below, you can build a digit detector with just two images.

First, train all layers through the following command.
- project/root> python train.py -c configs/from_scratch.json
Next, fine tune only the last layer through the following command.
- project/root> python train.py -c configs/from_scratch2.json
Finally, evaluate trained digit detector.
- project/root> python evaluate.py -c configs/from_scratch.json -w svhn/weights.h5
- The evaluation results are output in the following manner.
  - {'fscore': 1.0, 'precision': 1.0, 'recall': 1.0}
- The prediction result images are saved in the project/detected directory.

Now you can add more images to train a digit detector with good generalization performance.

3. SVHN dataset in Pascal Voc annotation format

In this project, I use pascal voc format as annotation information to train object detector. An annotation file of this format can be downloaded from svhn-voc-annotation-format.

Other Results

1. Raccoon dataset : https://github.com/experiencor/raccoon_dataset

pretrained weight file is stored at raccoon
training set evaluation (160-images)
- fscore / precision / recall: 0.937, 0.963, 0.913
test set evaluation (40-images)
- fscore / precision / recall: 0.631, 0.75, 0.545

Copyright

See LICENSE for details.
This project started at basic-yolo-keras. I refactored the source code structure of basic-yolo-keras and added the CI test. I also applied the SVHN dataset to implement the digit detector. Thanks to the Huynh Ngoc Anh for providing a good project as open source.

yolo-digit-detector's People

Contributors

Stargazers

Watchers

Forkers

praspadm shubhampachori12110095 matsmiley0202 nguyentu1602 mmint6424 tuhinsharma121 james-fu weisiong leidaguo ybenzaki wyw636 salimamoukou mohamadmahdi3 okanlv cbanyungong a202037 roozbehsanaei karandeepdps thamizh-sterio jonfishchan lifeel harshitsilly yongduek rajgthub faust-in-progress janithwanni athinkingneal ningkp toshihiroryuu zsgchinese aadebuger hologerry hanyc0914 wearenocomputer liuwenhaha jigyasa97 briantmali vykintazo shalevy1 amir22010 thangnguyenminh aiwintermuteai omar-fouad 360wcui aqsahassan hdvvip ahmadchaiban silvestre139 seamiacsr cyrilvincent titto-dominic chenshiy1 viniciusgalvaoia victoralphonse95 tanghaojie1 jotietav craigstar hafidzdaud allisonshen douglasrudd faustpy chetak28 helen-research gumdropsteve dnalexen nfajardo leonliao resources-computer-vision qinjianpeople krishnapsrinivasan janetwise tanguyhardelin arifirst panademo etuna rodrigosalles lazaref maxpark gipsyblues teoad95 jamesliao714 haocunming hanju18 nehakv panas18 wjcper2008 jacklee1 anfe9 crowdvest architparnami devquartam koalamuch aravinda89 bar371 rdamus hello-jackytruong ekyuho meabechar gattuzo cjxxu

yolo-digit-detector's Issues

training on svhn

I tried using train_driver.py to train on svhn full training-data with following steps

generated annotations .xml files for each of the svhn image
used the from_scratch.json as config, with same anchors
Run it with 2 warmup and then 5 epochs, it doesn't give any boxes on the test image 1.png
If I use threshold as very low like 0.0001, then some boxes in wrong places are got.
I have used 10 labels 1 to 10 instead of just digits (updated in from_scratch.json as well)

Can you please tell me how have you trained your net that's present in the repo.
Do you think, it need to run for far more epochs, or should use only one label "digits"?

Digit Detection fails after training from scratch (or fine tuning)

Hi,

Digit detection using the pretrained weight file works fine with the weights.h5 file you provided. When I try to train from scratch or fine tune (even with just 2 images you included as your README described), training completes just fine; however, when I evaluate using the evaluate.py script or detection_example notebook, I get the following error when loading the weights:

Traceback (most recent call last):
File "evaluate.py", line 56, in
yolo.load_weights(args.weights)
File "/home/nchecka/Code/ssocr/Yolo-digit-detector/yolo/frontend.py", line 71, in load_weights
self._yolo_network.load_weights(weight_path, by_name=by_name)
File "/home/nchecka/Code/ssocr/Yolo-digit-detector/yolo/backend/network.py", line 58, in load_weights
self._model.load_weights(weight_path, by_name=by_name)
File "/home/nchecka/.virtualenvs/cvml/lib/python3.5/site-packages/keras/engine/topology.py", line 2622, in load_weights
load_weights_from_hdf5_group(f, self.layers)
File "/home/nchecka/.virtualenvs/cvml/lib/python3.5/site-packages/keras/engine/topology.py", line 3142, in load_weights_from_hdf5_group
K.batch_set_value(weight_value_tuples)
File "/home/nchecka/.virtualenvs/cvml/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2247, in batch_set_value
get_session().run(assign_ops, feed_dict=feed_dict)
File "/home/nchecka/.virtualenvs/cvml/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/home/nchecka/.virtualenvs/cvml/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 975, in _run
% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (30, 2048, 1, 1) for Tensor 'Placeholder_318:0', which has shape '(1, 1, 2048, 75)'

Am I missing something? I'm using keras 2.1.1 and tensorflow 1.2.1

Did you train your model from scrath without load any pretrained weight?

Did you train your model from scrath without load pretrained weight of yolo on coco dataset or some other pretrained version of yolo?

Convert svhn annotation to VOC format

requirements.txt added

windows / linux environment

How to train with non-square images without resizing

I'm looking for a way to train the model on rectangle images. Any advice appreciated!

remove feature weights

pre-train model

why the pre-train model has a bad results, the digits are not detect

Detecting Handwritten Digits

I was wondering if we can use this model for handwritten digit recognition in documents

Pre-trained model doesn't detect anything on the SVHN sample images

I am trying to make this notebook work for me (I already solved the 'str' decode and cv2 nonetype problem), and when it comes to predicting numbers on the two sample images, it returns no boxes even with 0.1 threshold. What could be the reason behind it?

Yes, Tensorflow is 1.14.0 and Keras is 2.1.1

(Also this one is quite hilarious)

train model by other labels

hello,
I want to train the model with 38 different labels. what I should change to make the model work?

add svhn dataset description

I'm not able to download weights.h5

I was wondering if there is another link to weights.h5 for the yolo model. The link doesn't work. Thank you for your help.

Is your notebook directly use weight.h5 from google?

Hi Penny

Nice work on digital recognition
I tried your notebook with weight download from google drive
but the prediction shows 50+ boxes detection
Even set threshold to be 0.99, still have 30+ boxes detected

Any hints on repeating your notebook result?

Too much box

I trained the model but it does not detect the numbers. There is too much box.
Please need help....

The Layer has never been called and thus has no defined output shape.

{ "model" : { "architecture": "ResNet50", "input_size": 416, "anchors": [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828], "labels": ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"], "coord_scale" : 1.0, "class_scale" : 1.0, "object_scale" : 5.0, "no_object_scale" : 1.0 }, "pretrained" : { "full": "svhn/weights.h5" }, "train" : { "actual_epoch": 20, "train_image_folder": "tests/dataset/svhn/imgs/", "train_annot_folder": "tests/dataset/svhn/anns/", "train_times": 5, "valid_image_folder": "tests/dataset/svhn/imgs/", "valid_annot_folder": "tests/dataset/svhn/anns/", "valid_times": 1, "batch_size": 2, "learning_rate": 1e-4, "saved_folder": "svhn", "jitter": false, "first_trainable_layer": "input_1", "is_only_detect" : true } }
Python: 3.7
TF: 2.3.0rc1
Keras: 2.4.3
OpenCV: 3.4.2.16

Don't have any idea what this means or why it's happening. If anyone can help it would be greatly appreciated.

refactor Yolo loss module.

'str' object has no attribute 'decode'

Dear Sir,

I'm having this issue when I tried to run the pretrained weights or my owned trained weights.

'AttributeError: 'str' object has no attribute 'decode'

Is the error I finally getting. Your advice would be much appreciated.
Thanks

Loading pre-trained weights in svhn/weights.h5
Traceback (most recent call last):
File "train.py", line 62, in
yolo.load_weights(config['pretrained']['full'], by_name=True)
File "/home/hasantha/Documents/Yolo-digit-detector-master/yolo/frontend.py", line 71, in load_weights
self._yolo_network.load_weights(weight_path, by_name=by_name)
File "/home/hasantha/Documents/Yolo-digit-detector-master/yolo/backend/network.py", line 58, in load_weights
self._model.load_weights(weight_path, by_name=by_name)
File "/home/hasantha/anaconda3/envs/yolo/lib/python3.6/site-packages/keras/engine/topology.py", line 2620, in load_weights
load_weights_from_hdf5_group_by_name(f, self.layers)
File "/home/hasantha/anaconda3/envs/yolo/lib/python3.6/site-packages/keras/engine/topology.py", line 3161, in load_weights_from_hdf5_group_by_name
original_keras_version = f.attrs['keras_version'].decode('utf8')
AttributeError: 'str' object has no attribute 'decode'

SVHN labeling dataset

@penny4860 hi! Do you have SVHN dataset with labels for YOLO?

Yolo weights file

Hi do you have the trained weights file that I can use directly on original darknet? Training from scratch could be time consuming

Detecting more than 2 digits

Hi, I was wondering if this code can detect more than 2 digits per picture. I am training it with pictures with 4 digits, and it has only been detecting 2.

resnet backend

test yolo loss

Training loss is nan

I have a problem with training these because after some time training loss become nan

remove true boxes input tensor

Hey pretrained weights are overfitted.

Can you please confirm as i am getting large number of bound boxes for example images which you have provided.
I am not finetuning just using your weights.

RuntimeError: The layer has never been called and thus has no defined output shape

while running Detection example.ipynb and three steps mentioned in "Training from scratch" I'm facing a runtime error saying

RuntimeError: The layer has never been called and thus has no defined output shape

Can you please check it and help me resolve it ?

incorrect detection outputs by pre-trained model

Incorrect detection outputs when adopting pre-trained model(https://drive.google.com/drive/folders/1Lg3eAPC39G9GwVTCH3XzF73Eok-N-dER) in running detection_example.ipynb on master branch.

Raw outputs are attached below.

Extraneous detection boxes on sample input using pre-trained weight file

See attached images. It looks like the "actual" digits are being picked up correctly, but the results are getting clouded by a large number of '1.00' boxes. The two images reported:

18-boxes are detected
26-boxes are detected

That was with the threshold pushed all the way up to 0.99999. With the threshold at the original 0.3, the number of boxes is 64 and 87.

Training model from scratch on custom dataset performing poorly

input size

what is input_size and anchors in from_scrach.JSON and from_scrach2.JSON ?