Coder Social home page Coder Social logo

shinjayne / shintb Goto Github PK

View Code? Open in Web Editor NEW
90.0 90.0 35.0 112.7 MB

Textboxes : Image Text Detection Model : python package (tensorflow)

Home Page: https://shinjayne.github.io/deeplearning/2017/07/21/text-boxes-paper-review-1.html

Python 100.00%
cnn deep-learning detection ssd tensorflow text-detection textboxes

shintb's People

Contributors

shinjayne avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

shintb's Issues

Trying to access flag --mode before flags were parsed.

Hi !I am learnig tensorflow and textboxes, when I run the command "python3 main.py --mode train
", it outputs

Traceback (most recent call last): File "main.py", line 24, in <module> if FLAGS.mode == "train": File "/usr/local/lib/python3.5/dist-packages/absl/flags/_flagvalues.py", line 488, in __getattr__ raise _exceptions.UnparsedFlagAccessError(error_message) absl.flags._exceptions.UnparsedFlagAccessError: Trying to access flag --mode before flags were parsed.
How can I run the main.py? Thanks!

It seemed that "h" should be like this "h = float(rectangle.get('height')) / imageHeight"???

h = float(rectangle.get('height')) / 300.0

		imageWidth = float(image.find('Resolution').get('x'))
		imageHeight = float(image.find('Resolution').get('y'))
		taggedRectangles = image.find('taggedRectangles')
		for rectangle in taggedRectangles.findall('taggedRectangle'):
			h = float(rectangle.get('height')) / imageHeight
			w = float(rectangle.get('width'))  / imageWidth
			x = float(rectangle.get('x'))      / imageWidth
			y = float(rectangle.get('y'))      / imageHeight

how to runner.image?

1.when I run main.py --mode = image occur this problem,how to solve this? If you know, please call me. Thank you @shinjayne .

OpenCV(3.4.1) Error: Assertion failed (depth == 0 || depth == 2 || depth == 5) in cvtColor, file /io/opencv/modules/imgproc/src/color.cpp, line 11109
Traceback (most recent call last):
File "main.py", line 34, in
runner.image()
File "/home/dell/gen/shinTB/shintb/runner.py", line 260, in image
self.outputdrawer.draw_outputs(test_img, output_boxes , output_confidence , wait=1)
File "/home/dell/gen/shinTB/shintb/output_drawer.py", line 100, in draw_outputs
I = cv2.cvtColor(I, cv2.COLOR_RGB2BGR)
cv2.error: OpenCV(3.4.1) /io/opencv/modules/imgproc/src/color.cpp:11109: error: (-215) depth == 0 || depth == 2 || depth == 5 in function cvtColor

2.when I run "runner.image" occur this problem,how to solve this? If you know, please call me. Thank you @shinjayne .

Textboxes information!
rect_start : (0, 0) // rect_end : (0, 0)
confidence: 0.499774
Textboxes information!
rect_start : (0, 0) // rect_end : (0, 0)
confidence: 0.499774
Textboxes information!
rect_start : (0, 0) // rect_end : (0, 0)
confidence: 0.499774
Textboxes information!
rect_start : (0, 0) // rect_end : (0, 0)
confidence: 0.499774
Textboxes information!
rect_start : (0, 0) // rect_end : (0, 0)
confidence: 0.499774
Textboxes information!
rect_start : (0, 0) // rect_end : (0, 0)
confidence: 0.499774
Textboxes information!
rect_start : (0, 0) // rect_end : (0, 0)
confidence: 0.499774
Textboxes information!
rect_start : (0, 0) // rect_end : (0, 0)
confidence: 0.499774

I'm just wondering...

Is there anybody who succeed in running this code normally? because...

  1. There are some issues regrading test / image function.
  2. From the loss curve image, it seems to diverge, not converge.

I trained on tensorflow 1.4, and did so many tries on hyperparameter tuning (learning rate, batch size, batch normalization, etc...)

But all of my effort did not work well.

IF someone who succeed in training this model, can you upload or commit some trained checkpoints?
OR, maybe I should write all of code from scratch.
2

I use ' runner.test' and I get an error, can you tell me how to solve to solve it?

I run the "runner.test" and I get the error like this ;
" cv2.namedWindow("outputs", cv2.WINDOW_NORMAL)
cv2.error: /io/opencv/modules/highgui/src/window.cpp:565: error: (-2) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Carbon support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function cvNamedWindow "
And I delete " cv2.namedWindow("outputs", cv2.WINDOW_NORMAL) " in the runner.test, then I get this;
" File "/home/tian/tensorflow/example/11/shinTB-master/main.py", line 22, in
runner.test(1000)
File "/home/tian/tensorflow/example/11/shinTB-master/shintb/runner.py", line 196, in test
self.outputdrawer.draw_outputs(test_imgs[0], output_boxes, output_confidence, wait=1)
File "/home/tian/tensorflow/example/11/shinTB-master/shintb/output_drawer.py", line 104, in draw_outputs
I = cv2.rectangle(I, rect_start, rect_end, (255, 0, 0) , 5 )
TypeError: integer argument expected, got float "

And the returned value is :
" FILTERED BOXES INFO : [([-inf, -inf, inf, inf], 0.49975014, 0)...
PICKED BOXES INFO : [([-inf, -inf, inf, inf], 0.49975014, 0)] "

can you tell me how to solve this problem?
Thank you for your help.

question about runner.test()

Hi, Thank you for great code!!, I am newbie in Textboxes.
After training runner.train(..), I have evaluated runner.test(2) and then I have error like:


TypeError Traceback (most recent call last)
in ()
----> 1 runner.test(2)

E:\shinTB\shintb\runner.py in test(self, iter)
194 output_boxes, output_confidence = self.outputdrawer.format_output(pred_conf[0], pred_loc[0])
195
--> 196 self.outputdrawer.draw_outputs(test_imgs[0], output_boxes, output_confidence, wait=1)
197
198 step += 1

E:\shinTB\shintb\output_drawer.py in draw_outputs(self, img, boxes, confidences, wait)
102 rect_start = (x,y)
103 rect_end = (x+w, y+h)
--> 104 I = cv2.rectangle(I, rect_start, rect_end, (255, 0, 0) , 5 )
105
106 print("Textboxes information!")

TypeError: integer argument expected, got float

Can you fix it?

Training model

Do you mind giving me the training model for me (ckpt file)?

The errors when I did test.

The following is code that I used for testing the model after training.
from config import config
from shintb import graph_drawer, default_box_control, svt_data_loader, runner,output_drawer
import tensorflow as tf
flags = tf.app.flags
FLAGS = flags.FLAGS
graphdrawer = graph_drawer.GraphDrawer(config)
dataloader = svt_data_loader.SVTDataLoader('./svt2/train.xml','./svt2/test.xml')
dbcontrol = default_box_control.DefaultBoxControl(config, graphdrawer)
outputdrawer = output_drawer.OutputDrawer(config, dbcontrol)
runner = runner.Runner(config, graphdrawer, dataloader, dbcontrol,outputdrawer)
flags.DEFINE_integer("iter", 10000, "iteration for job")
runner.test(FLAGS.iter)

But I have the following errors:
/home/jsj/shinTB/shintb/utils/box_calculation.py:52: RuntimeWarning: invalid value encountered in double_scalars
right = min(r1[0] + r1[2], r2[0] + r2[2])
/home/jsj/shinTB/shintb/utils/box_calculation.py:53: RuntimeWarning: invalid value encountered in double_scalars
bottom = min(r1[1] + r1[3], r2[1] + r2[3])
PICKED BOXES INFO : [([-inf, 4260720.561588916, inf, 0.044721359549995794], 0.49975014, 0)]
/home/jsj/shinTB/shintb/output_drawer.py:103: RuntimeWarning: invalid value encountered in double_scalars
rect_end = (x+w, y+h)
Traceback (most recent call last):
File "/home/jsj/shinTB/new_file_for_test.py", line 12, in
runner.test(FLAGS.iter)
File "/home/jsj/shinTB/shintb/runner.py", line 196, in test
self.outputdrawer.draw_outputs(test_imgs[0], output_boxes, output_confidence, wait=1)
File "/home/jsj/shinTB/shintb/output_drawer.py", line 104, in draw_outputs
I = cv2.rectangle(I, rect_start, rect_end, (255, 0, 0) , 5 )
TypeError: integer argument expected, got float.

Can you help me fix the problem?
@shinjayne

Model not correct

I tried to train and test the model but I think that there are many and many errors within it. The most important is that confidence and delta are predicted as really big numbers (i.e. 1.8e+8). This make the project not usable until a real deep analysis of it

missing 1 required positional argument: 'outputdrawer'

Getting this error when I run "runner = runner.Runner(config, graphdrawer, dataloader, dbcontrol)" :-
Traceback (most recent call last):
File "", line 1, in
TypeError: init() missing 1 required positional argument: 'outputdrawer'

Error happens while saving checkpoint. How to fix it? Thanks a lot

GLOBAL STEP : 1000 / LEARNING RATE : 0.0008 / LOSS : 1.18235 ( 0.27370095253 secs)
Traceback (most recent call last):
File "main.py", line 30, in
runner.train(FLAGS.jobname, FLAGS.iter)
File "/home/wanghz/shinTB/shintb/runner.py", line 101, in train
ckpt_path = self.saver.save(self.sess, "%s.ckpt" % (c["saved_dir"]+"/"+jobname),global_step)
File "/opt/DL/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1369, in save
self.last_checkpoints, latest_filename)
File "/opt/DL/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 796, in update_checkpoint_state
text_format.MessageToString(ckpt))
File "/opt/DL/tensorflow/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 353, in atomic_write_string_to_file
rename(temp_pathname, filename, overwrite=True)
File "/opt/DL/tensorflow/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 335, in rename
compat.as_bytes(oldname), compat.as_bytes(newname), overwrite, status)
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/opt/DL/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: saved/checkpoint.tmp205d246fabab4cb488968fcc6989f6d7

Module output_drawer has no attribute 'formal_output'

Has anyone encountered this error when tryint to "test" or "image"?

Traceback (most recent call last):
  File "testShinTB.py", line 15, in <module>
    runner.test(5)
  File "/home/great/Documentos/spi/shinTB/shintb/runner.py", line 194, in test
    output_boxes, output_confidence = self.outputdrawer.format_output(pred_conf[0], pred_loc[0])
AttributeError: module 'shintb.output_drawer' has no attribute 'format_output'

I am just tryint to run an example based on the README.

Some questions about code, HELP!

@shinjayne
Thanks a lot for your implementation and contribution!

First of all, I like the way you write program. I spent couple days to read your programs. However, there are some questions that I could not understand after deliberation.

Here is the question

  1. line 55-56 in default_box_control.py, I print all information of scale, w, and h. however, what's the really meaning of default_w and default_h, according to line 57,58, where c_x, and c_y is normalized between 0 and 1. Is it possible that w and h will be normalized with respect to width and height of the original image?
    default_w = scale * np.sqrt(rs[i])
    default_h = scale / np.sqrt(rs[i])

scale: 0.10, box_ratio: 1.00, w: 0.10 h: 0.10
scale: 0.10, box_ratio: 2.00, w: 0.14 h: 0.07
scale: 0.10, box_ratio: 3.00, w: 0.17 h: 0.06
scale: 0.10, box_ratio: 5.00, w: 0.22 h: 0.04
scale: 0.10, box_ratio: 7.00, w: 0.26 h: 0.04
scale: 0.10, box_ratio: 10.00, w: 0.32 h: 0.03
scale: 0.27, box_ratio: 1.00, w: 0.27 h: 0.27
scale: 0.27, box_ratio: 2.00, w: 0.38 h: 0.19
scale: 0.27, box_ratio: 3.00, w: 0.47 h: 0.16
scale: 0.27, box_ratio: 5.00, w: 0.60 h: 0.12
scale: 0.27, box_ratio: 7.00, w: 0.71 h: 0.10
scale: 0.27, box_ratio: 10.00, w: 0.85 h: 0.09
scale: 0.44, box_ratio: 1.00, w: 0.44 h: 0.44
scale: 0.44, box_ratio: 2.00, w: 0.62 h: 0.31
scale: 0.44, box_ratio: 3.00, w: 0.76 h: 0.25
scale: 0.44, box_ratio: 5.00, w: 0.98 h: 0.20
scale: 0.44, box_ratio: 7.00, w: 1.16 h: 0.17
scale: 0.44, box_ratio: 10.00, w: 1.39 h: 0.14
scale: 0.61, box_ratio: 1.00, w: 0.61 h: 0.61
scale: 0.61, box_ratio: 2.00, w: 0.86 h: 0.43
scale: 0.61, box_ratio: 3.00, w: 1.06 h: 0.35
scale: 0.61, box_ratio: 5.00, w: 1.36 h: 0.27
scale: 0.61, box_ratio: 7.00, w: 1.61 h: 0.23
scale: 0.61, box_ratio: 10.00, w: 1.93 h: 0.19
scale: 0.78, box_ratio: 1.00, w: 0.78 h: 0.78
scale: 0.78, box_ratio: 2.00, w: 1.10 h: 0.55
scale: 0.78, box_ratio: 3.00, w: 1.35 h: 0.45
scale: 0.78, box_ratio: 5.00, w: 1.74 h: 0.35
scale: 0.78, box_ratio: 7.00, w: 2.06 h: 0.29
scale: 0.78, box_ratio: 10.00, w: 2.47 h: 0.25
scale: 0.95, box_ratio: 1.00, w: 0.95 h: 0.95
scale: 0.95, box_ratio: 2.00, w: 1.34 h: 0.67
scale: 0.95, box_ratio: 3.00, w: 1.65 h: 0.55
scale: 0.95, box_ratio: 5.00, w: 2.12 h: 0.42
scale: 0.95, box_ratio: 7.00, w: 2.51 h: 0.36
scale: 0.95, box_ratio: 10.00, w: 3.00 h: 0.30

  1. line 35-38 in svt_data_loader.py, why h is divided by 300 instead of the height and width of the original image?
    h = float(rectangle.get('height')) / 300.0
    w = float(rectangle.get('width')) / 300.0
    x = float(rectangle.get('x')) / 300.0
    y = float(rectangle.get('y')) / 300.0

BTW, how many epochs will lead to converge based on the given training data?

Looking forward to Your reply. Thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.