Coder Social home page Coder Social logo

eragonruan / text-detection-ctpn Goto Github PK

View Code? Open in Web Editor NEW
3.4K 3.4K 1.3K 337.88 MB

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

License: MIT License

Python 91.50% Shell 0.07% Cython 5.17% C++ 0.09% Cuda 3.17%
ctpn id-card ocr robust-reading tensorflow text-detection

text-detection-ctpn's People

Contributors

eragonruan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

text-detection-ctpn's Issues

fatal error: numpy/arrayobject.h: No such file or directory

When I try to build nms implement and run ./make.sh, there was an error happened:
bbox.c:517:31: fatal error: numpy/arrayobject.h: No such file or directory
It seems a path problem. But where should I modify the path.
The path of numpy in my enviroment is:
'/usr/local/lib/python2.7/dist-packages/numpy/core/include'

kernel not found in checkpoint

hi,I got this problem....
Loading network VGGnet_test...
Traceback (most recent call last):
File "./ctpn/demo.py", line 90, in
saver.restore(sess, os.path.join(os.getcwd(),"checkpoints/model_final.ckpt"))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1548, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key lstm_o/rnn/basic_lstm_cell/kernel not found in checkpoint
[[Node: save/RestoreV2_28 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_28/tensor_names, save/RestoreV2_28/shape_and_slices)]]

Caused by op u'save/RestoreV2_28', defined at:
File "./ctpn/demo.py", line 89, in
saver = tf.train.Saver()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1139, in init
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1170, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 691, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 247, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 640, in restore_v2
dtypes=dtypes, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()

NotFoundError (see above for traceback): Key lstm_o/rnn/basic_lstm_cell/kernel not found in checkpoint
[[Node: save/RestoreV2_28 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_28/tensor_names, save/RestoreV2_28/shape_and_slices)]]

so I search net and found this Issue
I change u demo code like this:

if name == 'main':
if os.path.exists("data/results/"):
shutil.rmtree("data/results/")
os.makedirs("data/results/")

cfg.TEST.HAS_RPN = True  # Use RPN for proposals
# init session
config = tf.ConfigProto(allow_soft_placement=True)
sess = tf.Session(config=config)
# load network
net = get_network("VGGnet_test")
# load model
print ('Loading network {:s}... '.format("VGGnet_test")),

I changed this part

OLD_CHECKPOINT_FILE = "checkpoints/model_final.ckpt"
#NEW_CHECKPOINT_FILE = "model_final.ckpt"  
vars_to_rename = {
    "lstm/basic_lstm_cell/weights": "lstm/basic_lstm_cell/kernel",
    "lstm/basic_lstm_cell/biases": "lstm/basic_lstm_cell/bias",
}
new_checkpoint_vars = {}
reader = tf.train.NewCheckpointReader(OLD_CHECKPOINT_FILE)
for old_name in reader.get_variable_to_shape_map():
    if old_name in vars_to_rename:
    new_name = vars_to_rename[old_name]
    else:
    new_name = old_name
    new_checkpoint_vars[new_name] = tf.Variable(reader.get_tensor(old_name))

saver = tf.train.Saver(new_checkpoint_vars)
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    saver.restore(sess, os.path.join(os.getcwd(),"checkpoints/model_final.ckpt"))
    print 'done'
    im = 128 * np.ones((300, 300, 3), dtype=np.uint8)
    for i in xrange(2):
        _, _ = test_ctpn(sess, net, im)
    im_names = glob.glob(os.path.join(cfg.DATA_DIR, 'demo', '*.png')) + \
           glob.glob(os.path.join(cfg.DATA_DIR, 'demo', '*.jpg'))
    for im_name in im_names:
    print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
    #print('Demo for {:s}'.format(im_name))
    ctpn(sess, net, im_name)

It was already run , but in ctpn line is empty .I thought threshold may a little high that I make it lower.
so the bounding box didn't find right proposal.

btw can u speak chinese`````:)

RuntimeError: Graph is finalized and cannot be modified.

Using config:
{'ANCHOR_SCALES': [16],
'DATA_DIR': '/home/text-detection-ctpn/data',
'DEDUP_BOXES': 0.0625,
'EPS': 1e-14,
'EXP_DIR': 'ctpn_end2end',
'GPU_ID': 0,
'IS_EXTRAPOLATING': True,
'IS_MULTISCALE': False,
'IS_RPN': True,
'LOG_DIR': 'ctpn',
'MATLAB': 'matlab',
'MODELS_DIR': '/home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn/models/pascal_voc',
'NCLASSES': 2,
'NET_NAME': 'VGGnet',
'PIXEL_MEANS': array([[[ 102.9801, 115.9465, 122.7717]]]),
'REGION_PROPOSAL': 'RPN',
'RNG_SEED': 3,
'ROOT_DIR': '/home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn',
'SUBCLS_NAME': 'voxel_exemplars',
'TEST': {'BBOX_REG': True,
'HAS_RPN': True,
'MAX_SIZE': 1000,
'NMS': 0.3,
'PROPOSAL_METHOD': 'selective_search',
'RPN_MIN_SIZE': 8,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 1000,
'RPN_PRE_NMS_TOP_N': 12000,
'SCALES': [600],
'SVM': False},
'TRAIN': {'ASPECTS': [1],
'ASPECT_GROUPING': True,
'BATCH_SIZE': 300,
'BBOX_INSIDE_WEIGHTS': [1, 1, 1, 1],
'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2],
'BBOX_NORMALIZE_TARGETS': True,
'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True,
'BBOX_REG': True,
'BBOX_THRESH': 0.5,
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.0,
'DISPLAY': 10,
'DONTCARE_AREA_INTERSECTION_HI': 0.5,
'FG_FRACTION': 0.3,
'FG_THRESH': 0.5,
'GAMMA': 0.1,
'HAS_RPN': True,
'IMS_PER_BATCH': 1,
'KERNEL_SIZE': 5,
'LEARNING_RATE': 0.001,
'LOG_IMAGE_ITERS': 100,
'MAX_SIZE': 1000,
'MOMENTUM': 0.9,
'OHEM': False,
'PRECLUDE_HARD_SAMPLES': True,
'PROPOSAL_METHOD': 'gt',
'RANDOM_DOWNSAMPLE': False,
'RPN_BATCHSIZE': 256,
'RPN_BBOX_INSIDE_WEIGHTS': [1, 1, 1, 1],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 8,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 2000,
'RPN_PRE_NMS_TOP_N': 12000,
'SCALES': [600],
'SCALES_BASE': [0.25, 0.5, 1.0, 2.0, 3.0],
'SNAPSHOT_INFIX': '',
'SNAPSHOT_ITERS': 1000,
'SNAPSHOT_PREFIX': 'VGGnet_fast_rcnn',
'SOLVER': 'Momentum',
'STEPSIZE': 50,
'USE_FLIPPED': True,
'USE_PREFETCH': False,
'WEIGHT_DECAY': 0.0005},
'USE_GPU_NMS': True}
<bound method pascal_voc.default_roidb of <lib.datasets.pascal_voc.pascal_voc object at 0x7f04ee1b6910>>
Loaded dataset voc_2007_trainval for training
Appending horizontally-flipped training examples...
voc_2007_trainval gt roidb loaded from /home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn/data/cache/voc_2007_trainval_gt_roidb.pkl
done
Preparing training data...
done
Output will be saved to /home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn/output/ctpn_end2end/voc_2007_trainval
Logs will be saved to /home/peng_yuxiang/tensorflow_tests/code/CPTN_Tensorflow/text-detection-ctpn/logs/ctpn/voc_2007_trainval/2017-10-26-16-03-04
/gpu:0
Tensor("data:0", shape=(?, ?, ?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
WARNING:tensorflow:<tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.BasicLSTMCell object at 0x7f0516b88450>: Using a concatenated state is slower and will soon be deprecated. Use state_is_tuple=True.
Tensor("lstm_o/Reshape:0", shape=(?, ?, ?, 128), dtype=float32)
Tensor("lstm_o/Reshape:0", shape=(?, ?, ?, 128), dtype=float32)
Tensor("rpn_cls_score/Reshape:0", shape=(?, ?, ?, 20), dtype=float32)
Tensor("gt_boxes:0", shape=(?, 5), dtype=float32)
Tensor("gt_ishard:0", shape=(?,), dtype=int32)
Tensor("dontcare_areas:0", shape=(?, 4), dtype=float32)
Tensor("im_info:0", shape=(?, 3), dtype=float32)
Tensor("rpn_cls_score/Reshape:0", shape=(?, ?, ?, 20), dtype=float32)
Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
Tensor("Reshape_5:0", shape=(?, ?, ?, 20), dtype=float32)
Tensor("rpn_bbox_pred/Reshape:0", shape=(?, ?, ?, 40), dtype=float32)
Tensor("im_info:0", shape=(?, 3), dtype=float32)
Tensor("rpn_rois_data/Reshape:0", shape=(?, 5), dtype=float32)
Tensor("rpn_rois_data/PyFunc:1", dtype=float32)
Tensor("gt_boxes:0", shape=(?, 5), dtype=float32)
Tensor("gt_ishard:0", shape=(?,), dtype=int32)
Tensor("dontcare_areas:0", shape=(?, 4), dtype=float32)
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
Computing bounding-box regression targets...
bbox target means:
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]
[ 0. 0. 0. 0.]
bbox target stdevs:
[[ 0.1 0.1 0.2 0.2]
[ 0.1 0.1 0.2 0.2]]
[ 0.1 0.1 0.2 0.2]
Normalizing targets
done
Solving...
/home/peng_yuxiang/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py:91: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Loading pretrained model weights from data/pretrain/VGG_imagenet.npy
assign pretrain model weights to conv5_1
assign pretrain model biases to conv5_1
ignore fc6
ignore fc6
assign pretrain model weights to conv5_3
assign pretrain model biases to conv5_3
ignore fc7
ignore fc7
ignore fc8
ignore fc8
assign pretrain model weights to conv5_2
assign pretrain model biases to conv5_2
assign pretrain model weights to conv4_1
assign pretrain model biases to conv4_1
assign pretrain model weights to conv4_2
assign pretrain model biases to conv4_2
assign pretrain model weights to conv4_3
assign pretrain model biases to conv4_3
assign pretrain model weights to conv3_3
assign pretrain model biases to conv3_3
assign pretrain model weights to conv3_2
assign pretrain model biases to conv3_2
assign pretrain model weights to conv3_1
assign pretrain model biases to conv3_1
assign pretrain model weights to conv1_1
assign pretrain model biases to conv1_1
assign pretrain model weights to conv1_2
assign pretrain model biases to conv1_2
assign pretrain model weights to conv2_2
assign pretrain model biases to conv2_2
assign pretrain model weights to conv2_1
assign pretrain model biases to conv2_1

iter: 0 / 180000, total loss: 1.8194, rpn_loss_cls: 0.6936, rpn_loss_box: 0.0755, rpn_loss: 1.0502, lr: 0.001000
speed: 18.390s / iter
image: img_5836.jpg iter: 10 / 180000, total loss: 1.6851, rpn_loss_cls: 0.6919, rpn_loss_box: 0.0470, rpn_loss: 0.9462, lr: 0.001000
speed: 20.903s / iter
image: img_2801.jpg iter: 20 / 180000, total loss: 2.7015, rpn_loss_cls: 0.6911, rpn_loss_box: 0.0254, rpn_loss: 1.9850, lr: 0.001000
speed: 18.728s / iter
image: img_1023.jpg iter: 30 / 180000, total loss: 1.9418, rpn_loss_cls: 0.6903, rpn_loss_box: 0.0254, rpn_loss: 1.2261, lr: 0.001000
speed: 14.129s / iter
image: img_3455.jpg iter: 40 / 180000, total loss: 1.7795, rpn_loss_cls: 0.6869, rpn_loss_box: 0.0630, rpn_loss: 1.0296, lr: 0.001000
speed: 16.138s / iter
image: img_1155.jpg

Traceback (most recent call last):
File "ctpn/train_net.py", line 39, in
restore=bool(int(0)))
File "/home/text-detection-ctpn/lib/fast_rcnn/train.py", line 338, in train_net
sw.train_model(sess, max_iters, restore=restore)
File "/home/text-detection-ctpn/lib/fast_rcnn/train.py", line 167, in train_model
sess.run(tf.assign(lr, lr.eval() * cfg.TRAIN.GAMMA))
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
use_locking=use_locking, name=name)
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 491, in apply_op
preferred_dtype=default_dtype)
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 716, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 176, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 169, in constant
attrs={"value": tensor_value, "dtype": dtype_value}, name=name).outputs[0]
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2354, in create_op
self._check_not_finalized()
File "/home/tensorflow1.0/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2077, in _check_not_finalized
raise RuntimeError("Graph is finalized and cannot be modified.")
RuntimeError: Graph is finalized and cannot be modified.

(1) Since I test is on CPU, I modify the text.yml -> STEPSIZE=50; for a few steps of training test.
(2) Model is load properly and the train loss changed, so the train step should be right;
But, when get the end, RuntimeError: Graph is finalized and cannot be modified.

AttributeError: 'module' object has no attribute 'ndarray'

when i try to train my own data, i faced this error:

Traceback (most recent call last):
File "ctpn/train_net.py", line 41, in
restore=bool(int(1)))
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/fast_rcnn/train.py", line 234, in train_net
sw = SolverWrapper(sess, network, imdb, roidb, output_dir, logdir= log_dir, pretrained_model=pretrained_model)
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/fast_rcnn/train.py", line 27, in init
self.bbox_means, self.bbox_stds = rdl_roidb.add_bbox_regression_targets(roidb)
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/roi_data_layer/roidb.py", line 58, in add_bbox_regression_targets
_compute_targets(rois, max_overlaps, max_classes)
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/roi_data_layer/roidb.py", line 136, in _compute_targets
targets[ex_inds, 1:] = bbox_transform(ex_rois, gt_rois)
File "/data/resys/lvyi/exp/OCR_detect/text-detection-ctpn/lib/fast_rcnn/bbox_transform.py", line 21, in bbox_transform
assert np.min(ex_widths) > 0.1 and np.min(ex_heights) > 0.1,
File "/data/resys/var/python2.7.3/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2362, in amin
if type(a) is not mu.ndarray:
AttributeError: 'module' object has no attribute 'ndarray'
Segmentation fault (core dumped)

Running with CPU only

Hello everyone, it is the first time I could successfully run a demo. Many thanks to the author.

To use cpu only, I follow the author's instruction and make the following modifications:
(1) Set "USE_GPU_NMS " in the file ./ctpn/text.yml as "False"
(2) Set the "__C.USE_GPU_NMS" in the file ./lib/fast_rcnn/config.py as "False";
(3) Comment out the line "from lib.utils.gpu_nms import gpu_nms" in the file ./lib/fast_rcnn/nms_wrapper.py;
(4) To rebuild the setup.py:

The author provides the new code of setup.py for cpu only:

from Cython.Build import cythonize
import numpy as np
from distutils.core import setup

try:
numpy_include = np.get_include()
except AttributeError:
numpy_include = np.get_numpy_include()

setup(
ext_modules=cythonize(["bbox.pyx","cython_nms.pyx"],include_dirs=[numpy_include]),
)

(a) execute export CFLAGS=-I/home/zhao181/ProGram1/anaconda2/lib/python2.7/site-packages/numpy/core/include
you should use your own numpy path.

(b) cd xxx/text-detection-ctpn-master/lib/utils
and execute:python setup.py build

(c) copy the .so file from the "build" directory to the
xxx/text-detection-ctpn-master/lib/utils.

(5) cd xxx/text-detection-ctpn-master
and execute: python ./ctpn/demo.py

By the way, I am running under ubuntu 16.04 with
Anaconda2-4.2.0-Linux-x86_64.sh and tensorflow-1.3.0-cp27-cp27mu-manylinux1_x86_64.whl(cpu).

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. This may consume a large amount of memory.

when i try to train my own datasets, i faced this core dump,

Computing bounding-box regression targets...
bbox target means:
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]
[ 0. 0. 0. 0.]
bbox target stdevs:
[[ 0.1 0.1 0.2 0.2]
[ 0.1 0.1 0.2 0.2]]
[ 0.1 0.1 0.2 0.2]
Normalizing targets
done
Solving...
/data/resys/var/python2.7.3/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py:91: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Segmentation fault (core dumped)

this fault caused by the function train_model() in lib/fast-rcnn/train.py , when it run train_op=opt.apply_gradients(list(zip(grads,tvars),global_step=global_step)) .

have you ever faced this error?

There are confusing parts.

Hello.
eragonruan!
I have some question.

I tried running the "./ctpn/train_net.py" file in Google Cloud Compute.
I have modified max_iters = 60000 to max_iters = 500 and ran it. (for test)
The output folder is created and a new checkpoint is created.
The new checkpoint folder has been move to "text-detection-ctpn/checkpoint". Then I ran "demo.py" and it did not have any errors.
But, There is nothing in the results folder.

(1) Is it because I set the training smaller? (max_iters = 500)
(2) new training -> The new checkpoint is created. but what is "VGG_imagenet.npy" file?
You have provided two pre-trained images.
one, checkpoints. ex) VGGnet_fast_rcnn_iter_50000.ckpt.data-00000-of-00001
two, data/pretrain. ex) VGG_imagenet.npy
checkpoints/train-informations vs data/pretrain/VGG_imagenet.npy
What is the difference between the two?
and Why I need VGG_imagenet.npy and How make own VGG_imagenet.npy.

I would be grateful if you could answer the question.
Thank you sooo much.

Bug In cpu only

在win下运行demo,报错: Process finished with exit code -1073741571, 这种情况怎么解决?

about the detection speed

it take about average 2s every pic, normally how long will it take? if i want change it to gpu,where will i modify?

model can't converge

use the same dataset, after updating the newest code, it can't converge, why?

Now It is work! But I have a question about VGG16Net.

After failing to train for the first time, I did not sleep for two days and did my research.
I found the optimal value by modifying the parameter part of the text.yml file. (for my computer)
I was able to solve it with your help. Thank you so much.

I am currently researching on detecting the text in business card.
The business card image contains text, face, and company logo.
It is very different from nature image. The elements are very limited.

Anyway, (Q.1)what data did you use to train VGG16NET in advance?
(Q.2)Would it be okay if both VGG16net and fast_rcnn(CTPN) were trained in business card image?
What are your thoughts?

example)
100 million text, human face, QR code, company logo images-> training -> VGnet.
100 million business card images -> training -> CTPN.

(Q.3) If I train VGG16NET, which soure code should I use?

Thank you for your continued help.
You will receive many blessings.

output of classfication is same !!!

@eragonruan i am sorry to bother you again!!! i have maybe 3500 image,and i set batch size 256 and lr 0.0001,after 5000 iter, the output of classfilation is always same, i do not think it is caused by overfit, because i test image in training dataset,and it can detect nothing. but i test output of classfication is ok at early of training!!! do you know what happened?

After CTPN. What is your idea?

Hello. eragonruan!
I was very impressed with your code.
One more time. Thank you so much.

I have a problem.
When the image passes through the CTPN, a green box is created.
If I put this image in the OCR engine, how should I separate it (green box)?
My idea is to use OpenCV.
But the green box is on the text. so it hides the text.
Can I draw a thin line to solve the problem?

BiLSTM and Training Time

Thanks for sharing your implementation with us. I have implemented CTPN with Caffe which failed to converge when adding LSTM.
First, I want to ask whether you have added the BiLSTM in your code or not. I am new to tensorflow. After looking at the code, I think you just implement the LSTM not the BiLSTM, is it right ?
Second, I want to ask how long did you train your model? I have run the train script of your programs on a GPU device. It seems that it would take 5-6 days to finish the first 180000 iterations.

Thanks very much.

there is a wrong when i train the model

Hi, I train the model with the dataset VOC2007. I modify the NCLASSES from 2 to 21 in ctpn/text.yml.
But it goes wrong after about 100 batchs.
here is the print.

assign pretrain model biases to conv1_1
assign pretrain model weights to conv1_2
assign pretrain model biases to conv1_2
assign pretrain model weights to conv2_2
assign pretrain model biases to conv2_2
assign pretrain model weights to conv2_1
assign pretrain model biases to conv2_1
iter: 0 / 180000, total loss: 3.0214, rpn_loss_cls: 0.6958, rpn_loss_box: 0.7112, rpn_loss: 1.6144, lr: 0.001000
speed: 4.622s / iter
image: 003311.jpg
iter: 10 / 180000, total loss: 3.3073, rpn_loss_cls: 0.6902, rpn_loss_box: 0.8598, rpn_loss: 1.7573, lr: 0.001000
speed: 3.948s / iter
image: 008127.jpg
iter: 20 / 180000, total loss: 2.6549, rpn_loss_cls: 0.6904, rpn_loss_box: 0.5334, rpn_loss: 1.4311, lr: 0.001000
speed: 1.100s / iter
image: 009458.jpg
iter: 30 / 180000, total loss: 3.3055, rpn_loss_cls: 0.6721, rpn_loss_box: 0.8771, rpn_loss: 1.7563, lr: 0.001000
speed: 1.115s / iter
image: 003082.jpg
iter: 40 / 180000, total loss: 3.7248, rpn_loss_cls: 0.6609, rpn_loss_box: 1.0980, rpn_loss: 1.9660, lr: 0.001000
speed: 3.983s / iter
image: 007029.jpg
iter: 50 / 180000, total loss: 2.5844, rpn_loss_cls: 0.6593, rpn_loss_box: 0.5294, rpn_loss: 1.3956, lr: 0.001000
speed: 1.080s / iter
2017-09-27 10:41:52.117726: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
2017-09-27 10:41:52.118615: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
	 [[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]
2017-09-27 10:41:52.118645: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
	 [[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]
2017-09-27 10:41:52.118648: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
	 [[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]
2017-09-27 10:41:52.118656: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
	 [[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]

and the Exceptions is below:

File "/home/text-detection-ctpn/ctpn/train_net.py", line 39, in <module>
  restore=bool(int(0)))
File "/home/text-detection-ctpn/lib/fast_rcnn/train.py", line 341, in train_net
  sw.train_model(sess, max_iters, restore=restore)
File "/home/text-detection-ctpn/lib/fast_rcnn/train.py", line 215, in train_model
  summary_str, _= sess.run(fetches=fetch_list, feed_dict=feed_dict)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
  run_metadata_ptr)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
  feed_dict_tensor, options, run_metadata)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
  options, run_metadata)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
  raise type(e)(node_def, op, message)

tensorflow.python.framework.errors_impl.InvalidArgumentError: exceptions.ValueError: zero-size array to reduction operation minimum which has no identity
[[Node: roi-data/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], token="pyfunc_2", _device="/job:localhost/replica:0/task:0/cpu:0"](rpn_rois_data/Reshape/_123, rpn_rois_data/PyFunc:1, _arg_gt_boxes_0_3, _arg_gt_ishard_0_4, _arg_dontcare_areas_0_2, roi-data/PyFunc/input_5)]]
[[Node: roi-data/PyFunc/_137 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_630_roi-data/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
Caused by op u'roi-data/PyFunc', defined at:
File "/usr/lib/wingide6/bin/wingdb.py", line 978, in <module>
main()
File "/usr/lib/wingide6/bin/wingdb.py", line 918, in main
netserver.abstract.kFileSystemEncoding, orig_sys_path)
File "/usr/lib/wingide6/bin/wingdb.py", line 766, in DebugFile

I am exhausted to fix the bug, any suggestion?

can not detect anything

I used your lasted codes and your data, but it can't convergence at all. Please tell me why?

Bus error: 10

When I run the demo.py, it appears this error:
Loading network VGGnet_test... Restoring from checkpoints/VGGnet_fast_rcnn_iter_50000.ckpt... done
done.
Bus error: 10

I compile the code on MAC pro.

What is the format of gt file?

I read the code in split_label.py and found the label information for each image is in gt file in gt_path, but I do not know the format you are using in this file. Can you kindly share the format of that? Thx.

What files should be in the gt_path folder?

Hello. eragonruan !
I was very surprised to see your code.
Thank you for sharing a good source.
I want to train my own image data.
You told me to edit the file split_label.py ("path" and "gt_path").
I think there is an image in the "path" folder.
But, I don't know about "gt_path".

What files should be in the "gt_path" folder? and
Can you give me some sample? (one image and gt_path.txt)

Thanks you!

Problem with CPU model

Hi author,thanks. I have only a cpu and want to run the demo.py.
According to the README.m, it seems that, in CPU mode, we need to
do the following two things before running:
(a) Set USE_GPU_NMS=False in ctpn/text.yml;
(b) Execute chmod +x make.sh and ./make.sh in the directory lib/utils.
Is this right?

However, when I run the command ./make.sh, it complains as follows:

Traceback (most recent call last):
File "setup.py", line 39, in
CUDA = locate_cuda()
File "setup.py", line 26, in locate_cuda
raise EnvironmentError('The nvcc binary could not be '
OSError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME
mv: cannot stat 'utils/*': No such file or directory

Then what should I do to run the demo with only a CPU?

I have been struck for a whole day. I am looking forward to your reply.

There is no results saved.

i run the demo.py, but there is no results saved.

the parameter 'inds' in function 'save_results' is always equal to 0.

any problem?

how to deal with skew box

hello, i saw your blog and you optimize CTPN so that it can detect skew words, but i want to know if you add a angle in loss function, how do you deal groundtruh?are you dividing groundtruth into small box in sknew direction? @eragonruan

How To Setting Training Data Question

Excuse me, Sorry, I'm a green hand, When I use this program, I don't know How to Setting Training Data, When I follow "Prepare" Step, I didn't found the Training Data in "Baidu Yun", Can you tell me where the Training Data (what's link I can got the Training Data) or How to i can Set the Training Data?

why is lr so low?

hello,i have a question, i try ro train model and if i set lr 0.00001, it will converge slowly,but if i set lr higher, it will never converge and even nan, in fact, i feel that 0.00001 is so low that model can learning nothing, why does it happen?

loss become nan, invalid value encountered in log

text-detection-ctpn-master1/lib/fast_rcnn/bbox_transform.py:29: RuntimeWarning: invalid value encountered in log
targets_dh = np.log(gt_heights / ex_heights)

in my training process,this warning come out and the loss become nan. why i get this error?

Problem

2017-11-05 04:40:59.256258: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /media/D/code/OCR/text-detection-ctpn/output/ctpn_end2end/voc_2007_trainval/VGGnet_fast_rcnn_iter_50000.ckpt: Not found: /media/D/code/OCR/text-detection-ctpn/output/ctpn_end2end/voc_2007_trainval; No such file or directory

how can I fix it?

ModuleNotFoundError: No module named 'lib'

When I try to execute the demo.py im receiving this error

File "demo.py", line 9, in
from lib.networks.factory import get_network
ModuleNotFoundError: No module named 'lib'

Can someone please help me with this ?

Key lstm_o/rnn/basic_lstm_cell/kernel not found in checkpoint

Dear eragonruan,
I think your implementation of ctpn is an excellent job. When I run your demo, I have loaded other layers successfully except the lstm layer, It shows the output following:
`Loading network VGGnet_test...
Traceback (most recent call last):
File "ctpn/demo.py", line 89, in
saver.restore(sess, os.path.join(os.getcwd(),"checkpoints/model_final.ckpt"))
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1560, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key lstm_o/rnn/basic_lstm_cell/kernel not found in checkpoint
[[Node: save/RestoreV2_28 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_28/tensor_names, save/RestoreV2_28/shape_and_slices)]]

Caused by op u'save/RestoreV2_28', defined at:
File "ctpn/demo.py", line 88, in
saver = tf.train.Saver()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1140, in init
self.build()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1172, in build
filename=self._filename)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 688, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 407, in _AddRestoreOps
tensors = self.restore_op(filename_tensor, saveable, preferred_shard)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 247, in restore_op
[spec.tensor.dtype])[0])
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 663, in restore_v2
dtypes=dtypes, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Key lstm_o/rnn/basic_lstm_cell/kernel not found in checkpoint
[[Node: save/RestoreV2_28 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_28/tensor_names, save/RestoreV2_28/shape_and_slices)]]`

Could you please tell me where I'm wrong? The version of TensorFlow that I am using is 1.3

How to freeze the model graph from ckpt?

I want to freeze the parameter of the model which you pretrained to one pb file.
I add some code to demo.py, but i am failed.
This work need to know the out_put_node of the model. I print out all nodes in pretrained model. and i think these two nodes maby output nodes:rois/Reshape/shape,rois/Reshape, is that right?
Here is my code:

if name == 'main':
if os.path.exists("data/results/"):
shutil.rmtree("data/results/")
os.makedirs("data/results/")

cfg_from_file('ctpn/text.yml')

# init session
config = tf.ConfigProto(allow_soft_placement=True)
sess = tf.Session(config=config)
# load network
net = get_network("VGGnet_test")
# load model
print(('Loading network {:s}... '.format("VGGnet_test")), end=' ')
saver = tf.train.Saver()

try:
    ckpt = tf.train.get_checkpoint_state(cfg.TEST.checkpoints_path)
    #ckpt=tf.train.get_checkpoint_state("output/ctpn_end2end/voc_2007_trainval/")
    print('Restoring from {}...'.format(ckpt.model_checkpoint_path), end=' ')
    saver.restore(sess, ckpt.model_checkpoint_path)
    print('done')
except:
    raise 'Check your pretrained {:s}'.format(ckpt.model_checkpoint_path)
print (' done.')

print('all nodes are:\n')
graph = tf.get_default_graph()
input_graph_def = graph.as_graph_def()
node_names = [node.name for node in input_graph_def.node]
for x in node_names:
   print(x)
output_node_names = 'rois/Reshape/shape,rois/Reshape'
output_graph_def = graph_util.convert_variables_to_constants(sess, input_graph_def, output_node_names.split(','))
output_graph = 'ctpn.pb'
with tf.gfile.GFile(output_graph, 'wb') as f:
    f.write(output_graph_def.SerializeToString())
sess.close()

TypeError: 'NoneType' object is not subscriptable

run demo.py

InvalidArgumentError (see above for traceback): TypeError: 'NoneType' object is not subscriptable

[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]

[[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

`

Bug in cpu only mode, maximum recursion depth exceeded in comparison

Caused by op 'rois/PyFunc', defined at:
File "D:/Project_PictureDetectiveSystem/src/PD-Service/deeplearn/ocr/textdetectionctpn/ctpn/demo.py", line 86, in
net = get_network("VGGnet_test")
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\factory.py", line 8, in get_network
return VGGnet_test()
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\VGGnet_test.py", line 14, in init
self.setup()
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\VGGnet_test.py", line 54, in setup
.proposal_layer(_feat_stride, anchor_scales, 'TEST', name='rois'))
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\network.py", line 23, in layer_decorated
layer_output = op(self, layer_input, *args, **kwargs)
File "D:\Project_PictureDetectiveSystem\src\PD-Service\deeplearn\ocr\textdetectionctpn\lib\networks\network.py", line 178, in proposal_layer
[tf.float32,tf.float32])
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\ops\script_ops.py", line 203, in py_func
input=inp, token=token, Tout=Tout, name=name)
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\ops\gen_script_ops.py", line 36, in _py_func
name=name)
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
op_def=op_def)
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\framework\ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "D:\Program Files\Python3\lib\site-packages\tensorflow\python\framework\ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

UnknownError (see above for traceback): RecursionError: maximum recursion depth exceeded in comparison
[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5, rpn_bbox_pred/Reshape, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]

这里会报错。同样的环境,上个版本没有出错。请问这个是怎么回事呢?

tensorboard

Dear author, could you add some summaries to the code so that the details of model structure and dataflow could be visualized in a tensorboard? We think it will help a lot. Thanks.

hello, i find ctpn has some trouble with small words!!!

hello, i find ctpn has some trouble with small words!!just like image below.i have 1500 images and maybe 100 images which contain small words.i finetune model which you provide and set batch size 512 and after 2000iters,total loss is 0.5.
00012_00167

i try to resize image to 1000x2000 as split_label.py and input t o network, but it can not detect image correctly,in image, size of each words maybe 20x20 pixels

Did you train your model on a Chinese database?

@eragonruan Hi, Thank you for your sharing. I have a question. Did you train your model on a Chinese database? I tested your model on some pictures with Chinese and it did a good job. Could you tell me which one database you chose and how to get it? Thank you very much.

I try to change code from python2 to python3

I try to change code from python2 to python3,but when I finish all the mistake,and run the code,it caused error below,i do not know where it come from and how to carry out it,how can i do?Thank you so much

2017-10-26 14:07:03.163176: W tensorflow/core/framework/op_kernel.cc:1158] Unknown: KeyError: b'TEST'
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn
status, run_metadata)
File "/usr/local/lib/python3.6/contextlib.py", line 88, in exit
next(self.gen)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST'
[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]
[[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "ctpn/demo.py", line 95, in
_, _ = test_ctpn(sess, net, im)
File "/root/chengjuntao/text-detection-ctpn/lib/fast_rcnn/test.py", line 171, in test_ctpn
rois = sess.run([net.get_output('rois')[0]],feed_dict=feed_dict)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError: KeyError: b'TEST'
[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]
[[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

Caused by op 'rois/PyFunc', defined at:
File "ctpn/demo.py", line 85, in
net = get_network("VGGnet_test")
File "/root/chengjuntao/text-detection-ctpn/lib/networks/factory.py", line 20, in get_network
return VGGnet_test()
File "/root/chengjuntao/text-detection-ctpn/lib/networks/VGGnet_test.py", line 14, in init
self.setup()
File "/root/chengjuntao/text-detection-ctpn/lib/networks/VGGnet_test.py", line 68, in setup
.proposal_layer(_feat_stride, anchor_scales, 'TEST', name='rois'))
File "/root/chengjuntao/text-detection-ctpn/lib/networks/network.py", line 28, in layer_decorated
layer_output = op(self, layer_input, *args, **kwargs)
File "/root/chengjuntao/text-detection-ctpn/lib/networks/network.py", line 241, in proposal_layer
[tf.float32,tf.float32])
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 198, in py_func
input=inp, token=token, Tout=Tout, name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_script_ops.py", line 38, in _py_func
name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()

UnknownError (see above for traceback): KeyError: b'TEST'
[[Node: rois/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_STRING, DT_INT32, DT_INT32], Tout=[DT_FLOAT, DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_5/_75, rpn_bbox_pred/Reshape/_77, _arg_Placeholder_1_0_1, rois/PyFunc/input_3, rois/PyFunc/input_4, rois/PyFunc/input_5)]]
[[Node: rois/PyFunc/_79 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_290_rois/PyFunc", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.