Coder Social home page Coder Social logo

scenereco's Introduction

scenereco's People

Contributors

bear63 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scenereco's Issues

when training, the loss will be nan in a while

I continue to train in another dataset using your pretrained model.
I set a very small learning rate and batch size, but the loss often became inf in trainin, and later the loss will be nan.
Can you provide your training strategy? @bear63 thank you.

ctpn result

How can I get coordinates of 4 points from ctpn?
Per bounding box of the detected text has only 2 points right now.

怎么训练新的数据

作者你好,我正在做crnn中文识别,有一套自己的中文数据集,但是不太清楚要怎么训练,要把训练数据按照长度排序吗,还是随机训练就可以了,可以放一下中文训练的代码吗

CTPN的链接失效了

非常感谢您的分享,但是ctpn失效了,想问一下这个cptn的模型是您自己训练的吗?自己生成的数据吗?谢谢。

IndexError: index 6 is out of bounds for axis 0 with size 5

input exit break
please input file name:1755.jpg
start CTPN
text_lines length:3
Number of the detected text lines: 3
Traceback (most recent call last):
File "demo.py", line 23, in
img,text_recs = getCharBlock(text_detector,im)
File "/mnt/data/sda1/sceneReco-master-tj/ctpnport.py", line 55, in getCharBlock
text_recs = draw_boxes(tmp, text_lines, is_display=False, caption='im_name', wait=False)
File "./CTPN/src/other.py", line 31, in draw_boxes
b1 = box[6] - box[7] / 2
IndexError: index 6 is out of bounds for axis 0 with size 5

i cannot import caffe and torch at the same time

I can import caffe, and use the ctpb module successfully, but when i want to import both ctpn and crnn modules, there is a error;

raceback (most recent call last):
File "", line 1, in
File "/data/resys/lvyi/exp/OCR_detect/CTPN/caffe/python/caffe/init.py", line 1, in
from .pycaffe import Net, SGDSolver
File "/data/resys/lvyi/exp/OCR_detect/CTPN/caffe/python/caffe/pycaffe.py", line 13, in
from ._caffe import Net, SGDSolver
ImportError: dlopen: cannot load any more object with static TLS

i guess it is a conflict between caffe and torch?
i tried to change the order of import torch and caffe, but it dosen't work.

what's the version of caffe

/root/sceneReco/CTPN$ python ./tools/demo.py --no-gpu
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0709 14:38:45.132715 2023 _caffe.cpp:139] DEPRECATION WARNING - deprecated use of Python interface
W0709 14:38:45.134696 2023 _caffe.cpp:140] Use this instead (with the named "weights" parameter):
W0709 14:38:45.135150 2023 _caffe.cpp:142] Net('models/deploy.prototxt', 1, weights='models/ctpn_trained_model.caffemodel')
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 387:19: Message type "caffe.LayerParameter" has no field named "transpose_param".
F0709 14:38:45.140238 2023 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: models/deploy.prototxt
*** Check failure stack trace: ***
已放弃 (核心已转储)

Why is this still happening this problem?

Net('/home/yang/sceneReco/CTPN/models/deploy.prototxt', 1, weights='/home/yang/sceneReco/CTPN/models/ctpn_trained_model.caffemodel')
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 387:19: Message type "caffe.LayerParameter" has no field named "transpose_param".
F0128 15:30:29.198107 2556 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /home/yang/sceneReco/CTPN/models/deploy.prototxt
*** Check failure stack trace: ***

ImportError: No module named torch

i run python demo.py , has error. can you help me

xiaomingli15:sceneReco yyh$ python demo.py objc[2455]: Class CaptureDelegate is implemented in both /Library/Python/2.7/site-packages/cv2/cv2.so (0x10ff5ddc0) and /usr/local/opt/opencv/lib/libopencv_videoio.3.3.dylib (0x11235a618). One of the two will be used. Which one is undefined. Traceback (most recent call last): File "demo.py", line 3, in <module> from crnnport import * File "/Users/yyh/Documents/study/sceneReco/crnnport.py", line 6, in <module> import torch ImportError: No module named torch

Training dataset

Hello, I want to know which dataset you use to train the model . It is you own dataset or ICDAR dataset? Thank you,

Could you tell me the size of your GPU RAM?

When I ran python demo.py with 1755.jpg, the program terminated and printed:
F1201 16:53:36.329907 8940 syncedmem.cpp:58] Check failed: error == cudaSuccess (2 vs. 0) out of memory
After searching on the net, the problem seems to be lacking of GPU memory.
PS: My video card is 1050 Ti with 4GB RAM.

run the demo.py ,show the wrong ..

PC:/media/media_share/linkfile/sceneReco$ python demo.py
Traceback (most recent call last):
File "demo.py", line 3, in
from crnnport import *
File "/media/media_share/linkfile/sceneReco/crnnport.py", line 6, in
import torch
File "/usr/local/lib/python2.7/dist-packages/torch/init.py", line 53, in
from torch._C import *
ImportError: dlopen: cannot load any more object with static TLS

could you how to do that?

can not recognize Chinese words in the picture

unexpected output : input does not need backward computation , conv1_1 does not need backward computation. I am confused .

The results of the project can frame the range of the text. There is no recognition result of Chinese characters. It seems that crnn has no effect. Why ?

cpu-only

If I only use cpu-only mode to run this project, can I run it completely?

Cannot use GPU in CPU-only Caffe: check mode

I build caffe CPU Only

root@ubuntu:/home/user/sceneReco# python demo.py
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0410 03:50:20.793697 5050 common.cpp:66] Cannot use GPU in CPU-only Caffe: check mode.
*** Check failure stack trace: ***
Aborted (core dumped

中文词库

你好,能提供一下你训练使用的中文语料库么

RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 2)

When I input 21.bmp, I got an error:
please input file name:21.bmp
Traceback (most recent call last):
File "demo.py", line 23, in
crnnRec(model,converter,img,text_recs)
File "/home/ycp/sceneReco-master/crnnport.py", line 78, in crnnRec
preds = preds.squeeze(2)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 750, in squeeze
return Squeeze.apply(self, dim)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/_functions/tensor.py", line 378, in forward
result = input.squeeze(dim)
RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 2)

CTPN difference between your repository and source version

First of all, I must say, this is not a issue of your code.
Sorry for I do not find your personal email, so just release a issue here. Please just close it after you read, thanks.

I notice the CTPN code in your repository differing from the source version: https://github.com/tianzhi0549/CTPN.
The difference mainly exists in CTPN_ROOT/src/other.py -> draw_boxes() function.

I think you rewrite draw_boxes() to enable it to display rotated rectangle. But I still need to confirm it with you. Is this right?

Text lines detection coordinates result difference.
I just tested the same image on both your CTPN and the source version CTPN. In the step of detecting text lines, your CTPN returns boxes with 8 values for each box, like this:
[ 1.60000000e+01 1.61765766e+01 9.59000000e+02 7.64572601e+01
9.74919617e-01 2.58991048e-02 3.36731377e+01 3.87520103e+01]
However, the source version CTPN returns only 5 values for each box, like this:
[ 16. 16.17657661 959. 76.45726013 0.97491962]
2.1 Why source version CTPN lost last 3 values? (if you also don't pretty sure, please skip this question.)
2.2 Except draw_boxes(), these two copies of CTPN are almost the same, why the text_lines detecting results different. Do I miss any other difference?
2.3 for the box result: I know the box[0:2] and box[2:4] represent the coordinates of left-top-corner and right-bottom-corner respectively; what the rest 4 values box[4:8] means?

Thank you for your read and looking forward to your (and any friends who have the answers) reply deadly.

Best regards,
Hui

(Really glad to discuss with you in Chinese if you can.)
(Finally, your sample images are really classic.)

how to run the demo.py with cpu

hi,i want to run the demo.py with CPU ?what should i do ?
I change the ctpnport.py like this:

caffe.set_mode_cpu()
# caffe.set_device(cfg.TEST_GPU_ID)

and the ctpn works,but I change the crnnport.py with cpu model ,output the error:

Traceback (most recent call last):
  File "demo.py", line 14, in <module>
    model,converter = crnnSource()
  File "/data/sceneReco/crnnport.py", line 47, in crnnSource
    model.load_state_dict(torch.load(path))
  File "/usr/lib64/python2.7/site-packages/torch/serialization.py", line 229, in load
    return _load(f, map_location, pickle_module)
  File "/usr/lib64/python2.7/site-packages/torch/serialization.py", line 377, in _load
    result = unpickler.load()
  File "/usr/lib64/python2.7/site-packages/torch/serialization.py", line 348, in persistent_load
    data_type(size), location)
  File "/usr/lib64/python2.7/site-packages/torch/serialization.py", line 85, in default_restore_location
    result = fn(storage, location)
  File "/usr/lib64/python2.7/site-packages/torch/serialization.py", line 67, in _cuda_deserialize
    return obj.cuda(device_id)
  File "/usr/lib64/python2.7/site-packages/torch/_utils.py", line 57, in _cuda
    with torch.cuda.device(device):
  File "/usr/lib64/python2.7/site-packages/torch/cuda/__init__.py", line 124, in __enter__
    _lazy_init()
  File "/usr/lib64/python2.7/site-packages/torch/cuda/__init__.py", line 84, in _lazy_init
    _check_driver()
  File "/usr/lib64/python2.7/site-packages/torch/cuda/__init__.py", line 58, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError: 
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

I write the demo code:

model = crnn.CRNN(32, 1, len(alphabet)+1, 256)
if torch.cuda.is_available():
    model = model.cuda()
print('loading pretrained model from %s' % model_path)
model.load_state_dict(torch.load(model_path))
print(model)

When I load other .pth model files from https://github.com/meijieru/crnn.pytorch, not output error.
can you provide the CPU models files for crnn?

crnnport.py IndexError: tuple index out of range

three is some errors on the below:
please input file name:1755.jpg
text_recs: [((16.0, 768.6111), (159.0, 790.30804)), ((48.0, 579.4411), (511.0, 690.03076)), ((528.0, 790.0657), (591.0, 800.4505))]
rec: ((16.0, 768.6111), (159.0, 790.30804))
pt1: (16.0, 768.6111)
pt2: (159.0, 790.30804)
Traceback (most recent call last):
File "demo.py", line 26, in
crnnRec(model,converter,img,text_recs)
File "/home/luwei/ML/CRNN/sceneReco-master/crnnport.py", line 57, in crnnRec
pt3 =(rec[6],rec[7])
IndexError: tuple index out of range

Q:waht is the fuction of this block?
pt3 =(rec[6],rec[7])
pt4 =(rec[4],rec[5])
partImg = dumpRotateImage(im,degrees(atan2(pt2[1]-pt1[1],pt2[0]-pt1[0])),pt1,pt2,pt3,pt4)
mahotas.imsave('%s.jpg'%index, partImg)

thanks for your contribution!
Best Reagrd!

how to run the demo code in multiprocess in python(I change it but get error)

from ctpnport import *
from crnnport import *
from multiprocessing import Pool as ProcessPool
from multiprocessing.dummy import Pool as ThreadPool
import time

#ctpn
text_detector = ctpnSource()
#crnn
model,converter = crnnSource()

timer=Timer()

def loop_by_input_filename():
	print "\ninput exit break\n"
	while 1 :
		im_name = raw_input("\nplease input file name:")
		if im_name == "exit":
		   break
		im_path = "./img/" + im_name
		im = cv2.imread(im_path)
		if im is None:
		  continue
		timer.tic()
		img,text_recs = getCharBlock(text_detector,im)
		print 'after demo.py getCharBlock(text_detector,im)'
		crnnRec(model,converter,img,text_recs)
		print "Time: %f"%timer.toc()
		#cv2.waitKey(0)    

def do_recognition(file):
    im = cv2.imread(file)
    timer.tic()
    img,text_recs = getCharBlock(text_detector,im)
    crnnRec(model,converter,img,text_recs)
    print "Time: %f"%timer.toc()
	
def get_all_files_in_dir(dir='./img/'):
    import glob
    files = glob.glob(dir + '*.jpg')
    return files
def stress_test():
    files = get_all_files_in_dir()
    print files
    start = time.time()
    pool = ProcessPool(1)
    #pool = ThreadPool(1)
    ret = pool.map(do_recognition, files)
    pool.close()
    pool.join()
    print 'time consume %s' % (time.time() - start)
if __name__ == '__main__':
    stress_test()
    #loop_by_input_filename()
    #do_recognition('./img/22.jpg')

when I runs python demo.py I get the below error, I just use one process,

"""
F0728 16:03:35.364833 4085 syncedmem.hpp:19] Check failed: error == cudaSuccess (3 vs. 0) initialization error
*** Check failure stack trace: ***
E0728 16:03:36.586454 4090 common.cpp:103] Cannot create Cublas handle. Cublas won't be available.
E0728 16:03:36.586565 4090 common.cpp:110] Cannot create Curand generator. Curand won't be available.
Number of the detected text lines: 3
Time: 22.920991
THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generic/THCStorage.c line=55 error=3 : initialization error
Traceback (most recent call last):
File "demo.py", line 54, in
stress_test()
File "demo.py", line 49, in stress_test
ret = pool.map(do_recognition, files)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
raise self._value
RuntimeError: cuda runtime error (3) : initialization error at /b/wheel/pytorch-src/torch/lib/THC/generic/THCStorage.c:55
"""

if I change the code as below, i also get error

    #pool = ProcessPool(1)
    pool = ThreadPool(1)

"""
*** Error in `python': free(): invalid pointer: 0x0000000204800000 ***
Aborted (core dumped)
"""

run demo.py wrong

Hi Dear

when I run
python demo.py
it shows the following error:
[root@localhost sceneReco]# python demo.py
Traceback (most recent call last):
File "demo.py", line 2, in
from ctpnport import *
File "/root/Desktop/sceneReco/ctpnport.py", line 29, in
from detectors import TextProposalDetector, TextDetector
File "./CTPN/src/detectors.py", line 4, in
from utils.cpu_nms import cpu_nms as nms
ImportError: No module named cpu_nms

How could I handle that?

Thanks
weizhen

ImportError: No module named cpu_nms

when i run python demo.py,it have error:
Traceback (most recent call last):
File "demo.py", line 3, in
from ctpnport import *
File "/home/liuchang/git/sceneReco/ctpnport.py", line 29, in
from detectors import TextProposalDetector, TextDetector
File "./CTPN/src/detectors.py", line 4, in
from utils.cpu_nms import cpu_nms as nms
ImportError: No module named cpu_nms

Failed to parse NetParameter file: CTPN/models/deploy.prototxt

when I run the code,it shows that:
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0714 00:59:40.417732 2919 _caffe.cpp:139] DEPRECATION WARNING - deprecated use of Python interface
W0714 00:59:40.417752 2919 _caffe.cpp:140] Use this instead (with the named "weights" parameter):
W0714 00:59:40.417757 2919 _caffe.cpp:142] Net('CTPN/models/deploy.prototxt', 1, weights='CTPN/models/ctpn_trained_model.caffemodel')
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 387:19: Message type "caffe.LayerParameter" has no field named "transpose_param".
F0714 00:59:40.419016 2919 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: CTPN/models/deploy.prototxt
*** Check failure stack trace: ***
已放弃 (核心已转储)

run the demo.py is is always core dunmp

when I run thE ROOT/demo.py ,

it always show the errors:

I0717 11:21:32.941676 3793 layer_factory.hpp:77] Creating layer input
I0717 11:21:32.941694 3793 net.cpp:100] Creating Layer input
I0717 11:21:32.941701 3793 net.cpp:408] input -> data
I0717 11:21:32.941712 3793 net.cpp:408] input -> im_info
I0717 11:21:34.105748 3793 net.cpp:150] Setting up input
I0717 11:21:34.105787 3793 net.cpp:157] Top shape: 1 3 600 900 (1620000)
I0717 11:21:34.105795 3793 net.cpp:157] Top shape: 1 3 (3)
I0717 11:21:34.105801 3793 net.cpp:165] Memory required for data: 6480012
I0717 11:21:34.105810 3793 layer_factory.hpp:77] Creating layer conv1_1
I0717 11:21:34.105829 3793 net.cpp:100] Creating Layer conv1_1
I0717 11:21:34.105844 3793 net.cpp:434] conv1_1 <- data
I0717 11:21:34.105852 3793 net.cpp:408] conv1_1 -> conv1_1
F0717 11:21:34.430495 3793 cudnn.hpp:113] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM
*** Check failure stack trace: ***
Aborted (core dumped)
hc@hc-PC:/media/media_share/linkfile/sceneReco$

Can you tell me how to solve it?

thank you very much

confusion about squeeze(2) in the file crnnport.py

According to my experiment, i found that the dimension of preds (line 75) is 3. After preds.max(2), its dimension is 2. As a result, preds.squeeze(2) throws an exception, since axis 2 exceeds the range of dimension.

So, the command line reminds me that "RuntimeError: dimension out of range (expected to be in range of [-2, 1], but got 2) "

Run the demo.py with cudnn,i get "Check failed:error == cudaSuccess (35 vs. 0) CUDA driver version is insufficient for CUDA runtime version

./NVIDIA_CUDA-8.0_Samples/bin/x86_64/linux/release/deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 8105 MBytes (8499167232 bytes)
(20) Multiprocessors, (128) CUDA Cores/MP: 2560 CUDA Cores
GPU Max Clock rate: 1848 MHz (1.85 GHz)
Memory Clock rate: 5005 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1080
Result = PASS

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.