mjq11302010044 / rrpn Goto Github PK
View Code? Open in Web Editor NEWArbitrary-Oriented Scene Text Detection via Rotation Proposals (TMM 2018)
License: Other
Arbitrary-Oriented Scene Text Detection via Rotation Proposals (TMM 2018)
License: Other
@mjq11302010044
if the CLASSES = ('chinese', 'japenese',"English","germany"),I want to get the different class's result, how can I modify the code below:
//Visualize detections for each class
CONF_THRESH = conf
NMS_THRESH = 0.3
for cls_ind, cls in enumerate(CLASSES[1:]):
cls_ind += 1 # because we skipped background
cls_boxes = boxes[:, 5 * cls_ind:5 * (cls_ind + 1)] # D
cls_scores = scores[:, cls_ind]
dets = np.hstack((cls_boxes,
cls_scores[:, np.newaxis])).astype(np.float32)
keep = rotate_gpu_nms(dets, NMS_THRESH) # D
dets = dets[keep, :]
//dets = dets[0:20]
//dets[:, 4] = dets[:, 4] * 0.45
dets[:, 2] = dets[:, 2] / cfg.TEST.GT_MARGIN
dets[:, 3] = dets[:, 3] / cfg.TEST.GT_MARGIN
results = write_result_ICDAR(
im_file,
dets,
CONF_THRESH,
result_dir,
im_height,
im_width)
return results
Hi,
I tested your pretrained VGG 16 model on ICDAR2015 with no parameters change. The following is the result of icdar2015 official evaluation script. All these values are lower than what is claimed in the paper.
Calculated!{"recall": 0.6721232546942706, "precision": 0.7977142857142857, "hmean": 0.7295531748105567, "AP": 0}
In rotate_polygon_nms.pyx
and rbbox_overlaps.pyx
, the order of input boxes is [x_ctr, y_ctr, w, h, theta]
or [y_ctr, x_ctr, h, w, theta]
?
I am confused about it . Can you give me some help? @mjq11302010044
Thank you for your great work! I wanted kindly to ask if there is an option to run the demo on CPU only? :)
Best Regards
Valentin
I am using the from rotation.rbbox_overlaps import rbbx_overlaps as abc
method to calcualte box overlap. But I got nan
value. What caused this code to generate nan
value? Thanks!
The following example produces IOU greater than 1.
b1 = np.array([[46.83, 44.03, 3.9, 1.63, 0]], dtype=np.float32)
b2 = np.array([[46.83, 44.03, 1.63, 3.9, 1.45]], dtype=np.float32)
rbbox_overlaps(b1, b2) = 1.35
The expected iou should be near 1. Is there something I'm missing here?
Note that I changed the code to use angle in radians.
您好!我在运行rotation_demo.py出现了下面的问题:
Demo for data/demo/./data/1000train/test_100image/TB1coNGLXXXXXXPXFXXunYpLFXX.jpg
Detection took 0.175s for 300 object proposals
QObject::moveToThread: Current thread (0x7fe39117a7f0) is not the object's thread (0x7fe3904d2960).
Cannot move to target thread (0x7fe39117a7f0)
Segmentation fault (core dumped)
想问一下应该如何解决,非常感谢!
@mjq11302010044大神您好,想问下论文当中,旋转角度[-π/4,3π/4]是如何确定的?
我个人理解应该始终是长边即w与x正半轴的夹角,当角度angle在[3π/4,π]的范围内时,angle=angle-π
这样的理解对不对?
Hi all,
I am trying to re-implement the Rotate Roi Pooling Layer and I can't quite understand where the variables of AB,AC,ABAB,ACAC, ABAP,ACAP come from in the paper. The problem is that because of the additional checking condition with those variables , there is a very rare update.
Just trying to understand, any help is highly appreciated.
I knwo how to make annotations with (x,y,w,h) with "labelimg".Can you tell me how can I make the annotation with 8 number(x1,y1,x2,y2,x3,y3,x4,y4)?
in the forward function, illegal memory access was encountered.
请问作者有没有其他版本的代码,比如pytorch或者tensorflow,安装caffe一直失败,很是崩溃
root@63b86f5db157:/workspace/gitzkl/RRPN# python ./tools/rotation_demo.py
Traceback (most recent call last):
File "./tools/rotation_demo.py", line 18, in
from fast_rcnn.test import im_detect
File "/workspace/gitzkl/RRPN/tools/../lib/fast_rcnn/test.py", line 16, in
import caffe
File "/workspace/gitzkl/RRPN/tools/../caffe-fast-rcnn/python/caffe/init.py", line 1, in
from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver
File "/workspace/gitzkl/RRPN/tools/../caffe-fast-rcnn/python/caffe/pycaffe.py", line 13, in
from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver,
ImportError: No module named _caffe
但是
但是在该容器下 跑 python
import caffe
正常运行,没有报错,
又重新
/opt/caffe/build/tools/caffe
cd /opt/caffe/build
make pycaffe
编译ok
[100%] Built target pycaffe
但还是
python ./tools/rotation_demo.py,出现上述错误,
环境是docker 建立的 nvidia-docker run -ti bvlc/caffe:gpu caffe --version
Helo @mjq11302010044. I'm trying to create a model on my own just like your trained model that you shared.
But the training speed is very low. The iteration speed is aroud 5 seconds and the iteration count is 490000.
Do you have any trick to speed up the training speed?
What can be the least iteration count to get the demo works?
When I predict a text area, rotate image and crop it, I often find that if a text area`s angle is 225°, the rrpn predict 45°. After I input the crop image croped by 45°, the crnn result goes wrong. So , can author or ohers can advise some methods to make sure whether the correct prediction is 45° or 225°.
hi @mjq11302010044,
thanks for your great work.
i have a question, does the code support multi-gpus training?
thanks.
I am trying to use your code to replicate the training process with my custom dataset. I am using the VGG16 as back end. I get the following error as mentioned below
I0120 19:31:06.732738 15049 solver.cpp:77] Creating training net from train_net file: /data/RRPN/models/rrpn/VGG16/faster_rcnn_end2end/train.prototxt
[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 471:24: Message type "caffe.LayerParameter" has no field named "smooth_l1_loss_param".
F0120 19:31:06.733158 15049 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /data/RRPN/models/rrpn/VGG16/faster_rcnn_end2end/train.prototxt
*** Check failure stack trace: ***
On further googling I found that this issue was related to caffe-fast-rccn provided in your code not having certain layers. so I tried building it again using the instructions provided on the BVLC caffe website
make all -j8 proceeds without any failures. But make test fails as shown below
ubuntu@ip-172-31-1-22:/data/RRPN/caffe-fast-rcnn$ sudo make test -j8
CXX src/caffe/test/test_concat_layer.cpp
CXX src/caffe/test/test_multinomial_logistic_loss_layer.cpp
CXX src/caffe/test/test_platform.cpp
CXX src/caffe/test/test_deconvolution_layer.cpp
CXX src/caffe/test/test_random_number_generator.cpp
CXX src/caffe/test/test_convolution_layer.cpp
CXX src/caffe/test/test_euclidean_loss_layer.cpp
CXX src/caffe/test/test_hinge_loss_layer.cpp
CXX src/caffe/test/test_stochastic_pooling.cpp
CXX src/caffe/test/test_io.cpp
CXX src/caffe/test/test_softmax_layer.cpp
CXX src/caffe/test/test_upgrade_proto.cpp
CXX src/caffe/test/test_bias_layer.cpp
CXX src/caffe/test/test_mvn_layer.cpp
CXX src/caffe/test/test_roi_pooling_layer.cpp
CXX src/caffe/test/test_net.cpp
CXX src/caffe/test/test_benchmark.cpp
CXX src/caffe/test/test_filler.cpp
CXX src/caffe/test/test_argmax_layer.cpp
CXX src/caffe/test/test_contrastive_loss_layer.cpp
CXX src/caffe/test/test_smooth_L1_loss_layer.cpp
src/caffe/test/test_smooth_L1_loss_layer.cpp:11:35: fatal error: caffe/vision_layers.hpp: No such file or directory
compilation terminated.
Makefile:581: recipe for target '.build_release/src/caffe/test/test_smooth_L1_loss_layer.o' failed
make: *** [.build_release/src/caffe/test/test_smooth_L1_loss_layer.o] Error 1
make: *** Waiting for unfinished jobs....
Can you please tell me how you resolved this issue and trained your model
Thanks
my computer motherboard seems to be not compatible with Linux OS(asus Z370), anyone has tips for compiling the lib so it can be runned in win10 ???
thank you!!!
麻烦问下,一般需要迭代到多少次,模型才能看到明显的效果?我们数据集大概350张图片,每个图片大概有15个文字框。现在迭代了200000次了,但是感觉结果还是有问题,文字检测结果很不准确
Hi,作者你好。
首先,很感谢你的分享。我按照你写的训练方法,训练了MSRA-TD500数据,跑rotation_demo.py时,没有输出结果。调试看到,框的置信度都很低,都不超过0.1,所以都被滤掉了。看了下训练日志,开始loss是1.6,迭代20W次后是0.35。看上去应该是收敛了。但是不清楚为什么没有检测出框。
请问,
1.这种情况下是不是真的收敛了,还是哪些地方需要调整下。
2.训练数据需不需要做什么变换,我用的是MSRA-TD500原始标注locations.xml,没有做任何的二次处理。
I run the script ./data/scripts/fetch_imagenet_models.sh. and can only get the imagenet_models.tgz with 34kb.
error information:
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
我使用 ./experiment/scripts/faster_rcnn_end2end.sh 1 VGG16 rrpn
来训练模型。训练阶段没有问题,成功得到了最终的模型,但是测试阶段出现了错误,是 test_net.py
中调用 get_imdb
函数时出的错, 提示 `KeyError: 'Unknown dataset: MSRA_TEST',请问我应该如何解决这个错误?
没太看懂demo.py里面的输出
大牛,您好,我想用您的这个方法来检测一般的类似矩形目标,请问数据集的标注格式是什么样的啊?有没有专用的工具来制作,谢谢
I get the error TypeError: No registered converter was able to extract a C++ pointer to type char from this Python object of type str
Traceback (most recent call last):
File "./tools/rotation_demo.py", line 572, in
net = caffe.Net(prototxt, caffemodel, caffe.TEST)
SystemError: <Boost.Python.function object at 0x24faeb0> returned NULL without setting an error when I try to run the demo file. Make -j4 and make pycaffe happen without any error. I am using Python3.6
这个theta 角按照论文的说法应该是逆时针取得,但是从那个论文公式来看,仿射变换 好像都是从顺时针的公式来算的,有点不太理解这个公式。不知道是不是我理解错了。 @mjq11302010044
I tried running with CPU but I'm getting this error when I try to build the Cython modules:
EnvironmentError: The nvcc binary could not be located in your $PATH. Either add it to your path, or set $CUDAHOME
运行faster_rcnn_end2end.sh时,界面上的输出如下:
real 0m0.000s
user 0m0.000sc
sys 0m0.000s
real 0m0.000s
user 0m0.000s
sys 0m0.000s
就是没有进入train_net.py脚本,而是直接退出了faster_rcnn_end2end.sh,请问是什么原因?还请大神指教!!!谢谢!!!
When I run the code ./tools/rotation_demo.py
, I get the following error:
Loaded network /RRPN/data/faster_rcnn_models/vgg16_faster_rcnn.caffemodel
Memory need is 426752000
Memory need is 426752000
Memory need is 106752000
Memory need is 106752000
Memory need is 213504000
Memory need is 213504000
Memory need is 53376000
Memory need is 53376000
Memory need is 106752000
F1010 11:27:03.680461 8528 syncedmem.cpp:57] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
Aborted (core dumped)
I have tried all possible available solution but they were not able to resolve this. I am compiling with Cudnn 7 and cuda 9.0. I have downgraded both of them and the problem was still not solved. I am using GT 710 2GB. Is there a way anyone can help me here. I am not even sure that if this is a bug or genuinely a hardware limitation. So before I go and buy a new GPU I would appreciate your help.
@mjq11302010044 I finally managed to run it :) thank you for your great support.
I do not have cuDNN installed ... that might be an issue with the memory required by the demo?
I am using a fairly new GPU, NVIDIA QUADRO-M4000 with 8GB! How is it possible I am running out of Memory? Also, I've resized your images, but still...:
libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.
Loaded network /home/vale/masterarbeit/H_AOSTD/data/faster_rcnn_models/vgg16_faster_rcnn.caffemodel
F1109 05:23:29.260838 22269 syncedmem.cpp:64] Check failed: error == cudaSuccess (2 vs. 0) out of memory
I guess I could optimize utilization with cuDNN. But is it necessary? Quite bad experience with cuDNN...
I0107 11:43:32.129838 14961 layer_factory.hpp:77] Creating layer rpn_loss_cls
I0107 11:43:32.132026 14961 net.cpp:150] Setting up rpn_loss_cls
I0107 11:43:32.132074 14961 net.cpp:157] Top shape: (1)
I0107 11:43:32.132081 14961 net.cpp:160] with loss weight 1
I0107 11:43:32.132100 14961 net.cpp:165] Memory required for data: 298545136
I0107 11:43:32.132104 14961 layer_factory.hpp:77] Creating layer rpn_loss_bbox
I0107 11:43:32.132115 14961 net.cpp:106] Creating Layer rpn_loss_bbox
I0107 11:43:32.132118 14961 net.cpp:454] rpn_loss_bbox <- rpn_bbox_pred_rpn_bbox_pred_0_split_0
I0107 11:43:32.132123 14961 net.cpp:454] rpn_loss_bbox <- rpn_bbox_targets
I0107 11:43:32.132127 14961 net.cpp:454] rpn_loss_bbox <- rpn_bbox_inside_weights
I0107 11:43:32.132129 14961 net.cpp:454] rpn_loss_bbox <- rpn_bbox_outside_weights
I0107 11:43:32.132133 14961 net.cpp:411] rpn_loss_bbox -> rpn_loss_bbox
F0107 11:43:32.132158 14961 smooth_L1_loss_layer.cpp:28] Check failed: bottom[0]->channels() == bottom[1]->channels() (225 vs. 270)
*** Check failure stack trace: ***
./experiments/scripts/faster_rcnn_end2end.sh: line 78: 14961 Aborted (core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver /data/wuxl/RRPN2/models/${PT_DIR}/${NET}/faster_rcnn_end2end/solver.prototxt --weights data/imagenet_models/${NET}.v2.caffemodel --imdb ${TRAIN_IMDB} --iters ${ITERS} --cfg experiments/cfgs/faster_rcnn_end2end.yml ${EXTRA_ARGS}
real 0m18.619s
user 0m17.324s
sys 0m1.984s
how to solve the problem
the code of function convert_region
in rotate_polygon_nms_kernel.cu and rbbox_overlaps_kernel.cu seems to expect anti-clockwise degree as param.
Do I have an exact understanding?
I compared the result with OpenCV3's rotatedRectangleIntersection, and the result equals.
Hi! Thank you for opening the source
My training configs set as follows
TRAIN': {'ASPECT_GROUPING': True,
'BATCH_SIZE': 64,
'BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2],
'BBOX_NORMALIZE_TARGETS': True,
'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True,
'BBOX_REG': True,
'BBOX_THRESH': 0.5,
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.0,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'GT_MARGIN': 1.4,
'HAS_RPN': True,
'IMS_PER_BATCH': 1,
'MAX_SIZE': 1000,
'PROPOSAL_METHOD': 'gt',
'RBBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0, 1.0],
'RBBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0, 0.0],
'RBBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2, 1],
'RBBOX_NORMALIZE_TARGETS_PRECOMPUTED': True,
'RPN_BATCHSIZE': 256,
'RPN_BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 16,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 2000,
'RPN_PRE_NMS_TOP_N': 12000,
'RPN_RBBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0, 1.0],
'R_NEGATIVE_ANGLE_FILTER': 15,
'R_POSITIVE_ANGLE_FILTER': 15,
'SCALES': [600],
'SNAPSHOT_INFIX': '',
'SNAPSHOT_ITERS': 10000,
'USE_FLIPPED': False,
'USE_PREFETCH': False},
'USE_GPU_NMS': True}
and use ZF network
Images to use per minibatch __C.TRAIN.IMS_PER_BATCH = 1
I thought too much Anchors was generated,then I reduced the size, proportions, and angles of anchors, as follows:
def generate_anchors(base_size=16, ratios=[0.2,1],
scales=2 ** np.arange(3,5), angle=[0.0,30.0]):
But there is always such a mistake
Check failed: error == cudaSuccess (11 vs. 0) invalid argument
then, I set the size of A to 64,as follows:
Minibatch size (number of regions of interest [ROIs])
A = __C.TRAIN.BATCH_SIZE = 64
always such a mistake as follows:
Check failed: error == cudaSuccess (11 vs. 0) invalid argument
As you released VGG16 model I can not download, I use the ZF.v2.caffemodel(imagenet_models) for fine-tuning training.
Is it necessary to set what parameters? Or where did not notice.
thank you!
make的时候报错:recipe for target '.build_release/src/caffe/layers/rotate_roi_polling_layer.o' failed
I want to use the skew-nms algorithm used here for a different project. What all parameters can I control and which files do I need to import?
When I train the icdar 2003 dataset,I have generated a caffemodel,but during the test period,I meet this problem:
Traceback (most recent call last):
File "./tools/test_net.py", line 86, in
imdb = get_imdb(args.imdb_name)
File "/home/xuy/code/RRPN/tools/../lib/datasets/factory.py", line 37, in get_imdb
raise KeyError('Unknown dataset: {}'.format(name))
KeyError: 'Unknown dataset: MSRA_TEST'
according to your faster_rcnn_end2end.sh,I could know that:
case $DATASET in
pascal_voc)
TRAIN_IMDB="voc_2007_trainval"
TEST_IMDB="voc_2007_test"
PT_DIR="pascal_voc"
ITERS=70000
;;
coco)
# This is a very long and slow training schedule
# You can probably use fewer iterations and reduce the
# time to the LR drop (set in the solver to 350,000 iterations).
TRAIN_IMDB="coco_2014_train"
TEST_IMDB="coco_2014_minival"
PT_DIR="coco"
ITERS=490000
;;
rrpn)
# This is a very long and slow training schedule
# You can probably use fewer iterations and reduce the
# time to the LR drop (set in the solver to 350,000 iterations).
TRAIN_IMDB="MSRA_TRAIN"
TEST_IMDB="MSRA_TEST"
PT_DIR="rrpn"
ITERS=490000
;;
rrpn_vehicle)
# This is a very long and slow training schedule
# You can probably use fewer iterations and reduce the
# time to the LR drop (set in the solver to 350,000 iterations).
TRAIN_IMDB="MSRA_TRAIN"
TEST_IMDB="MSRA_TEST"
PT_DIR="rrpn_vehicle"
ITERS=490000
;;
*)
echo "No dataset given"
exit
;;
esac
so,what should I do?
When I run the code nms_test.py
using gpu. I got the result [0, 3]
. When I run the same code using cpu, I got [0, 1, 3]
.
您好,我是一名本科的学生,最近在研究rcnn这方面的东西,我之前已经使用过caffe训练过cnn的模型,目前想clone您的代码学习一下rcnn,因为之前训练都是在Intel的devcloud上进行的,所以在本地都是用CPU训练一些小的数据集(本人机器为mac),所以我想问一下您们README上的demo是一定要在支持CUDA驱动的gpu环境的才能运行吗?因为我在mac运行(有CUDA环境但不支持NVDIA的gpu驱动)会报错。
报错信息如下:
nms/gpu_nms.cpp:2074:3: error: no matching function for call to '_nms'
_nms((&(*__Pyx_BufPtrStrided1d(__pyx_t_5numpy_int32_t *, __pyx_pybuffernd_keep.rcbuffer->pybuffer.buf, __pyx_t_10, __pyx_pybuffernd_keep.diminfo[0].strides))), (&__pyx_v_num_out), (&(...
^~~~
nms/gpu_nms.hpp:1:6: note: candidate function not viable: no known conversion from '__pyx_t_5numpy_int32_t ' (aka 'long ') to 'int ' for 1st argument
void _nms(int keep_out, int num_out, const float boxes_host, int boxes_num,
看上去像是gpu驱动的问题。
所以我想请教您,是不是你们测试的demo只能在支持CUDA的gpu环境下才能make成功?
万分感谢。
cannot import name rt_test_net
tools/test_net.py need call rt_test_net function in lib/rotation/rt_test.py
but not find it
@mjq11302010044 嗨,你好,我想问一下,
Hello, I am evaluating the code on the official rrc webpage but the results are worse than in the paper. Is the code made by the authors or externally? Any reason for the decreased performance? :)
running on GPU.
Best Regards
Valentin
hello,I use my own training data ,when i run the script ./experiments/scripts/faster_rcnn_end2end.sh 0 VGG16 rrpn.i got the error:
./experiments/scripts/faster_rcnn_end2end.sh: line 78: 38172 Floating point exception(core dumped) ./tools/train_net.py --gpu ${GPU_ID} --solver models/${PT_DIR}/${NET}/faster_rcnn_end2end/solver.prototxt --weights data/imagenet_models/${NET}.v2.caffemodel --imdb ${TRAIN_IMDB} --iters ${ITERS} --cfg experiments/cfgs/faster_rcnn_end2end.yml ${EXTRA_ARGS}.
so how to solve this error,thanks
大神你好,
怎样来制作数据集啊?
Download pre-computed RRPN detectors
Trained VGG16 model download link: https://drive.google.com/open?id=0B5rKZkZodGIsV2RJUjVlMjNOZkE
Is it possible to list the major changes to this version of caffe you use? I want know the potential issues/conflict while merging it with the newer version of caffe.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.