Comments (14)
@lillyPJ You should change the batch size to 32. If you set the batch size to 8, you did not complete one epoch for 800,000 images when trained with 50k iterations. This may be one of the problem.
By the way, did you use data augmentation?
from textboxes.
Thanks for your reply @MhLiao . I have changed the batch size to 16 (due to my GPU limit), and doubled the iterations before( the step of for the learning rate was changed either). But it did not work. Even with the batch size being set to 32, the epoch is only about 2. Is that enough for the training ?
The data augmentation I used follows your paper and SSD:
layer {
name: "data"
type: "AnnotatedData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: false
mean_value: 104
mean_value: 117
mean_value: 123
resize_param {
prob: 1
resize_mode: WARP
height: 300
width: 300
interp_mode: LINEAR
interp_mode: AREA
interp_mode: NEAREST
interp_mode: CUBIC
interp_mode: LANCZOS4
}
emit_constraint {
emit_type: CENTER
}
}
data_param {
source: "examples/VGG/VGG_train_lmdb"
batch_size: 16
backend: LMDB
}
annotated_data_param {
batch_sampler {
max_sample: 1
max_trials: 1
}
batch_sampler {
sampler {
min_scale: 0.3
max_scale: 1.0
min_aspect_ratio: 0.3
max_aspect_ratio: 2.0
}
sample_constraint {
min_jaccard_overlap: 0.1
}
max_sample: 1
max_trials: 50
}
batch_sampler {
sampler {
min_scale: 0.3
max_scale: 1.0
min_aspect_ratio: 0.3
max_aspect_ratio: 2.0
}
sample_constraint {
min_jaccard_overlap: 0.3
}
max_sample: 1
max_trials: 50
}
batch_sampler {
sampler {
min_scale: 0.3
max_scale: 1.0
min_aspect_ratio: 0.3
max_aspect_ratio: 2.0
}
sample_constraint {
min_jaccard_overlap: 0.5
}
max_sample: 1
max_trials: 50
}
batch_sampler {
sampler {
min_scale: 0.3
max_scale: 1.0
min_aspect_ratio: 0.3
max_aspect_ratio: 2.0
}
sample_constraint {
min_jaccard_overlap: 0.7
}
max_sample: 1
max_trials: 50
}
batch_sampler {
sampler {
min_scale: 0.3
max_scale: 1.0
min_aspect_ratio: 0.3
max_aspect_ratio: 2.0
}
sample_constraint {
min_jaccard_overlap: 0.9
}
max_sample: 1
max_trials: 50
}
batch_sampler {
sampler {
min_scale: 0.3
max_scale: 1.0
min_aspect_ratio: 0.3
max_aspect_ratio: 2.0
}
sample_constraint {
max_jaccard_overlap: 1.0
}
max_sample: 1
max_trials: 50
}
label_map_file: "data/VGG/labelmap.prototxt"
}
}
from textboxes.
from textboxes.
Pretraining (100k iteration on VGG-dataset) performance: recall = 0.67, precision = 0.62, f-measure = 0.65.
Training (4k iteration on ICDAR2013) performance: recall =0.77 , precision =0.63 , f-measure = 0.69.
It seems many words has multiple overlapped (but not precise) bounding boxes.
from textboxes.
Could you provide your recall, precision, final train-loss of the pretraining stage and training stage?
from textboxes.
@lillyPJ The final performance: recall=0.74, precision=0.86, f-measure=0.80 when use 700*700 input images. You may try to adjust the detection threshold and NMS threshold to achieve better performance.
from textboxes.
@lillyPJ Hi, text in VGG synthetic data is oriented and it's label has 4 points? Can you tell me how to use this data for trainging? Thanks a lot!
from textboxes.
For simplicity, you can use xmin = min(x1, x2, x3, x4), ymin = min(y1, y2, y3, y4), xmax = max(x1, x2, x3, x4), ymax = max(y1, y2, y3, y4) for training. @lufo816
from textboxes.
@lillyPJ Thanks!
from textboxes.
@lillyPJ Hi, can you tell me how to calculate the pricision, recall and f-measure?
Could you provide me with the source codes (matlab or python)?
from textboxes.
@lillyPJ Hi,
In your step1 and step2, what's your test data and what's the test batchsize respectively?
Do you split the SynthText into train and test?
Do you use the paper's default train code? (train_icdar13.py)
from textboxes.
@MhLiao Hi,
the results you mention above, in @lillyPJ The final performance: recall=0.74, precision=0.86, f-measure=0.80 when use 700*700 input images. You may try to adjust the detection threshold and NMS threshold to achieve better performance.
What protocol do you use to get the results? The evaluation_nms.m file in your codes or the ICDAR 2013
protocol?
I test the TextBoxes_icdar13.caffemodel (you provide with us) in different protocols.
for single scale (700*700 input) ( score>0.6 )
evaluation_nms.m: recall=0.7641, precision=0.8528, f-measure=0.7959
ICDAR 2013: recall=0.7273, precison=0.8276, f-measure=0.7742
for multiple scales: (score>0.9)
evaluation_nms.m: recall=0.8292, precision=0.8764, f-measure=0.8562
ICDAR 2013: recall=0.8046, precision=0.8402, f-measure=0.8220
Performance
Using the given test code, you can achieve an F-measure of about 80% on ICDAR 2013 with a single scale.
Using the given multi-scale test code, you can achieve an F-measure of about 85% on ICDAR 2013 with a non-maximum suppression.
It seems only testing by evaluation_nms.m can achieve the results.
P.S. I write the ICDAR 2013 protocol above by myself. Maybe i make mistakes in it.
from textboxes.
@HelloTobe For single scale input, you can upload it to the ICDAR 2013 website for evaluation; for multi-scale input, you can upload it to the ICDAR 2013 website for evaluation after nms. The website is: http://rrc.cvc.uab.es
from textboxes.
@MhLiao Thanks.
from textboxes.
Related Issues (20)
- Running TextBoxes on Caffe installed in Anaconda3 env HOT 2
- can not compile the CRNN HOT 5
- cannot find -lopencv_imgcodes HOT 1
- 关于模型的参数设置问题 HOT 3
- demo.py takes about 0.4s per image, when the model load only once and single scale is 700*700
- 关于 multi-scale的问题
- 关于Test的一些问题 HOT 8
- when i run "python examples/TextBoxes/train_icdar13.py",the error is occured when i train on my dataset.
- importError: libhdf5.so.101 HOT 1
- Where to place the downloaded model?
- Failed to run make -j8 HOT 1
- 请问如何您有synthtext数据集格式转换为icdar格式的脚本嘛,谢谢您分享一下鸭
- 请问如何能分享一下synthtext格式转换为icdar格式的脚本吗,谢谢鸭
- testing results on Total-Text HOT 1
- call for loss info/curve.
- 关于TextBoxes_icdar13.caffemodel模型
- Question about mismatch between the code with original paper
- icdar13 dataset consists of 229 training images and 233 testing images,
- will this work as an OCR solution? HOT 1
- Fine tuning on custom dataset: converting to LMDB HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from textboxes.