tl_ssd's People
Forkers
rotorliu shijia2018 chanbluky amirunpri2018 tony-hou windyrainy a1b2c3s4d4 anzisheng jdeguia haolyshiit yucedagonurcan zaharenok jcliu20 wf1024966 shinkiriu aidasdirtl_ssd's Issues
About IoU
Model with offsets adaption
Hi,
Figure 6 in the paper shows that it could improve the performance a lot by using the adaption of offsets (section III-C in the paper).
I am wondering if the adaptions are the configuration mentioned in the README (i.e., the offset_w
s), which I have pasted below. If so, is it possible to get the model structure and the weights? Thank you!
layer {
name: "inception_b4_concat_norm_mbox_priorbox"
type: "PriorBox"
bottom: "inception_b4_concat_norm"
bottom: "data"
top: "inception_b4_concat_norm_mbox_priorbox"
prior_box_param {
min_size: 7
min_size: 10
min_size: 15
min_size: 25
min_size: 35
min_size: 50
min_size: 70
aspect_ratio: 0.3
flip: false
clip: false
variance: 0.1
variance: 0.1
variance: 0.2
variance: 0.2
offset_w: 0.2
offset_w: 0.4
offset_w: 0.6
offset_w: 0.8
offset_h: 0.5
}
}
Question about the number of states
Hi,
I have a question about the number of states in your work. Based on my understanding of the documentation, I think there are 4 states in total:
Specify the number of states. Please note that an additional background state is predicted as well. In other words, if your dataset contains the states red, yellow, green, you have to set the num_states to 3 + 1 = 4.
However, based on the code, there are 6 states:
- In the prototxt of the model structure, it shows that
num_states: 4
https://github.com/julimueller/tl_ssd/blob/master/prototxt/deploy.prototxt#L3767 - The channel of the
inception_b4_concat_norm_mbox_state
is 42 (7 * 6), which also suggests that there are 6 states https://github.com/julimueller/tl_ssd/blob/master/prototxt/deploy.prototxt#L3615
If there are 6 states, what are the other two besides the red, yellow, green, and background? Thank you very much!
Traffic light dataset
From where you found the traffic light dataset?
A question about batchnorm layer
Hi,
I would like to ask about the motivation for using use_global_stats=false
for BatchNorm
in the deploy.prototxt.
Based on the documentation, it is suggested to set it to true
in the testing phase.
// If false, normalization is performed over the current mini-batch
// and global statistics are accumulated (but not yet used) by a moving
// average.
// If true, those accumulated mean and variance values are used for the
// normalization.
// By default, it is set to false when the network is in the training
// phase and true when the network is in the testing phase.
optional bool use_global_stats = 1;
Thank you!
"caffe.PriorBoxParameter" has no field named "offset_w".`
I use DTLD dataset for test and I didn't change deploy.prototxt anf caffemodel.
But when the ssd_dtld_test.py run at caffe.Net(), it raise the exception:
[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 3668:13: Message type "caffe.PriorBoxParameter" has no field named "offset_w".
Anyone can help me?
Do we need the labelmap.prototxt file when we use the class "112340" as the label as you mentioned in training phase?
I want to prepared the DTLD dataset for training the model, I think I can use the class label(e.g. 112340) in the txt directly as you recommended. And then using lmdb format data created from lists of .jpg and .txt. So I don't need to using the labelmap file.
By the way, do you use the same method for data augmentation as the original SSD? It will help a lot if you could share your data layer in your train.prototxt.
error: (-215:Assertion failed) (numPriors * _numLocClasses * 4) == total(inputs[0], 1)
I run you code using my own image. I dont change anything in your deploy.prototxt, THE INPUT SHAPE IS CORRECT but I encouter some error when i run net.forward()
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3066)
cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively OPENCV/DNN: [DetectionOutput]:(detection_out): getMemoryShapes() throws exception. inputs=4 outputs=0/0
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[0] = [ 1 110236 ]
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[1] = [ 1 55118 ]
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[2] = [ 1 2 220472 ]
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[3] = [ 1 165354 ]
error: (-215:Assertion failed) (numPriors * _numLocClasses * 4) == total(inputs[0], 1) in function 'cv::dnn::DetectionOutputLayerImpl::getMemoryShapes'
CAN YOU HELP ME, I JUST WANT TO REUSE YOU CODE TO TEST IMAGES SO I DONT CHANGE ANYTHING. WHERE AM I WRONG OR WHAT SHOULD I MODIFY THE CODE. SORRY FOR MY BAD ENGLISH, THANK YOU SO MUCH.
Index out of bound error
python ssd_dtld_test.py --predictionmap_file /home/vishwa/Downloads/project/ssd_try5/tl_ssd/test_on_dtld/prediction_map_ssd_states.json --confidence 0.2 --deploy_file /home/vishwa/Downloads/project/ssd_try5/tl_ssd/prototxt/deploy.prototxt --caffemodel_file /home/vishwa/Downloads/project/ssd_try5/tl_ssd/caffemodel/SSD_DTLD_iter_90000.caffemodel --test_file /home/vishwa/Downloads/DTLD/label/Bochum_all.yml
('CHECK: ', 5)
Traceback (most recent call last):
File "ssd_dtld_test.py", line 201, in
main(parse_args())
File "ssd_dtld_test.py", line 156, in main
result = detection.detect(img_color, args.confidence)
File "ssd_dtld_test.py", line 85, in detect
det_xmin = detections[0,0,:,3 + num_states + 1]
IndexError: index 9 is out of bounds for axis 3 with size 7
@julimueller Since number of states is 5, I am getting index out of bound error.
Do you have any idea how to fix it?
Inference on different image sizes
Hello,
I want to inference on difference dataset which consists of 256x512 images with pretrained model on repo.
Before that I tried with images 1024x768 and I found it very robust, but when I tried with 256x512 images its accuracy went nearly zero.
So if I want to inference on different image sizes, I am changing:
So what should I do? Is there anything needs to be changed further?
Have a great day.
PyTorch replication
Hi,
I have replicated this work in PyTorch py_tl_ssd. However, I cannot verify the correctness of my replication due to the compilation issues of this project (cuda driver problems). Is it possible to get the inference results on DTLD dataset of tl_ssd? So that I could compare the results and verify the correctness of my replication. Thanks for your help!
forward pass got killed
Why we use raw_scale in preprocessing?
Hello everyone,
Thank you for this repo first of all.
-
I am confused why we are using pure ssd prepocessing, especially raw_scale which is 255: https://github.com/weiliu89/caffe/blob/4817bf8b4200b35ada8ed0dc378dceaf38c539e4/python/caffe/io.py#L223
-
As far as I know, it is used for images represented in [0...1] range to scale them up to [0...255].
-
I double checked using DTLD data and pixel ranges are [0...255]. So by using this raw_scale we end up with odd values for images:
array([[[11730., 11730., 11730., ..., 9435., 8415., 8415.],
[11730., 11730., 11730., ..., 9435., 8415., 8415.],
[11730., 11730., 11475., ..., 9180., 9945., 9945.],
...,
[10200., 10200., 10965., ..., 6885., 6885., 6885.],
[10200., 10200., 10200., ..., 6120., 5865., 5865.],
[ 9945., 9945., 9690., ..., 5100., 5100., 5100.]],
[[10455., 10455., 10965., ..., 16575., 13770., 13770.],
[10455., 10455., 10965., ..., 16575., 13770., 13770.],
[ 9690., 9690., 9435., ..., 14280., 15300., 15300.],
...,
[10455., 10455., 10200., ..., 11220., 10455., 10455.],
[10200., 10200., 9945., ..., 9435., 8670., 8670.],
[10455., 10455., 10200., ..., 7905., 8415., 8415.]],
[[ 8160., 8160., 8160., ..., 13260., 13260., 13260.],
[ 8160., 8160., 8160., ..., 13260., 13260., 13260.],
[ 7650., 7650., 7650., ..., 10200., 13005., 13005.],
...,
[ 8160., 8160., 8160., ..., 9435., 9180., 9180.],
[ 8160., 8160., 8160., ..., 7905., 7905., 7905.],
[ 8415., 8415., 8415., ..., 7140., 7395., 7395.]]],
dtype=float32)
Forgive my curiosity, what am I missing here?
Have a great day.
How to train the model (tl_ssd) with our own dataset?
About max stride
In the paper, I don’t understand this sentence-"In consequence, a maximum stride of 0.34·5 pixels =1.7 pixels is needed to guarantee a detection of objects with a width of 5 pixels. As seen in Table I, only layer conv 1 - conv 3 can satisfy this condition." Can you explain it? I really want to know this answer. Thanks a lot.
Model structure question
Hi,
I have a question about the model structure. The paper shows that there are two inception_c blocks, and inception_a3, inception_b4, and inception_c2 are concatenated. However, the model file in the repo (https://github.com/julimueller/tl_ssd/blob/master/prototxt/deploy.prototxt) does not contain these. Did I misunderstand anything?
Thank you!
mAP results of the proposed SSD on DTLD
Hello,
First of all, thank you very much for your work and explanations.
I was wondering if you have mAP AP50 AP0.5:0.95 results?
Thanks again.
How do I compress the training data when write train script?
Hi:
I want train the model using the DriveU traffic light dataset. But I don't know how to generate the data as input. Would you like post your train script?
Thanks
Can't find the inception_c and concatenation part
Hi, Julian, In your paper you use the concat layer from inception_a3, inception_b4 and inception_c2, but I didn't find these operation in the deploy.prototxt. So are these net structures in train.prototxt only? And will you publish your training code later? Thanks for your work.
Message type "caffe.MultiBoxLossParameter" has no field named "state_digit".
Hi,
When i use the prior box adaptions AND state detection for DTLD,I meet "Error parsing text-format caffe.NetParameter: 3918:16: Message type "caffe.MultiBoxLossParameter" has no field named "state_digit".",but i have already replace the caffe file from you,what's the problem?
Thank you~
error in build "'FocalLossParameter' does not name a type"
Hi, julimueller,
I have a problem in building the code.when I built the code in step2, it went wrong with following:
root@104b0b60593a:/opt/caffe# make all
CXX src/caffe/solver.cpp
In file included from src/caffe/solver.cpp:9:0:
./include/caffe/util/bbox_util.hpp:311:69: error: 'FocalLossParameter' does not name a type const map<int, vector<NormalizedBBox> >& all_gt_bboxes, const FocalLossParameter focal_param,
^
./include/caffe/util/bbox_util.hpp:326:74: error: 'FocalLossParameter' does not name a type const int background_label_id, const ConfLossType loss_type, const FocalLossParameter focal_param,
^
In file included from src/caffe/solver.cpp:9:0:
./include/caffe/util/bbox_util.hpp:540:69: error: 'FocalLossParameter' does not name a type const map<int, vector<NormalizedBBox> >& all_gt_bboxes, const FocalLossParameter focal_param,
^
Makefile:575: recipe for target '.build_release/src/caffe/solver.o' failed
make: *** [.build_release/src/caffe/solver.o] Error 1
And i haven't found the definition of the FocalLossParameter in the code, I think it's a new type in tl_ssd. Wish to your reply!.
A train issue about "Number of priors must match number of location predictions."
Hi, I want to train the model using the DLTD dataset.So, I generate train.prototxt by adapting the depoly.prototxt folling follwing 3 steps:
- change the input to:
layer { name: "data" type: "AnnotatedData" top: "data" top: "label" include { phase: TRAIN } transform_param { crop_h:512 crop_w:2048 mirror: true mean_value: 60 mean_value: 60 mean_value: 60 } data_param { source: "/home/sc03/datasets/DLTD/Berlin/VOC0712/lmdb/VOC0712_trainval_lmdb" batch_size: 1 backend: LMDB } }
- add the MultiBoxLoss layer in your readme
layer { name: "mbox_loss" type: "MultiBoxLoss" bottom: "mbox_loc" bottom: "mbox_conf" bottom: "mbox_priorbox" bottom: "label" bottom: "mbox_state" top: "mbox_loss" include { phase: TRAIN } propagate_down: true propagate_down: true propagate_down: false propagate_down: false propagate_down: true loss_param { normalization: VALID } multibox_loss_param { loc_loss_type: SMOOTH_L1 conf_loss_type: SOFTMAX loc_weight: 1.0 num_classes: 2 share_location: true match_type: PER_PREDICTION overlap_threshold: 0.3 use_prior_for_matching: true background_label_id: 0 use_difficult_gt: true neg_pos_ratio: 3.0 neg_overlap: 0.5 code_type: CENTER_SIZE ignore_cross_boundary_bbox: false mining_type: MAX_NEGATIVE state_weight: 1.0 do_state_prediction: true num_states: 6 background_state_id: 0 state_digit: 4 state_loss_type: LOGISTIC } }
- Prior Box Adaptions as in your readme.
layer { name: "inception_b4_concat_norm_mbox_priorbox" type: "PriorBox" bottom: "inception_b4_concat_norm" bottom: "data" top: "inception_b4_concat_norm_mbox_priorbox" prior_box_param { min_size: 7 min_size: 10 min_size: 15 min_size: 25 min_size: 35 min_size: 50 min_size: 70 aspect_ratio: 0.3 flip: false clip: false variance: 0.1 variance: 0.1 variance: 0.2 variance: 0.2 offset_w: 0.2 offset_w: 0.4 offset_w: 0.6 offset_w: 0.8 offset_h: 0.5 } }
But, when I train the network, I meet an error:
multibox_loss_layer.cpp:242] Check failed: num_priors_ * loc_classes_ * 4 == bottom[0]->channels() (440944 vs. 110236) Number of priors must match number of location predictions.
Could you tell me where I should adapt to fix the error?
Number of priors must match number of location predictions
Hello:
I have compiled the code according to readme sucessfully.
But when I run the ssd_dtld_test.py , I meet the error:
F0215 12:27:08.644311 15733 detection_output_layer.cpp:164] Check failed: num_priors_ * num_loc_classes_ * 4 == bottom[0]->channels() (220472 vs. 110236) Number of priors must match number of location predictions.
anyone can help me solve the problem?
Detection or classification
Hello Julian Müller,
I was testing the model for my better understanding, and I could use the model for detection of traffic light but I couldn't get any success on classification. Should i need to change the deploy.prototxt in order to handle the classification.
Thanks,
Vishwa
Layer mbox_loss error when only use the prior box adaptions
when I use the original mbox_loss layer as you recommend, it print
I0613 03:33:26.476109 16066 net.cpp:434] mbox_state <- inception_b4_concat_norm_mbox_state_flat
I0613 03:33:26.476140 16066 net.cpp:408] mbox_state -> mbox_state
I0613 03:33:26.476225 16066 net.cpp:150] Setting up mbox_state
I0613 03:33:26.476246 16066 net.cpp:157] Top shape: 8 165354 (1322832)
I0613 03:33:26.476255 16066 net.cpp:165] Memory required for data: 19659154720
F0613 03:33:26.476299 16066 net.cpp:88] Check failed: layer_param.propagate_down_size() == layer_param.bottom_size() (5 vs. 4) propagate_down param must be specified either 0 or bottom_size times
But when I remove all the propagate_down_size param, it prints as following
I0613 04:03:53.034931 16106 net.cpp:100] Creating Layer mbox_loss
I0613 04:03:53.034948 16106 net.cpp:434] mbox_loss <- mbox_loc
I0613 04:03:53.034972 16106 net.cpp:434] mbox_loss <- mbox_conf
I0613 04:03:53.035014 16106 net.cpp:434] mbox_loss <- mbox_priorbox
I0613 04:03:53.035034 16106 net.cpp:434] mbox_loss <- label
I0613 04:03:53.035068 16106 net.cpp:408] mbox_loss -> mbox_loss
F0613 04:03:53.035151 16106 layer.hpp:374] Check failed: ExactNumBottomBlobs() == bottom.size() (5 vs. 4) MultiBoxLoss Layer takes 5 bottom blob(s) as input.
it occurs when I only use the prior box adaptions, and make all the boxes labels same to one class. It seems that the mbox_loss layer must have 5 bottom Blobs.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.