julimueller / tl_ssd Goto Github PK

View Code? Open in Web Editor NEW

47.0 47.0 17.0 34.74 MB

C++ 79.80% Cuda 15.33% Python 4.87%

tl_ssd's People

Contributors

Stargazers

Watchers

Forkers

rotorliu shijia2018 chanbluky amirunpri2018 tony-hou windyrainy a1b2c3s4d4 anzisheng jdeguia haolyshiit yucedagonurcan zaharenok jcliu20 wf1024966 shinkiriu aidasdir

tl_ssd's Issues

About IoU

Hi Julian,
In Equation 3 of the paper,

the calculation of IoU should be the ratio of intersection to union, but it should be this

I don't know the reason.Is it because the value is too small to be deleted directly?Can you explain it? Thank you very much.

Best,
Dreamay

Model with offsets adaption

Hi,

Figure 6 in the paper shows that it could improve the performance a lot by using the adaption of offsets (section III-C in the paper).

I am wondering if the adaptions are the configuration mentioned in the README (i.e., the offset_ws), which I have pasted below. If so, is it possible to get the model structure and the weights? Thank you!

layer {
  name: "inception_b4_concat_norm_mbox_priorbox"
  type: "PriorBox"
  bottom: "inception_b4_concat_norm"
  bottom: "data"
  top: "inception_b4_concat_norm_mbox_priorbox"
  prior_box_param {
    min_size: 7
    min_size: 10
    min_size: 15
    min_size: 25
    min_size: 35
    min_size: 50
    min_size: 70
    aspect_ratio: 0.3
    flip: false
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset_w: 0.2
    offset_w: 0.4
    offset_w: 0.6
    offset_w: 0.8
    offset_h: 0.5
  }
}

Question about the number of states

Hi,

I have a question about the number of states in your work. Based on my understanding of the documentation, I think there are 4 states in total:

Specify the number of states. Please note that an additional background state is predicted as well. In other words, if your dataset contains the states red, yellow, green, you have to set the num_states to 3 + 1 = 4.

However, based on the code, there are 6 states:

In the prototxt of the model structure, it shows that num_states: 4 https://github.com/julimueller/tl_ssd/blob/master/prototxt/deploy.prototxt#L3767
The channel of the inception_b4_concat_norm_mbox_state is 42 (7 * 6), which also suggests that there are 6 states https://github.com/julimueller/tl_ssd/blob/master/prototxt/deploy.prototxt#L3615

If there are 6 states, what are the other two besides the red, yellow, green, and background? Thank you very much!

Traffic light dataset

From where you found the traffic light dataset?

A question about batchnorm layer

Hi,

I would like to ask about the motivation for using use_global_stats=false for BatchNorm in the deploy.prototxt.
Based on the documentation, it is suggested to set it to true in the testing phase.

  // If false, normalization is performed over the current mini-batch
  // and global statistics are accumulated (but not yet used) by a moving
  // average.
  // If true, those accumulated mean and variance values are used for the
  // normalization.
  // By default, it is set to false when the network is in the training
  // phase and true when the network is in the testing phase.
  optional bool use_global_stats = 1;

Thank you!

"caffe.PriorBoxParameter" has no field named "offset_w".`

I use DTLD dataset for test and I didn't change deploy.prototxt anf caffemodel.
But when the ssd_dtld_test.py run at caffe.Net(), it raise the exception:
[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 3668:13: Message type "caffe.PriorBoxParameter" has no field named "offset_w".
Anyone can help me?

Do we need the labelmap.prototxt file when we use the class "112340" as the label as you mentioned in training phase?

I want to prepared the DTLD dataset for training the model, I think I can use the class label(e.g. 112340) in the txt directly as you recommended. And then using lmdb format data created from lists of .jpg and .txt. So I don't need to using the labelmap file.
By the way, do you use the same method for data augmentation as the original SSD? It will help a lot if you could share your data layer in your train.prototxt.

error: (-215:Assertion failed) (numPriors * _numLocClasses * 4) == total(inputs[0], 1)

I run you code using my own image. I dont change anything in your deploy.prototxt, THE INPUT SHAPE IS CORRECT but I encouter some error when i run net.forward()

[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3066)
cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively OPENCV/DNN: [DetectionOutput]:(detection_out): getMemoryShapes() throws exception. inputs=4 outputs=0/0
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[0] = [ 1 110236 ]
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[1] = [ 1 55118 ]
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[2] = [ 1 2 220472 ]
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[3] = [ 1 165354 ]
error: (-215:Assertion failed) (numPriors * _numLocClasses * 4) == total(inputs[0], 1) in function 'cv::dnn::DetectionOutputLayerImpl::getMemoryShapes'

CAN YOU HELP ME, I JUST WANT TO REUSE YOU CODE TO TEST IMAGES SO I DONT CHANGE ANYTHING. WHERE AM I WRONG OR WHAT SHOULD I MODIFY THE CODE. SORRY FOR MY BAD ENGLISH, THANK YOU SO MUCH.

Index out of bound error

python ssd_dtld_test.py --predictionmap_file /home/vishwa/Downloads/project/ssd_try5/tl_ssd/test_on_dtld/prediction_map_ssd_states.json --confidence 0.2 --deploy_file /home/vishwa/Downloads/project/ssd_try5/tl_ssd/prototxt/deploy.prototxt --caffemodel_file /home/vishwa/Downloads/project/ssd_try5/tl_ssd/caffemodel/SSD_DTLD_iter_90000.caffemodel --test_file /home/vishwa/Downloads/DTLD/label/Bochum_all.yml

('CHECK: ', 5)
Traceback (most recent call last):
File "ssd_dtld_test.py", line 201, in
main(parse_args())
File "ssd_dtld_test.py", line 156, in main
result = detection.detect(img_color, args.confidence)
File "ssd_dtld_test.py", line 85, in detect
det_xmin = detections[0,0,:,3 + num_states + 1]
IndexError: index 9 is out of bounds for axis 3 with size 7

@julimueller Since number of states is 5, I am getting index out of bound error.
Do you have any idea how to fix it?

Inference on different image sizes

Hello,

I want to inference on difference dataset which consists of 256x512 images with pretrained model on repo.
Before that I tried with images 1024x768 and I found it very robust, but when I tried with 256x512 images its accuracy went nearly zero.

So if I want to inference on different image sizes, I am changing:

ssd_dtld_test.py
deploy.prototxt

So what should I do? Is there anything needs to be changed further?

Have a great day.

PyTorch replication

Hi,

I have replicated this work in PyTorch py_tl_ssd. However, I cannot verify the correctness of my replication due to the compilation issues of this project (cuda driver problems). Is it possible to get the inference results on DTLD dataset of tl_ssd? So that I could compare the results and verify the correctness of my replication. Thanks for your help!

forward pass got killed

hello Julian Müller,

i'm trying to run the demo with CPU. Unfortunately, it was interrupted during forward pass. Do I have to use GPU for it? Thanks in advance.

MfG.

Why we use raw_scale in preprocessing?

Hello everyone,

Thank you for this repo first of all.

I am confused why we are using pure ssd prepocessing, especially raw_scale which is 255: https://github.com/weiliu89/caffe/blob/4817bf8b4200b35ada8ed0dc378dceaf38c539e4/python/caffe/io.py#L223
As far as I know, it is used for images represented in [0...1] range to scale them up to [0...255].
I double checked using DTLD data and pixel ranges are [0...255]. So by using this raw_scale we end up with odd values for images:

array([[[11730., 11730., 11730., ...,  9435.,  8415.,  8415.],
        [11730., 11730., 11730., ...,  9435.,  8415.,  8415.],
        [11730., 11730., 11475., ...,  9180.,  9945.,  9945.],
        ...,
        [10200., 10200., 10965., ...,  6885.,  6885.,  6885.],
        [10200., 10200., 10200., ...,  6120.,  5865.,  5865.],
        [ 9945.,  9945.,  9690., ...,  5100.,  5100.,  5100.]],

       [[10455., 10455., 10965., ..., 16575., 13770., 13770.],
        [10455., 10455., 10965., ..., 16575., 13770., 13770.],
        [ 9690.,  9690.,  9435., ..., 14280., 15300., 15300.],
        ...,
        [10455., 10455., 10200., ..., 11220., 10455., 10455.],
        [10200., 10200.,  9945., ...,  9435.,  8670.,  8670.],
        [10455., 10455., 10200., ...,  7905.,  8415.,  8415.]],

       [[ 8160.,  8160.,  8160., ..., 13260., 13260., 13260.],
        [ 8160.,  8160.,  8160., ..., 13260., 13260., 13260.],
        [ 7650.,  7650.,  7650., ..., 10200., 13005., 13005.],
        ...,
        [ 8160.,  8160.,  8160., ...,  9435.,  9180.,  9180.],
        [ 8160.,  8160.,  8160., ...,  7905.,  7905.,  7905.],
        [ 8415.,  8415.,  8415., ...,  7140.,  7395.,  7395.]]],
      dtype=float32)

Forgive my curiosity, what am I missing here?

Have a great day.

How to train the model (tl_ssd) with our own dataset?

About max stride

In the paper, I don’t understand this sentence-"In consequence, a maximum stride of 0.34·5 pixels =1.7 pixels is needed to guarantee a detection of objects with a width of 5 pixels. As seen in Table I, only layer conv 1 - conv 3 can satisfy this condition." Can you explain it? I really want to know this answer. Thanks a lot.

Model structure question

Hi,

I have a question about the model structure. The paper shows that there are two inception_c blocks, and inception_a3, inception_b4, and inception_c2 are concatenated. However, the model file in the repo (https://github.com/julimueller/tl_ssd/blob/master/prototxt/deploy.prototxt) does not contain these. Did I misunderstand anything?

Thank you!

mAP results of the proposed SSD on DTLD

Hello,
First of all, thank you very much for your work and explanations.
I was wondering if you have mAP AP50 AP0.5:0.95 results?
Thanks again.

How do I compress the training data when write train script?

Hi:
I want train the model using the DriveU traffic light dataset. But I don't know how to generate the data as input. Would you like post your train script?
Thanks

Can't find the inception_c and concatenation part

Hi, Julian, In your paper you use the concat layer from inception_a3, inception_b4 and inception_c2, but I didn't find these operation in the deploy.prototxt. So are these net structures in train.prototxt only? And will you publish your training code later? Thanks for your work.

Message type "caffe.MultiBoxLossParameter" has no field named "state_digit".

Hi,
When i use the prior box adaptions AND state detection for DTLD,I meet "Error parsing text-format caffe.NetParameter: 3918:16: Message type "caffe.MultiBoxLossParameter" has no field named "state_digit".",but i have already replace the caffe file from you,what's the problem?
Thank you~

error in build "'FocalLossParameter' does not name a type"

Hi, julimueller,
I have a problem in building the code.when I built the code in step2, it went wrong with following:

root@104b0b60593a:/opt/caffe# make all
CXX src/caffe/solver.cpp
In file included from src/caffe/solver.cpp:9:0:
./include/caffe/util/bbox_util.hpp:311:69: error: 'FocalLossParameter' does not name a type const map<int, vector<NormalizedBBox> >& all_gt_bboxes, const FocalLossParameter focal_param,
                                                                     ^
./include/caffe/util/bbox_util.hpp:326:74: error: 'FocalLossParameter' does not name a type const int background_label_id, const ConfLossType loss_type, const FocalLossParameter focal_param,
                                                                     ^
In file included from src/caffe/solver.cpp:9:0:
./include/caffe/util/bbox_util.hpp:540:69: error: 'FocalLossParameter' does not name a type const map<int, vector<NormalizedBBox> >& all_gt_bboxes, const FocalLossParameter focal_param,
                                                                     ^
Makefile:575: recipe for target '.build_release/src/caffe/solver.o' failed
make: *** [.build_release/src/caffe/solver.o] Error 1

And i haven't found the definition of the FocalLossParameter in the code, I think it's a new type in tl_ssd. Wish to your reply!.

A train issue about "Number of priors must match number of location predictions."

Hi, I want to train the model using the DLTD dataset.So, I generate train.prototxt by adapting the depoly.prototxt folling follwing 3 steps:

change the input to:
layer { name: "data" type: "AnnotatedData" top: "data" top: "label" include { phase: TRAIN } transform_param { crop_h:512 crop_w:2048 mirror: true mean_value: 60 mean_value: 60 mean_value: 60 } data_param { source: "/home/sc03/datasets/DLTD/Berlin/VOC0712/lmdb/VOC0712_trainval_lmdb" batch_size: 1 backend: LMDB } }
add the MultiBoxLoss layer in your readme
layer { name: "mbox_loss" type: "MultiBoxLoss" bottom: "mbox_loc" bottom: "mbox_conf" bottom: "mbox_priorbox" bottom: "label" bottom: "mbox_state" top: "mbox_loss" include { phase: TRAIN } propagate_down: true propagate_down: true propagate_down: false propagate_down: false propagate_down: true loss_param { normalization: VALID } multibox_loss_param { loc_loss_type: SMOOTH_L1 conf_loss_type: SOFTMAX loc_weight: 1.0 num_classes: 2 share_location: true match_type: PER_PREDICTION overlap_threshold: 0.3 use_prior_for_matching: true background_label_id: 0 use_difficult_gt: true neg_pos_ratio: 3.0 neg_overlap: 0.5 code_type: CENTER_SIZE ignore_cross_boundary_bbox: false mining_type: MAX_NEGATIVE state_weight: 1.0 do_state_prediction: true num_states: 6 background_state_id: 0 state_digit: 4 state_loss_type: LOGISTIC } }
Prior Box Adaptions as in your readme.
layer { name: "inception_b4_concat_norm_mbox_priorbox" type: "PriorBox" bottom: "inception_b4_concat_norm" bottom: "data" top: "inception_b4_concat_norm_mbox_priorbox" prior_box_param { min_size: 7 min_size: 10 min_size: 15 min_size: 25 min_size: 35 min_size: 50 min_size: 70 aspect_ratio: 0.3 flip: false clip: false variance: 0.1 variance: 0.1 variance: 0.2 variance: 0.2 offset_w: 0.2 offset_w: 0.4 offset_w: 0.6 offset_w: 0.8 offset_h: 0.5 } }
But, when I train the network, I meet an error:
multibox_loss_layer.cpp:242] Check failed: num_priors_ * loc_classes_ * 4 == bottom[0]->channels() (440944 vs. 110236) Number of priors must match number of location predictions.

Could you tell me where I should adapt to fix the error?

Number of priors must match number of location predictions

Hello:
I have compiled the code according to readme sucessfully.
But when I run the ssd_dtld_test.py , I meet the error:
F0215 12:27:08.644311 15733 detection_output_layer.cpp:164] Check failed: num_priors_ * num_loc_classes_ * 4 == bottom[0]->channels() (220472 vs. 110236) Number of priors must match number of location predictions.
anyone can help me solve the problem?

Detection or classification

Hello Julian Müller,

I was testing the model for my better understanding, and I could use the model for detection of traffic light but I couldn't get any success on classification. Should i need to change the deploy.prototxt in order to handle the classification.

Thanks,
Vishwa

Layer mbox_loss error when only use the prior box adaptions

when I use the original mbox_loss layer as you recommend, it print

I0613 03:33:26.476109 16066 net.cpp:434] mbox_state <- inception_b4_concat_norm_mbox_state_flat
I0613 03:33:26.476140 16066 net.cpp:408] mbox_state -> mbox_state
I0613 03:33:26.476225 16066 net.cpp:150] Setting up mbox_state
I0613 03:33:26.476246 16066 net.cpp:157] Top shape: 8 165354 (1322832)
I0613 03:33:26.476255 16066 net.cpp:165] Memory required for data: 19659154720
F0613 03:33:26.476299 16066 net.cpp:88] Check failed: layer_param.propagate_down_size() == layer_param.bottom_size() (5 vs. 4) propagate_down param must be specified either 0 or bottom_size times

But when I remove all the propagate_down_size param, it prints as following

I0613 04:03:53.034931 16106 net.cpp:100] Creating Layer mbox_loss
I0613 04:03:53.034948 16106 net.cpp:434] mbox_loss <- mbox_loc
I0613 04:03:53.034972 16106 net.cpp:434] mbox_loss <- mbox_conf
I0613 04:03:53.035014 16106 net.cpp:434] mbox_loss <- mbox_priorbox
I0613 04:03:53.035034 16106 net.cpp:434] mbox_loss <- label
I0613 04:03:53.035068 16106 net.cpp:408] mbox_loss -> mbox_loss
F0613 04:03:53.035151 16106 layer.hpp:374] Check failed: ExactNumBottomBlobs() == bottom.size() (5 vs. 4) MultiBoxLoss Layer takes 5 bottom blob(s) as input.

it occurs when I only use the prior box adaptions, and make all the boxes labels same to one class. It seems that the mbox_loss layer must have 5 bottom Blobs.