Coder Social home page Coder Social logo

dsod's People

Contributors

liuzhuang13 avatar szq0214 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dsod's Issues

the last pooling size mismatch

the last pooling size is not inconsistent with paper.
in the paper:
the size of the pooling7 feature map = 1*1
in model_libs.py:
the size of the pooling7 feature map = 2*2

Also, DSOD prediction layers differ from the figure 1 in paper.

About the test GPU memory?

@szq0214 @liuzhuang13 When I'm trying test the video,the DSOD300 Occupy GPU memory is 1500M ,just like SSD_300_ResNet101 . I have tried the official optimized version of the densenet.,GPU memory footprint is not particularly serious,Is this a problem with version optimization?

Can not detect the small object?

Hi,
thank you for your job.It is great.

Some question:
1,the pretrained model can not detect the small object .
2. is it great than the RON?

thank you

Question about pooling layer in your DSOD300

Thx for your sharing code.And I want to make a re-implementation of this net with other framwork.But the definition of pooling layer is different from which in caffe.

In caffe,I think the funtion of size of output is a ceil function as shown in most of your code.But in the final,I don't know why it become a floor function.

I mean that the process should be

300x300→150x150→75x75→38x38→19x19→10x10→5x5→3x3→2x2

But in your code,

model2 = add_bl_layer2(model1, 256, dropout, 1) # pooling4: 10x10
net.Third = model2
model3 = add_bl_layer2(model2, 128, dropout, 1) # pooling5: 5x5
net.Fourth = model3
model4 = add_bl_layer2(model3, 128, dropout, 1) # pooling6: 3x3
net.Fifth = model4
model5 = add_bl_layer2(model4, 128, dropout, 1) # pooling7: 1x1

I don't know why 3x3→1x1.Could you give me some suggestion?

Transition w/o Pooling Layer size mismatch

It seems your model graph is inconsistent with the paper (Table1 Output Size) for the Transition w/o Pooling Layer (1+2)
In the paper:
Transition w/o Pooling Layer (1) channel = 1120
Transition w/o Pooling Layer (2) channel = 1568
In the model graph:
Convolution49 num output = 1184
Convolution66 num output = 256

Also, I don't quite understand of the purpose of Transition w/o Pooling Layer (1), you don't actually compress nor expand its filter number (num input = num output), and you don't branch it out for prediction. By removing it (Convolution49 + BN 50 + ReLU50) you would have a compact Dense Block (3+4) with 8 x 2 = 16 dense layers. So what's the reason to explicitly inject such extra (BN+ReLU+1x1Conv) block in between?

No such File or Directory - Core Dumped

I am getting the following message when running train command

python examples/dsod/DSOD300_pascal.py

I0312 18:13:06.186707 31109 layer_factory.hpp:77] Creating layer data
I0312 18:13:06.186813 31109 net.cpp:100] Creating Layer data
I0312 18:13:06.186830 31109 net.cpp:408] data -> data
I0312 18:13:06.186846 31109 net.cpp:408] data -> label
F0312 18:13:06.189031 31210 db_lmdb.hpp:15] Check failed: mdb_status == 0 (2 vs. 0) No such file or directory
*** Check failure stack trace: ***
@ 0x7f8d26f785cd google::LogMessage::Fail()
@ 0x7f8d26f7a433 google::LogMessage::SendToLog()
@ 0x7f8d26f7815b google::LogMessage::Flush()
@ 0x7f8d26f7ae1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f8d2784b770 caffe::db::LMDB::Open()
@ 0x7f8d2768a396 caffe::DataReader<>::Body::InternalThreadEntry()
@ 0x7f8d2767c465 caffe::InternalThread::entry()
@ 0x7f8d1cc4f5d5 (unknown)
@ 0x7f8d159ee6ba start_thread
@ 0x7f8d25fcf3dd clone
@ (nil) (unknown)
Aborted (core dumped)

Any help would be appreciated. All steps before running the training command were successful.

Video Detection

Hey guys!

is it also possible to do Video Detection with this model, like in the SSD which is implemented by Wei Liu?

Best Wishes

The net that python script created is diffrent from any pre-trained model offered in this README.md?

Hi, I was trying to fine tune a pre-trained model with my dataset and I need to change number of classes from 21 to 2. So I planed to modify the python script instead of making it in prototxt files. But I found the model python script created is about "28.6M", which is different from anyone offered in this repository. If I want to train a model with 2 classes, I should train it without pre-trained model?

Many thanks!

Any Dockerfile?

Could you provide a Dockerfile? Does anyone have a Dockerfile?

How to prepare voc12 test lmdb

I want to know how to prepare voc12 test lmdb to run training on the voc07++12 dataset. Anyone can help me? thanks a lot.

About 1 Channel images

Hi,
I would like to learn from my own dataset composed of only gray level images. Could you tell me how I could adapt DSOD to work using only 1 channel. Thanks !!!

training on new dataset

Hello,

First of all I want to say thank you for releasing the code.
Can you please tell me if I can train with my own custom dataset?
Because it is not clear to me.

Thank you

Couldn't find any detections

When I run python DSOD300_pascal.py, I get many information like: I0809 19:00:06.018213 8332 detection_output_layer.cu:113] Couldn't find any detections.
What should I do?

Have you tried DSOD512?

Hi,

Recently, I tried to train DSOD512 version which follows origin-SSD512 settings except for backbone dsod.

But, the accuracy was not good as dsod300.

Have you tried to train dsod512?

Thanks :)

no advantage with VGG-ssd ?

SSD300S† 07+12 ✗ VGGNet Plain 46 26.3M 300 ×300 69.6
SSD300S† 07+12 ✗ VGGNet Dense 37 26.0M 300 ×300 70.4

in the table 4 of your paper, Dense-ssd seems to be no advantage with VGG-ssd. similar precision but slower

small batchsize has a lower mAP.

Hi, @szq0214:
I only have two GTX 1080 GPUs. I want to reproduce you GRP-DSOD. When I change the batch_size and accum_batch_size to 6 and 30, the mAP is just 63%. What I should do to get the results as you paper?
Thanks.

NameError :name ‘DSOD300_V3_Body’ is not defined. How to deal with it?

When I‘m training a DSOD model on VOC 07+12 by python examples/dsod/DSOD300_pascal.py,I encounter

Traceback (most recent call last):
File “examples/dsod/DSOD300_pascal.py”, line 380, in
DSOD300_V3_Body(net, from_layer=‘data’)
NameError: name ‘DSOD300_V3_Body’ is not defined

What should I do to deal with it? Thank you~

The training time

How long is your training time based on one TitanX GPU or 8 GPUs?

How to get VOC12 annotations?

Hi

Just wonder how did you produce
VOC0712Plus_test_lmdb
We could download images for official website but not VOC12 annotations.
How did you compute the VOC12 mAP outside the evaluation platform without annotation?

DSOD Visualization Problem on Video Test

I have download the DSOD_voc+coco model and modify the corresponding prototxt according to the video test in SSD project. While it works well in SSD project, the test failed when setting up the DSOD network, throwing the following error:

F0922 17:37:27.110465 13992 bbox_util.cpp:2197] Check failed: label < colors.size() (2 vs. 0)
*** Check failure stack trace: ***
    @     0x7f6b48c805cd  google::LogMessage::Fail()
    @     0x7f6b48c82433  google::LogMessage::SendToLog()
    @     0x7f6b48c8015b  google::LogMessage::Flush()
    @     0x7f6b48c82e1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f6b493cb414  caffe::VisualizeBBox<>()
    @     0x7f6b49765844  caffe::DetectionOutputLayer<>::Forward_gpu()
    @     0x7f6b494689e1  caffe::Net<>::ForwardFromTo()
    @     0x7f6b49468ad7  caffe::Net<>::Forward()
    @           0x4199a3  test()
    @           0x415aa5  main
    @     0x7f6b47688830  __libc_start_main
    @           0x416679  _start
    @              (nil)  (unknown)

And here is the modified part in DSOD prototxt: (I mainly modify the input layer and detection output layer according to the SSD settings)
The input layer is:

layer {
  name: "data"
  type: "VideoData"
  top: "data"
  transform_param {
    mean_value: 104
    mean_value: 117
    mean_value: 123
    resize_param {
      prob: 1
      resize_mode: WARP
      height: 300
      width: 300
      interp_mode: LINEAR
    }
  }
  data_param {
    batch_size: 1
  }
  video_data_param {
    video_type: VIDEO
    video_file: "examples/videos/ILSVRC2015_train_00755001.mp4"
    skip_frames: 1
  }
}

And the detection layer is:

layer {
  name: "detection_out"
  type: "DetectionOutput"
  bottom: "mbox_loc"
  bottom: "mbox_conf_flatten"
  bottom: "mbox_priorbox"
  bottom: "data"
  top: "detection_out"
  include {
    phase: TEST
  }
  transform_param {
    mean_value: 104
    mean_value: 117
    mean_value: 123
    resize_param {
      prob: 1
      resize_mode: WARP
      height: 576
      width: 1024
      interp_mode: LINEAR
    }
  }
  detection_output_param {
    num_classes: 21
    share_location: true
    background_label_id: 0
    nms_param {
      nms_threshold: 0.449999988079
      top_k: 400
    }
    save_output_param {
      output_directory: "data/VOC0712/dsod_labelmap_voc.prototxt"
}
    code_type: CENTER_SIZE
    keep_top_k: 200
    confidence_threshold: 0.00999999977648
    visualize: true
    visualize_threshold: 0.3
  }
}

interestingly, when I close the visualize process by setting visualize: false, the network could work well but I can't tell if the result is right without visualize video. I wonder if anyone met the same problem like this and how do you deal with it?

How to measure the inference time?

Hi,

I want to know how to measure the inference time?

Did you use caffe time operator ? or Did you measure full time when VOC 4952 test images are tested ?

Thanks in advance :)

Test on VOC2012

Hi, @szq0214. Sorry for bothering you again. Can you tell me what I should change to test on VOC2012, the default is 2007.

Sixth_norm_mbox_priorbox step parameter is wrong

Hi, with your changes to the SSD model, the last layer has 2x2 spatial size, not 1x1 anymore. This stems from the fact that the last 3×3×128 conv layer has padding 1 and also the parallel pooling branch, having kernel size 2, will output a 2x2 feature, instead of 1x1. You can double check this by reading Caffe's code of conv_layer.cpp:

const int output_dim = (input_dim + 2 * pad_data[i] - kernel_extent) / stride_data[i] + 1;

output_dim = (3 + 2 * 1 - 3) / 2 + 1 = 2 / 2 + 1 = 1 + 1 = 2

Also, Caffe's output reflects this:

I1023 13:46:00.587738    43 net.cpp:100] Creating Layer Sixth
I1023 13:46:00.587746    43 net.cpp:434] Sixth <- Convolution77
I1023 13:46:00.587751    43 net.cpp:434] Sixth <- Convolution79
I1023 13:46:00.587757    43 net.cpp:408] Sixth -> Sixth
I1023 13:46:00.587786    43 net.cpp:150] Setting up Sixth
I1023 13:46:00.587792    43 net.cpp:157] Top shape: 2 256 2 2 (2048)

Given this, I think the step size in the Sixth_norm_mbox_priorbox should be 150 (= 300/2) instead of 300 (=300/1).

EDIT: I should also point out that I have made NO modification whatsoever to the source code.

Couldn't find any detections

I am getting the following message when running train command
python examples/dsod/DSOD300_pascal.py

32554 detection_output_layer.cu:113] Couldn't find any detections
32554 detection_output_layer.cu:113] Couldn't find any detections
32554 detection_output_layer.cu:113] Couldn't find any detections
32554 detection_output_layer.cu:113] Couldn't find any detections
32554 detection_output_layer.cu:113] Couldn't find any detections

The result after relu activation function isn't used in grp-dsod.

Thx for your sharing code of grp-dsod.I read the code and I find that result after relu function isn't used in this part.

def global_level(net, from_layer, relu_name):
    fc = L.InnerProduct(net[relu_name], num_output=1)
    sigmoid = L.Sigmoid(fc, in_place=True)
    att_name = "{}_att".format(from_layer)
    sigmoid = L.Reshape(sigmoid, reshape_param=dict(shape=dict(dim=[-1])))
    scale = L.Scale(net[att_name], sigmoid, axis=0, bias_term=False, bias_filler=dict(value=0))
    relu = L.ReLU(scale, in_place=True)
    residual = L.Eltwise(net[from_layer], scale)
    gatt_name = "{}_gate".format(from_layer)
    net[gatt_name] = residual
    return net

relu = L.ReLU(scale, in_place=True)
Is it a mistake?Or,is it discarded?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.