Coder Social home page Coder Social logo

Comments (9)

yhenon avatar yhenon commented on August 15, 2024 2

Thanks for reporting this. I'm now aware of some issues with the tensorflow backend, it should be fixed over the next day or two.

from keras-spp.

yhenon avatar yhenon commented on August 15, 2024 1

@fferroni yes that's correct.
One advantage of RoiPooling of it is that you can pool at multiple sizes since the output is all flat. With RoiPooling, you can for example have a 1x1, a 2x2 and a 4x4, all flattened into a 1+4+16=21 length array.
But with RoiPoolingConv, if you have a 7x7 output, you can't have a different size one, since your array would be uneven.

Note that I'm still working on this. I've just submitted an issue that needs to be fixed to get the tensorflow backend working.

from keras-spp.

yhenon avatar yhenon commented on August 15, 2024 1

Hey,
The trick is to use TimeDistributed, which will apply it as you want. It's kind of awkward, but I see no other way to do it. I am working on a prototype faster-rcnn implementation right now, you can see the model definition here
Basically the layer look like this:

def classifier(base_layers,input_rois,num_rois,nb_classes = 21):

    pooling_regions = 7

    out_roi_pool = RoiPoolingConv(pooling_regions, num_rois)([base_layers,input_rois])
    out_class  = TimeDistributed(Convolution2D(...))(out_roi_pool)
    ...
    out_class  = TimeDistributed(Flatten())(out_class)
    out_class  = TimeDistributed(Dense(nb_classes,activation='softmax'),name='dense_{}'.format(nb_classes))(out_class)

    return (out_class)

For RoiPooling, use TimeDistributed(Dense(...)) instead of TimeDistributed(Convolution2D(...))

from keras-spp.

fferroni avatar fferroni commented on August 15, 2024

Is RoiPooling essentially a "flattened" version of RoiPoolingConv, for each RoI ? In other words, it returns the various RoI feature vectors rather than the RoI feature maps?

from keras-spp.

fferroni avatar fferroni commented on August 15, 2024

Thanks. Yes, that sounds nice.
Given that you mention i.e. Fast RCNN in your readme, the output of this RoiPooling layer would then be passed through some FC layers and split into a softmax output and a bbox regressor output. But in Keras, how do you do this for multiple ROI? The output of this RoiPooling would be (None, nb_roi, nb_channels * num_outputs_per_channel) but the classification and regression needs to be done for each roi. Maybe TimeDistributed?

from keras-spp.

fferroni avatar fferroni commented on August 15, 2024

Ha! You beat me to it ;-)

from keras-spp.

yhenon avatar yhenon commented on August 15, 2024

Ha not really, mine is still very much a work in progress. It is quite a large task, I would certainly be interested if you want to collaborate or submit some pull requests.

from keras-spp.

yhenon avatar yhenon commented on August 15, 2024

I've finally fixed this. All test should now pass for both the theano and tensorflow backend (tested with keras 1.2.2, theano 0.8.2, tensorflow 1.0.0). Note that there are small differences in how rounding is handled by default, so results may not be exactly reproducible between pure numpy vs theano vs tensorflow is pooling sized that don't divide exactly (i.e. pooling size 16 into a size 3).

For a more complete example on using these layers, see the work on frcnn I've begun here:
https://github.com/yhenon/keras-frcnn

Please feel free to reopen this if you have issues.

from keras-spp.

kirk86 avatar kirk86 commented on August 15, 2024

@yhenon are u planning to do a pull request to keras?

from keras-spp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.