Coder Social home page Coder Social logo

roi_pooling's Introduction

Pytorch ROIPooling

Welcome!

This is a generic implementation of ROIpooling operation used in the context of object detection.

Feats

  • Modularized

  • JIT compilation with cupy

  • Works well with batches of images ๐Ÿ˜‰

Getting started

We need the following requirements cuda, pytorch==1.0.1, cupy=5.1.0 which we can get most of them from anaconda.org with trusted channels.

  1. Install anaconda or miniconda.

    Skip this if you already have miniconda or anaconda installed in your system.

  2. Create a new environment

    conda create -n pytorch-extensions python=3.7 pytorch cupy -c pytorch

    This step creates a conda environment called pytorch-extensions. In case, you change the name keep it mind to update the next lines accordingly.

  3. conda activate pytorch-extensions

  4. python example.py

    Hopefully everything runs like the breeze.

Can I use it in Colab?

Sure, take a look at this notebook. It provides a guide for the setup and usage of the roi_pooling Function.

LICENSE

MIT

We highly appreciate that you leave attribution notes when you copy portions of this codebase in yours.

Did you like it?

Support me, gimme a โญ in the github banner or invite me a โ˜•/๐Ÿบ. If you are in academia, I would appreciate that you cite my research:

@article{EscorciaDJGS18,
  author    = {Victor Escorcia and
               Cuong Duc Dao and
               Mihir Jain and
               Bernard Ghanem and
               Cees Snoek},
  title     = {Guess Where? Actor-Supervision for Spatiotemporal Action Localization},
  journal   = {CoRR},
  volume    = {abs/1804.01824},
  year      = {2018},
  url       = {http://arxiv.org/abs/1804.01824},
  archivePrefix = {arXiv},
  eprint    = {1804.01824}
}

This implementation was built on top of the legendary Faster-RCNN which you must cite:

@article{RenHG017,
  author    = {Shaoqing Ren and
               Kaiming He and
               Ross B. Girshick and
               Jian Sun},
  title     = {Faster {R-CNN:} Towards Real-Time Object Detection with Region Proposal
               Networks},
  journal   = {{IEEE} Trans. Pattern Anal. Mach. Intell.},
  volume    = {39},
  number    = {6},
  pages     = {1137--1149},
  year      = {2017},
  url       = {https://doi.org/10.1109/TPAMI.2016.2577031},
  doi       = {10.1109/TPAMI.2016.2577031}
}

This was also possible due to Chainer, and the easy to follow pyinn.

FAQs

Do I need to buy an anaconda license?

Of course not! You do everything with virtual environments. Indeed, I would be pleased to accept a PR with a recipe for virtual environments.

Why anaconda?

In short, due to the last five letters.

Why another ROIpooling operation?

Well, I tried many C extensions mainly taken from this repo but those did not fit my purpose of ROIPooling over batches of images.

Why?

You can clearly see here that when the batch size is greater than 1, the output is zero.

Does that mean that they are useless?

Of course not! I noticed that FastRCNN uses a batch size of 1. Probably, they did not mind to make it more general implementation.

Why didn't you remove the conditional?

I tried in one of the repos but it fails. I even removed all the binaries and compiled again but it still returned zeros. Thus, I just moved on and pursue my personal reason:

I was really curious of launching cupy kernels using data from pytorch tensors. It is simply amazing. Moreover, it was a great experience to explore CUDA and pytorch.autograd.

roi_pooling's People

Contributors

escorciav avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

roi_pooling's Issues

trying to undestand the batch_size

Thank you for your code, I have a little confusion to understand the procedure. Sorry if my question is stupid.
I am looking at the test case that you provided.

So based on my understanding the batch_size in output y is the number of rois? am I correct?
if that is the case so what will happen to batch_sizes in the input. in other words, let say my input has shape of torch.Size([2, 4, 12, 8]) and I have 4 rois, so the output will be in size of torch.Size([4, 4, 5, 7]), should the output be in size of torch.Size([2, 4, 4, 5, 7])?

About the format of rois

image,
Does your roi format be [index, xMin, yMin, xMax, yMax] or [index, yMin, xMin, yMax, xMax]?

trying to undestand the output dimension of roi pooling

Thank you for your code, I have a qustion to understand this code.
I have seen the issue > #6 , but i still have quesions about this.

In "example.py" , the input dimention is torch.Size([4, 4, 12, 8]), and the output dimension is torch.Size([4, 4, 5, 7]).
I see "x_np = np.arange(batch_size * n_channels * input_size[0] * input_size[1], dtype=np.float32) " .
I guess that the input means that the number of feature maps is "4" , and the dimension of every feature map is "4*12*8" .
So the output dimension should be "4 (the number of feature maps) * 4 (roi_num) * 4*5*7 (feature map dimension after roi pooling)". It means that the dimension of every feature map after roi_pooling is "4 (roi_num) *4*5*7" , and the number of feature maps is "4".
So the result should be "4*4*4*5*7"

But your output dimension is torch.Size([4, 4, 5, 7]). I dont know about this. If that's the case, let's say that the input dimension is torch.Size([100, 4, 12, 8]), "100" means 100 samples. And the output dimension still is torch.Size([4, 4, 5, 7]), and the number of samples has been removed.

Hope you can help, thank you very much.

Problem in running your code

I tried running your code in my system. But I ran into some error when executing the command python example.py as:
Traceback (most recent call last): File "example.py", line 1, in <module> import torch File "/home/pg2017/cse/17071016/.conda/envs/myenv/lib/python3.7/site-packages/torch/__init__.py", line 102, in <module> from torch._C import * ImportError: /home/pg2017/cse/17071016/.conda/envs/myenv/lib/python3.7/site-packages/torch/lib/libtorch.so.1: undefined symbol: nvrtcGetProgramLogSize

Please address this issue. Waiting for a reply.

report error of cupy

ERROR: test_backward_gpu (main.TestROIPooling2D)

Traceback (most recent call last):
File "/usr/local/anaconda3/lib/python3.6/site-packages/cupy/cuda/compiler.py", line 241, in compile
nvrtc.compileProgram(self.ptr, options)
File "cupy/cuda/nvrtc.pyx", line 98, in cupy.cuda.nvrtc.compileProgram
File "cupy/cuda/nvrtc.pyx", line 108, in cupy.cuda.nvrtc.compileProgram
File "cupy/cuda/nvrtc.pyx", line 53, in cupy.cuda.nvrtc.check_status
cupy.cuda.nvrtc.NVRTCError: NVRTC_ERROR_COMPILATION (6)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "test/test_roi_pooling.py", line 80, in test_backward_gpu
self.check_backward(x_var, rois_var)
File "test/test_roi_pooling.py", line 69, in check_backward
raise_exception=True))
File "/usr/local/anaconda3/lib/python3.6/site-packages/torch/autograd/gradcheck.py", line 190, in gradcheck
output = _differentiable_outputs(func(*inputs))
File "/home1/machen/roi_pooling/roi_pooling/functions/roi_pooling.py", line 257, in roi_pooling_2d
return ROIPooling2d.apply(input, rois, output_size, spatial_scale)
File "/home1/machen/roi_pooling/roi_pooling/functions/roi_pooling.py", line 201, in forward
pooled_height=self.output_h, pooled_width=self.output_w)
File "cupy/util.pyx", line 48, in cupy.util.memoize.decorator.ret
File "/usr/local/anaconda3/lib/python3.6/site-packages/pyinn/utils.py", line 20, in load_kernel
kernel_code = cupy.cuda.compile_with_cache(code)
File "/usr/local/anaconda3/lib/python3.6/site-packages/cupy/cuda/compiler.py", line 164, in compile_with_cache
ptx = compile_using_nvrtc(source, options, arch)
File "/usr/local/anaconda3/lib/python3.6/site-packages/cupy/cuda/compiler.py", line 82, in compile_using_nvrtc
ptx = prog.compile(options)
File "/usr/local/anaconda3/lib/python3.6/site-packages/cupy/cuda/compiler.py", line 245, in compile
raise CompileException(log, self.src, self.name, options)
cupy.cuda.compiler.CompileException: /tmp/tmp_vrmafnj/kern.cu(7): error: identifier "None" is undefined

1 error detected in the compilation of "/tmp/tmp_vrmafnj/kern.cu".

======================================================================
ERROR: test_forward_gpu (main.TestROIPooling2D)

Traceback (most recent call last):
File "/usr/local/anaconda3/lib/python3.6/site-packages/cupy/cuda/compiler.py", line 241, in compile
nvrtc.compileProgram(self.ptr, options)
File "cupy/cuda/nvrtc.pyx", line 98, in cupy.cuda.nvrtc.compileProgram
File "cupy/cuda/nvrtc.pyx", line 108, in cupy.cuda.nvrtc.compileProgram
File "cupy/cuda/nvrtc.pyx", line 53, in cupy.cuda.nvrtc.check_status
cupy.cuda.nvrtc.NVRTCError: NVRTC_ERROR_COMPILATION (6)

Add colab-setup to README

Showcase the operation with a colab notebook.

This may not run because of #4 . In case you are eager to see it in action, a quick-fix would be:

  1. updating roi_pooling.functions.roi_pooling.py by replacing Dtype(argmax_data) by int on the lines mentioned in #4 .
  2. remove the old file in colab
  3. Add the file with the update mentioned in 1.
  4. enjoy ๐Ÿป

Fix Dtype for torch.int

For some reason, the previous version of pytorch&cupy didn't complain when pyinn.utils.Dtype return None as the data type for integer tensors.

We must amend these two lines during forward, and backward.

We should request pyinn to extend the Dtype function.

How to use RoIpooling for a particular application

I am implementing the paper 'Perceptual GAN for small object detection` where I need to do RoI pooling on a feature map with the RoI bounding box coordinates already known. The main architecture is given as follows:
Capture2

Can you tell what modifications to bring in your code to do the same. I am trying to modify example.py like this:

import numpy as np
from roi_pooling.functions.roi_pooling import roi_pooling_2d
model = VGG16()
feature_map=model.predict(img)

feature_map = np.asarray(feature_map, dtype='float32')
feature_map = feature_map.cuda()

rois = torch.FloatTensor([ [0, 0, 0, 1, 3], [0, 2, 2, 3, 3], [0, 1, 0, 3, 2] ])

rois= rois.cuda()

rpooling = roi_pooling_2d(feature_map, rois, (7, 7),spatial_scale=0.6)

Thanks in advance. Waiting for a reply.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.