nvlabs / ssn_superpixels Goto Github PK

View Code? Open in Web Editor NEW

342.0 20.0 55.0 51 KB

Superpixel Sampling Networks (ECCV2018)

Home Page: https://varunjampani.github.io/ssn/

License: Other

Python 61.08% Shell 0.61% C++ 17.85% Cuda 20.46%

eccv-2018 superpixels superpixel-segmentation superpixel-algorithm

ssn_superpixels's Introduction

Superpixel Sampling Networks

This is the code accompanying the ECCV 2018 publication on Superpixel Sampling Networks. Please visit the project website for more details about the paper and overall methodology.

License

Installation

Caffe Installation

Go to 'lib' folder if you are not already there:

cd ssn_superpixels/lib/

We make use of layers in 'Video Propagation Networks' caffe repository and add additional layers for SSN superpixels:

git clone https://github.com/varunjampani/video_prop_networks.git

Manually copy all the source files (files in lib/include and lib/src folders) to the corresponding locations in the caffe repository. In the ssn_superpixels/lib directory:

cp src/caffe/layers/* video_prop_networks/lib/caffe/src/caffe/layers/.
cp src/caffe/test/* video_prop_networks/lib/caffe/src/caffe/test/.
cp src/caffe/proto/caffe.proto video_prop_networks/lib/caffe/src/caffe/proto/caffe.proto
cp include/caffe/layers/* video_prop_networks/lib/caffe/include/caffe/layers/.

Install Caffe following the installation instructions. In the ssn_superpixels/lib directory:

cd video_prop_networks/lib/caffe/
mkdir build
cd build
cmake ..
make -j
cd ../../../..

Note: If you install Caffe in some other folder, update CAFFEDIR in config.py accordingly.

Install a cython file

We use a cython script taken from 'scikit-image' for enforcing connectivity in superpixels. To compile this:

cd lib/cython/
python setup.py install --user
cd ../..

Usage: BSDS segmentation

Data download

Download the BSDS dataset into data folder:

cd data
sh get_bsds.sh
cd ..

Superpixel computation

First download the trained segmentation models using the get_models.sh script in the models folder:

cd models
sh get_models.sh
cd ..

Use compute_ssn_spixels.py to compute superpixels on BSDS dataset:

python compute_ssn_spixels.py  --datatype TEST --n_spixels 100 --num_steps 10 --caffemodel ./models/ssn_bsds_model.caffemodel --result_dir ./bsds_100/

You can change the number of superpixels by changing the n_spixels argument above, and you can update the datatype to TRAIN or VAL to compute superpixels on the corresponding data splits.

If you want to compute superpixels on other datasets, update config.py accordingly.

Evaluation

For superpixel evaluation, we use scripts from here for computing ASA score and scripts from here for computing Precision-Recall and other evaluation metrics.

Training

Use train_ssn.py to train on BSDS training dataset:

python train_ssn.py --l_rate=0.0001 --num_steps=10

Citation

Please consider citing the below paper if you make use of this work and/or the corresponding code:

@inproceedings{jampani18ssn,
	title = {Superpixel Samping Networks},
	author={Jampani, Varun and Sun, Deqing and Liu, Ming-Yu and Yang, Ming-Hsuan and Kautz, Jan},
	booktitle = {European Conference on Computer Vision (ECCV)},
	month = September,
	year = {2018}
}

ssn_superpixels's People

Contributors

Stargazers

Watchers

Forkers

zmbhou vivianliangb yanminbit shubhampachori12110095 hxl1990 wrccrwx yang-fei farm365 exuejiao runngezhang csjunxu tedyhabtegebrial spaceview giser18 zehaoy zhy07013216 mad-ye berther areslp superzjfan yajha yangwangx pandinosaurus finesure2017 wwfzs1990 prantikhowlader unitingcoders jiuyueshiwo b03202019 qichaoliu jtpils aixioma dywu98 gintsuki9349 xtrigold acproject abeerraj alisure-fork axrid txrc higornucci nmvbxcz derrick-xwp paulkili zampal94 pritesh-aidash longjohncoder hsiyjnd kay360 anm-pinellia danny0559 explcre devkpro ltrain111

ssn_superpixels's Issues

I cannot understand “n*9” mentioned in the paper

Hi.
I try to implement this in pytorch. But I cannot understand how to calculate Q.

The paper says
“we constrain the distance computations from each pixel to only 9 surrounding superpixels”
This confuses me.

Is there any chance that more than or less than 9 superpixel centers are in the red box?
When the size of Q is n x 9 and I is n x k(k is feature dim), QI is 9 x k, right? If so, how can I get the m x k size superpixel centers?

I apologize for my poor English and thank you for your help.

F0630 15:37:53.939426 12256 math_functions.cu:79] Check failed: error == cudaSuccess (74 vs. 0) misaligned address * Check failure stack trace: *

about the label and loss

the article said the R can be semantic label (as one-hot encoding) or optical flow maps
but the whole Algorithm 1 has no R
i dont know what is the data X and what is the label Y for backpropagation in the network, and the loss ||R-R*|| in the article seems like autoencoder, which dosen't need any label.
does the R is I or F in the Algorithm 1?

About some custom caffe layer ?

Hi, @varunjampani
Thanks for sharing your perfect code, I have run this training code and get good result, But I 'm not very familiar to caffe , so could you please tell me the function of custom caffe layer 'L.RelToAbsIndex' and 'L.Smear' ?

Some confusion about training with label

Hi, thanks for your nice work, i am very interest in it, after reading your paper, i have some point want to make sure:
in your paper, you train the snn by L=Lr+Lc right? the Lr is reconstruction loss which need the segment gt(label).
but want we using the traditional algorithm such as SLIC which don't need label right?

so can i understand: SLIC is a unsupervised approach while SNN is a supervised(semi supervised) approach in superpixels?

this is my question, looking forward for your reply!

Question of the function of spix_init

Hi, thanks for your work.
I have a question about spix_init in create_ssn.py. It seems to be an unchangeable constant each time it is called (e.g. in Passoc_layer.cu and spixel_feature2_layer.cu, it is a const Dtype* and named index_data).
In my comprehension, spix_init provides the surrounding valid labels of corresponding superpxiel when accessing one pixel. However, in each iteration, the label of each pixel may be changed due to calculating the distance between it and surrounding clustering centers, meaning this pixel may not belong to the initial superpixel after several iterations.
For instance (maybe it’s not true and rational), a pixel belongs to superpixel #57, and the surrounding superpixel are #46, #47, #48, #56, #58, #66, #67 and #68. After several iterations, this pixel may belong to #56, and the surrounding superpixels of #57 may turn to other 8 superpixels - maybe one is #69. But as index is fixed, its label is fixed, meaning that this pixel always belongs to superpixel #57 (in spixel_feature2_layer.cu). Besides, the surrouding superpixels' labels are fixed. So I cannot understand that and think it may not be able to show the process of clustering.
Thanks for reading this redundant description and looking forward to your comprehension of spix_init.

Questions about semantic segmentation testing

Hi, thank you for your excellent work. I have a little confusion about your paper.
In your paper, you mention that the ssn can be used for downstream tasks. Here's my question
in semtantic segmentaion, you need to use GT of pixel label to get superpixels, but how to get pixel label in testing, or here semantic segmentation is just a form to help ssn get superpixels?

I wish to get your reply.Thank you.

About the weights of the loss function

Hi,

Thanks for the great work and sharing the code. I am trying to run your code, but I am not familiar with caffe, and get confused about the weight.

In the code, there are three losses -- pos_loss (loss 1),col_loss(loss2) and losswithoutSoftmax (loss3), where I believe only pos_loss and lossWithoutSoftmax are using. However, when I ran it, I usually get something as what I posted at the end.

What is the loss value next to the iteration number (e.g. Iteration 6, loss = 1.16342)? I thought it was the total_value, but I found sometimes it could be less than loss1, as the Iteration 6 shows.
I am trying to transfer your code to pytorch, but I am not clear how the weight 1e-5 is applied. Based on the printing result of the code, to me, it seems something like

loss = torch.norm(pos_pix_feat - pos_recon_feat).sum() * 1e-5 + elem_wise_cross_tropy(rec_label, ori_label).mean()

where the pos_loss is using the sum of the element-wise l2 norm, while the label_loss is using the mean of the element-wise cross entropy. Is this true?
I tried to use sum() or mean() for both sub-losses, but none of them are even close to the loss value I got from your code.

Thanks in advance.

=======================
I0301 11:46:37.911602 9853 solver.cpp:228] Iteration 6, loss = 1.16342
I0301 11:46:37.911633 9853 solver.cpp:244] Train net output #0: loss1 = 119468 (* 1e-05 = 1.19468 loss)
I0301 11:46:37.911638 9853 solver.cpp:244] Train net output #1: loss2 = 112387
I0301 11:46:37.911644 9853 solver.cpp:244] Train net output #2: loss3 = 0.15799 (* 1 = 0.15799 loss)

I0301 11:46:38.340327 9853 solver.cpp:228] Iteration 7, loss = 1.14899
I0301 11:46:38.340358 9853 solver.cpp:244] Train net output #0: loss1 = 90096.4 (* 1e-05 = 0.900964 loss)
I0301 11:46:38.340363 9853 solver.cpp:244] Train net output #1: loss2 = 24685.4
I0301 11:46:38.340366 9853 solver.cpp:244] Train net output #2: loss3 = 0.147002 (* 1 = 0.147002 loss)

I0301 11:46:38.935927 9853 solver.cpp:228] Iteration 8, loss = 1.15263
I0301 11:46:38.935950 9853 solver.cpp:244] Train net output #0: loss1 = 109387 (* 1e-05 = 1.09387 loss)
I0301 11:46:38.935955 9853 solver.cpp:244] Train net output #1: loss2 = 112974
I0301 11:46:38.935958 9853 solver.cpp:244] Train net output #2: loss3 = 0.0878664 (* 1 = 0.0878664 loss)
I0301 11:46:38.935961 9853 sgd_solver.cpp:106] Iteration 8, lr = 0.0001

Aboutn

where to install Caffe inside "ssn_superpixels"

Hi, I have a question about how step 4 of the README.md file.

Step 4 reads, “install caffe following the installations instructions. In the ssn_superpixels/lib directory …”, after which the “Note” reads “if you install Caffe in some other folder, update CAFFEDIR in config.py accordingly”.

In the linked “instructions” there are multiple ways of installing caffe - ubuntu standard installation, docker etc, the last of which downloads and installs Caffe from source. How should Caffe be ideally installed and where should it be installed to? I’m asking this because it’s not very clear what you mean by “if you install Caffe in some other folder ...”; does this imply that when downloading Caffe from source, the “git clone https://github.com/BVLC/caffe.git” command should be performed inside the “ssn_superpixels/lib” directory?

Thanks in advance

The implementation of pytorch version

Recently, I work on a pytorch implementation of your paper. Some implementation results are shown as follows:

There is one straight line at close to the image boundary, and I do not konw why. In your experiments, is there the simiar problem? And can you give me some advices?
Another problem is that the pos_recon loss will decrease at the begining of training and gradually increase with a long training period.

Some question about initialization of group conv layer 'concat_spixel_feat_50'

Hi! @varunjampani
Sorry to bother you again. When I want to use ssn as a module, I meet some problem: There is a group conv layer 'concat_spixel_feat_50' in function "def decode_features" (in file 'create_net.py') which is used to Concatenate neighboring 9 superpixel feature, is that right?
When I visualize the params of this layer in the pretrained-model, I get conv-params like this:
(which is correct initialize, for concat neighboring 9 feature)

However, after I add the same code to my own network, I get the all-zero init of this layer 'concat_spixel_feat_50'

I have checked the document of caffe: when the param "weight_filler" is not set, then the kernel param of this conv layer will initial with all-zero. Obviously the all-zero init is not correct, but I can't get the correct params initialization above. I use the same code and really can't find which setting is wrong.
Could you please give some help? Thanks a lot !

what does the training loss curve look like

I'm trying to train SSN via train_ssn.py, but after running ~40,000 iterations there seems to be a lot of jittering but no meaningful decrease in the training loss. I know from reading previous issues that convergence takes ~500,000K iterations, but with my computing resources it would take a few days to reach convergence.

So I was wondering whether the authors could kindly tell me / show me what the training loss curve looks like as a function of iteration number, starting from iteration 0 all the way to convergence.

Thank you in advance.

Training with self-made dataset

hello, very thankful for sharing the whole program. My question is that I want to train this SSN with my own dataset, which basically consists of original images from BSDS500 training images to be input, and their superpixel segmentation data (by some classical superpixel algorithm) to be ground truth.
So could you provide some documentation explaining how to plug in user defined training dataset to replace BSDS500 training set ? thank you very much

How can l get the features vector associated to each superpixel ?

Hello,

Thank you a lot for your work.

I'm wondering your network learns also features vector of each superpixel or it just learn the superpixels.
If yes, how can l get them otherwise you keep just the raw pixels of each superpixels.

Thank you

Test for city_scapes dataset?

Error while test on city_scapes dataset?

F1128 11:33:33.655436 27257 blob.cpp:34] Check failed: shape[i] <= 0x7fffffff / count_ (2048 vs. 1182) blob size exceeds INT_MAX
*** Check failure stack trace: ***
Aborted (core dumped)

optical flow with superpixel

Thank you for the release of project, and it is a great work!

In your paper, there are a section on the optical flow using SSN_{deep}, which gives a good visual results, would you please give some hints or demo on this part ?

input image size different for training and testing on the BSDS500 dataset??

Hi,

I was running "compute_ssn_superpixels.py" when I found something curious that I wanted to ask the authors about.
The paper mentioned that for training, the authors used "image patches of size 201 x 201" of the original BSDS500 dataset. However, when I printed the image sizes used for testing in compute_ssn_superpixels.py via print(net.blobs['img'].data.shape), it returned (1, 3, 321, 481).
So I wanted to confirm whether different input image sizes were using for training and testing when computing superpixels on the BSDS500 dataset (it seems that the trained weights have been reused in the test network whose activation sizes have accordingly been adjusted for the changed input image size?).

Thanks in advance :)

some question of paper about training

Hi!
Thanks for sharing this nice work, but I still have some question about this paper.

As mentioned in paper, before the SSN module, the image is feed to a CNN and then get k-dimension pixel features. i.e the tensor of n × k. and then this pixel features was cluster and get m super-pixel representation of k-dimension (m × k), and finally we should map the super-pixel representation(m × k) back to pixel representation(n × k) using matrix multiply, to compute the pixel-wise loss. Is my interpretation correct ?

In Section 4.3, we should compute Task-specific reconstruction loss using pixel properties(label) R and pixel representation R∗, so here the R∗ is pixel representation(n × k) which is from super-pixel representation mentioned above. Is that right ?

when i run your code using pycharm,the error is..

Error parsing text-format caffe.NetParameter: 23:23: Message type "caffe.LayerParameter" has no field named "pixel_feature_param".
In the caffe.proto file,the variable is existed.
I install caffe in the ringht way,but always arised that worng,please help me! Thank you very much!

dataset

http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/BSR/BSR_full.tgz > BSDS.tgz
can not open this network
please give another network or cloud storage

superpixel results

Thank you for the release of project.

But I have a question, using pre-trained model to run directly, the result contains 'new_spix_indices' and '_bdry.jpg' in the code. How can I get results like this:

Looking forward to your reply, thank you!

how to run the code???

is the code worked on windows or linvx, how should i do to run it! really beg for your help!! i am just a new cow.

Presence of hard association in the compactness portion of loss function

Hi, I wanted to ask you about a technicality in the compactness portion of the loss function.
The compactness loss uses inverse mapping onto the pixel representation using the hard associations H, instead of the soft association Q. My question is how is the non-differentiability of the hard association in the loss function resolved - there appears to be no reference to any form of Monte Carlo optimization (e.g. REINFORCE) to circumvent the non-differentiability issue (the very presence of hard association in the loss function also appears to go against the founding principles of differentiable SLIC).

I would really appreciate hearing your thoughts on this.
Thanks in advance :)

h_grid, w_grid is not used

It seems h_grid and w_grid is computed but not used here

Question About your paper

when you comput Q Algorithm 1 in your paper, did you use the normalization technique? because the Q seems to be very small.

The BSDS result from the pre-trained model does not match the paper one

Hi,
Thanks for sharing the code and pre-trained models. I am trying to test the BSD500 result got from the pre-trained model, but I found the boundary recall(BR) does NOT match the result shown on paper.

I tried to compute the BR using both evaluation methods (with default parameters) mentioned in the instruction, and because multiply ground truths are available on BSD500, I also tried to simply average all the results and average only the best result of per image.

Take 500 superpixels result for example, the one on paper is around 93% (based on Fig.4 on the paper), but what I got are:

BR = 97.94% (Stutz's method, average all )
BR = 86.08% (Tu's method, average all)
BR = 89.21% (Tu's method, avarege the best)

Could you provide more specific description of the evaluation method? Thanks!

converge very slowly

Hi Varun,
Super grateful for your sharing of this great work! I follow your guide, and try to train on BSDS training dataset. However, the model seems not converging after a whole day of training. I am wondering if this is because it needs a great many iterations to converge, since I find you set 'num_iter = 1000000' in train_ssn.py.
Looking forward to your reply:)

optical flow network

may I ask which optical flow network was used in the paper?

Some questions on source code

Thanks for your code. Unfamiliar with Caffe and CUDA C/Cpp, I have two questions perplexing me when reading the code:

In page 8 of the paper it is said that:

using row-normalized association matrix

Yet I've not found any row-normalization operation adopted where it ought to be (I suppose it t o be inside decode_features in create_net.py). Does this implementation make a simplification, or is the normalization done elsewhere?

Again in decode_features in create_net.py, I notice that the neighboring superpixel features concat_spixel_feat are concatenated in a rather neat and tricky way, namely by passing through a group convolution. My doubt is that in my understanding, the convolution kernels should be fixed to specific values in order to achieve this. The kernels of each group should look like this if my guess were right:

[1, 0, 0                              [0, 1, 0                                 [0, 0, 0
 0, 0, 0                               0, 0, 0                                  0, 0, 0
 0, 0, 0]    for channel 1,            0, 0, 0]    for channel 2, ...,          0, 0, 1]    for channel 9

But nowhere in the repo can I find the initial value setting part of this convolution layer. I wonder where you put it or if it is that my guess is just incorrect?

custom training

Hello, thanks for sharing the code. Does it make sense to train the model with a dataset of images with varying size and resolution or would it never converge?

caffe build configuration？

Can anyone provide the successful building configuration of caffe? I kept failing......

Including:

OS version
cuda version
cudnn version
protobuf version
python version
gcc version

Thanks very much!

BTW, I really do not think using caffe is a good idea these days.

Pytorch Implementation License

Hello,

I recently uploaded a pytorch implementation of "Superpixel Sampling Networks" on my GitHub, and I was wondering which license I should be using for that repo, considering how "ssn_superpixels" is licensed under the Creative Commons license.

My concern is that my training script is based on that of another repo with a GPL-v3 license so I feel obliged to use a GPL-v3 license for my repo as well, but I was wondering whether this is in conflict with the CC license of the "ssn_superpixels" repo. Here is the link to my repo for reference (https://github.com/andrewsonga/ssn_pytorch).

I would love to hear your advice on this.
Andrew.

about Evaluation

On the BSDS dataset, I used the parameters provided（ssn_bsds_model.caffemodel)to get better result for ASA and BR than the paper. May I ask if the parameters provided are the parameters used in the paper

Why is random cropping / patching and random scaling applied continuously throughout training?

Hello,

According to the paper, random patches and random scaling of those patches are used as a means of data augmentation. My question is why are these forms of data augmentation applied continuously throughout the training process i.e. when the data is loaded during the training process.

Wouldn't it be enough to augment the BSDS500 data before training, and then running training on a fixed, yet augmented dataset (especially since there are multiple annotations for each image in the BSDS500 dataset)? It appears that applying data augmentation continuously throughout training may be the reason it takes so long to train SSN (~500K iterations).

Thank you in advance :)

Superpixel border issue on BSDS500 dataset

Hi, I'm trying to test your algorithm on BSDS500 dataset. I follow your instruction in README.md and run compute_ssn_spixels.py using your args. However, I get abnormal results like this.

It can not assign correct superpixel ID for the boundary of images.
Did you encounter this problem? Or do you know the reason?

extract the features of the superpixel area

can this project help me extract the features of the superpixel area, or the characteristics of each pixel in the image?

nvlabs / ssn_superpixels Goto Github PK

ssn_superpixels's Introduction

Superpixel Sampling Networks

License

Installation

Caffe Installation

Install a cython file

Usage: BSDS segmentation

Data download

Superpixel computation

Evaluation

Training

Citation

ssn_superpixels's People

Contributors

Stargazers

Watchers

Forkers

ssn_superpixels's Issues

Recommend Projects

Recommend Topics

Recommend Org