Coder Social home page Coder Social logo

bilinear-cnn's People

Contributors

haomood avatar zhosteven avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bilinear-cnn's Issues

About step1 and step2

Hi, is stetp2 's network based on step1's FC parameters or just training a vgg16 net from scratch?

How to get the mean and variance of the data normalize transform in your code?

Hi, I am confused about a question in your code. The mean and variance of the data normalize transform in your code is [(0.485, 0.456, 0.406), (0.229, 0.224, 0.225)]. But I computed the mean and variance of train data and got [(0.4856, 0.4994, 0.4324), (0.1817, 0.1811, 0.1927)]. Then I used the mean and variance I computed, but the test accuracy of the result I got is lower than yours. First, I thought maybe you used mean and variance of the whole dataset, then I computed it but got a result very close to what I got before. So can you tell me how to get the mean and std of the data normalize transform in your code? Thank you!

Hi

Hi,I just used your class BCNN as a module, but what i get is the same classification result of different images.
Is there something wrong?

here is the output of predict class and truct class

predict_class: tensor([61, 61, 61, 61, 61, 61, 61, 61], device='cuda:0')
truth_class: tensor([180, 151, 187, 33, 70, 36, 109, 54], device='cuda:0')

the training process:

            data = data.to(opt.device)
            label = (label).to(opt.device)
            optimizer.zero_grad()
            score  = bcnn_model(data)
            loss = criterion(score,label)
            loss.backward()
            optimizer.step()

confusion about the parameters of Normalize

Regarding line 129-130 in bilinear_cnn_fc.py, I'm confused about the magic number in

Normalize(mean=(0.485, 0.456, 0.406),
                 std=(0.229, 0.224, 0.225))

where are these numbers from?

Cannot reproduce accuracy 84% (after step2)

Hi Hao,

Thank you for a neat implementation.

I wonde if training with the hyperparameters written in README

 --base_lr 1e-2 \
 --batch_size 64 --epochs 25 --weight_decay 1e-5 \
 --model "model.pth" 

gives 84.17% test accuracy?

I used exactly the commads which you provide in the README:

    Step 1.
    $ CUDA_VISIBLE_DEVICES=0,1,2,3 ./src/bilinear_cnn_fc.py --base_lr 1.0 \
          --batch_size 64 --epochs 55 --weight_decay 1e-8 \
          | tee "[fc-] base_lr_1.0-weight_decay_1e-8-epoch_.log"

    Step 2. 
    $ CUDA_VISIBLE_DEVICES=0,1,2,3 ./src/bilinear_cnn_all.py --base_lr 1e-2 \
          --batch_size 64 --epochs 25 --weight_decay 1e-5 \
          --model "model.pth" \
          | tee "[all-] base_lr_1e-2-weight_decay_1e-5-epoch_.log"

I have trained step1 model and got 76.67% accuracy on test. I use this as initialization for step2 model and finetune all the layers further. But the accuracy saturates at 76.61% and doesn't grow further.

Are there any extra tricks to get the desired performance?

Signed square root

Hi Hao

First of all thanks for the excellent implementation. I have used the code here as a reference for my own implementations.

In the original paper (http://vis-www.cs.umass.edu/bcnn/docs/bcnn_iccv15.pdf) the authors have used signed square root operation. Something like:

X = torch.mul(torch.sign(X),torch.sqrt(torch.abs(X)+1e-5))

instead of the normal square root you have used X = torch.sqrt(X + 1e-5)

Was there a particular reason for using this ?

外积问题

X = torch.bmm(X, torch.transpose(X, 1, 2)) / (28**2) # Bilinear

特征图A的尺寸为(C,M),B的尺寸为(C,N)
论文中提到
If fA and fB extract features of size C ×M and
C ×N respectively, then Φ(I) is of size M × N.
但是按照您的写法,这个结果是C × C
但是论文experiment部分似乎结果也是您的512*512,即C × C
我很困惑,望您能解答以下,谢谢。

download the model.pth

Thank you very much for your code! But where can I find that model for fine-tuning? Or it need to be trained by myself?

out of memory

Hi! After 2 epochs the backward runs out of memory :( First epoch its okey but then crash on second one. It seems that stores the graph or somethin but I change some things and crash:

`
for X, y in self._train_loader:
# Data.

            # Clear the existing gradients.
            X = X.cuda()
            y = y.cuda()

            # Forward pass.
            score = self._net(X)
            loss = self._criterion(score, y.long())

            with torch.no_grad():
                epoch_loss += loss.item()
                # Prediction.
                prediction = torch.argmax(score, dim=1)
                num_total += y.size(0)
                num_correct += torch.sum(prediction == y.long()).item()

            # Backward pass.
            self._optimizer.zero_grad()
            loss.backward()
            self._optimizer.step()
            
            total_batches+=1
            del X, y, score, loss, prediction

`

question about the bilinear pooling operation

in the forward function of BCNN class, the bilinear operation is

X = torch.bmm(X, torch.transpose(X, 1, 2)) / (28**2) # Bilinear

why does it require the result of matrix multiplication being divided by (28 ** 2)?

About step 2

@HaoMood ,你好,非常感谢你的工作。
我在跑你代码时第一步的test acc是76%,保存的最好结果的model是vgg_16_epoch_21.pth
但是在step 2中load 这个21.pth 的model得到的train acc 是1%, test acc 是0.
请问这是什么问题?

zombie process when using multiple gpu

hi, thanks a lot for your code! Everything works well when I only use one gpu by setting cuda_visible_devices=0 (for example), but when I use multiple gpus by setting cuda_visible_devices=0,1 (for example), the process will become a zombie process, which means it is not actually training, but it still holds the gpu and cpu resources. What's the worst is, you even cannot kill it through "kill -9 PID". The only thing you can do is a reboot. Have u come across the same issue before? Thanks a lot!

About memory

In your README, I see you used 4 gpus. So, how much memory has been used totally in your step1?

bilinear sqrt with sign

Hi, it is a concise and useful code for bilinear CNNs, however, from the paper I read about the
" elementwise signed square-root (x ← sign(x)􏰊|x|) and l2 normalization is applied to the matrix A"
which means it should be multiplied by the sign. But in this code just "X = torch.sqrt(X + 1e-5)"

Am I missing something? and even this not same thoroughly, I got the same result (84.2%) which suggests it should be a right answer?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.