Coder Social home page Coder Social logo

racnn-pytorch's People

Contributors

jeong-tae avatar jtlee90 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

racnn-pytorch's Issues

There is someyhing error ,and I changed logits,_, _= net(images) into logits, cc, aa= net(images) as it used to make error,but now it has the same error

THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "trainer2.py", line 305, in
train()
File "trainer2.py", line 94, in train
logits, cc, aa= net(images)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/Desktop/ww/RACNN-pytorch-master/models/RACNN.py", line 55, in forward
conv5_4_A = self.b2.features:-1
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/batchnorm.py", line 49, in forward
self.training or not self.track_running_stats, self.momentum, self.eps)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/functional.py", line 1194, in batch_norm
training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58
Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f10b8d49dd8>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 349, in del
self._shutdown_workers()
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 328, in _shutdown_workers
self.worker_result_queue.get()
File "/usr/lib/python3.5/multiprocessing/queues.py", line 345, in get
return ForkingPickler.loads(res)
File "/usr/local/lib/python3.5/dist-packages/torch/multiprocessing/reductions.py", line 70, in rebuild_storage_fd
fd = df.detach()
File "/usr/lib/python3.5/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/lib/python3.5/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 487, in Client
c = SocketClient(address)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 614, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused

some question for loss

hi, @jeong-tae :
I follow your model to train. The number of pre-apn trainings is 2000, and it seems strange to see the loss of the cls and the loss of the rank, as shown.

4f
168

Also accuracy has been very low, no more than 1. What is the reason for this?

Some doubts about your programs

I know nothing about pytorch, so I read your programs as tensorflow, and I find a strange calculations about loss, in model/loss.py, I see you cal loss as
F.cross_entropy(preds[i], labels)
Is this right? I guess maybe F.cross_entropy(preds[i], labels**[i]**) ?

is there some bug

when i run the trainer.py,there is no problem in pre_apn_epoch,but when test ,it is out of size ,just like the figure ,what i use is a GT2010 Ti,have no idea of the problem ,is there any tensor space not be released?

save model and predict

Hello, @jeong-tae:
I try use you write code of RACNN-pytorch to train my dataset. I want to save train-model. But I have not idea to do this. Can you give me some suggest or idea for save-model and predict. thanks.

one errors in your code and a problem about cuda out of memery

Hello, in Line 228 of ./trainer.py
response_map = F.upsample(response_map, size = [resize, resize])
maybe shoud be
before_upsample = Variable(response_map.unsqueeze(0))
response_map = F.upsample(before_upsample, size = [resize, resize])
response_map = response_map.data.squeeze()

More, I have a question to ask you. It have no problem when I run your code with only one gpu, however it has the "cude error: out of memory" problem when I run the code with multiple gpus, do u have the same problem , or do u know the reason?

about apn

Hi, I read your code which is cool, and run it. But the APN model which implement from the Paper is not concise and confused me.

I spend much time on thinking about it. I suddenly see the light when I read The Spatial Transformer Networks.

All the crop operate can description with a 2x3 matrix([[0.5, 0.0, 0.0], [0.0, 0.5, 0.0]]).

selection_052

M00 and M11 mean the size of crop area, M02 and M12 mean the coordinate of crop area.

selection_053

Which means we only need to train 2 parameter(M02 and M12) to find the attention area.

selection_054

And The Spatial Transformer Networks can autograd which we don't need to write the complex crop function.

I'm not finish trained The Spatial Transformer Networks when I write this text. I just want to share the stn net with you. That's wonderful!

Hope this can help you!

TypeError: expected Tensor as element 0 in argument 0, but got list

I fix the batch_size=1,there are some following issues,Can you tell me how to solve them??

[] pre_apn_epoch[13], || pre_apn_iter 19980 || pre_apn_loss: 0.1223 || Timer: 0.1521sec
[
] Swtich optimize parameters to Class
Traceback (most recent call last):
File "/home/alex/.local/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 3265, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
runfile('/home/alex/Datas/code/RACNN-pytorch/trainer.py', wdir='/home/alex/Datas/code/RACNN-pytorch')
File "/usr/local/pycharm-2018.2.4/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "/usr/local/pycharm-2018.2.4/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/alex/Datas/code/RACNN-pytorch/trainer.py", line 311, in
train()
File "/home/alex/Datas/code/RACNN-pytorch/trainer.py", line 136, in train
test(testloader, iteration)
File "/home/alex/Datas/code/RACNN-pytorch/trainer.py", line 292, in test
test_apn_losses = torch.stack(test_apn_losses).mean()
TypeError: expected Tensor as element 0 in argument 0, but got list

Problem of backward of APN

Thanks for sharing the code. It helps me understand the APN, I have been confused by how the author crops the attention region.

In the backward code of APN, I found you used a fixed value of in_size. (If my understanding for the code is right) Did you just backpropagate the gradient to a fixed location? if it is fixed, why did you do that? If not, how you backpropagate the gradient to the attention location?

Thanks in advance

def backward(self, grad_output):
    images, ret_tensor = self.saved_variables[0], self.saved_variables[1]
    in_size = 224
    ret = torch.Tensor(grad_output.size(0), 3).zero_()
    norm = -(grad_output * grad_output).sum(dim=1)
  
    
    x = torch.stack([torch.arange(0, in_size)] * in_size).t()
    y = x.t()
    long_size = (in_size/3*2)
    short_size = (in_size/3)
    mx = (x >= long_size).float() - (x < short_size).float()
    my = (y >= long_size).float() - (y < short_size).float()
    ml = (((x<short_size)+(x>=long_size)+(y<short_size)+(y>=long_size)) > 0).float()*2 - 1
    
    mx_batch = torch.stack([mx.float()] * grad_output.size(0))
    my_batch = torch.stack([my.float()] * grad_output.size(0))
    ml_batch = torch.stack([ml.float()] * grad_output.size(0))
    
    if isinstance(grad_output, torch.cuda.FloatTensor):
        mx_batch = mx_batch.cuda()
        my_batch = my_batch.cuda()
        ml_batch = ml_batch.cuda()
        ret = ret.cuda()
    
    ret[:, 0] = (norm * mx_batch).sum(dim=1).sum(dim=1)
    ret[:, 1] = (norm * my_batch).sum(dim=1).sum(dim=1)
    ret[:, 2] = (norm * ml_batch).sum(dim=1).sum(dim=1)
    return None, ret

error find in line 75, RACNN.py when run train.py

find error in line 75, RACNN.py when run train.py

h = lambda x: 1 / (1 + torch.exp(-10 * x))
RuntimeError: _exp_out is not implemented for type torch.cuda.LongTensor
look like is the torch.exp not support a longtensor
do any body know why?
thanks

RuntimeError: CUDA out of memory.

hi,If I use code in Line 236 of ./trainer.py
response_map = F.interpolate(response_map.unsqueeze(0), size = [resize, resize])
it will report CUDA memory error
when I change it to
before_upsample = Variable(response_map.unsqueeze(0)) response_map = F.upsample(before_upsample, size = [resize, resize]) response_map = response_map.data.squeeze()
and
def train(): net.train() with torch.no_grad():
it will be ok,But I don't know if it's right.and in Line 125 and 185 of ./trainer.py if it should be
logits, _, _, _ = net(images)

Some mistake in APN

Hi, very nice code but seems that the APN doesn't work.
there are some problems in AttentionCropFunction and I changed it as below:

            tx, ty, tl = locs[i][0], locs[i][1], locs[i][2]
            # tx = tx if tx > (in_size/3) else in_size/3
            # tx = tx if (in_size/3*2) < tx else (in_size/3*2)
            # ty = ty if ty > (in_size/3) else in_size/3
            # ty = ty if (in_size/3*2) < ty else (in_size/3*2)
            # tl = tl if tl > (in_size/3) else in_size/3
            ## this should generate a more reasonable anchor here
            tl = tl if tl > (in_size/3) else in_size/3
            tx = tx if tx > tl else tl
            tx = tx if tx < in_size-tl else in_size-tl
            ty = ty if ty > tl else tl
            ty = ty if ty < in_size-tl else in_size-tl

            w_off = int(tx-tl) if (tx-tl) > 0 else 0
            h_off = int(ty-tl) if (ty-tl) > 0 else 0
            w_end = int(tx+tl) if (tx+tl) < in_size else in_size
            h_end = int(ty+tl) if (ty+tl) < in_size else in_size

            mk = (h(x-w_off) - h(x-w_end)) * (h(y-h_off) - h(y-h_end))
            xatt = images[i] * mk

        #  xatt_cropped = xatt[:, h_off : h_end, w_off : w_end]
        ##  axis h,w here seems to be reversed wrongly?
            xatt_cropped = xatt[:, w_off: w_end, h_off: h_end] 

Hope that helps
great reproduction btw
5380800a19d8bc3e09bd993f8f8ba61ea9d3458b

TypeError: 'module' object is not callable

Traceback (most recent call last):
File "trainer.py", line 306, in
train()
File "trainer.py", line 57, in train
trainset = CUB200_loader(os.getcwd() + 'G:/RACNN-pytorch-master/RACNN-pytorch-master/data/CUB_200_2011/images', split = 'train')
TypeError: 'module' object is not callable

I need help

Maybe something inappropriate in limiting the range of tx or ty ?

Hello, in Line 88 of ./models/RACNN.py
tx = tx if (in_size/3*2) < tx else (in_size/3*2)
maybe shoud be
tx = tx if (in_size/3*2) > tx else (in_size/3*2)
and the same as that of the ty?

by the way, Would you mind explain the meaning of the Line 115 of ./models/RACNN.py about the backward() in brief? I got confused on it .Thank you!
norm = -(grad_output * grad_output).sum(dim=1)

some mistake in the code

After line 183 in the trainer.py , did you forget this line of code "logits, _, _ = net(images)"?

hello,some question

Hi,Thank you for sharing,but when I do as your steps,some question occur and I try my best to fix it but nothing help.my problem is:
ImportError: No module named 'visual',could you please help me! Thank you

hi,i don't know what happened .RuntimeError: "exp" not implemented for 'torch.LongTensor'

[*] RACNN forward test...
Traceback (most recent call last):
File "/home/dl2/Songly/RACNN-pytorch-master/models/RACNN.py", line 158, in
logits, conv5s, attens = net(x)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/dl2/Songly/RACNN-pytorch-master/models/RACNN.py", line 49, in forward
scaledA_x = self.crop_resize(x, atten1 * 448)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/dl2/Songly/RACNN-pytorch-master/models/RACNN.py", line 152, in forward
return AttentionCropFunction.apply(images, locs)
File "/home/dl2/Songly/RACNN-pytorch-master/models/RACNN.py", line 99, in forward
mk = (h(x-w_off) - h(x-w_end)) * (h(y-h_off) - h(y-h_end))
File "/home/dl2/Songly/RACNN-pytorch-master/models/RACNN.py", line 76, in
h = lambda x: 1 / (1 + torch.exp(-10 * x))
RuntimeError: "exp" not implemented for 'torch.LongTensor'

about saving the tx ty tl

Hi, I'm interested about RACNN and your code. But I‘m not familiar with pytorch. I want to ask how could I save the tx, ty and tl and use them to get the pics which is the input of the second channel and the third channel? Thank you!

Question of formula 8

hi, @jeong-tae :
I looked at your code and looked at the model and original paper of the source racnn-caffe. I have a question about the final loss function. The original racnn-caffe model finally concat pool5, pool5_A, and pool5_A_A(pow1, pow2, pow3), and then calculates accuracy1+2+3 through the fc layer. Figure:
37

I read your code and the original paper about this Lcls. You calculate each Lcls1, Lcls2, Lcls3, and then add them up. Figure:
d

I don't think they are a meaning. If I want to use scale1+2+3 like the source racnn-caffe, how I do it ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.