tengshaofeng / residualattentionnetwork-pytorch Goto Github PK
View Code? Open in Web Editor NEWa pytorch code about Residual Attention Network. This code is based on two projects from
a pytorch code about Residual Attention Network. This code is based on two projects from
what's the version of torch, torchvision and python? can anyone explain it?
Thanks for your job! I have a question about the expression of mix attention. And is conv->relu->conv->sigmoid able to represent it?
您好,请问怎么输出attention map呢
I ran your code and meet the error as below:
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). Traceback (most recent call last): File "train_pre.py", line 52, in <module> for i, (images, labels) in enumerate(train_loader): File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 275, in __next__ idx, batch = self._get_batch() File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 254, in _get_batch return self.data_queue.get() File "/usr/lib/python3.5/multiprocessing/queues.py", line 343, in get res = self._reader.recv_bytes() File "/usr/lib/python3.5/multiprocessing/connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/lib/python3.5/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/usr/lib/python3.5/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 175, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 50) is killed by signal: Bus error.
The run environment is python 3.5, tensorflow 1.0.1 and pythorch 0.3.1. I have search for the solutions. And I think this maybe cause by version confilicts.
Can you tell us the run environment, and any other suggestion? thx
I follow the insturction and run: CUDA_VISIBLE_DEVICES=0 python train.py
but I get
TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType), but expected one of:
what's wrong with this code?
话题关闭
Hi,
Can we implement the same network for 3D data by using 3d layers of same 2d layers? What do you advice?
I want to use this code for another dataset, which parameter makes sure that my new data will be used for the model trained on CIFAR?
And do you have any advice, if input data dimensions are higher than CIFAR, e.g 100*100?
where the code has stage 0 which doesn't exist in the paper
i think the num of params for cifar10 residual network is incorrect, i find that it is much bigger than the num in paper
Hi,
Is the model_92_sgd.pkl is pre_trained for cifar10? Does the imagenet has the pretrained model? Thanks
When testing, model do not need grad.
And this line caused me out of memory.
When I run this code in python3.6
I met an error
'File "/home/user/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 33, in init
out_channels, in_channels // groups, *kernel_size))
TypeError: torch.FloatTensor constructor received an invalid combination of arguments - got (float, int, int, int), but expected one of:
I am trying to implement a new dataset on this code. I changed the class name and also included data class which gives an image as an item of size 448*448 through each iteration. And there is a list of labels matching the class name list. And I am using from model.residual_attention_network import ResidualAttentionModel_448input as.....
And I am getting this error :
Traceback (most recent call last):
File "train.py", line 83, in
model = ResidualAttentionModel().cuda()
File "/home/jayant/Documents/Marsh_Ann/ResidualAttentionNetwork-pytorch-master/model/residual_attention_network.py", line 24, in init
self.residual_block0 = ResidualBlock(64, 128)
File "/home/jayant/Documents/Marsh_Ann/ResidualAttentionNetwork-pytorch-master/model/basic_layers.py", line 20, in init
self.bn3 = nn.BatchNorm2d(output_channels/4)
File "/home/jayant/anaconda3/envs/saltmarsh/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 21, in init
self.weight = Parameter(torch.Tensor(num_features))
TypeError: new(): data must be a sequence (got float)
@tengshaofeng Do you have an intuition about what am I doing wrong? I can also share my dataset rendering class. It has a getiitem method which returns 448*448 image.
Hi @tengshaofeng,
Do you know if this model can process multi-label datasets, like NUSWIDE? Any idea how to do it? Thank you.
File "/home//ResidualAttentionNetwork-pytorch-master/Residual-Attention-Network/model/attention_module.py", line 249, in forward
out_interp3 = self.interpolation3(out_softmax3) + out_softmax2
RuntimeError: The size of tensor a (14) must match the size of tensor b (2) at non-singleton dimension 3
Epoch [32/300], Iter [100/254] Loss: 0.2530
Epoch [32/300], Iter [200/254] Loss: 0.1421
the epoch takes time: 40.39500594139099
evaluate test set:
Accuracy of the model on the test images: 87 %
Accuracy of the model on the test images: 0.8785185185185185
Accuracy of plane : 0 %
Accuracy of car : 0 %
Accuracy of bird : 1 %
Accuracy of cat : 0 %
Accuracy of deer : 0 %
Accuracy of dog : 3 %
Accuracy of frog : 0 %
Accuracy of horse : 0 %
Accuracy of ship : 0 %
Accuracy of truck : 1 %
分类测试精度这么低,还有多个类别都有对应精度?是不是我运行软件的版本有问题,我用python3.5 pytorch1.1版本。还有就是最高精度没输出是咋回事。谢谢!
Hi, I am confused about the term softmax_blocks
. The term in the paper should be soft mask blocks
? I check the ResidualBlock class which does not exist normalization layers.
Hello, I studied your code carefully, and then I found that there are different formulas for Mixed Attention, Channel Attention and Spatial Attention in the paper. But I don't see a formal representation of F (xi, c) in your code. I just started to learn about Deep Networks. How do I modify the network if I want to express different attentions? Thank you!
Thank you for sharing your code!
can you provide the best pretrained model?
I have question about the the soft attention mask. I have implemented residual attention blocks for specific domain (faces). How does the attention mask focus on specific regions of the face? such as forehead and so on???
Hi,
Is the model_92_sgd.pkl is pre_trained for cifar10? Does the imagenet has the pretrained model? Thanks
Traceback (most recent call last):
File "train.py", line 93, in
model = ResidualAttentionModel().cuda()
File "/home/ResidualAttentionNetwork-pytorch-master/Residual-Attention-Network/model/residual_attention_network.py", line 136, in init
self.residual_block1 = ResidualBlock(64, 256)
File "/home/ResidualAttentionNetwork-pytorch-master/Residual-Attention-Network/model/basic_layers.py", line 16, in init
self.conv1 = nn.Conv2d(input_channels, output_channels/4, 1, 1, bias = False)
File "/home//.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 412, in init
False, _pair(0), groups, bias, padding_mode)
File "/home//.local/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 78, in init
out_channels, in_channels // groups, *kernel_size))
TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:
Is there anyone train the resattentionnet on ImageNet?
The paper didn't provide the batchsize for ImageNet training. So I set the batchsize=256/lr=0.1 which is a common setting, but the training result (top1-acc: 77.64) is much lower than paper reported (top1-acc: 78.24) ! More details about hyperparameters are listed as below. The epoch setting is converted from the iteration which is mentioned in paper. If we set the batchsize as 256, then there is 5k iteration in 1 epoch. According to the paper, we should decay the learning rate at 200k/5k=40, 400k/5k=80, 500k/5k=100 epoch, and terminate training at 530/5k=106 epoch.
The learning rate is divided by 10 at 200k, 400k, 500k iterations. We terminate training at 530k iterations.
args.epochs = 106
args.batch_size = 256
### data transform: RandomResizeCrop(224)/HorizontalFlip(0.5)/ChangeLight(AlexNet color augmenation)/Normalize() are used in training
args.autoaugment = False
args.colorjitter = False
args.change_light = True # standard color augmentation from AlexNet
### optimizer
args.optimizer = 'SGD'
args.lr = 0.1
args.momentum = 0.9
args.weigh_decay_apply_on_all = True # TODO: weight decay apply on which params
args.weight_decay = 1e-4
args.nesterov = True
### criterion
args.labelsmooth = 0
### lr scheduler
args.scheduler = 'uneven_multistep'
args.lr_decay_rate = 0.1
args.lr_milestone = [40, 80, 100]
Hi @tengshaofeng ,thanks ,But I have a question,in attention_module.py ,the class AttentionModule_stage0 inputsize is 112112,but in the class AttentionModule_stage1 the inputsize is 5656,is any maxpool layer used in the middle?I think it's not mentioned in the paper.
I have a trained residual attention model, and I want to visualize the masks given in Figure 1. Any idea how do the authors do that? @tengshaofeng If u have already done it, can u share the code to actually visualize the attention masks?
excuse me,your code bring me a big help about my research,but,when i run the train.py,it appears the following errors,do you konw how to fix it? thank you!
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
E TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:
E * (torch.device device)
E * (torch.Storage storage)
E * (Tensor other)
E * (tuple of ints size, torch.device device)
E * (object data, torch.device device)
how to fix it? Thanks
hello ,thank you for your code!
But I have a question about your code.The episode in your code seems to be no such operation in the paper and in the soft mask branch only skip connection have addition operation.Could you help me solve this question?
out_interp = self.interpolation1(out_middle_2r_blocks) + out_down_residual_blocks1
TypeError: new() received an invalid combination of arguments - got (float, int, int, int), but expected one of:
你好,如果我想用自己的数据集,有没有在ImageNet上预训练好的模型呢?
Can you tell me if your training and testing accuracies always followed each other? I am implementing a smaller and modified version of the network you coded, and my test accuracy seems to have stagnated at 81%.
Also, I think you have coded a different architecture because you are adding output of pool layer as well as the output of pool+conv layer to the upsampled input, while the actual architecture only adds the pool+conv output to the upsampled layer. Is that making all the difference?
out_interp2 = self.interpolation2(out_up_residual_blocks1) + out_trunk
test code:
# print('Accuracy of the model on the test images:', correct.item()/total)
# print(correct.item())
# print(total)
# for i in range(10):
# print('%s :Accuracy of %5s : %2d %%' % (
# datetime.now(),classes[i], class_correct[i].item() / class_total[i]))
# print(class_correct[i].item())
# print(class_total[i])
# return correct / total
out:
D:\Microsoft Visual Studio\Shared\Anaconda3_64\envs\xk\lib\site-packages\torch\nn\modules\upsampling.py:129: UserWarning: nn.UpsamplingBilinear2d is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
2020-03-31 15:32:25.001979 :Accuracy of the model on the test images: 95 %
Accuracy of the model on the test images: 0.954
9540
10000
2020-03-31 15:32:25.002979 :Accuracy of plane : 0 %
194
1000.0
2020-03-31 15:32:25.002979 :Accuracy of car : 0 %
206
1000.0
2020-03-31 15:32:25.002979 :Accuracy of bird : 0 %
169
1000.0
2020-03-31 15:32:25.002979 :Accuracy of cat : 0 %
136
1000.0
2020-03-31 15:32:25.002979 :Accuracy of deer : 0 %
187
1000.0
2020-03-31 15:32:25.002979 :Accuracy of dog : 0 %
159
1000.0
2020-03-31 15:32:25.003980 :Accuracy of frog : 0 %
204
1000.0
2020-03-31 15:32:25.003980 :Accuracy of horse : 0 %
197
1000.0
2020-03-31 15:32:25.003980 :Accuracy of ship : 0 %
205
1000.0
2020-03-31 15:32:25.003980 :Accuracy of truck : 0 %
203
1000.0
If don't add ‘.item()’
The output will become:
D:\Microsoft Visual Studio\Shared\Anaconda3_64\envs\xk\lib\site-packages\torch\nn\modules\upsampling.py:129: UserWarning: nn.UpsamplingBilinear2d is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
2020-03-31 15:38:02.784257 :Accuracy of the model on the test images: 95 %
Accuracy of the model on the test images: tensor(0, device='cuda:0')
9540
10000
2020-03-31 15:38:02.785258 :Accuracy of plane : 0 %
tensor(194, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.786259 :Accuracy of car : 0 %
tensor(206, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.786259 :Accuracy of bird : 0 %
tensor(169, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.787259 :Accuracy of cat : 0 %
tensor(136, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.787259 :Accuracy of deer : 0 %
tensor(187, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.788261 :Accuracy of dog : 0 %
tensor(159, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.789261 :Accuracy of frog : 0 %
tensor(204, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.789261 :Accuracy of horse : 0 %
tensor(197, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.789261 :Accuracy of ship : 0 %
tensor(205, device='cuda:0', dtype=torch.uint8)
1000.0
2020-03-31 15:38:02.790264 :Accuracy of truck : 0 %
tensor(203, device='cuda:0', dtype=torch.uint8)
1000.0
I hope to get your help. thanks
i find that the params is different from
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.