titu1994 / keras-resnext Goto Github PK
View Code? Open in Web Editor NEWImplementation of ResNeXt models from the paper Aggregated Residual Transformations for Deep Neural Networks in Keras 2.0+.
License: MIT License
Implementation of ResNeXt models from the paper Aggregated Residual Transformations for Deep Neural Networks in Keras 2.0+.
License: MIT License
when I try to follow the original paper and instantiate CIFAR-10 network (ResNeXt-29, 8x64d), the paper lists 34.4M parameters, but when I run: ResNext(img_dim, depth=29, cardinality=8, width=64, classes=10)
,
The resulting model summary outputs:
Total params: 89,700,288
Trainable params: 89,599,808
Non-trainable params: 100,480
What is the reason for the big difference?
Hi titu, great to see ResNeXt model. I'm so curious to know if you have plan of implementing Multi-Attention-Network over the ResNeXt ? https://arxiv.org/ftp/arxiv/papers/2009/2009.02130.pdf
I'm trying to extract weights from ImageNet for ResNeXt architecture.
While executing resnext.py, I'm receiving below error.
raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: ''
After checking the code in detail I found that the variable IMAGENET_TF_WEIGHTS_PATH is not properly defined, it must contain a valid URL.
Can you please provide your suggestion to resolve this issue?
Thanks.
Hi Somshubra,
I find that you didn't mention the place to download the pretrained resNext model. The Imagenet pretrained without top one.
Best,
Lele
When running either cifar10.py or cifar100.py I get:
Using TensorFlow backend. Traceback (most recent call last): File "cifar10.py", line 15, in <module> from resnext import hello_ResNeXt ImportError: cannot import name 'hello_ResNeXt'
I've checked that I have both tensorflow and keras 2.0.8. I've search on SO but everything I tried didn't solve the issue. One of the things I've tried is putting everything in a single file, but keeps crashing for the same reason.
Any ideas? Thanks!
Hi, if it's not too much trouble, i wish to implement ResNeXt-29, 16×64d according to the paper. May i know why the parameter count using (depth = 29, cardinality = 16, width = 64) is 320,956,352 in model.summary() instead of 68.1M as reported in the paper? Similarly, (depth = 29, cardinality = 8, width = 64) is 89,700,288 parameters. Thanks alot!
Can you review pull request #25 ? It will convert this code to the package which can be installed with pip.
Hi, I am running out of memor while running it for CIFAR10, for Cardinality of 8 and Width of 64. I a using Nvidia Ti-1080, for training, with a batch size of 64. Is there any way I can avoid this issue. I tried using multi_gpu option in keras, and that have slowed my training at least 10 times. Any suggestion would be really appreciated
First I want to thank you for your implementation of Squeeze-and-Excitation Networks in keras.
I tried your network and it worked very well, but I got a question.
As I underderstand, resnet is very similar to resnext.
Resnet reduces the size of the input 5 times, for exemple :
256 -> 128 , 128->64, 64->32, 32->16, 16->8.
But when I used the code you provided to create a resnext I end up with only 3 reduction of the size :
256->128, 128->64, 64->32
Is it normal?
Does this affect the precision of the network?
i used this inputs to create the network :
resnet_base = SEResNext(input_shape=input_shape,
#depth=155,
depth=56,
cardinality=64,
width=4,
weight_decay=5e-4,
include_top=False,
weights=None,
input_tensor=input_layer,
pooling=None)
https://github.com/titu1994/Keras-ResNeXt/blob/master/cifar10.py#L15 : import ResNeXt
But https://github.com/titu1994/Keras-ResNeXt/blob/master/resnext.py#L37 : ResNext
ResNeXt
!= ResNext
, typo?
According to the paper https://arxiv.org/pdf/1611.05431.pdf
"We use SGD with a mini-batch size of 256 on 8 GPUs (32
per GPU). The weight decay is 0.0001 and the momentum
is 0.9. We start from a learning rate of 0.1, and divide it by
10 for three times using the schedule in"
does adam work better ?
Thanks for your implementation of Resnext. But I can't find 'resnext_imagenet_32_4_th_dim_ordering_th_kernels_no_top.h5', I think this file is ImageNet Weight file. How can I get this file? Thank you very much!
Hello,I have read your code and have some doubts about the structure, it seems not tha same as the paper, especially this part:
I have seen the group conv which has 33 kernel, I don't know where is the 11 kernel's filters in the resdiual block?
Hello, the ResNeXtImageNet() need some .h5 files, and where should these files be download?
Awesome work!
In figure 3 of the paper:
Can you shed light on how these three different iterations of Aggregated Transforms are equivalent? From looking at your code it looks like you choose to implement method b. Is this accurate? Also I saw another implementation that uses lambda layers to do something more akin to item c. That is, if the previous layer channel dimension is 64-d for instance and C=32 (cardinality groups) then this would result in 64/32= 2 feature maps per cardinality group as input to the 32 different convolutions. These feature maps would not overlap and the sum of them across the cardinality groups will always equal 64-d in our example.
How is this the same as having 32 different convolutions all with 64-d channels as input? Your thoughts would be much appreciated!
EDIT: Other implementation - https://gist.github.com/mjdietzx/0cb95922aac14d446a6530f87b3a04ce
I implemented your code but did not get the correct model definition. For ResneXt-101, the filters in last bottleneck layer should be 2048 but it gives 1024.
I'm not sure about the implementation of the grouped convolution block. The attached image shows a different approach if I'm not mistaken. The repository has only one convolution per tower while the image shown has 3 convolutions followed by BatchNormalization and Relu, except for the last one which is only followed by BatchNormalization.
Image found in: https://towardsdatascience.com/illustrated-10-cnn-architectures-95d78ace614d
Can you add a LICENSE file? I suggest Apache License 2.0.
If you look at the original Resnext code with torch at facebookresearch/Resnext and also in the paper
You can't see any LeakyReLU activations but you always used LeakyRelu instead of relu. Did you miss that thing or there is different purpose on this?
Thanks in advance...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.