titu1994 / keras-resnext Goto Github PK

Implementation of ResNeXt models from the paper Aggregated Residual Transformations for Deep Neural Networks in Keras 2.0+.

License: MIT License

Python 100.00%

keras-resnext's Issues

Number of parameters

when I try to follow the original paper and instantiate CIFAR-10 network (ResNeXt-29, 8x64d), the paper lists 34.4M parameters, but when I run: ResNext(img_dim, depth=29, cardinality=8, width=64, classes=10),
The resulting model summary outputs:

Total params: 89,700,288
Trainable params: 89,599,808
Non-trainable params: 100,480

What is the reason for the big difference?

Do you have plan of implementing

Hi titu, great to see ResNeXt model. I'm so curious to know if you have plan of implementing Multi-Attention-Network over the ResNeXt ? https://arxiv.org/ftp/arxiv/papers/2009/2009.02130.pdf

ImportError: cannot import name '_obtain_input_shape'?

IMAGENET_TF_WEIGHTS_PATH isn't defined in resnext.py

I'm trying to extract weights from ImageNet for ResNeXt architecture.

While executing resnext.py, I'm receiving below error.

raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: ''

After checking the code in detail I found that the variable IMAGENET_TF_WEIGHTS_PATH is not properly defined, it must contain a valid URL.

Can you please provide your suggestion to resolve this issue?

Thanks.

Where can i download the weight?

Where is the pretrained model?

Hi Somshubra,

I find that you didn't mention the place to download the pretrained resNext model. The Imagenet pretrained without top one.

Best,
Lele

cannot import name 'ResNeXt'`

When running either cifar10.py or cifar100.py I get:

Using TensorFlow backend. Traceback (most recent call last): File "cifar10.py", line 15, in <module> from resnext import hello_ResNeXt ImportError: cannot import name 'hello_ResNeXt'

I've checked that I have both tensorflow and keras 2.0.8. I've search on SO but everything I tried didn't solve the issue. One of the things I've tried is putting everything in a single file, but keeps crashing for the same reason.

Any ideas? Thanks!

Parameter Count

Hi, if it's not too much trouble, i wish to implement ResNeXt-29, 16×64d according to the paper. May i know why the parameter count using (depth = 29, cardinality = 16, width = 64) is 320,956,352 in model.summary() instead of 68.1M as reported in the paper? Similarly, (depth = 29, cardinality = 8, width = 64) is 89,700,288 parameters. Thanks alot!

Pull request pending

Can you review pull request #25 ? It will convert this code to the package which can be installed with pip.

Running out of Memory

Hi, I am running out of memor while running it for CIFAR10, for Cardinality of 8 and Width of 64. I a using Nvidia Ti-1080, for training, with a batch size of 64. Is there any way I can avoid this issue. I tried using multi_gpu option in keras, and that have slowed my training at least 10 times. Any suggestion would be really appreciated

Hello

First I want to thank you for your implementation of Squeeze-and-Excitation Networks in keras.
I tried your network and it worked very well, but I got a question.
As I underderstand, resnet is very similar to resnext.
Resnet reduces the size of the input 5 times, for exemple :
256 -> 128 , 128->64, 64->32, 32->16, 16->8.
But when I used the code you provided to create a resnext I end up with only 3 reduction of the size :
256->128, 128->64, 64->32
Is it normal?
Does this affect the precision of the network?

i used this inputs to create the network :
resnet_base = SEResNext(input_shape=input_shape,
#depth=155,
depth=56,
cardinality=64,
width=4,
weight_decay=5e-4,
include_top=False,
weights=None,
input_tensor=input_layer,
pooling=None)

`ResNeXt` != `ResNext`

https://github.com/titu1994/Keras-ResNeXt/blob/master/cifar10.py#L15 : import ResNeXt
But https://github.com/titu1994/Keras-ResNeXt/blob/master/resnext.py#L37 : ResNext

ResNeXt != ResNext, typo?

What's the best Optimizer to use and what default parameters go well.

According to the paper https://arxiv.org/pdf/1611.05431.pdf
"We use SGD with a mini-batch size of 256 on 8 GPUs (32
per GPU). The weight decay is 0.0001 and the momentum
is 0.9. We start from a learning rate of 0.1, and divide it by
10 for three times using the schedule in"
does adam work better ?

Where is ImageNet Weights file?

Thanks for your implementation of Resnext. But I can't find 'resnext_imagenet_32_4_th_dim_ordering_th_kernels_no_top.h5', I think this file is ImageNet Weight file. How can I get this file? Thank you very much!

Is the model structure in the code the same as in the paper?

Hello，I have read your code and have some doubts about the structure, it seems not tha same as the paper, especially this part:
I have seen the group conv which has 33 kernel, I don't know where is the 11 kernel's filters in the resdiual block?

Where can I download the pretained model based on Imagenet?

Hello, the ResNeXtImageNet() need some .h5 files, and where should these files be download?

Splitting Tensors/Grouped Convolutions

@titu1994 ,

Awesome work!

In figure 3 of the paper:

Can you shed light on how these three different iterations of Aggregated Transforms are equivalent? From looking at your code it looks like you choose to implement method b. Is this accurate? Also I saw another implementation that uses lambda layers to do something more akin to item c. That is, if the previous layer channel dimension is 64-d for instance and C=32 (cardinality groups) then this would result in 64/32= 2 feature maps per cardinality group as input to the 32 different convolutions. These feature maps would not overlap and the sum of them across the cardinality groups will always equal 64-d in our example.

How is this the same as having 32 different convolutions all with 64-d channels as input? Your thoughts would be much appreciated!

EDIT: Other implementation - https://gist.github.com/mjdietzx/0cb95922aac14d446a6530f87b3a04ce

Incorrect number of features at last stage for ResneXt-101

I implemented your code but did not get the correct model definition. For ResneXt-101, the filters in last bottleneck layer should be 2048 but it gives 1024.

Grouped Convolution Block

I'm not sure about the implementation of the grouped convolution block. The attached image shows a different approach if I'm not mistaken. The repository has only one convolution per tower while the image shown has 3 convolutions followed by BatchNormalization and Relu, except for the last one which is only followed by BatchNormalization.

Image found in: https://towardsdatascience.com/illustrated-10-cnn-architectures-95d78ace614d

Add a license

Can you add a LICENSE file? I suggest Apache License 2.0.

Different Implementation from paper and original code

If you look at the original Resnext code with torch at facebookresearch/Resnext and also in the paper
You can't see any LeakyReLU activations but you always used LeakyRelu instead of relu. Did you miss that thing or there is different purpose on this?

Thanks in advance...

titu1994 / keras-resnext Goto Github PK

keras-resnext's Issues

Recommend Projects

Recommend Topics

Recommend Org