Keras-Classification-Models
A set of models which allow easy creation of Keras models to be used for classification purposes. Also contains modules which offer implementations of recent papers.
Sparse Neural Networks (SparseNets) in Keras
An implementation of "SparseNets" from the paper Sparsely Connected Convolutional Networks in Keras 2.0+.
SparseNets are a modification of DenseNet and its dense connectivity pattern to reduce memory requirements drastically while still having similar or better performance.
Progressive Neural Architecture Search in Keras
Basic implementation of Encoder RNN from the paper ["Progressive Neural Architecture Search"]https://arxiv.org/abs/1712.00559), which is an improvement over the original Neural Architecture Search paper since it requires far less time and resources.
- Uses Keras to define and train children / generated networks, which are defined in Tensorflow by the Encoder RNN.
- Define a state space by using StateSpace, a manager which adds states and handles communication between the Encoder RNN and the user. Submit custom operations and parse locally as required.
- Encoder RNN trained using a modified Sequential Model Based Optimization algorithm from the paper. Some stability modifications made by me to prevent extreme variance when training to cause failed training.
- NetworkManager handles the training and reward computation of a Keras model
Available at : Progressive Neural Architecture Search in Keras
Neural Architecture Search in Keras
Basic implementation of Controller RNN from the paper "Neural Architecture Search with Reinforcement Learning " and "Learning Transferable Architectures for Scalable Image Recognition".
- Uses Keras to define and train children / generated networks, which are defined in Tensorflow by the Controller RNN.
- Define a state space by using StateSpace, a manager which adds states and handles communication between the Controller RNN and the user.
- Reinforce manages the training and evaluation of the Controller RNN
- NetworkManager handles the training and reward computation of a Keras model
Available at : Neural Architecture Search in Keras
Non-Local Neural Networks in Keras
Keras implementation of Non-local blocks from the paper "Non-local Neural Networks".
- Support for "Gaussian", "Embedded Gaussian" and "Dot" instantiations of the Non-Local block.
- Support for shielded computation mode (reduces computation by 4x)
- Support for "Concatenation" instantiation will be supported when authors release their code.
Available at : Non-Local Neural Networks in Keras
Neural Architecture Search Net (NASNet) in Keras
An implementation of "NASNet" models from the paper Learning Transferable Architectures for Scalable Image Recognitio in Keras 2.0+.
Supports building NASNet Large (6 @ 4032), NASNet Mobile (4 @ 1056) and custom NASNets.
Available at : Neural Architecture Search Net (NASNet) in Keras
Squeeze and Excite Networks in Keras
Implementation of Squeeze and Excite networks in Keras. Supports ResNet and Inception v3 models currently. Support for Inception v4 and Inception-ResNet-v2 will also come once the paper comes out.
Available at : Squeeze and Excite Networks in Keras
Dual Path Networks in Keras
Implementation of Dual Path Networks, which combine the grouped convolutions of ResNeXt with the dense connections of DenseNet into two path
Available at : Dual Path Networks in Keras
MobileNets in Keras
Implementation of MobileNet models from the paper MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications in Keras 2.0+.
Contains code for building the MobileNet model (optimized for datasets similar to ImageNet) and weights for the model trained on ImageNet.
Also contains MobileNet V2 model implementations + weights.
Available at : MobileNets in Keras
ResNeXt in Keras
Implementation of ResNeXt models from the paper Aggregated Residual Transformations for Deep Neural Networks in Keras 2.0+.
Contains code for building the general ResNeXt model (optimized for datasets similar to CIFAR) and ResNeXtImageNet (optimized for the ImageNet dataset).
Available at : ResNeXt in Keras
Inception v4 in Keras
Implementations of the Inception-v4, Inception - Resnet-v1 and v2 Architectures in Keras using the Functional API. The paper on these architectures is available at "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning".
The models are plotted and shown in the architecture sub folder. Due to lack of suitable training data (ILSVR 2015 dataset) and limited GPU processing power, the weights are not provided.
Contains : Inception v4, Inception-ResNet-v1 and Inception-ResNet-v2
Available at : Inception v4 in Keras
Wide Residual Networks in Keras
Implementation of Wide Residual Networks from the paper Wide Residual Networks
Usage
It can be used by importing the wide_residial_network script and using the create_wide_residual_network() method. There are several parameters which can be changed to increase the depth or width of the network.
Note that the number of layers can be calculated by the formula : nb_layers = 4 + 6 * N
import wide_residial_network as wrn
ip = Input(shape=(3, 32, 32)) # For CIFAR 10
wrn_28_10 = wrn.create_wide_residual_network(ip, nb_classes=10, N=4, k=10, dropout=0.0, verbose=1)
model = Model(ip, wrn_28_10)
Contains weights for WRN-16-8 and WRN-28-8 models trained on the CIFAR-10 Dataset.
Available at : Wide Residual Network in Keras
DenseNet in Keras
Implementation of DenseNet from the paper Densely Connected Convolutional Networks.
Usage
- Run the cifar10.py script to train the DenseNet 40 model
- Comment out the model.fit_generator(...) line and uncomment the model.load_weights("weights/DenseNet-40-12-CIFAR10.h5") line to test the classification accuracy.
Contains weights for DenseNet-40-12 and DenseNet-Fast-40-12, trained on CIFAR 10.
Available at : DenseNet in Keras
Residual Networks of Residual Networks in Keras
Implementation of the paper "Residual Networks of Residual Networks: Multilevel Residual Networks"
Usage
To create RoR ResNet models, use the ror.py
script :
import ror
input_dim = (3, 32, 32) if K.image_dim_ordering() == 'th' else (32, 32, 3)
model = ror.create_residual_of_residual(input_dim, nb_classes=100, N=2, dropout=0.0) # creates RoR-3-110 (ResNet)
To create RoR Wide Residual Network models, use the ror_wrn.py
script :
import ror_wrn as ror
input_dim = (3, 32, 32) if K.image_dim_ordering() == 'th' else (32, 32, 3)
model = ror.create_pre_residual_of_residual(input_dim, nb_classes=100, N=6, k=2, dropout=0.0) # creates RoR-3-WRN-40-2 (WRN)
Contains weights for RoR-3-WRN-40-2 trained on CIFAR 10
Available at : Residual Networks of Residual Networks in Keras
Keras Segmentation Models
A set of models which allow easy creation of Keras models to be used for segmentation tasks.
Fully Connected DenseNets for Semantic Segmentation
Implementation of the paper The One Hundred Layers Tiramisu : Fully Convolutional DenseNets for Semantic Segmentation
Usage
Simply import the densenet_fc.py script and call the create method:
import densenet_fc as dc
model = dc.create_fc_dense_net(img_dim=(3, 224, 224), nb_dense_block=5, growth_rate=12,
nb_filter=16, nb_layers=4)
Keras Recurrent Neural Networks
A set of scripts which can be used to add custom Recurrent Neural Networks to Keras.
Chrono Initializer, Chrono LSTM and JANET
Keras implementation of the paper The unreasonable effectiveness of the forget gate and the Chrono initializer and Chrono LSTM from the paper Can Recurrent Neural Networks Warp Time?.
This model utilizes just 2 gates - forget (f) and context (c) gates out of the 4 gates in a regular LSTM RNN, and uses Chrono Initialization
to acheive better performance than regular LSTMs while using fewer parameters and less complicated gating structure.
Usage
Simply import the janet.py
file into your repo and use the JANET
layer.
It is not adviseable to use the JANETCell
directly wrapped around a RNN
layer, as this will not allow the max timesteps
calculation that is needed for proper training using the Chrono Initializer
for the forget gate.
The chrono_lstm.py
script contains the ChronoLSTM
model, as it requires minimal modifications to the original LSTM
layer to use the ChronoInitializer
for the forget and input gates.
Same restrictions to usage as the JANET
layer, use the ChronoLSTM
layer directly instead of the ChronoLSTMCell
wrapped around a RNN
layer.
from janet import JANET
from chrono_lstm import ChronoLSTM
...
To use just the ChronoInitializer
, import the chrono_initializer.py
script.
Independently Recurrent Neural Networks (SRU)
Implementation of the paper Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN for Keras 2.0+. IndRNN is a recurrent unit that can run over extremely long time sequences, able to learn the additional problem over 5000 timesteps where most other models fail..
Usage
Usage of IndRNNCells
from ind_rnn import IndRNNCell, RNN
cells = [IndRNNCell(128), IndRNNCell(128)]
ip = Input(...)
x = RNN(cells)(ip)
...
Usage of IndRNN layer
from ind_rnn import IndRNN
ip = Input(...)
x = IndRNN(128)(x)
...
Simple Recurrent Unit (SRU)
Implementation of the paper Training RNNs as Fast as CNNs for Keras 2.0+. SRU is a recurrent unit that can run over 10 times faster than cuDNN LSTM, without loss of accuracy tested on many tasks, when implemented with a custom CUDA kernel.
This is a naive implementation with some speed gains over the generic LSTM cells, however its speed is not yet 10x that of cuDNN LSTMs.
Multiplicative LSTM
Implementation of the paper Multiplicative LSTM for sequence modelling for Keras 2.0+. Multiplicative LSTMs have been shown to achieve state-of-the-art or close to SotA results for sequence modelling datasets. They also perform better than stacked LSTM models for the Hutter-prize dataset and the raw wikipedia dataset.
Usage
Add the multiplicative_lstm.py
script into your repository, and import the MultiplicativeLSTM layer.
Eg. You can replace Keras LSTM layers with MultiplicativeLSTM layers.
from multiplicative_lstm import MultiplicativeLSTM
Minimal RNN
Implementation of the paper MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks for Keras 2.0+. Minimal RNNs are a new recurrent neural network architecture that achieves comparable performance as the popular gated RNNs with a simplified structure. It employs minimal updates within RNN, which not only leads to efficient learning and testing but more importantly better interpretability and trainability
Usage
Import minimal_rnn.py and use either the MinimalRNNCell or MinimalRNN layer
from minimal_rnn import MinimalRNN
# this imports the layer rather than the cell
ip = Input(...) # Rank 3 input shape
x = MinimalRNN(units=128)(ip)
...
Nested LSTM
Implementation of the paper Nested LSTMs for Keras 2.0+. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell, which has its own inner memory cell. Nested LSTMs outperform both stacked and single-layer LSTMs with similar numbers of parameters in our experiments on various character-level language modeling tasks, and the inner memories of an LSTM learn longer term dependencies compared with the higher-level units of a stacked LSTM
Usage
from nested_lstm import NestedLSTM
ip = Input(shape=(nb_timesteps, input_dim))
x = NestedLSTM(units=64, depth=2)(ip)
...
Keras Modules
A set of scripts which can be used to add advanced functionality to Keras.
Normalized Optimizers for Keras
Keras wrapper class for Normalized Gradient Descent from kmkolasinski/max-normed-optimizer, which can be applied to almost all Keras optimizers.
Partially implements Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network for all base Keras optimizers, and allows flexibility to choose any normalizing function. It does not implement adaptive learning rates however.
Usage
from keras.optimizers import Adam, SGD
from optimizer import NormalizedOptimizer
sgd = SGD(0.01, momentum=0.9, nesterov=True)
sgd = NormalizedOptimizer(sgd, normalization='l2')
adam = Adam(0.001)
adam = NormalizedOptimizer(adam, normalization='l2')
Tensorflow Eager with Keras APIs
A set of example notebooks and scripts which detail the usage and pitfalls of Eager Execution Mode in Tensorflow using Keras high level APIs.
One Cycle Learning Rate Policy for Keras
Implementation of One-Cycle Learning rate policy from the papers by Leslie N. Smith.
- A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay
- Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates
Batch Renormalization
Batch Renormalization algorithm implementation in Keras 1.2.1. Original paper by Sergey Ioffe, Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models.\
Usage
Add the batch_renorm.py
script into your repository, and import the BatchRenormalization layer.
Eg. You can replace Keras BatchNormalization layers with BatchRenormalization layers.
from batch_renorm import BatchRenormalization
Snapshot Ensembles in Keras
Implementation of the paper Snapshot Ensembles
Usage
The technique is simple to implement in Keras, using a custom callback. These callbacks can be built using the SnapshotCallbackBuilder class in snapshot.py. Other models can simply use this callback builder to other models to train them in a similar manner.
- Download the 6 WRN-16-4 weights that are provided in the Release tab of the project and place them in the weights directory
- Run the train_cifar_10.py script to train the WRN-16-4 model on CIFAR-10 dataset (not required since weights are provided)
- Run the predict_cifar_10.py script to make an ensemble prediction.
Contains weights for WRN-CIFAR100-16-4 and WRN-CIFAR10-16-4 (snapshot ensemble weights - ranging from 1-5 and including single best model)
Available at : Snapshot Ensembles in Keras