Hongyu Zhai (hz2162
)
The project proposal was sent to Professor Wong on Nov 22. After that, I realized the download link to the training dataset (LSVRC 2010) no longer works. With Professor's permission, I decided to use the CIFAR-100 dataset.
As the result of switching dataset, I made some changes to the AlexNet model as well. The original model expects the input shape to be input_shape
and the size of the output layer. More details can be found in Implementing AlexNet.ipynb
.
This project requires the following dependencies to work
sklearn
: for assessing the performancematplotlib
: for displaying images and making plotstensorflow.keras
: for constructing the neural network- also for loading the CIFAR-100 dataset
Two versions are included: Implementing AlexNet.py
, and Implementing AlexNet.ipynb
. The Jupyter Notebook version is recommended, because the user can explore the code step by step.
# # CS 6433 Project 2: Implementing AlexNet
#
# Hongyu Zhai (`hz2162`)
import numpy as np
from tensorflow import keras
import matplotlib.pyplot as plt
# Loading the Dataset
# Keras provides a function to load CIFAR-100
from tensorflow.keras.datasets import cifar100
(X_train, y_train), (X_test, y_test) = cifar100.load_data(label_mode="fine")
n_test = X_test.shape[0]
n_train = X_train.shape[0]
img_shape = X_train.shape[1:]
print("Number of training examples:", n_train)
print("Number of testing examples:", n_test)
print("Shape of input images:", img_shape)
# First Convolution Layer
#
# > "The first convolutional layer filters the 224×224×3 input image with 96 kernels of size 11×11×3 with a stride of 4 pixels."
# > "The ReLU non-linearity is applied to the output of every convolutional and fully-connected layer."
# > "Response-normalization layers follow the first and second convolutional layers."
# > "Max-pooling layers, of the kind described in Section 3.4, follow both response-normalization layers as well as the fifth convolutional layer."
# layer #1 conv #1
l1_conv1 = keras.layers.Conv2D(input_shape=img_shape,
filters=96,
kernel_size=(11, 11),
strides=(4, 4),
padding='same',
activation='relu')
# response-normalization layer follows the first conv layer
l1_conv1_norm = keras.layers.BatchNormalization()
# max pooling layer with s = 2, and z = 3
l1_conv1_pool = keras.layers.MaxPooling2D(pool_size=(3, 3),
strides=(2,2),
padding='same')
# Second Convolution Layer
#
# > "The second convolutional layer takes as input the (response-normalized and pooled) output of the first convolutional layer and filters it with 256 kernels of size 5 × 5 × 48"
# layer #2 conv #2
l2_conv2 = keras.layers.Conv2D(filters=256,
kernel_size=(5, 5),
strides=(1, 1),
padding='same',
activation='relu')
# response-normalization layer follows the second conv layer
l2_conv2_norm = keras.layers.BatchNormalization()
# max pooling layer with s = 2, and z = 3
l2_conv2_pool = keras.layers.MaxPooling2D(pool_size=(3, 3),
strides=(2, 2),
padding='same')
# Third Convolution Layer
#
# > "The third, fourth, and fifth convolutional layers are connected to one another without any intervening pooling or normalization layers."
# > "The third convolutional layer has 384 kernels of size 3 × 3 × 256 connected to the (normalized, pooled) outputs of the second convolutional layer."
# layer #3 conv #3
l3_conv3 = keras.layers.Conv2D(filters=384,
kernel_size=(3, 3),
strides=(1, 1),
padding='same',
activation='relu')
# Fourth Convolution Layer
#
# > "The fourth convolutional layer has 384 kernels of size 3 × 3 × 192"
# layer #4 conv #4
l4_conv4 = keras.layers.Conv2D(filters=384,
kernel_size=(3, 3),
strides=(1, 1),
padding='same',
activation='relu')
# Fifth Convolution Layer
#
# > "the fifth convolutional layer has 256 kernels of size 3 × 3 × 192."
# layer #5 conv #5
l5_conv5 = keras.layers.Conv2D(filters=256,
kernel_size=(3, 3),
strides=(1, 1),
padding='same',
activation='relu')
# max pooling layer with s = 2, and z = 3
l5_conv5_pool = keras.layers.MaxPooling2D(pool_size=(3, 3),
strides=(2,2),
padding='same')
# First Fully-Connected Layer
#
# > "The fully-connected layers have 4096 neurons each."
# > "We use dropout in the first two fully-connected layers."
# flatten before feeding to FC layers
l6_fc1_flat = keras.layers.Flatten()
# layer #6 fc #1
l6_fc1 = keras.layers.Dense(4096,
input_shape=(32,32,3,),
activation='relu')
# dropout with rate 0.5
l6_fc1_dropout = keras.layers.Dropout(0.5)
# Second Fully-Connected Layer
# layer #7 fc #2
l7_fc2 = keras.layers.Dense(4096,
activation='relu')
# dropout with rate 0.5
l7_fc2_dropout = keras.layers.Dropout(0.5)
# Third Fully-Connected Layer
#
# > "The output of the last fully-connected layer is fed to a 1000-way softmax which produces a distribution over the 1000 class labels"
# layer #8 fc #3
l8_fc3 = keras.layers.Dense(100,
activation='softmax')
# Put Everything Together
AlexNet = keras.models.Sequential()
# first conv layer
AlexNet.add(l1_conv1)
AlexNet.add(l1_conv1_norm)
AlexNet.add(l1_conv1_pool)
# second conv layer
AlexNet.add(l2_conv2)
AlexNet.add(l2_conv2_norm)
AlexNet.add(l2_conv2_pool)
# third conv layer
AlexNet.add(l3_conv3)
# fourth conv layer
AlexNet.add(l4_conv4)
# fifth conv layer
AlexNet.add(l5_conv5)
AlexNet.add(l5_conv5_pool)
# first fc layer
AlexNet.add(l6_fc1_flat)
AlexNet.add(l6_fc1)
AlexNet.add(l6_fc1_dropout)
# second fc layer
AlexNet.add(l7_fc2)
AlexNet.add(l7_fc2_dropout)
# third fc layer
AlexNet.add(l8_fc3)
# compile the Sequential model
AlexNet.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# Testing Our Model
# one-hot encoding the labels
from tensorflow.keras.utils import to_categorical
y_train=to_categorical(y_train)
y_test=to_categorical(y_test)
# train the model using training images
AlexNet.fit(X_train, y_train, batch_size=32, epochs=10)
# make predictions on testing images
y_predicted = AlexNet.predict(X_test)
# report accuracy score of the model
from sklearn.metrics import accuracy_score
score = accuracy_score(np.argmax(y_predicted, axis=1),
np.argmax(y_test, axis=1))
print("The accuracy score of our model is", score)
- Original Paper: https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
- CIFAR-100 Dataset
- Website: https://www.cs.toronto.edu/~kriz/cifar.html
- Keras API to load data: https://keras.io/api/datasets/cifar100/
- Keras API
Sequential
model: https://keras.io/api/models/sequential/- Convolution layer:https://keras.io/api/layers/convolution_layers/convolution2d/
- Normalization layer: https://keras.io/api/layers/normalization_layers/batch_normalization/
- Max Pooling layer: https://keras.io/api/layers/pooling_layers/max_pooling2d/
- Flatten layer: https://keras.io/api/layers/reshaping_layers/flatten/
- Fully-Connected layer: https://keras.io/api/layers/core_layers/dense/
- Dropout layer: https://keras.io/api/layers/regularization_layers/dropout/
- Compiling the model: https://keras.io/api/models/model_training_apis/
- Other functions used:
keras.utils.to_categorical
: https://keras.io/api/utils/python_utils/numpy.random.random
: https://numpy.org/doc/stable/reference/random/generated/numpy.random.random.htmlsklearn.metrics.accuracy_score
: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html