Predicting Image Memorability with MemNet in Keras on PowerAI

This code pattern will enable you to build an application that predicts how "unique" or "memorable" images are. You'll do this through the Keras deep learning library, using the MemNet architecture. The dataset this neural network will be trained on is called "LaMem" (Large-scale Image Memorability), by MIT. In order to process the 45,000 training images and 10,000 testing images (227x227 RGB) efficiently, we'll be training the neural network on a PowerAI machine on NIMBIX, enabling us to benefit from NVLink (direct CPU-GPU memory interconnect) without needing any extra code. Once the model has been trained on PowerAI, we'll convert it to a CoreML model and expose it via a web application written in Swift, running on a Kitura server on macOS.

When the reader has completed this pattern, they'll understand how to:

Train a Keras model on PowerAI.
Use a custom loss function with a Keras model.
Convert tf.keras models that deal with images to CoreML models.
Use the Apple Vision framework with a CoreML model in Swift to get VNCoreMLFeatureValueObservations.
Host a Web Server with Kitura
Expose a Mustache HTTP template through Kitura

Flow

TODO: add flow diagram

A Keras model is trained with the LaMem dataset.
The Keras model is converted to a CoreML model.
The user uploads their image to the kitura web app.
The Kitura web app uses the CoreML model for predictions.
The user recieves the neural network's prediction.

Included Components

IBM Power Systems: A server built with open technologies and designed for mission-critical applications.
IBM PowerAI: A software platform that makes deep learning, machine learning, and AI more accessible and better performing.
Kitura: Kitura is a free and open-source web framework written in Swift, developed by IBM and licensed under Apache 2.0. It’s an HTTP server and web framework for writing Swift server applications.

Featured Technologies

Artificial Intelligence: Artificial intelligence can be applied to disparate solution spaces to deliver disruptive technologies.
Swift on the Server: Build powerful, fast and secure server side Swift apps for the Cloud.

Prerequisites

If you don't already have a PowerAI server, you can acquire one from Nimbix or from the PowerAI offering on IBM Cloud.
macOS 10.13 (High Sierra) or later

Steps

Clone the repo
Download the LaMem data
Train the Keras model
Convert the Keras model to a CoreML model
Run the Kitura web app

1. Clone the repo

Clone the powerai-image-memorability repo onto both your PowerAI server and local macOS machine. In a terminal, run:

git clone https://www.github.com/IBM/powerai-image-memorability

2. Download and extract the LaMem data

To download the LaMem dataset, head over to the powerai_serverside directory, and run the following command:

wget http://memorability.csail.mit.edu/lamem.tar.gz

Once the dataset is done downloading, run the following command to extract that data:

tar -xvf lamem.tar.gz

3. Train the Keras model

To train the Keras model, run the following command inside of the powerai_serverside directory:

python train.py

Once Python script is done running, you'll see a memnet_model.h5 model in the powerai_serverside directory. Copy that over to the webapp directory on the macOS machine that you'd like to run the frontend on.

4. Convert the Keras model to a CoreML model

Inside of the webapp directory on your macOS machine, run the following Python script to convert your Keras model to a CoreML model:

python convert_model.py memnet_model.h5

This may take a few minutes, but when you're done, you should see a lamem.mlmodel file in the webapp directory.

5. Run the Kitura web app

Then, you're ready to roll! Run the following command to build & run your application:

swift build && swift run

Now, you can head over to localhost:3333 in your favourite web browser, upload an image, and calculate its memorability.

TODO: add screenshot

Predicting score problem

Hello, I am trying to use your code without kitura and swift just with python, but it doesn't matter what image I use, each of the predicting images has at least ~0.79 memorability score, even completely white image has ~0.83 score. Please help me, here is my code:

...
model = Sequential()
model.add(Conv2D(96, (11, 11), (4, 4), activation="relu", input_shape=(227, 227, 3)))
model.add(MaxPooling2D((3, 3), (2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(256, (5, 5), activation="relu"))
model.add(ZeroPadding2D((2, 2)))
model.add(MaxPooling2D((3, 3), (2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(384, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(384, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(Conv2D(256, (3, 3), activation="relu"))
model.add(ZeroPadding2D((1, 1)))
model.add(MaxPooling2D((3, 3), (2, 2)))
model.add(GlobalAveragePooling2D())
model.add(Dense(4096, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(4096, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(1))

train_split = load_split("../lamem/splits/train_1.txt")
test_split = load_split("../lamem/splits/test_1.txt")
batch_size = 64 * 4

train_gen = lamem_generator(train_split, batch_size=batch_size)
test_gen = lamem_generator(test_split, batch_size=batch_size)

model.compile("adam", euclidean_distance_loss)
model.fit(train_gen, steps_per_epoch=int(len(train_split) / batch_size), epochs=5, verbose=1, validation_data=test_gen, 
validation_steps=int(len(test_split) / batch_size))

model.save("memnet_model2.h5")
// I am trying also with separate weights saving.
model.save_weights('memnet_model2_w')

then in other script:

model = tf.keras.models.load_model('memnet_model2.h5', custom_objects={'euclidean_distance_loss': 
euclidean_distance_loss})
// tried also with load_weights, I have hoped this may help(
//model.load_weights('memnet_model2_w')

def load_image(image_file):
return np.array(Image.open(image_file).resize((227, 227)).convert("RGB"), dtype="float32") / 255.

test_img = mp.Pool().map(load_image, ['predict/7.png'])
test_img = np.array(test_img)
print(test_img.shape)
# test_img.reshape(-1, 227, 227, 3)
print(np.array(test_img).shape)

prediction = model.predict(np.array(test_img))
print(prediction)

stevemar / powerai-image-memorability Goto Github PK

powerai-image-memorability's Introduction

Predicting Image Memorability with MemNet in Keras on PowerAI

Flow

Included Components

Featured Technologies

Prerequisites

Steps

1. Clone the repo

2. Download and extract the LaMem data

3. Train the Keras model

4. Convert the Keras model to a CoreML model

5. Run the Kitura web app

powerai-image-memorability's People

Contributors

Stargazers

Watchers

Forkers

powerai-image-memorability's Issues

Recommend Projects

Recommend Topics

Recommend Org