Coder Social home page Coder Social logo

batserine / image_captioning Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 4.0 67.24 MB

Image Captioning using Deep learning models in Keras.

Jupyter Notebook 99.97% Python 0.02% HTML 0.01%
image-captioning keras vgg16 lstm datacleaning progressive-image-loading bleu-score flickr8k-dataset google-colab word2vec

image_captioning's Introduction

Image_Captioning

Image Captioning using Deep learning models in Keras. The models were trained on Flickr_8k Dataset using Google Colab.

Objectives:

  1. Prepare photo and text data for training a deep learning model.
  2. Design and train a deep learning model.
  3. Evaluate the model
  4. Using this model generate caption for new pictures.

Using word to index procedure

Steps:

  1. Data collection
  2. Understanding the data
  3. Data Cleaning
  4. Loading the training set
  5. Data Preprocessing โ€” Images
  6. Data Preprocessing โ€” Captions
  7. Data Preparation using Generator Function
  8. Word Embeddings
  9. Model Architecture
  10. Inference

Dataset:

After requesting the dataset from the author's website. I got these two files.

  1. Flickr8k_Dataset: Contains 8092 photographs in JPEG format.
  2. Flickr8k_text: Contains a number of files containing different sources of descriptions for the photographs.

The dataset has a pre-defined training dataset (6,000 images), development dataset (1,000 images), and test dataset (1,000 images).

Deployment:

Built a basic web app using Flask. It takes an image as input and generates a caption to it. Web app

Comments:

  1. From the result you can see it's not accurate because model was trained for 5 epochs due to limited GPU time Google colab.
  2. Using Checkpoints can make a difference but it will be updated.
  3. Have to try for other techniques like different pretrained models for feature etraction and word to vec for token generation.
  4. This is Implemented by understanding the tutorial of Jason Brownlee(Machine learning mastery).

References:

  1. Flickr_8k Dataset
  2. Deep learning caption generation - Jason Brownlee

image_captioning's People

Contributors

batserine avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.