Coder Social home page Coder Social logo

ml-lab / meaning-guided-video-captioning- Goto Github PK

View Code? Open in Web Editor NEW

This project forked from captanlevi/meaning-guided-video-captioning-

0.0 1.0 0.0 48 KB

Here we describe a new approach to train a video captioning neural network , that is not only based on the normal cross entropy loss for the caption but also uses the meaning of the caption.

Jupyter Notebook 100.00%

meaning-guided-video-captioning-'s Introduction

Meaning-guided-video-captioning-

This is the code based on our research paper on meaning guided video captioning. The code is written in pytorch. Here we describe a new approach to train a video captioning neural network , that is not only based on the normal cross entropy loss for the caption but also uses the meaning of the caption.

All the code files should be run in jupyter notebook , the first cell of all the notebooks will provide the information regarding the necessary python modules and required files and their paths , set them according to your comfort.

How use the pretrained model... To caption the model use the caption.ipynb file ,run all cells, the caption will be generated after the end cell.

How to train the model....

  1. First look at the extract_features file ... This is responsible for extracting Object features , VGG16 features and resnet features. Running this file will save all the extracted features at the specified destination.

  2. Pretrain the video_model. In the main_train.ipynb there are three methods to train the model train1 , train2 and train3 train1 uses regular cross entropy loss. train2 uses lets the model generate a caption on its own and then uses metric loss described in the paper. train3 again lets the model generate a caption , but uses cross entropy loss.

During pretraining we use train1 only and let the model converge. Then use combination of train1 and train3 as described in the paper.

  1. pretraining the meteric The meteric is a bi directional GRU and its defination is provided in main train itself. Pretraining it would yield better results , the procedure is described in the paper.

meaning-guided-video-captioning-'s People

Contributors

captanlevi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.