Coder Social home page Coder Social logo

iamduyang / attend2u Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cesc-park/attend2u

0.0 2.0 0.0 76.55 MB

Attend to You: Personalized Image Captioning with Context Sequence Memory Networks. In CVPR, 2017.

License: MIT License

Python 99.22% Shell 0.78%

attend2u's Introduction

Attend2u

alt tag

This project hosts the code for our CVPR 2017 paper.

  • Cesc Chunseong Park, Byeongchang Kim and Gunhee Kim. Attend to You: Personalized Image Captioning with Context Sequence Memory Networks. In CVPR, 2017. (Spotlight) [arxiv]

We address personalization issues of image captioning, which have not been discussed yet in previous research. For a query image, we aim to generate a descriptive sentence, accounting for prior knowledge such as the user's active vocabularies in previous documents. As applications of personalized image captioning, we tackle two post automation tasks: hashtag prediction and post generation, on our newly collected Instagram dataset, consisting of 1.1M posts from 6.3K users. We propose a novel captioning model named Context Sequence Memory Network (CSMN).

Reference

If you use this code as part of any published research, please refer the following paper.

@inproceedings{attend2u:2017:CVPR,
    author    = {Cesc Chunseong Park and Byeongchang Kim and Gunhee Kim},
    title     = "{Attend to You: Personalized Image Captioning with Context Sequence Memory Networks}"
    booktitle = {CVPR},
    year      = 2017
}

Running Code

Get our code

git clone https://github.com/cesc-park/attend2u

Prerequisites

  1. Install python modules
pip install -r requirements.txt
  1. Download pre-trained resnet checkpoint
cd ${project_root}/scripts
./download_pretrained_resnet_101.sh
  1. Download our dataset (coming soon)
cd ${project_root}/scripts
./download_dataset.sh
  1. Generate formatted dataset and extract Resnet-101 pool5 features
cd ${project_root}/scripts
./extract_features.sh

Training

Run training script. You can train the model with multiple gpus.

python -m train --num_gpus 4 --batch_size 200

Evaluation

Run evaluation script. You can evaluate the model with multiple gpus

python -m eval --num_gpus 2 --batch_size 500

Personalized Image Captioning Dataset

Comming soon!

Examples

Here are post generation examples:

alt tag

Here are hashtag generation examples:

alt tag

Here are (little bit wrong but) interesting post generation examples:

alt tag

Here are (little bit wrong but) interesting hashtag generation examples:

alt tag

Acknowledgement

We implement our model using tensorflow package. Thanks for tensorflow developers. :)

We also thank Instagram for their API and Instagram users for their valuable posts.

Additionally, we thank coco-caption developers for providing caption evaluation tools.

Authors

Cesc Chunseong Park, Byeongchang Kim and Gunhee Kim

Vision and Learning Lab @ Computer Science and Engineering, Seoul National University, Seoul, Korea

License

MIT license

attend2u's People

Contributors

bckim92 avatar cesc-park avatar

Watchers

James Cloos avatar Yang Du avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.