Coder Social home page Coder Social logo

cesc-park / attend2u Goto Github PK

View Code? Open in Web Editor NEW
209.0 209.0 57.0 82.51 MB

๐Ÿ–ผ๏ธ Attend to You: Personalized Image Captioning with Context Sequence Memory Networks. In CVPR, 2017. Expanded : Towards Personalized Image Captioning via Multimodal Memory Networks. In IEEE TPAMI, 2018.

License: MIT License

Python 98.84% Shell 1.16%

attend2u's People

Contributors

bckim92 avatar cesc-park avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

attend2u's Issues

Any Implementation on Baselines?

Thanks for the greatwork! It is creative and the shown results are promising.
I saw from paper you have several baselines to be compared against your proposed CSMN (e.g. 1-nearest neighbor to user contents, RNN seq2seq with active vocabulary)
Would you release the implementations of those baselines?

Besides that, given the recent advancement on NLP (transformer, GPT-2 ... etc) , would you (and how would you) propose your CSMN differently under modern context (as of 2020)?

about superscript a and c

In your paper๏ผŒIโ€˜ve read the CSMN model.I have trouble with the understanding of superscript a and c๏ผŒwould you mind telling me what the porpose of add superscript a and c to the image memory vector and the user context memory vector?Thanks a lot!

Doubt about steps and InstaPic dataset size

Do we need to train for 5 lac. steps as specified in code ? or for just 20 epochs as per paper ?

Also, earlier I downloaded InstaPic data set, which only had 1.03M samples instead of 1.1M. why ?

Hoping to get a reply soon. Thanks :)

Memory Error

ํ˜น์‹œ ๋Œ๋ฆฌ์…จ๋˜ ํ™˜๊ฒฝ์„ ์ข€ ๊ตฌํ• ์ˆ˜ ์žˆ์„๊นŒ์š” ??

Difference between papers

Hi @cesc-park
If I am correct, the only difference between CVPR and TPAMI papers is use of data sets. Both use the same CSMN architecture.
Can you or someone validate this?

getting error while training on CPU

Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/lalit/notebooks/Lalit/image_caption/attend2u/train.py", line 211, in
tf.app.run()
File "/home/lalit/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/lalit/notebooks/Lalit/image_caption/attend2u/train.py", line 208, in main
train()
File "/home/lalit/notebooks/Lalit/image_caption/attend2u/train.py", line 134, in train
loss = _tower_loss(inputs, scope)
File "/home/lalit/notebooks/Lalit/image_caption/attend2u/train.py", line 33, in _tower_loss
net = CSMN(inputs, ModelConfig(FLAGS))
File "utils/configuration.py", line 33, in init
super(ModelConfig, self).init(FLAGS)
File "utils/configuration.py", line 12, in init
attrs = FLAGS.dict['__flags']
KeyError: '__flags'

about the validation dataset part?

In papar's EXperiments part, said "we randomly split the dataset into 90% for training, 5k posts for test and the rest for validation" but I read the code and found the dataset is parted into train.txt test1.txt and test2.txt, but the text2.txt is not used in code, Did I miss something?
looking forward to your reply.

Pretrained model ๋ฌธ์˜

ํ˜น์‹œ Pretrained model ์žˆ์œผ๋ฉด ํ•œ๋ฒˆ ํ™•์ธ์ด ๊ฐ€๋Šฅํ• ์ง€์š”??
๊ฐœ์ธ์ ์œผ๋กœ ๋Œ๋ ค๋ณด๋‹ˆ ์ƒ๊ฐ๋ณด๋‹ค.. GPU ์„ฑ๋Šฅ์ด ๋ถ€์กฑํ•˜๋„ค์š”.

๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

[email protected]

Dataset cannot be downloaded

The YFCC100M data set cannot be downloaded via Google Cloud Disk. Can you provide other download methods? And I hope you can provide the InstaPIC-1.1M Dataset again.

Errors in code for feature extraction

I cloned the repository and followed the instructions as given by you guys on this github repo, but there are too many issues in the code. First, even though I installed the tensorflow as per the version given in requirements.txt but there is no module named preprocessing in slim. Anyways I managed that but then I ran into the following errrors:

Traceback (most recent call last):
  File "cnn_feature_extractor.py", line 148, in <module>
    tf.app.run()
  File "/home/pdguest/Lalit/Selfie/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "cnn_feature_extractor.py", line 112, in main
    processed_images, num_classes=1000, is_training=False
TypeError: resnet_v1_101() got an unexpected keyword argument 'is_training'
Traceback (most recent call last):
  File "cnn_feature_extractor.py", line 148, in <module>
    tf.app.run()
  File "/home/pdguest/Lalit/Selfie/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "cnn_feature_extractor.py", line 112, in main
    processed_images, num_classes=1000, is_training=False
TypeError: resnet_v1_101() got an unexpected keyword argument 'is_training'
Traceback (most recent call last):
  File "cnn_feature_extractor.py", line 148, in <module>
    tf.app.run()
  File "/home/pdguest/Lalit/Selfie/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "cnn_feature_extractor.py", line 112, in main
    processed_images, num_classes=1000, is_training=False
TypeError: resnet_v1_101() got an unexpected keyword argument 'is_training'
Traceback (most recent call last):
  File "cnn_feature_extractor.py", line 148, in <module>
    tf.app.run()
  File "/home/pdguest/Lalit/Selfie/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "cnn_feature_extractor.py", line 112, in main
    processed_images, num_classes=1000, is_training=False
TypeError: resnet_v1_101() got an unexpected keyword argument 'is_training'
Traceback (most recent call last):
  File "cnn_feature_extractor.py", line 148, in <module>
    tf.app.run()
  File "/home/pdguest/Lalit/Selfie/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "cnn_feature_extractor.py", line 112, in main
    processed_images, num_classes=1000, is_training=False
TypeError: resnet_v1_101() got an unexpected keyword argument 'is_training'
Traceback (most recent call last):
  File "cnn_feature_extractor.py", line 148, in <module>
    tf.app.run()
  File "/home/pdguest/Lalit/Selfie/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "cnn_feature_extractor.py", line 112, in main
    processed_images, num_classes=1000, is_training=False
TypeError: resnet_v1_101() got an unexpected keyword argument 'is_training'

Can you please check your code once again and update it ?

Problem about context information.

It seems that for one image to be predicted, the context information is from all tags of one user, not from the previous tags before the image is uploaded. Did I miss something?

Error in train.py

When I ran the train.py file using the options as num_gpus 1 and batch_size 10, it throws the following error :

[ERROR:2017-05-08 12:42:08,943] Exception in QueueRunner: 0-th value returned by pyfunc_0 is double, but expects float
	 [[Node: PyFunc = PyFunc[Tin=[DT_STRING], Tout=[DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](DecodeCSV)]]

Caused by op u'PyFunc', defined at:
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/HashtagPred/train.py", line 211, in <module>
    tf.app.run()
  File "/home/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "/home/HashtagPred/train.py", line 208, in main
    train()
  File "/home/HashtagPred/train.py", line 98, in train
    tower_caption_mask = enqueue(False)
  File "utils/data_utils.py", line 240, in enqueue
    answer_id, context_mask, caption_mask = read_numpy_format_and_label( filename_queue)
  File "utils/data_utils.py", line 160, in read_numpy_format_and_label
    tf.float32
  File "/home/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 189, in py_func
    input=inp, token=token, Tout=Tout, name=name)
  File "/home/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_script_ops.py", line 40, in _py_func
    name=name)
  File "/home/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op
    op_def=op_def)
  File "/home/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/HashtagPred/hashtag/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): 0-th value returned by pyfunc_0 is double, but expects float
	 [[Node: PyFunc = PyFunc[Tin=[DT_STRING], Tout=[DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"](DecodeCSV)]]

Exception in thread Thread-6:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.