Coder Social home page Coder Social logo

Comments (11)

ruotianluo avatar ruotianluo commented on August 19, 2024 1

Pair loss is worse and vseattmodel gives worse result too.

from disccaptioning.

ruotianluo avatar ruotianluo commented on August 19, 2024

thats very common in the first several epochs. Try training it a little bit longer. Or just restart the training.

from disccaptioning.

qq283215389 avatar qq283215389 commented on August 19, 2024

ok, thanks a lot, for another VSE model(VSEAttModel) and "pair loss" , whose result isn't shown in your paper "Discriminability objective for training descriptive captions" in CVPR 2018?

from disccaptioning.

qq283215389 avatar qq283215389 commented on August 19, 2024

thanks!if the retrieval model perform better(like the paper“Stacked Cross Attention for Image-Text Matching”),can we get a better result for captioning model?

from disccaptioning.

ruotianluo avatar ruotianluo commented on August 19, 2024

I think it's very likely.

from disccaptioning.

qq283215389 avatar qq283215389 commented on August 19, 2024

hello,luo
It's my result of pre-training retrieval model after i run “run_fc_con.sh”, there is still a difference with your result presented in your paper for the retrieval model.
Result:
Average i2t Recall: 53.9
Image to text: 29.9 59.2 72.6 4.0 19.6
Average t2i Recall: 42.3
Text to image: 20.6 46.5 59.8 7.0 40.8

from disccaptioning.

ruotianluo avatar ruotianluo commented on August 19, 2024

Did you download my pretrained model? Does it perform better and the same as what's reported in the paper?
https://drive.google.com/open?id=1oQ_O-O2KoSQv1xdBPKaIOGt-VW0gS-42
These are my training curves, to give you a hint.

from disccaptioning.

qq283215389 avatar qq283215389 commented on August 19, 2024

i might get the problem,i have used the size of 7x7 for coco fc features, i think u have used 14x14 for coco fc features?

from disccaptioning.

ruotianluo avatar ruotianluo commented on August 19, 2024

fc feature doest have spatial dimensions, it's a vector

from disccaptioning.

qq283215389 avatar qq283215389 commented on August 19, 2024

I found other paper use Karpathy'split for COCO, your paper use rama's split, whose test data are the same? why you can compare your result with the result in self-critical?

from disccaptioning.

ruotianluo avatar ruotianluo commented on August 19, 2024

the splits are different. The self critical one is my implementation on Rama's split. Using Rama split I'd because we need to compare ours to Rama's result.

from disccaptioning.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.