Coder Social home page Coder Social logo

Comments (2)

kentonl avatar kentonl commented on August 21, 2024 1
  1. 50 refers to the threshold for the minimum frequency of the vocabulary.
  2. Yes, your interpretation is exactly right. You can think of it as separating the key and values of the attention mechanisms. In this case, the head_embeddings are the values, and the keys are determined by the context_embeddings. In hindsight, I probably should have removed this separation for simplicity in the final version, since the improvement wasn't very large.
  3. The window size is referring to the hyperparameter on the x-axis in Figure 2b of the GloVe paper (https://nlp.stanford.edu/pubs/glove.pdf). The wording in the paper is a bit confusing. We were just trying to say that the context_embeddings have a window size of 10 and the head_embeddings have a window size of 2.

Hope that helps!

from e2e-coref.

kalpitdixit avatar kalpitdixit commented on August 21, 2024

Thanks for the fast and complete answers!

For "3." above, I see how using a smaller window size for the head_embeddings compared to the context_embeddings makes sense. Because the head_embeddings are used to represent a span which is typically a few tokens vs context_embeddings which are used to represent entire sentences.
Nice idea.

from e2e-coref.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.