BERT-based, or any transformer-based models output contextualized embeddings, which is

Why and how the same model for doc_embeddings and word_embeddings? about keybert HOT 1 OPEN

Atharvalite commented on July 4, 2024

Why and how the same model for doc_embeddings and word_embeddings?

from keybert.

Comments (1)

MaartenGr commented on July 4, 2024

It depends on several things, including tokenization schemes but also training data, but in general, these models are also quite capable of creating word embeddings despite not having contextual information at the time of inference. As you might notice, and especially combined with MMR (which does take into account the relationship between words to a certain extent), this already produces quite good results.

The BaseEmbedder indeed started out with the additional option to pass a word embedding model but since both models needed to be in the same dimensional space to be comparable, this turned out to be something that could not easily be implemented. You can't really (or easily) compare the output embeddings of two different embeddings using distance functions. What has been on the list for a while is to extract the token embeddings before aggregation from sentence-transformers but that again depends on the underlying model.

Any suggestions for implementations are appreciated!

from keybert.

Why and how the same model for doc_embeddings and word_embeddings? about keybert HOT 1 OPEN

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent