Hi, firstly thank you so much for this library. I've tried it and it does take some ti

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Does GPU help? about bertopic HOT 3 CLOSED

maartengr commented on May 10, 2024

Does GPU help?

from bertopic.

Comments (3)

lppier commented on May 10, 2024 1

Apologies, managed to try it on GPU enabled cloud server and it was significantly faster.

from bertopic.

MaartenGr commented on May 10, 2024 1

Yes! Using a GPU is highly recommended to speed-up the inference at the sentence-transformers stage.

However, if you do not have a GPU available to you, then you can actually use TF-IDF instead since BERTopic allows for custom embeddings to be passed:

from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer

# Create TF-IDF sparse matrix
docs = fetch_20newsgroups(subset='all',  remove=('headers', 'footers', 'quotes'))['data']
vectorizer = TfidfVectorizer(min_df=5)
embeddings = vectorizer.fit_transform(docs)

# Run BERTopic with embeddings
model = BERTopic(allow_st_model=True)
topics, probabilities = model.fit_transform(docs, embeddings)

Note that I used the parameter allow_st_model which basically uses a sentence-transformer model to fine-tune the topic representation. This should be very efficient regardless of using a GPU since you would only need to embed a few hundred words. However, you can set this to False if you do not want to be using a sentence-transformer model at all.

EDIT: Did not saw your response but I will leave this up here for those who are interested in other embedding methods.

from bertopic.

lppier commented on May 10, 2024

Thanks @MaartenGr ! This was very useful.

from bertopic.

Does GPU help? about bertopic HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent