Is your feature request related to a problem? Please describe. Wh

if you are using the cli/toml just place it under: <div class="snippet-clipboard-c

Feature Request: Choose Embedding Model about nhost HOT 16 CLOSED

osseonews commented on June 1, 2024 1

Feature Request: Choose Embedding Model

from nhost.

Comments (16)

ded-ditat commented on June 1, 2024

Also requesting this. We would like to use auto embeddings with the small model and not ada.

This is Dylan with revtron.ai btw.

Excellent call out.

from nhost.

dbarrosop commented on June 1, 2024

Thanks for reporting this, this makes sense and should be an easy addition. We will take a look as soon as possible.

from nhost.

dbarrosop commented on June 1, 2024

Would you mind testing v0.5.0-beta1? This version adds a new column model to the autoembeddings_configuration table that can have one of the following values:

text-embedding-ada-002 (default)
text-embedding-3-small
text-embedding-3-large

We still don't have support in the dashboard so you will have to update the value directly using the database tab

from nhost.

osseonews commented on June 1, 2024

Sure, I can test, but probably won't get to it until later next week. btw, how do we install the v0.5.0-beta1?

from nhost.

dbarrosop commented on June 1, 2024

if you are using the cli/toml just place it under:

[ai]
version=xxx

(it should already be there so just udpate the version)

If you aren't, just go the dashboard->settings->ai and enter the version (don't worry if it doesn't show in the menu, just enter the custom value)

from nhost.

osseonews commented on June 1, 2024

I changed the version, in the settings, but all I see now is that my nhost workspace is updating, and I don't see the "model" column added.

from nhost.

osseonews commented on June 1, 2024

And I just got this error in my project: "Error deploying the project most likely due to invalid configuration. Please review your project's configuration and logs for more information."

from nhost.

dbarrosop commented on June 1, 2024

Apologies, it should be 0.5.0-beta1.

from nhost.

osseonews commented on June 1, 2024

OK, it updated and I change the embeds. I'll test it soon.

from nhost.

osseonews commented on June 1, 2024

Quick question: When we are setting the query for the autoembedding, it says that the "id" field is required, but what about the other fields that are in query. Are the other fields the ones that are used to create the actual vector? So we should only include those fields with the text we want to create a vector for? For example, your sample query below. The embed will be created for "name, genre, overview" fields? These fields are concatenated and a vector is created? So if we want to only let's say embed, "overview", we would just include that field in the query and remove the others?

query GetOutdatedMovies {
  movies(where: {
    _or: [
      {embeddings: {_is_null: true}}, # new rows without embeddings
      {outdated: {_eq: true},         # existing rows with changed data
    },
  ]}) {
    id                                # id column is mandatory
    name
    genre
    overview
  }
}

from nhost.

osseonews commented on June 1, 2024

BTW, I just ran the embeds with the "small embedding", and the vector searches are meaningless for the most part. For example, I did a graphite search with the keyword: " supercalifragilisticexpialidocious" and it turned up a result, even though none of our content has anything to do with that at all. I would have expected this to return nothing. Also, for other real searches, the results don't match the query at all. I can't say if this is a problem with the model or pgvector, and it's possible this error would have occured before this update. Just that there is something wrong here.

from nhost.

dbarrosop commented on June 1, 2024

There is nothing wrong with the models. The queries will return the best matches (even if they are unrelated). There is another feature coming alongside this one when 0.5.0 is released that will allow to set the maximum distance to avoid the issue you describe.

from nhost.

osseonews commented on June 1, 2024

OK that makes sense, I was actually wondering about what distance was used. Also, is there an easy way to turn off embeddings on a table without deleting the auto embeddings set up? Was thinking of just deleting the triggers we created based on the docs for the outdated field.

from nhost.

dbarrosop commented on June 1, 2024

You can try with 0.5.0. I still need to update documentation and we need to add the feature to the dashboard but it should be usable so no need to wait. Re maxDistance, you now have a maxDistance option when doing similarity/search queries. Something like:

graphiteXXXSearch(args={
      query: "blah",
      amount: 10,
      maxDistance: 0.20,
  }) {
      ...
}

maxDistance is a float between 0 (exact match) and 1 (nothing in common) and it defaults to 1 (for backwards compatibility).

from nhost.

osseonews commented on June 1, 2024

Can we already use 0.5.0? We can just change the version in the AI Settings page?

from nhost.

dbarrosop commented on June 1, 2024

Yes

from nhost.

Feature Request: Choose Embedding Model about nhost HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent