Comments (16)
Also requesting this. We would like to use auto embeddings with the small model and not ada.
This is Dylan with revtron.ai btw.
Excellent call out.
from nhost.
Thanks for reporting this, this makes sense and should be an easy addition. We will take a look as soon as possible.
from nhost.
Would you mind testing v0.5.0-beta1
? This version adds a new column model
to the autoembeddings_configuration
table that can have one of the following values:
- text-embedding-ada-002 (default)
- text-embedding-3-small
- text-embedding-3-large
We still don't have support in the dashboard so you will have to update the value directly using the database tab
from nhost.
Sure, I can test, but probably won't get to it until later next week. btw, how do we install the v0.5.0-beta1?
from nhost.
if you are using the cli/toml just place it under:
[ai]
version=xxx
(it should already be there so just udpate the version)
If you aren't, just go the dashboard->settings->ai and enter the version (don't worry if it doesn't show in the menu, just enter the custom value)
from nhost.
I changed the version, in the settings, but all I see now is that my nhost workspace is updating, and I don't see the "model" column added.
from nhost.
And I just got this error in my project: "Error deploying the project most likely due to invalid configuration. Please review your project's configuration and logs for more information."
from nhost.
Apologies, it should be 0.5.0-beta1
.
from nhost.
OK, it updated and I change the embeds. I'll test it soon.
from nhost.
Quick question: When we are setting the query for the autoembedding, it says that the "id" field is required, but what about the other fields that are in query. Are the other fields the ones that are used to create the actual vector? So we should only include those fields with the text we want to create a vector for? For example, your sample query below. The embed will be created for "name, genre, overview" fields? These fields are concatenated and a vector is created? So if we want to only let's say embed, "overview", we would just include that field in the query and remove the others?
query GetOutdatedMovies {
movies(where: {
_or: [
{embeddings: {_is_null: true}}, # new rows without embeddings
{outdated: {_eq: true}, # existing rows with changed data
},
]}) {
id # id column is mandatory
name
genre
overview
}
}
from nhost.
BTW, I just ran the embeds with the "small embedding", and the vector searches are meaningless for the most part. For example, I did a graphite search with the keyword: " supercalifragilisticexpialidocious" and it turned up a result, even though none of our content has anything to do with that at all. I would have expected this to return nothing. Also, for other real searches, the results don't match the query at all. I can't say if this is a problem with the model or pgvector, and it's possible this error would have occured before this update. Just that there is something wrong here.
from nhost.
There is nothing wrong with the models. The queries will return the best matches (even if they are unrelated). There is another feature coming alongside this one when 0.5.0 is released that will allow to set the maximum distance to avoid the issue you describe.
from nhost.
OK that makes sense, I was actually wondering about what distance was used. Also, is there an easy way to turn off embeddings on a table without deleting the auto embeddings set up? Was thinking of just deleting the triggers we created based on the docs for the outdated field.
from nhost.
You can try with 0.5.0
. I still need to update documentation and we need to add the feature to the dashboard but it should be usable so no need to wait. Re maxDistance
, you now have a maxDistance
option when doing similarity/search queries. Something like:
graphiteXXXSearch(args={
query: "blah",
amount: 10,
maxDistance: 0.20,
}) {
...
}
maxDistance
is a float between 0 (exact match) and 1 (nothing in common) and it defaults to 1 (for backwards compatibility).
from nhost.
Can we already use 0.5.0? We can just change the version in the AI Settings page?
from nhost.
Yes
from nhost.
Related Issues (20)
- "Track this" in the SQL editor doesn't update metadata
- Cannot login with password less in version 3.3.2 HOT 7
- @nhost/react: Infinite loop with requests to `/token` if two tabs are open & user logs out in one tab HOT 3
- Hasura doesn't work HOT 3
- nhost project showing 404 HOT 1
- dashboard: allow configuring postmark's native integraton
- Feature Request: Add OTP for Email in Auth HOT 7
- dashboard: github connect: investigate potential UX improvements HOT 2
- dashboard: multiple error toasts closing together
- Custom Claim Array always null with auth 0.29.1 HOT 5
- dashboard: add model settings to autoembeddings configuration
- dashboard: e2e tests for Run and AI pages
- Error in signup/email-password HOT 2
- Never received any otp code, response is null on session and error
- Nhost (Next) JWT Token expire and apollo/nhostNext js client stops working HOT 19
- Can't Access Auth Endpoint HOT 1
- Change metadata for user in dashboard HOT 2
- evaluate project templates
- add "headers" option to missing methods in the js sdk
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nhost.