Coder Social home page Coder Social logo

Comments (13)

snexus avatar snexus commented on August 23, 2024 1

Thanks for confirming, I will try to implement it in the next couple of days.

Doing manual indexing isn't too much work, though it does ultimately make the application have a more "POC" feel when app functionality is split between different arguments that need to be executed from a command-line.

You are right, having it in the same UI makes it a better experience, think it is worth the effort .

from llm-search.

Hisma avatar Hisma commented on August 23, 2024 1

Agreed, it looked so clean when all I had to do was press a button to update! It will be a very nice UI/UX once this gets properly implemented. And end users only need to interact with the cli to start the app (which is easy to automate w/ a script as I have done), and can then just leave the app running all times.

from llm-search.

Hisma avatar Hisma commented on August 23, 2024 1

Just tested. It works great. I made sure to watch GPU memory & it was able to load & unload no problem. Thank you!

from llm-search.

snexus avatar snexus commented on August 23, 2024

Thanks for the suggestions - your second idea shouldn't be too hard to implement - will add it to to-do list.

The embedding update feature was necessary for this app - before that even for small/medium document base of 500MB it was painful to recreate from scratch.

Have to mention that the update process works on the file level (which is higher level than update functionality in vector dbs, which work on individual chunks level) - it has to scan all the existing files (as specified in configuration) and figure out what was changed/deleted/added, based on the files' hash. So the process is still not instant, but quick enough to do the updates frequently, if needed.

About the API endpoint, trying to understand the goal - do you want to host llmsearch on the cloud, with an exposed api that will allow add or remove documents from the internal vector store? So when documents are changed on the cloud storage, some service or app would ping the api with a request to update the embeddings?

from llm-search.

Hisma avatar Hisma commented on August 23, 2024

thanks! Yeah the update button would be a nice convenience as you could at that point leave the application running at all times and do everything you need in the UI.

The API endpoint, yes it would be if it were a cloud deployment, or partial cloud deployment (ie the app runs on-premise but the corpus of data is in the cloud). Think if you had a team of researchers that used your app, & they all lived in different geographic locations... putting the corpus on the cloud where everyone can add to it would be the easiest solution, and a middle-ware service is used to update your chromaDB any time a file change is detected. I see this as something that would be very low on the list, but something you can consider in the future when you have ran out of things to do haha.

from llm-search.

snexus avatar snexus commented on August 23, 2024

The update button feature should be available in this branch - https://github.com/snexus/llm-search/tree/feature-webapp-embeddings-udpate, please test if it works for you.

The API endpoint, yes it would be if it were a cloud deployment, or partial cloud deployment (ie the app runs on-premise but the corpus of data is in the cloud). Think if you had a team of researchers that used your app, & they all lived in different geographic locations... putting the corpus on the cloud where everyone can add to it would be the easiest solution, and a middle-ware service is used to update your chromaDB any time a file change is detected. I see this as something that would be very low on the list, but something you can consider in the future when you have ran out of things to do haha.

I will think about it :)

from llm-search.

Hisma avatar Hisma commented on August 23, 2024

busy day yesterday sorry.
image

I get this error. I think this may just mean I need to first recreate the index with the new version of the app before I can use the update feature, correct?

from llm-search.

Hisma avatar Hisma commented on August 23, 2024

ooof.

When I re-ran the ingest script to re-create the embeddings index, I ran the same test query I always ran, and got a result I was not expecting. In essence, it couldn't answer the question.
Any theory as to why the llm could have gotten "dumber"?

from llm-search.

snexus avatar snexus commented on August 23, 2024

When I re-ran the ingest script to re-create the embeddings index, I ran the same test query I always ran, and got a result I was not expecting. In essence, it couldn't answer the question.
Any theory as to why the llm could have gotten "dumber"?

It shouldn't happen, new functionality doesn't touch the embedding logic or querying logic. Can you see the proper chunks retrieved when using this version of the app?

from llm-search.

Hisma avatar Hisma commented on August 23, 2024

When I re-ran the ingest script to re-create the embeddings index, I ran the same test query I always ran, and got a result I was not expecting. In essence, it couldn't answer the question.
Any theory as to why the llm could have gotten "dumber"?

It shouldn't happen, new functionality doesn't touch the embedding logic or querying logic. Can you see the proper chunks retrieved when using this version of the app?

Yep, this was on me. I added some new files to the db. When adding them, it caused the results of the search to be worse. When I deleted the new files and re-ran the search, I got the results I wanted.

However, I tried adding a new file and got and using the update on-the-fly button, and got this error -
image

something caused my GPU to run out of memory.

from llm-search.

snexus avatar snexus commented on August 23, 2024

Yep, this was on me. I added some new files to the db. When adding them, it caused the results of the search to be worse. When I deleted the new files and re-ran the search, I got the results I wanted.

Good to hear, you got me concerned for a moment, haha.

However, I tried adding a new file and got and using the update on-the-fly button, and got this error

Updating index requires additional GPU memory (since it uses the embedding model). Is it possible that you had the memory almost full and using the update caused an out-of-memory error as a result? Are you able please to test with a smaller model just that confirm that that's the problem (or monitor GPU VRAM usage to confirm that?)

If that's the case - I am not sure what would be a good solution. Perhaps unload the model, do the indexing, reload the model?

from llm-search.

Hisma avatar Hisma commented on August 23, 2024

I have 2 GPUs and llama.cpp does a great job of auto-splitting models between the 2 cards. Here's what the model looked like before doing the embeddings update -
image

Here's what happened when I tried to do the update -
image

It loaded all the embeddings to the primary GPU, rather than recognize that there was a lot of empty space on the 2nd GPU.

So yes, I think the only way to do this elegantly is to unload the model, index, then reload.
How much effort that takes for you, I don't know if you think it is worth the effort. Doing manual indexing isn't too much work, though it does ultimately make the application have a more "POC" feel when app functionality is split between different arguments that need to be executed from a command-line.

from llm-search.

snexus avatar snexus commented on August 23, 2024

Pushed the changes to the same branch which hopefully will solve the problem - please test on your side when you have time. Thank you!

from llm-search.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.