Coder Social home page Coder Social logo

andrewnguonly / lumos Goto Github PK

View Code? Open in Web Editor NEW
1.2K 1.2K 89.0 11.25 MB

A RAG LLM co-pilot for browsing the web, powered by local LLMs

License: MIT License

TypeScript 96.33% JavaScript 2.11% CSS 1.55%
chrome-extension langchain langchain-js llm ollama react typescript webpack

lumos's People

Contributors

andrewnguonly avatar billykern avatar draculabo avatar eltociear avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

lumos's Issues

Refactor RAG workflow

  • skip embedding if page content is already embedded
  • add configurable search parameters

Separate (generally smaller) embeddings model?

Using tinyllama

[GIN] 2024/02/10 - 12:33:07 | 200 |   51.673542ms |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2024/02/10 - 12:33:07 | 200 |    51.70225ms |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2024/02/10 - 12:33:07 | 200 |   51.951042ms |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2024/02/10 - 12:33:07 | 200 |   43.755125ms |       127.0.0.1 | POST     "/api/embeddings"

Maybe you can "get away with" using a smaller model for quick embeddings to make things a bit more responsive ??

Getting HTTP 400 errors on /api/embeddings

Hey! thx for your time in this poc.

Not working for me. Waiting for instructions to debug this. Installation not user friendly.

Context

  • Apple M1 Pro

  • Sonoma 14.1.1

  • 16GB RAM

  • Ollama working well:

ollama -v
ollama version 0.1.12
OLLAMA_ORIGINS=chrome-extension://* ollama serve
  • Extension built OK.
  • Extension installed in chrome OK:
    • Site access: This extension can read and change your data on sites. You can control which sites the extension can access.
    • Automatically allow access on the following sites: ENABLED.
    • Allow in Incognito: DISABLED
    • Allow access to file URLs: ENABLED
      Collect errors: ENABLED.
    • Source: Unpacked extension Loaded from: ~/Lumos/dist

Error

[GIN] 2023/11/30 - 11:55:01 | 400 |     174.833µs |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2023/11/30 - 11:55:03 | 400 |     411.917µs |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2023/11/30 - 11:55:06 | 400 |     285.208µs |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2023/11/30 - 11:55:11 | 400 |     375.584µs |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2023/11/30 - 11:55:19 | 400 |     382.666µs |       127.0.0.1 | POST     "/api/embeddings"

Blocked API Access due to CORS Policy - Disable CORS Checking for Specific Request.

When I try to use the extension, I receive the following error. I have Ollama running locally and can query it from Emacs and receive responses.

Access to fetch at 'http://127.0.0.1:11434/api/embeddings' from origin 'chrome-extension://asfdasfasdfs' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

I fed this to Llama and it suggested the following fixes. However I don't know see where in background.ts the fetch call is for me to add a no-cors mode.

The error message you encountered is related to the Cross-Origin
Resource Sharing (CORS) policy in web browsers. The browser is
blocking access to the http://127.0.0.1:11434/api/embeddings URL
from originating from a different domain than the one that served the
HTML document.

The error message specifically states that there is no
Access-Control-Allow-Origin header present in the response to the
preflight request (which is a request made by the browser to check if
the server supports CORS). As a result, the browser is blocking the
request from proceeding.

To fix this issue, you have two options:

  1. Add the Access-Control-Allow-Origin header to the server
    response. This can be done by adding the following line to the
    server-side code that handles the API request:
    #+BEGIN_SRC php
    header('Access-Control-Allow-Origin: *');
    #+END_SRC
    This will allow the browser to make requests to the API from any origin.

  2. Disable CORS checking for the specific API request by setting the
    mode parameter to 'no-cors' in the fetch() function. This can be
    done like this:
    #+BEGIN_SRC javascript
    fetch('http://127.0.0.1:11434/api/embeddings', {
    mode: 'no-cors'
    })
    #+END_SRC
    This will disable CORS checking for the specific API request, allowing
    it to proceed even though there is no Access-Control-Allow-Origin
    header present in the response.

Note that disabling CORS checking can be a security risk, as it allows
requests from any origin to access the API. You should only use this
option when you have verified that the API is being accessed from a
trusted source.

Make the popover longer (?)

This is more a suggestion, I feel like the popover (or popup in Arc) is a bit short.
Maybe make it at least resizable?

In this example I can barely see the conversation:

Screen Shot 2024-02-05 at 16 26 59@2x

Thank you,

Implement custom content chunking for domains

Each domain should have its own chunkSize and chunkOverlap values. These values should be passed to the background script for processing.

Also, investigate if it's useful for each domain should have its own vectorstore retrieval config (e.g. number of documents to return).

Ollama started service successfully but browser did not respond

First of all thank you for your excellent work, but I have found some issues with my local deployment. My docker backend has successfully pulled up the local model service. In the command line curl access works fine, but when I set the OLLAMA_BASE_URL and OLLAMA_MODEL in script/background.ts the plugin is not responding in the browser.

My specific process is to use ollama inside docker for local model packaging, outside docker using curl access there is normal return, but after setting two parameters in the plugin there is no response from the plugin inside the browser.

On longs pages it seems to get stuck

On long pages it seems to halt (e.g. https://news.ycombinator.com/item?id=39190468)

image

Maybe fixed in new versions

Might be nice to have some indication of the amount of work it's doing
Progress bar or something
I mean you know how many chunks it needs to embed, right?

I don't know the feasibility, but wondering if you can do embedding in parallel somehow?
I suppose with mmap'd model with model shared by multiple processes it could be?
But that's more of an ollama question perhaps?

Thanks

Add LICENSE

As the title says, can you add the License for this awesome project?

Embeddings cache regression (^h^h^h confusion)

I just updated to commit 72439bf but it seems like there is a regression?

image

The TTL is 60 minutes but it seems like it's requesting a series of embeddings for each query.

Ok, so I uninstalled it, then reinstalled it, in case my chrome storage options got in a wonky state somehow.

It's then not showing the connection indicator (which I was /was/ seeing at first!) for the model 404

image image

So, back to the embeddings, I've removed/installed. Once I select a model in options hopefully we are good?

Response:

image

Lots of embeddings (long page):
image

Hrmmm, it definitely seems like it's calling the embedding endpoint many times for each query. I could have sworn you were caching, that's what the TTL means, right!?

Oh, it's not cached when isHighlightedContent:

          chrome.runtime.sendMessage({
            prompt: prompt,
            skipRAG: false,
            chunkSize: config.chunkSize,
            chunkOverlap: config.chunkOverlap,
            url: activeTabUrl.toString(),
            skipCache: isHighlightedContent,
            imageURLs: imageURLs,
          });

Based on:

const getHighlightedContent = (): string => {
  const selection = window.getSelection();
  return selection ? selection.toString().trim() : "";
};

Oh, I see! I guess it's a bit complicated to use the cache easily, eh?

Hrmmmmmm, hhere's other optimizations you could do, but compared to creating completely new embeddings a simple linear search over the highlighted string to see if it contains any of the chunks that would otherwise be returned by the configured parser (i.e. "canonical" chunks?) ?

Function calling support?

Can it do function calling? It will automate so much stuff if it can. Please close this if it already can do that.

I have played with function calling on ChatGPT and tried to make a local ChatGPT based tool. But I can't just let ChatGPT go to pages and do research for me (will be too expensive).

With function calling Lumos will be able to answer any question sending request to appropriate tool

"Connection" indicator

Played with this again, did a git reset hard to origin to update (a bit mindlessly, oops), then of course it overwrote DEFAULT_MODEL (Oh, I see there's a gui for that now in options)

An indicator somewhere for:
Is Ollama alive?
Needs starting? Responding at all ?
Is model available ( I guess on the fly model config is a whole other issue)
Needs origins configuring ? 403 (or whatever) Forbidden responses?
I guess you could use the browser action icon even.

The thing is it just seems to quietly go about doing nothing when ollama is not running.

image

image

re: models
image

I have various quants of the same model, dunno how often people would actually??? but yeah, I guess if there is maybe should it show tag as discriminator?!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.