andrewnguonly / lumos Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 89.0 11.25 MB

A RAG LLM co-pilot for browsing the web, powered by local LLMs

License: MIT License

TypeScript 96.33% JavaScript 2.11% CSS 1.55%

chrome-extension langchain langchain-js llm ollama react typescript webpack

lumos's People

Contributors

Stargazers

Watchers

Forkers

rayfernando1337-ai-forks rochemedia trizko jihunkim0 doubleespresso2018 saifrahmed vagelim osbarcelos79 render-ai hbcbh1999 pent sublimator rogervaas kennethh72 escottgoodwin frikadellios pyamin1878 ailabteam shashipal95 yanxg natestraub qqq-tech frrabelo ai-jie01 taner45 bngnly edwin254 tfius jwinter74 syaikhipin seshakiran codeaudit zeroxclem drjay7 sean810720 soi-20 rahulmanuwas mivanovitch polya20 scomants-0 centisreptilejide tearchoi-womanne x-xglossynn joeaelkhoury xrinairgi comfyrejiggyny bloggeno14 syntherperfectiveq surrealsleek-extorksta lawrt-serenesilly burkepinchglitznotes spreamhachoneprep mbrukman flashyzool45 weblogik56 glionaptu opissroo-glasedip jerryankur allishoesa scopency-46 mirembere papiguy fisherno-timeat cruzazzan-mrebur hhy5277 wysstartgo jaytoday lamardealmaker eltociear orefaleoluwayinka svorwerk-flextg henri-edh 5l1v3r1 vital121 maheskrishnan benjie91 coderworld520 smartlabsai billykern apollohuang1 francip devcharli draculabo snafi99

lumos's Issues

Explore integrating LangChain tools

https://js.langchain.com/docs/integrations/tools

Refactor RAG workflow

skip embedding if page content is already embedded
add configurable search parameters

Model Web LLM ChatRestModule as LangChain SimpleChatModel

Fix bug with parsing HTML and CSS content

Create `CONTRIBUTING` file

https://docs.github.com/en/communities/setting-up-your-project-for-healthy-contributions/setting-guidelines-for-repository-contributors#adding-a-contributing-file

Separate (generally smaller) embeddings model?

Using tinyllama

[GIN] 2024/02/10 - 12:33:07 | 200 |   51.673542ms |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2024/02/10 - 12:33:07 | 200 |    51.70225ms |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2024/02/10 - 12:33:07 | 200 |   51.951042ms |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2024/02/10 - 12:33:07 | 200 |   43.755125ms |       127.0.0.1 | POST     "/api/embeddings"

Maybe you can "get away with" using a smaller model for quick embeddings to make things a bit more responsive ??

Add functionality to parse highlighted text and pass it to prompt

Record video/gif of Chrome extension

Getting HTTP 400 errors on /api/embeddings

Hey! thx for your time in this poc.

Not working for me. Waiting for instructions to debug this. Installation not user friendly.

Context

Apple M1 Pro
Sonoma 14.1.1
16GB RAM
Ollama working well:

ollama -v
ollama version 0.1.12
OLLAMA_ORIGINS=chrome-extension://* ollama serve

Extension built OK.
Extension installed in chrome OK:
- Site access: This extension can read and change your data on sites. You can control which sites the extension can access.
- Automatically allow access on the following sites: ENABLED.
- Allow in Incognito: DISABLED
- Allow access to file URLs: ENABLED
  Collect errors: ENABLED.
- Source: Unpacked extension Loaded from: ~/Lumos/dist

Error

[GIN] 2023/11/30 - 11:55:01 | 400 |     174.833µs |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2023/11/30 - 11:55:03 | 400 |     411.917µs |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2023/11/30 - 11:55:06 | 400 |     285.208µs |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2023/11/30 - 11:55:11 | 400 |     375.584µs |       127.0.0.1 | POST     "/api/embeddings"
[GIN] 2023/11/30 - 11:55:19 | 400 |     382.666µs |       127.0.0.1 | POST     "/api/embeddings"

Create documentation/tutorial showing how to inspect webpage to select content for parsing

Update README with instructions for configuring custom parsing

Focus on text input after token streaming is done

Blocked API Access due to CORS Policy - Disable CORS Checking for Specific Request.

When I try to use the extension, I receive the following error. I have Ollama running locally and can query it from Emacs and receive responses.

Access to fetch at 'http://127.0.0.1:11434/api/embeddings' from origin 'chrome-extension://asfdasfasdfs' has been blocked by CORS policy: Response to preflight request doesn't pass access control check: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

I fed this to Llama and it suggested the following fixes. However I don't know see where in background.ts the fetch call is for me to add a no-cors mode.

The error message you encountered is related to the Cross-Origin
Resource Sharing (CORS) policy in web browsers. The browser is
blocking access to the ~~http://127.0.0.1:11434/api/embeddings~~ URL
from originating from a different domain than the one that served the
HTML document.

The error message specifically states that there is no
~~Access-Control-Allow-Origin~~ header present in the response to the
preflight request (which is a request made by the browser to check if
the server supports CORS). As a result, the browser is blocking the
request from proceeding.

To fix this issue, you have two options:

Add the ~~Access-Control-Allow-Origin~~ header to the server
response. This can be done by adding the following line to the
server-side code that handles the API request:
#+BEGIN_SRC php
header('Access-Control-Allow-Origin: *');
#+END_SRC
This will allow the browser to make requests to the API from any origin.

Disable CORS checking for the specific API request by setting the
~~mode~~ parameter to ~~'no-cors'~~ in the fetch() function. This can be
done like this:
#+BEGIN_SRC javascript
fetch('http://127.0.0.1:11434/api/embeddings', {
mode: 'no-cors'
})
#+END_SRC
This will disable CORS checking for the specific API request, allowing
it to proceed even though there is no ~~Access-Control-Allow-Origin~~
header present in the response.

Note that disabling CORS checking can be a security risk, as it allows
requests from any origin to access the API. You should only use this
option when you have verified that the API is being accessed from a
trusted source.

Move content config to user configuration, create UI

Make the popover longer (?)

This is more a suggestion, I feel like the popover (or popup in Arc) is a bit short.
Maybe make it at least resizable?

In this example I can barely see the conversation:

Thank you,

Add functionality to manually disable content parsing

This is helpful for use cases where a user just wants to prompt the LLM without any content.

Refactor `isArithmeticExpression()` and `isImagePrompt()` into common configurable function

Also, consider implementing function calling to retrieve binary yes or no response.

Implement custom content chunking for domains

Each domain should have its own chunkSize and chunkOverlap values. These values should be passed to the background script for processing.

Also, investigate if it's useful for each domain should have its own vectorstore retrieval config (e.g. number of documents to return).

Support URL patterns for custom content config

Automatically scroll to bottom of text field as completion is updated

Implement UI to persist chat history

Implement Ollama `keepAlive` parameter for API requests

https://github.com/ollama/ollama/blob/v0.1.23/api/types.go#L45

Note: This may be dependent on LangChain changes.

Ollama started service successfully but browser did not respond

First of all thank you for your excellent work, but I have found some issues with my local deployment. My docker backend has successfully pulled up the local model service. In the command line curl access works fine, but when I set the OLLAMA_BASE_URL and OLLAMA_MODEL in script/background.ts the plugin is not responding in the browser.

My specific process is to use ollama inside docker for local model packaging, outside docker using curl access there is normal return, but after setting two parameters in the plugin there is no response from the plugin inside the browser.

Update LangChain package, migrate to `@langchain/core`, `@langchain/community`

NaN cannot be parsed if `Vector Store TTL` config is deleted

Add error state and helper text for vector store TTL config.

Increase max tokens size for Web LLM

implementing pieces ts sdk for local llms

Hi @andrewnguonly can you please try to build another project with https://github.com/pieces-app/client

Pieces allows for using LLMs to also get ChatGPT like Persistent chats that can answer any questions. And it supports using local LLMs. there is a /chats API endpoint.

The chrome extension is here: https://docs.pieces.app/extensions-plugins/chrome

Can you build a similar project but with LLM running through their Typescript SDK.

On longs pages it seems to get stuck

On long pages it seems to halt (e.g. https://news.ycombinator.com/item?id=39190468)

Maybe fixed in new versions

Might be nice to have some indication of the amount of work it's doing
Progress bar or something
I mean you know how many chunks it needs to embed, right?

I don't know the feasibility, but wondering if you can do embedding in parallel somehow?
I suppose with mmap'd model with model shared by multiple processes it could be?
But that's more of an ollama question perhaps?

Thanks

Add LICENSE

As the title says, can you add the License for this awesome project?

Investigate possible functionality for exporting LLM results, saving results, or searching historical results

Document steps to install on Firefox

Error when trying to load the extension, as is:

`cmd+k` to clear messages

Inconsistent spacing between avatar and message bubble

Embeddings cache regression (^h^h^h confusion)

I just updated to commit 72439bf but it seems like there is a regression?

The TTL is 60 minutes but it seems like it's requesting a series of embeddings for each query.

Ok, so I uninstalled it, then reinstalled it, in case my chrome storage options got in a wonky state somehow.

It's then not showing the connection indicator (which I was /was/ seeing at first!) for the model 404

So, back to the embeddings, I've removed/installed. Once I select a model in options hopefully we are good?

Response:

Lots of embeddings (long page):

Hrmmm, it definitely seems like it's calling the embedding endpoint many times for each query. I could have sworn you were caching, that's what the TTL means, right!?

Oh, it's not cached when isHighlightedContent:

          chrome.runtime.sendMessage({
            prompt: prompt,
            skipRAG: false,
            chunkSize: config.chunkSize,
            chunkOverlap: config.chunkOverlap,
            url: activeTabUrl.toString(),
            skipCache: isHighlightedContent,
            imageURLs: imageURLs,
          });

Based on:

const getHighlightedContent = (): string => {
  const selection = window.getSelection();
  return selection ? selection.toString().trim() : "";
};

Oh, I see! I guess it's a bit complicated to use the cache easily, eh?

Hrmmmmmm, hhere's other optimizations you could do, but compared to creating completely new embeddings a simple linear search over the highlighted string to see if it contains any of the chunks that would otherwise be returned by the configured parser (i.e. "canonical" chunks?) ?

Save text field state after closing the extension popup

Persist LLM req/res state if extension popup is closed (and reopened)

Add listener for "Enter" key to submit prompt

Function calling support?

Can it do function calling? It will automate so much stuff if it can. Please close this if it already can do that.

I have played with function calling on ChatGPT and tried to make a local ChatGPT based tool. But I can't just let ChatGPT go to pages and do research for me (will be too expensive).

With function calling Lumos will be able to answer any question sending request to appropriate tool

"Connection" indicator

Played with this again, did a git reset hard to origin to update (a bit mindlessly, oops), then of course it overwrote DEFAULT_MODEL (Oh, I see there's a gui for that now in options)

An indicator somewhere for:
Is Ollama alive?
Needs starting? Responding at all ?
Is model available ( I guess on the fly model config is a whole other issue)
Needs origins configuring ? 403 (or whatever) Forbidden responses?
I guess you could use the browser action icon even.

The thing is it just seems to quietly go about doing nothing when ollama is not running.

re: models

I have various quants of the same model, dunno how often people would actually??? but yeah, I guess if there is maybe should it show tag as discriminator?!