transitive-bullshit / openopenai Goto Github PK
View Code? Open in Web Editor NEWSelf-hosted version of OpenAI’s new stateful Assistants API
License: MIT License
Self-hosted version of OpenAI’s new stateful Assistants API
License: MIT License
Is this project dead, or is there any change it'll get updated again?
Can anyone recommend any alternative?
This is for the built-in retrieval
tool.
Currently, the current knowledge retrieval implementation uses a very naive retrieval which simply returns the full contents of every attached file (source).
The current implementation also only support text file types like text/plain
and markdown, as no preprocessing or conversions are done at the moment.
It shouldn't be too hard to add support for more legit knowledge retrieval approaches, which would require:
processForFileAssistant
- File ingestion pre-processing for files marked with purpose: 'assistants'
markdown
(this is probably the hardest step to do well across all of the most common file types)file_id
each chunk comes from for filtering purposesretrievalTool
- Performs knowledge retrieval for a given query
on a set of file_ids
for RAG.
query
file_ids
Integrations here with LangChain and/or LlamaIndex would be great for their flexibility, but we could also KISS and roll out own using https://github.com/dexaai/dexter
Just installed, and kicked off dist/server and dist/runner.
Both start listening with no complaints.
Kick off a test with: OPENAI_API_BASE_URL='http://127.0.0.1:3000' npx tsx e2e
and get the below error on runner.
Note the only thing I can see wrong is that it is showing server: cloudflare
though I have s3 configured with an s3:// address.
According to this thread on the opennai forums, this can sometimes be caused by lack of "Authorization" header.
https://community.openai.com/t/getting-hit-with-429-slow-down-error/482704/16
Runner started for queue "openopenai" listening for "thread-run" jobs
Processing thread-run job "clrmgtt100005iwpwdp54i3mb" for run "clrmgtt100005iwpwdp54i3mb"
Job "clrmgtt100005iwpwdp54i3mb" run "clrmgtt100005iwpwdp54i3mb": >>> chat completion call {
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{
role: 'user',
content: 'What is the weather in San Francisco today?'
}
],
model: 'gpt-4-1106-preview',
tools: [ { type: 'function', function: [Object] } ],
tool_choice: 'auto'
}
Error job "clrmgtt100005iwpwdp54i3mb" run "clrmgtt100005iwpwdp54i3mb": APIError: 429 "slow down"
at <anonymous> (/Users/chuckjewell/git/123/repos/ai_dev/OpenOpenAI/node_modules/.pnpm/[email protected]/node_modules/openai-fetch/src/fetch-api.ts:43:14)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at fn (/Users/chuckjewell/git/123/repos/ai_dev/OpenOpenAI/node_modules/.pnpm/[email protected]/node_modules/ky/source/core/Ky.ts:55:14)
at Promise.result.<computed> (/Users/chuckjewell/git/123/repos/ai_dev/OpenOpenAI/node_modules/.pnpm/[email protected]/node_modules/ky/source/core/Ky.ts:86:27)
at OpenAIClient.createChatCompletion (/Users/chuckjewell/git/123/repos/ai_dev/OpenOpenAI/node_modules/.pnpm/[email protected]/node_modules/openai-fetch/src/openai-client.ts:73:45)
at ChatModel.runModel (/Users/chuckjewell/git/123/repos/ai_dev/OpenOpenAI/node_modules/.pnpm/@[email protected]/node_modules/@dexaai/dexter/src/model/chat.ts:53:24)
at ChatModel.run (/Users/chuckjewell/git/123/repos/ai_dev/OpenOpenAI/node_modules/.pnpm/@[email protected]/node_modules/@dexaai/dexter/src/model/model.ts:132:24)
at Worker.Worker.connection (/Users/chuckjewell/git/123/repos/ai_dev/OpenOpenAI/src/runner/index.ts:270:21)
at async Worker.processJob (/Users/chuckjewell/git/123/repos/ai_dev/OpenOpenAI/node_modules/.pnpm/[email protected]/node_modules/bullmq/dist/cjs/classes/worker.js:350:28)
at async Worker.retryIfFailed (/Users/chuckjewell/git/123/repos/ai_dev/OpenOpenAI/node_modules/.pnpm/[email protected]/node_modules/bullmq/dist/cjs/classes/worker.js:537:24) {
status: 429,
headers: {
'alt-svc': 'h3=":443"; ma=86400',
'cf-ray': '8489bcf009a4efd2-PDX',
connection: 'keep-alive',
'content-length': '22',
'content-type': 'application/json',
date: 'Sat, 20 Jan 2024 19:31:29 GMT',
server: 'cloudflare',
'set-cookie': '__cf_bm=NIGNPPboVXD8fiY_rJBrhjexBEiM7x8_Q4Nkuof9gS4-1705779089-1-ATL5Ajgw+3zdT405fvUxZgi5nnc3+jPQ/+W+uEv1Tk8yuymYc0CDzU2F25WJ0zNVlrF4eVSTlCTvoHOZeG7ZH/A=; path=/; expires=Sat, 20-Jan-24 20:01:29 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None',
vary: 'Accept-Encoding'
},
error: 'slow down',
code: undefined,
param: undefined,
type: undefined
}
This isn't supported in the official OpenAI API yet, but it was mentioned at the OpenAI dev day that it will be coming soon, possibly via websocket and/or webhook support.
See this related issue in the OpenAI developer community.
The toughest part of this is that the runner
is completely disparate from the HTTP server
, as it should be, to process thread runs in an async task queue. The runner
is responsible for making chat completion calls, which are streamable, so we'd have to either:
createRun
or createThreadAndRun
operations, and then pipe the chat completion calls into this streamcreateRun
/ createThreadAndRun
OpenAI uses prefix IDs for its resources, which would be great, except it's a pain to get working with Prisma.
See prisma/prisma#3391 and prisma/prisma#6719 for more details.
OpenAI's resource prefixes:
Currently, the models is hard-coded to use the OpenAI chat completion API, but it wouldn't be very difficult to use custom LLMs or external model providers.
The only real constraint is that the custom models need to support function calling and/or ideally parallel tool calling using OpenAI's tool_calls
format.
Will consider implementing this depending on how much love this issue gets.
This is for the built-in code_interpreter
tool.
Currently, the built-in code_interpreter
tool is hard-coded to throw a 501 unsupported error (source).
At a minimum, we should support an integration with open-interpreter.
I believe that e2b also has some interpreter functionality.
OpenGPTs is an awesome OSS project by the LangChain team that has a decent amount of overlap with this project.
The main difference between the two is that this project is intended to have 100% API compatibility with the official OpenAI Assistants API, whereas OpenGPTs is based loosely on the functionality of OpenAI GPTs.
Seeing as we're all playing around in similar sandboxes, I figured it made sense to open an issue to see if there's any interest.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.