Comments (6)
I've now enabled multi-threaded serving in v0.0.4 so hopefully you won't find that the healthcheck endpoint becomes unresponsive while the system is working.
I've also forked the fauxpilot plugin for VSCode which actually shows a status wheel while it is thinking which helps with the UX a bit. I've provided a custom build of the plugin that can be downloaded and installed via the install from VSIX file option for now (hopefully the upstream maintainer will incorporate my PR).
I've also noticed that the fauxcode plugin is a bit fussy about maintaining focus so if you click out of the window while it's thinking it seems to silently throw the suggestions away.
from turbopilot.
Thanks for your ticket. Looking at the prediction logs you screenshotted there it's taking about 2 minutes to generate a response on your system - I'm not 100% sure but its possible you're hitting a timeout issue from within vscode. Are you seeing any timeout logs from fauxcode? What system are you running on and how many threads are you using?
from turbopilot.
What system are you running on and how many threads are you using?
The memory usage was around 4GB's. 12 x AMD Ryzen 5 3600X 6-Core Processor (1 Socket) (with AVX2 support).
At the time I believe I was using 6 threads. I had also been using the 2B model.
I'm not 100% sure but its possible you're hitting a timeout issue from within vscode. Are you seeing any timeout logs from fauxcode?
The developer tools shows the extension requests with making the requests to the server and also shows the request is done, but no prediction is returned. There's no log there about a timeout that I can see. :/
I tried with 2 threads and saw a prediction (once) - trying to reproduce... but it seems very hit and miss (it was also nonsensical and pressing TAB didn't accept the prediction) O.o
Might need to open an issue on the fauxpilot vsc plugin repo.
One thing I notice is with K8 I add a readiness health check for start up - but during my requests for predictions, it often fails the healthcheck.... so K8 stops traffic to it..... for whatever reason, when making requests it often fails the healthcheck while I am asking for predictions. My CPU isn't pegged 100%, so I am thinking there's an issue with the webserver getting hung.
from turbopilot.
Ok great thank you for your report - is it possible that k8s is trying to kill and restart the pod while prediction is happening and the health endpoint is timing out? I imagine k8s would be a pretty common deployment case so I definitely want to support a liveness check. I think the web server only only handle one request at a time at the moment - I will see if I can turn on Crow's multithreaded support.
I also noticed that the fauxpilot plugin can be a little bit fussy about retaining focus - I think if you start a prediction, change to another window and change back it will cancel the suggestion popup. I might dive into the plugin itself over the weekend to see if I can make any improvements myself - from a UX point of view even a notification of progress would be welcome right?
from turbopilot.
are you thinking about changing the way this extension work but not automatically generate code but follow some command. For example, if I want to generate code, I press "CMD-K", this reduce a lot of unnecessary request to server. What do you think?
from turbopilot.
@whatvn I think that'd be great - I'm looking into the feasibility at the moment. Within VSCode we're limited by what their API lets us do. The API Documentation states that "Providers are asked for completions either explicitly by a user gesture or implicitly when typing." so in theory it seems possible but I am looking at how to make it work with a user gesture or at least make it wait a little longer when you stop typing before sending a request
from turbopilot.
Related Issues (20)
- Local build failing to run (NO AVX2) HOT 3
- use WebSocket for Real-time reception
- terminated by signal SIGABRT (Abort)
- Fauxpilot client does not communicate with TurboPilot 🚀server
- How to use it with cuda in v0.0.5 HOT 2
- Any chance for cuda 12 support? HOT 1
- Is there any roadmap to add support for replit models? HOT 2
- Add support for StableCode
- Support Huggingface Code plugin
- "symbol not found" error in docker image running under ARM64 HOT 3
- How to build for Mac OS Apple Silicon? HOT 2
- ggml_new_tensor_impl: not enough space in the context's memory pool HOT 7
- docker turbopilot:v0.1.0-cuda12 not using gpu HOT 3
- OOM - Segmentation fault (core dumped) HOT 3
- CAN NOT RUN TURBOPILOT USING DOCKER HOT 5
- Only huggingface client works, and crashes server HOT 3
- Support for Code Llama HOT 1
- Docker Image Fail to load model
- Failed to load model wizardcoder (Illegal Instruction)
- [Feature request] Add refact model
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from turbopilot.