Coder Social home page Coder Social logo

russellcanfield / wingman-ai Goto Github PK

View Code? Open in Web Editor NEW
140.0 3.0 11.0 9.58 MB

An open source AI coding assistant VSCode extension. Works with Ollama, HuggingFace, OpenAI and Anthropic

License: MIT License

JavaScript 0.07% TypeScript 99.78% HTML 0.15%

wingman-ai's Introduction

Wingman - AI Coding Assistant

The Wingman-AI extension brings high quality AI assisted coding right to your computer, it's 100% free and private which means data never leaves your machine!

Like the extension? Check out Squadron AI our AI-assisted code reviewer.

🚀 Getting Started

Choosing an AI Provider

We recommend starting with Ollama using the Deepseek model(s), see why here or here.

  • Install this extension from the VS Code Marketplace: Wingman-AI
  • Install Ollama
  • Install the supported local models by running the following command(s): Example:
    • ollama pull deepseek-coder:6.7b-base-q8_0
    • ollama pull deepseek-coder:6.7b-instruct-q8_0

That's it! This extension will validate that the models are configured correctly in it's VSCode settings upon launch. If you wish to customize which models run see the FAQ section.

Features

Code Completion

The AI will look for natural pauses in typing to decide when to offer code suggestions (keep in mind the speed is limited by your machine). The code completion feature will also analyze comments you type and generate suggestions based on that context.

Wingman AI code completion example

Code Completion Disable / HotKey

We understand that sometimes the code completion feature can be too aggressive, which may strain your system's resources during local development. To address this, we have introduced an option to disable automatic code completion. However, we also recognize the usefulness of on-demand completion. Therefore, we've implemented a hotkey that allows you to manually trigger code completion at your convenience.

When you need assistance, simply press Shift + Ctrl + Space. This will bring up a code completion preview right in the editor and a quick action will appear. If you're satisfied with the suggested code, you can accept it by pressing Enter. This provides you with the flexibility to use code completion only when you want it, without the overhead of automatic triggers.

Interactive Chat

Talk to the AI naturally! It will use open files as context to answer your question, or simply select a section of code to use as context. Chat will also analyze comments you type and ge

Wingman AI chat example

Wingman AI chat example

AI Providers

Ollama

Ollama is a free and open-source AI model provider, allowing users to run their own local models.

Why Ollama?

Ollama was chosen for it's simplicity, allowing users to pull a number of models in different configurations and update them at will. Ollama will pull optimized models based on your system architecture, however if you do not have a GPU accelerated machine, models will be slower.

Setting up Ollama

Follow the directions on the Ollama website. Ollama has a number of open source models available that are capable of writing high quality code. See getting started for how to pull and customize models.

Supported Models

The extension uses a separate model for chat and code completion. This is due to the fact that different types of models have different strengths, mixing and matching offers the best result.

NOTE - You can use any quantization for a supported model, you are not limited.

Example: deepseek-coder:6.7b-instruct-q4_0

Supported Models for Code Completion:

Supported Models for Chat:

OpenAI

OpenAI is supported! You can use the following models:

  • GPT4-o
  • GPT4-Turbo
  • GPT4

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.

Anthropic

Anthropic is supported! You can use the following models:

  • Claude 3.5 Sonnet
  • Claude 3 Opus

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.

Hugging Face

Hugging Face supports hosting and training models, but also supports running many models (under 10GB) for free! All you have to do is create a free account.

Setting up Hugging Face

Once you have a Hugging Face account and an API key, all you need to do is open the VSCode settings pane for this extension "Wingman" (see FAQ).

Once it's open, select "HuggingFace" as the AI Provider and add your API key under the HuggingFace section:

Supported Models

The extension uses a separate model for chat and code completion. This is due to the fact that different types of models have different strengths, mixing and matching offers the best result.

Supported Models for Code Completion:

Supported Models for Chat:

NOTE - Unlike using Ollama, your data is not private and will not be sanitized prior to being sent.


FAQ

  • How can I change which models are being used? This extension uses settings like any other VSCode extension, see the examples below.

  • The AI models feel slow, why? As of pre-release 0.0.6 we've added an indicator in the bottom status bar to show you when an AI model is actively processing. If you aren't using GPU accelerated hardware, you may need to look into Quantization].

Troubleshooting

This extension leverages Ollama due to it's simplicity and ability to deliver the right container optimized for your running environment. However good AI performance relies on your machine specs, so if you do not have the ability to GPU accelerate, responses may be slow. During startup the extension will verify the models you have configured in the VSCode settings pane for this extension, the extension does have some defaults:

Code Model - deepseek-coder:6.7b-base-q8_0

Chat Model - deepseek-coder:6.7b-instruct-q8_0

The models above will require enough RAM to run them correctly, you should have at least 12GB of ram on your machine if you are running these models. If you don't have enough ram, then choose a smaller model but be aware that it won't perform as well. Also see information on model Quantization.

Release Notes

To see the latest release notes - check out our releases page.


If you like the extension, please leave a review! If you don't, open an issue and we'd be happy to assist!

Enjoy!

wingman-ai's People

Contributors

russellcanfield avatar harlenalvarez avatar

Stargazers

Jesse Boudreau avatar Marc avatar tillganster avatar Beard or Die avatar frbrn avatar Igor Suhorukov avatar  avatar MFINIT avatar Aleksandr Petruhin avatar Fei avatar Duc-Thien Bui avatar Patrick Forringer avatar Tyler DiBartolo avatar Cam avatar Nam Hoang avatar  avatar  avatar  avatar Boaz Sze avatar Kilian LE DU avatar Vincent Greff avatar Marko Bogosavljevic avatar  avatar Dom avatar DT avatar  avatar  avatar  avatar Mark Percival avatar Jan Salecker avatar LYFTIUM avatar SAHAF M FAISAL avatar Sandalots avatar Avaneesh avatar Jeffrey Morgan avatar yves avatar  avatar  avatar Scott Shawcroft avatar  avatar Ethan avatar  avatar sablib avatar David Dennison avatar Adil avatar Mitch avatar omid mesgarha avatar  avatar springtian avatar Marcelo Fernandes avatar shark99 avatar  avatar YvanKOB avatar Bodhi avatar Simon avatar Gary Blankenship avatar  avatar  avatar Noel Jacob avatar  avatar Alexander Schäfer avatar  avatar Li Yu avatar Jer avatar Akkad Marano avatar Oleg Klimenko avatar Anvar Azizov avatar  avatar Alin Osan avatar Pedram Amini avatar Jake Langford avatar Michael Schock avatar  avatar Vadym Parakonnyi avatar Tilt avatar Riley Retzloff avatar JT5D avatar Timothy Johnson avatar Yassine KARI avatar Brooks Brasfield avatar Eren Aslan avatar Roland Gaida avatar  avatar Amit avatar Allen S, bookman avatar Sabin Tudor avatar Matthew MacLeod  avatar Mr.Lucy avatar Richard Andresen avatar Nikolaus Schlemm avatar  avatar  avatar  avatar Josh Studl avatar Lesly Arun Franco avatar poweron // software avatar Tim Collins avatar  avatar feng lui avatar Andrew avatar

Watchers

 avatar  avatar Jesse C. Lin avatar

wingman-ai's Issues

VueJS2 and Linux Bash files not running

We are having problems with getting wingman with ollama integrated working with .vue and .sh files. We are able to get them to work
with other filetypes, like .js, .json, and .php to name a few. We run the latter filetypes and we get the greyed out code suggestions and a loading circle on the bottom right hand corner of the screen, but with the former we get no response at all. We are running in a VSCode extension.

Enhance code completion on/off

Code completion currently supports being disabled and enabled. Instead of a disabled option, let’s convert it to always on or hotkey.

Feature - Code Completion Toggle

Add a toggle in the settings to allow users to turn on/off code completion to control traffic and potential data leakage in situations where you dont want to send parts or the entire file over to openai as context.

Support other Ollama models

Remove restriction and allow other values in chatModel and codeModel inside "Wingman.Ollama".
In version v0.3.1 only specific models allowed. This is very limiting.

Allow use of llama3 models.

Not able to run the extension by installing it manually.

i just want to make same UI changes fit my preference by the i tried to install the extension from the repo and ran it, everything worked okay by none of the functions were working like chat or code completion. I have ollama installed too. I am trying this on my macbook m1. perhaps it would be of great help if you could list the steps of installation through this repo instead of VS Code.

Auto completion not working and needs some improvements

I'm using a computer with AMD Ryzen 5 5600G with integrated Radeon Graphics × 6, and 32GB RAM, on Linux Mint 21.
This is the code I'm using to try Wingman-AI:

def isPrime(n):
    """ Test if n is a prime number """
    if 
        
def hello(i):
    for j in range(i):
        print(f"{j} is a value") 

At the isPrime function, I've been trying to type either if or for to see if Wingman-AI generates a code completion.

From what I understand, while typing code, if I pause, Wingman-AI is supposed to suggest a code completion. In my case, it isn't working. Here's the output:

27/02/2024, 22:01:01 - [info] Ollama - Code Completion submitting request with body: {"model":"deepseek-coder:6.7b-base-q8_0","prompt":"<|fim▁begin|>\ndef isPrime(n):\n    \"\"\" Test if n is a prime number \"\"\"\n    for i <|fim▁hole|>\n\n        \n\n\ndef hello(i):\n    for j in range(i):\n        print(f\"{j} is a value\")        \n\n<|fim▁end|>","stream":false,"raw":true,"options":{"temperature":0.6,"num_predict":-1,"top_k":30,"top_p":0.2,"repeat_penalty":1.1,"stop":["<|end▁of▁sentence|>","<|EOT|>","\\n","</s>"]}}
27/02/2024, 22:01:01 - [error] Ollama - code completion request with model deepseek-coder:6.7b-base-q8_0 failed with the following error: AbortError: This operation was aborted
27/02/2024, 22:01:01 - [info] Ollama - Code Completion execution time: 0.298 seconds
27/02/2024, 22:01:02 - [info] Ollama - Code Completion submitting request with body: {"model":"deepseek-coder:6.7b-base-q8_0","prompt":"<|fim▁begin|>\ndef isPrime(n):\n    \"\"\" Test if n is a prime number \"\"\"\n    for i in<|fim▁hole|>\n\n        \n\n\ndef hello(i):\n    for j in range(i):\n        print(f\"{j} is a value\")        \n\n<|fim▁end|>","stream":false,"raw":true,"options":{"temperature":0.6,"num_predict":-1,"top_k":30,"top_p":0.2,"repeat_penalty":1.1,"stop":["<|end▁of▁sentence|>","<|EOT|>","\\n","</s>"]}}
27/02/2024, 22:01:02 - [error] Ollama - code completion request with model deepseek-coder:6.7b-base-q8_0 failed with the following error: AbortError: This operation was aborted
27/02/2024, 22:01:02 - [info] Ollama - Code Completion execution time: 0.143 seconds
27/02/2024, 22:01:02 - [info] Ollama - Code Completion submitting request with body: {"model":"deepseek-coder:6.7b-base-q8_0","prompt":"<|fim▁begin|>\ndef isPrime(n):\n    \"\"\" Test if n is a prime number \"\"\"\n    for i in <|fim▁hole|>\n\n        \n\n\ndef hello(i):\n    for j in range(i):\n        print(f\"{j} is a value\")        \n\n<|fim▁end|>","stream":false,"raw":true,"options":{"temperature":0.6,"num_predict":-1,"top_k":30,"top_p":0.2,"repeat_penalty":1.1,"stop":["<|end▁of▁sentence|>","<|EOT|>","\\n","</s>"]}}
27/02/2024, 22:01:21 - [error] Ollama - code completion request with model deepseek-coder:6.7b-base-q8_0 failed with the following error: AbortError: This operation was aborted
27/02/2024, 22:01:21 - [info] Ollama - Code Completion execution time: 18.563 seconds

I noticed "num_predict":-1 and checked Wingman config. Sure enough, Code max tokens is -1 by default. Made the value 100, closed vscode, opened it again and tried. This time:

27/02/2024, 22:16:51 - [info] Ollama - Code Completion submitting request with body: {"model":"deepseek-coder:6.7b-base-q8_0","prompt":"<|fim▁begin|>\ndef isPrime(n):\n    \"\"\" Test if n is a prime number \"\"\"\n    <|fim▁hole|>\n        \n\ndef hello(i):\n    for j in range(i):\n        print(f\"{j} is a value\")        \n\n<|fim▁end|>","stream":false,"raw":true,"options":{"temperature":0.6,"num_predict":100,"top_k":30,"top_p":0.2,"repeat_penalty":1.1,"stop":["<|end▁of▁sentence|>","<|EOT|>","\\n","</s>"]}}
27/02/2024, 22:16:51 - [error] Ollama - code completion request with model deepseek-coder:6.7b-base-q8_0 failed with the following error: AbortError: This operation was aborted
27/02/2024, 22:16:51 - [info] Ollama - Code Completion execution time: 0.112 seconds
27/02/2024, 22:16:52 - [info] Ollama - Code Completion submitting request with body: {"model":"deepseek-coder:6.7b-base-q8_0","prompt":"<|fim▁begin|>\ndef isPrime(n):\n    \"\"\" Test if n is a prime number \"\"\"\n    if <|fim▁hole|>\n        \n\ndef hello(i):\n    for j in range(i):\n        print(f\"{j} is a value\")        \n\n<|fim▁end|>","stream":false,"raw":true,"options":{"temperature":0.6,"num_predict":100,"top_k":30,"top_p":0.2,"repeat_penalty":1.1,"stop":["<|end▁of▁sentence|>","<|EOT|>","\\n","</s>"]}}
27/02/2024, 22:17:30 - [error] Ollama - code completion request with model deepseek-coder:6.7b-base-q8_0 failed with the following error: AbortError: This operation was aborted
27/02/2024, 22:17:30 - [info] Ollama - Code Completion execution time: 37.988 seconds

So what would I need to do to get code completions?

Few other humble suggestions:

Conservative use of CPU required:

To save on power, Users could prefer to have Wingman-AI query Ollama for a code completion only when using a key combination like perhaps Ctrl+i or something. If Wingman-AI is going to try generating completions everytime the User pauses, it consumes a huge amount of CPU power even when the User doesn't want it to. This not only adds up in terms of the electricity bill, it also puts the User's CPU fan under constant duress/wear. I understand that some people would prefer not having to press a key combo, so this could be a setting that Users could choose. Either to use a pause or a key combo.

Readme update required:

It'd help to update the readme to show users how they can view the output logs and show a screenshot of the wingman config tab, to show how easy it is to change settings. Also, most users won't know what the consequence of changing settings like the code context window, code max tokens, chat context window etc is. So it'd be nice to explain those, and also explain why there's a separate code model and a separate chat model.
wingmanConfig

Light color theme issue

Wingman AI generated code barely visible in sidebar when light color theme is used in VS Code.
I have tried few different light themes and all look the same. Bleached.

Screenshot from 2024-04-25 15-29-25

Remove throw on invalid config

Currently Wingman is setup to throw when invalid configuration scenarios are detected. The original intent was to prevent the user going to Chat and it being unresponsive.

since we have evolved Wingman to have a config pane we can no longer fail the extension, consider removing throw and maybe enhance the existing error state so the config screen still loads

also consider the state of the extension when it does fail to start up. We probably want to consider hiding quick fix/code action options - or on use have them emit an error dialog. Same with chat or load a chat shell that indicates it’s not loaded properly.

No response after prompting Wingman AI

I'm using VS Code version 1.85.1 on Linux Mint 21 Cinnamon. I've installed Ollama using curl -fsSL https://ollama.com/install.sh | sh, downloaded this model ollama pull deepseek-coder:6.7b-base-q8_0 and verfied it's working on http://localhost:11434/. I have 32GB RAM. http://localhost:11434/api/show and http://localhost:11434/api/generate show a 404 page not found error.
In the Wingman AI extension pane where I can type a prompt, when I type a prompt, the loading icon just keeps circling forever. No code is generated. On the system monitor, I can see that CPU isn't being used. I don't have a separate graphics card, and since Ollama detected that there's no NVidia graphics card, it had printed a message that it'd run in CPU mode.
So is this merely an issue with the fact that there's no NVidia graphics card, or that the Ollama API paths could have changed? I don't want to use the HuggingFace API or OpenAI API, so won't be trying those.

Update: I downloaded ollama pull deepseek-coder:6.7b-instruct-q8_0 also. Still same problem.

[FEATURE REQUEST] Code Completion Tokens Limit

Hi,

I am using Wingman AI on my 2018 Macbook Pro for VSCode. Since it runs deepseek-coder on CPU, it takes a while to generate the code. For the autocompletion, on like 1 second pause, I want to see the generated text.

According to the codebase, the current set token limit is 1024, which is quite high. I want to request it to be user-adjustable, for me, 30-50 tokens would be the best limit and they will get generated in 2 seconds as well. Currently, it takes over 40 seconds to see the autocompletion. That is not practical at all.

Can you please implement it? Thank you!

Secondly, can we make code contributions?

Add ability to exclude context

Wingman AI by default uses current file as context for the request to LLM.

It would be nice to just send general request/question without any context because:

  1. Context slows down response from LLM for general questions.
  2. If you do not want any context you are forced to select blank line(s) in text editor.

Refactor - enhance context

Refactor currently is in development but offers sub-par performance on complex refactors.

2 issues contribute - one is lack of context understanding. The other is potentially brittle markdown parsing - consider prompt tweak here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.