Coder Social home page Coder Social logo

koboldai / koboldai-client Goto Github PK

View Code? Open in Web Editor NEW
3.4K 70.0 741.0 13.16 MB

Home Page: https://koboldai.com

License: GNU Affero General Public License v3.0

Python 58.71% Batchfile 0.62% CSS 5.68% JavaScript 11.50% HTML 1.77% Dockerfile 0.07% Shell 0.92% Lua 8.04% Jupyter Notebook 1.91% Haxe 0.31% PowerShell 0.19% Less 3.53% SCSS 3.54% Stylus 3.22%

koboldai-client's Introduction

KoboldAI - Your gateway to GPT writing

This is a browser-based front-end for AI-assisted writing with multiple local & remote AI models. It offers the standard array of tools, including Memory, Author's Note, World Info, Save & Load, adjustable AI settings, formatting options, and the ability to import existing AI Dungeon adventures. You can also turn on Adventure mode and play the game like AI Dungeon Unleashed.

Multiple ways to play

Stories can be played like a Novel, a text adventure game or used as a chatbot with an easy toggles to change between the multiple gameplay styles. This makes KoboldAI both a writing assistant, a game and a platform for so much more. The way you play and how good the AI will be depends on the model or service you decide to use. No matter if you want to use the free, fast power of Google Colab, your own high end graphics card, an online service you have an API key for (Like OpenAI or Inferkit) or if you rather just run it slower on your CPU you will be able to find a way to use KoboldAI that works for you.

Adventure mode

By default KoboldAI will run in a generic mode optimized for writing, but with the right model you can play this like AI Dungeon without any issues. You can enable this in the settings and bring your own prompt, try generating a random prompt or download one of the prompts available at /aids/ Prompts.

The gameplay will be slightly different than the gameplay in AI Dungeon because we adopted the Type of the Unleashed fork, giving you full control over all the characters because we do not automatically adapt your sentences behind the scenes. This means you can more reliably control characters that are not you.

As a result of this what you need to type is slightly different, in AI Dungeon you would type take the sword while in KoboldAI you would type it like a sentence such as You take the sword and this is best done with the word You instead of I.

To speak simply type : You say "We should probably gather some supplies first"
Just typing the quote might work, but the AI is at its best when you specify who does what in your commands.

If you want to do this with your friends we advise using the main character as You and using the other characters by their name if you are playing on a model trained for Adventures. These models assume there is a You in the story. This mode does usually not perform well on Novel models because they do not know how to handle the input those are best used with regular story writing where you take turns with the AI.

Writing assistant

If you want to use KoboldAI as a writing assistant this is best done in the regular mode with a model optimized for Novels. These models do not make the assumption that there is a You character and focus on Novel like writing. For writing these will often give you better results than Adventure or Generic models. That said, if you give it a good introduction to the story large generic models like 13B can be used if a more specific model is not available for what you wish to write. You can also try to use models that are not specific to what you wish to do, for example a NSFW Novel model for a SFW story if a SFW model is unavailable. This will mean you will have to correct the model more often because of its bias, but can still produce good enough results if it is familiar enough with your topic.

Chatbot Mode

In chatbot mode you can use a suitable model as a chatbot, this mode automatically adds your name to the beginning of the sentences and prevents the AI from talking as you. To use it properly you must write your story opening as both characters in the following format (You can use your own text) :

Bot : Hey!
You : Hey Boyname, how have you been?
Bot : Been good! How about you?
You : Been great to, excited to try out KoboldAI
Bot : KoboldAI is really fun!
You : For sure! What is your favorite game?

Its recommended to have your own input be the last input, especially in the beginning its possible that the AI mixes up the names. In that case either retry or manually correct the name. This behavior improves as the chat progresses. Some models may swap names if they are more familiar with a different name that is similar to the name you defined for the bot. In that case you can either do the occasional manual correction or choose a name for your chatbot that the AI likes better.

This mode works the best on either a Generic model or a chatbot model specifically designed for it, some models like the AvrilAI model are instead designed to be used in Adventure mode and do not conform to the format above. These models typically ship with adventure mode enabled by default and should not be switched over to chatbot mode.

Novel or Adventure models are not recommended for this feature but might still work but can derail away from the conversation format quickly.

Play KoboldAI online for free on Google Colab (The easiest way to play)

If you would like to play KoboldAI online for free on a powerful computer you can use Google Colaboraty. We provide two editions, a TPU and a GPU edition with a variety of models available. These run entirely on Google's Servers and will automatically upload saves to your Google Drive if you choose to save a story (Alternatively, you can choose to download your save instead so that it never gets stored on Google Drive). Detailed instructions on how to use them are at the bottom of the Colab's.

Each edition features different models and requires different hardware to run, this means that if you are unable to obtain a TPU or a GPU you might still be able to use the other version. The models you can use are listed underneath the edition. To open a Colab click the big link featuring the editions name.

Model Style Description
Nerys by Mr Seeker Novel/Adventure Nerys is a hybrid model based on Pike (A newer Janeway), on top of the Pike dataset you also get some Light Novels, Adventure mode support and a little bit of Shinen thrown in the mix. The end result is a very diverse model that is heavily biased towards SFW novel writing, but one that can go beyond its novel training and make for an excellent adventure model to. Adventure mode is best played from a second person perspective, but can be played in first or third person as well. Novel writing can be done best from the first or third person.
Erebus by Mr Seeker NSFW Erebus is our community's flagship NSFW model, being a combination of multiple large datasets that include Literotica, Shinen and erotic novels from Nerys and featuring thourough tagging support it covers the vast majority of erotic writing styles. This model is capable of replacing both the Lit and Shinen models in terms of content and style and has been well received as (one of) the best NSFW models out there. If you wish to use this model for commercial or non research usage we recommend choosing the 20B version as that one is not subject to the restrictive OPT license.
Janeway by Mr Seeker Novel Janeway is a model created from Picard's dataset combined with a brand new collection of ebooks. This model is trained on 20% more content than Picard and has been trained on literature from various genres. Although the model is mainly focussed on SFW, romantic scenes might involve a degree of nudity.
Shinen by Mr Seeker NSFW Shinen is an NSFW model trained on a variety of stories from the website Sexstories it contains many different kinks. It has been merged into the larger (and better) Erebus model.
Skein by VE_FORBRYDERNE Adventure Skein is best used with Adventure mode enabled, it consists of a 4 times larger adventure dataset than the Adventure model making it excellent for text adventure gaming. On top of that it also consists of light novel training further expanding its knowledge and writing capabilities. It can be used with the You filter bias if you wish to write Novels with it, but dedicated Novel models can perform better for this task.
Adventure by VE_FORBRYDERNE Adventure Adventure is a 6B model designed to mimick the behavior of AI Dungeon. It is exclusively for Adventure Mode and can take you on the epic and wackey adventures that AI Dungeon players love. It also features the many tropes of AI Dungeon as it has been trained on very similar data. It must be used in second person (You).
Lit (V2) by Haru NSFW Lit is a great NSFW model trained by Haru on both a large set of Literotica stories and high quality novels along with tagging support. Creating a high quality model for your NSFW stories. This model is exclusively a novel model and is best used in third person.
OPT by Metaseq Generic OPT is considered one of the best base models as far as content goes, its behavior has the strengths of both GPT-Neo and Fairseq Dense. Compared to Neo duplicate and unnecessary content has been left out, while additional literature was added in similar to the Fairseq Dense model. The Fairseq Dense model however lacks the broader data that OPT does have. The biggest downfall of OPT is its license, which prohibits any commercial usage, or usage beyond research purposes.
Neo(X) by EleutherAI Generic NeoX is the largest EleutherAI model currently available, being a generic model it is not particularly trained towards anything and can do a variety of writing, Q&A and coding tasks. 20B's performance is closely compared to the 13B models and it is worth trying both especially if you have a task that does not involve english writing. Its behavior will be similar to the GPT-J-6B model since they are trained on the same dataset but with more sensitivity towards repetition penalty and with more knowledge.
Fairseq Dense Generic Trained by Facebook Researchers this model stems from the MOE research project within Fairseq. This particular version has been converted by us for use in KoboldAI. It is known to be on par with the larger 20B model from EleutherAI and considered as better for pop culture and language tasks. Because the model has never seen a new line (enter) it may perform worse on formatting and paragraphing. Compared to other models the dataset focuses primarily on literature and contains little else.
GPT-J-6B by EleutherAI Generic This model serves as the basis for most other 6B models (Some being based on Fairseq Dense instead). Being trained on the Pile and not biased towards anything in particular it is suitable for a variety of tasks such as writing, Q&A and coding tasks. You will likely get better result with larger generic models or finetuned models.
Model Style Description
Nerys by Mr Seeker Novel/Adventure Nerys is a hybrid model based on Pike (A newer Janeway), on top of the Pike dataset you also get some Light Novels, Adventure mode support and a little bit of Shinen thrown in the mix. The end result is a very diverse model that is heavily biased towards SFW novel writing, but one that can go beyond its novel training and make for an excellent adventure model to. Adventure mode is best played from a second person perspective, but can be played in first or third person as well. Novel writing can be done best from the first or third person.
Tiefighter 13B by KoboldAI Hybrid Tiefighter 13B is a very versitile fiction Hybrid, it can write, chat and play adventure games and can also answer regular instructions (Although we do not recommend this model for factual use due to its fictional nature). This is an excellent starting model, for the best results avoid using Second person writing in your chats unless you are wanting it to become a text adventure.
Janeway by Mr Seeker Novel Janeway is a model created from Picard's dataset combined with a brand new collection of ebooks. This model is trained on 20% more content than Picard and has been trained on literature from various genres. Although the model is mainly focussed on SFW, romantic scenes might involve a degree of nudity.
Picard by Mr Seeker Novel Picard is a model trained for SFW Novels based on Neo 2.7B. It is focused on Novel style writing without the NSFW bias. While the name suggests a sci-fi model this model is designed for Novels of a variety of genre's. It is meant to be used in KoboldAI's regular mode.
AID by melastacho Adventure Also know as Adventure 2.7B this is a clone of the AI Dungeon Classic model and is best known for the epic wackey adventures that AI Dungeon Classic players love.
OPT by Metaseq Generic OPT is considered one of the best base models as far as content goes, its behavior has the strengths of both GPT-Neo and Fairseq Dense. Compared to Neo duplicate and unnecessary content has been left out, while additional literature was added in similar to the Fairseq Dense model. The Fairseq Dense model however lacks the broader data that OPT does have. The biggest downfall of OPT is its license, which prohibits any commercial usage, or usage beyond research purposes.
Fairseq Dense Generic Trained by Facebook Researchers this model stems from the MOE research project within Fairseq. This particular version has been converted by us for use in KoboldAI. It is known to be on par with the larger models from EleutherAI and considered as better for pop culture and language tasks. Because the model has never seen a new line (enter) it may perform worse on formatting and paragraphing. Compared to other models the dataset focuses primarily on literature and contains little else.
MythoMax 13B by Gryphe Roleplay An improved, potentially even perfected variant of MythoMix, my MythoLogic-L2 and Huginn merge using a highly experimental tensor type merge technique¹.
Holomax 13B by KoboldAI Adventure This is an expansion merge to the well-praised MythoMax model from Gryphe (60%) using MrSeeker's KoboldAI Holodeck model (40%). The goal of this model is to enhance story-writing capabilities while preserving the desirable traits of the MythoMax model as much as possible (It does limit chat reply length).
Airoboros 13B by Jon Durbin Generic This is an instruction fine-tuned llama-2 model, using synthetic instructions generated by airoboros⁵.
Emerhyst 13B by Undi Roleplay An attempt using BlockMerge_Gradient to get better result. In addition, LimaRP v3 was used⁷.
Chronos 13B by Elinas Generic This model is primarily focused on chat, roleplay, and storywriting, but can accomplish other tasks such as simple reasoning and coding. Chronos generates very long outputs with coherent text, largely due to the human inputs it was trained on.
Spring Dragon by Henk717 Adventure This model is a recreation attempt of the AI Dungeon 2 Dragon model. To achieve this, the "text_adventures.txt" dataset was used, which was bundled with the original AI Dungeon 2 GitHub release prior to the online service. It is worth noting that the same dataset file was used to create the Dragon model, where Dragon is a GPT-3 175B Davinci model from 2020.
Holodeck By KoboldAI Adventure LLAMA2 13B-Holodeck is a finetune created using Meta's llama 2 model.The training data contains around 3000 ebooks in various genres. Most parts of the dataset have been prepended using the following text: [Genre: ,
Neo by EleutherAI Generic This is the base model for all the other 2.7B models, it is best used when you have a use case that we have no other models available for, such as writing blog articles or programming. It can also be a good basis for the experience of some of the softprompts if your softprompt is not about a subject the other models cover.
Various 2.7b models by various Various smaller models are also possible to load in GPU colab.

Styles

Type Description
Novel For regular story writing, not compatible with Adventure mode or other specialty modes.
NSFW Indicates that the model is strongly biased towards NSFW content and is not suitable for children, work environments or livestreaming. Most NSFW models are also Novel models in nature.
Adventure These models are excellent for people willing to play KoboldAI like a Text Adventure game and are meant to be used with Adventure mode enabled. Even if you wish to use it as a Novel Type model you should always have Adventure mode on and set it to story. These models typically have a strong bias towards the use of the word You and without Adventure mode enabled break the story flow and write actions on your behalf.
Hybrid Hybrid models are a blend between different Types, for example they are trained on both Novel stories and Adventure stories. These models are great variety models that you can use for multiple different playTypes and modes, but depending on your usage you may need to enable Adventure Mode or the You bias (in userscripts).
Generic Generic models are not trained towards anything specific, typically used as a basis for other tasks and models. They can do everything the other models can do, but require much more handholding to work properly. Generic models are an ideal basis for tasks that we have no specific model for, or for experiencing a softprompt in its raw form.

Tips to get the most out of Google Colab

  • Google will occationally show a Captcha, typically after it has been open for 30 minutes but it can be more frequent if you often use Colab. Make sure to do these properly, or you risk getting your instance shut down and getting a lower priority towards the TPU's.
  • KoboldAI uses Google Drive to store your files and settings, if you wish to upload a softprompt or userscript this can be done directly on the Google Drive website. You can also use this to download backups of your KoboldAI related files or upload models of your own.
  • Don't want to save your stories on Google Drive for privacy reasons? Do not use KoboldAI's save function and instead click Download as .json, this will automatically download the story to your own computer without ever touching Google's harddrives. You can load this back trough the Load from file option.
  • Google shut your instance down unexpectedly? You can still make use of the Download as .json button to recover your story as long as you did not close the KoboldAI window. You can then load this back up in your next session.
  • Done with KoboldAI? Go to the Runtime menu, click on Manage Sessions and terminate your open sessions that you no longer need. This trick can help you maintain higher priority towards getting a TPU.
  • Models stored on Google Drive typically load faster than models we need to download from the internet.

Install KoboldAI on your own computer

KoboldAI has a large number of dependencies you will need to install on your computer, unfortunately Python does not make it easy for us to provide instructions that work for everyone. The instructions below will work on most computers, but if you have multiple versions of Python installed conflicts can occur.

Downloading the latest version of KoboldAI

KoboldAI is a rolling release on our github, the code you see is also the game. You can download the software by clicking on the green Code button at the top of the page and clicking Download ZIP, or use the git clone command instead. Then, on Windows you need to you run install_requirements.bat (using admin mode is recommanded to avoid errors), and once it's done, or if you're on Linux, either play.bat/sh or remote-play.bat/sh to run it.

The easiest way for Windows users is to use the offline installer below.

Installing KoboldAI offline bundle on Windows 7 or higher using the KoboldAI Offline Installer (Easiest)

  1. Download the latest offline installer from here
  2. Run the installer to place KoboldAI on a location of choice, KoboldAI is portable software and is not bound to a specific harddrive. (Because of long paths inside our dependencies you may not be able to extract it many folders deep).
  3. Update KoboldAI to the latest version with update-koboldai.bat if desired.
  4. Use KoboldAI offline using play.bat or remotely with remote-play.bat

Installing KoboldAI Github release on Windows 10 or higher using the KoboldAI Runtime Installer

  1. Extract the .zip to a location you wish to install KoboldAI, you will need roughly 20GB of free space for the installation (this does not include the models).
  2. Open install_requirements.bat as administrator.
  3. Choose the regular version of Transformers (Option 1), finetuneanon is depreciated and no longer recommended.
  4. You will now be asked to choose the installation mode, we strongly recommend the Temporary B: drive option. This option eliminates most installation issues and also makes KoboldAI portable. The B: drive will be gone after a reboot and will automatically be recreated each time you play KoboldAI.
  5. The installation will now automatically install its requirements, some stages may appear to freeze do not close the installer until it asks you to press a key. Before pressing a key to exit the installer please check if errors occurred. Most problems with the game crashing are related to installation/download errors. Disabling your antivirus can help if you get errors.
  6. Use play.bat to start KoboldAI.

Installing KoboldAI on Linux using the KoboldAI Runtime (Easiest)

  1. Clone the URL of this Github repository (For example git clone https://github.com/koboldai/koboldai-client )
  2. AMD user? Make sure ROCm is installed if you want GPU support. Is yours not compatible with ROCm? Follow the usual instructions.
  3. Run play.sh or if your AMD GPU supports ROCm use play-rocm.sh

KoboldAI will now automatically configure its dependencies and start up, everything is contained in its own conda runtime so we will not clutter your system. The files will be located in the runtime subfolder. If at any point you wish to force a reinstallation of the runtime you can do so with the install_requirements.sh file. While you can run this manually it is not neccesary.

Manual installation / Mac

We can not provide a step by step guide for manual installation due to the vast differences between the existing software configuration and the systems of our users.

If you would like to manually install KoboldAI you will need some python/conda package management knowledge to manually do one of the following steps :

  1. Use our bundled environments files to install your own conda environment, this should also automatically install CUDA (Recommended, you can get Miniconda from https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links). The recommended configuration is huggingface.yml for CUDA users and rocm.yml for ROCm users.
  2. If conda is proving difficult you could also look inside requirements.txt for the required dependencies and try to install them yourself. This will likely be a mixture of pip and your native package manager, just installing our requirements.txt is not recommended since we assume local users will run conda to get all dependencies. For local installations definitely prioritize conda as that is a better way for us to enforce that you have the compatible versions.
  3. Clone our Github or download the zip file.
  4. Now start KoboldAI with aiserver.py and not with our play.bat or play.sh files.

AMD GPU's (Linux only)

AMD GPU's have terrible compute support, this will currently not work on Windows and will only work for a select few Linux GPU's. You can find a list of the compatible GPU's here. Any GPU that is not listed is guaranteed not to work with KoboldAI and we will not be able to provide proper support on GPU's that are not compatible with the versions of ROCm we require. Make sure to first install ROCm on your Linux system using a guide for your distribution, after that you can follow the usual linux instructions above.

Troubleshooting

There are multiple things that can go wrong with the way Python handles its dependencies, unfortunately we do not have direct step by step solutions for every scenario but there are a few common solutions you can try.

ModuleNotFoundError

This is ALWAYS either a download/installation failure or a conflict with other versions of Python. This is very common if users chose the subfolder option during the installation while putting KoboldAI in a location that has spaces in the path. When an antivirus sandboxes the installation or otherwise interferes with the downloads, systems with low disk space or when your operating system was not configured for Long FIle Paths (The installer will do this on Windows 10 and higher if you run it as administrator, anything other than Windows 10 is not supported by our installers).

Another reason the installation may have failed is if you have conflicting installations of Python on your machine, if you press the Windows Key + R and enter %appdata% in the Run Dialog it will open the folder Python installs dependencies on some systems. If you have a Python folder in this location rename this folder and try to run the installer again. It should now no longer get stuck on existing dependencies. Try the game and see if it works well. If it does you can try renaming the folder back to see if it remains functional.

The third reason the installation may have failed is if you have conda/mamba on your system for other reasons, in that case we recommend either removing your existing installations of python/conda if you do not need them and testing our installer again. Or using conda itself with our bundled environment files to let it create its runtime manually. Keep in mind that if you go the manual route you should NEVER use play.bat but should instead run aiserver.py directly.

In general, the less versions of Python you have on your system the higher your chances of it installing correctly. We are consistently trying to mitigate these installation conflicts in our installers but for some users we can not yet avoid all conflicts.

GPU not found errors

GPU not found errors can be caused by one of two things, either you do not have a suitable Nvidia GPU (It needs Compute Capability 5.0 or higher to be able to play KoboldAI). Your Nvidia GPU is supported by KoboldAI but is not supported by the latest version of CUDA. Your Nvidia GPU is not yet supported by the latest version of CUDA or you have a dependency conflict like the ones mentioned above.

Like with Python version conflicts we recommend uninstalling CUDA from your system if you have manually installed it and do not need it for anything else and trying again. If your GPU needs CUDA10 to function open environments\finetuneanon.yml and add a line that says - cudatoolkit=10.2 underneath dependencies: . After this you can run the installer again (Pick the option to delete the existing files) and it will download a CUDA10 compatible version.

If you do not have a suitable Nvidia GPU that can run on CUDA10 or Higher and that supports Compute Capabilities 5.0 or higher we can not help you get the game detected on the GPU. Unless you are following our ROCm guide with a compatible AMD GPU.

vocab.json / config.json is not found error

If you get these errors you either did not select the correct folder for your custom model or the model you have downloaded is not (yet) compatible with KoboldAI. There exist a few models out there that are compatible and provide a pytorch_model.bin file but do not ship all the required files. In this case try downloading a compatible model of the same kind (For example another GPT-Neo if you downloaded a GPT-Neo model) and replace the pytorch_model.bin file with the one you are trying to run. Chances are this will work fine.

Softprompts

Softprompts (also known as Modules in other products) are addons that can change the output of existing models. For example you may load a softprompt that biases the AI towards a certain subject and style like transcripts from your favorite TV show.

Since these softprompts are often based on existing franchises we currently do not bundle any of them with KoboldAI due to copyright concerns (We do not want to put the entire project at risk). Instead look at community resources like #softprompts on the KoboldAI Discord or the community hosted mirror.

That way we are better protected from any DMCA claims as things can be taken down easier than directly on Github. If you have a copyright free softprompt that you made from scratch and is not based on existing IP that you would like to see officially bundled with KoboldAI issue a pull request with your softprompt.

Training softprompts can be done for free with the Easy Softprompt Tuner, in that case you can leave most of the settings default. Your source data needs to be a folder with text files that are UTF-8 formatted and contain Unix line endings.

Userscripts

Userscripts are scripts that can automate tasks in KoboldAI, or modify the AI behavior / input / output.
Scripting is done in LUA5.4 (Lua does not need to be separately installed as long as you got all the python requirements) and has sandboxing to help protect you from malicious behavior. Even with these measures in place we strongly advise you only run userscripts from places you trust and/or understand, otherwise consult the community for advice on how safe the script might be.

Inside the userscripts folder you will find our kaipreset scripts, these are default scripts that we think will be useful for our users. These scripts are automatically overwritten when you update KoboldAI, if you wish to modify these scripts make sure to first rename them to something else that does not contain kaipreset so your changes are not lost. These scripts range from a You Bias filter that prevents the AI from addressing characters as you. Ways to be able to prevent the AI from using words, word replacements and more.

Along with our preset scripts we also ship examples in the examples folder that merely serve as a demonstration and do not enhance your usage of KoboldAI. To use these scripts make sure to move them out of the examples folder before either using or modifying the script.

Lastly the all the features of our userscript API are documented inside the API Documentation files inside the userscripts folder.

For our TPU versions keep in mind that scripts modifying AI behavior relies on a different way of processing that is slower than if you leave these userscripts disabled even if your script only sporadically uses this modifier. If you want to partially use a script at its full speed than you can enable "No Gen Modifiers" to ensure that the parts that would make the TPU slow are not active.

API

KoboldAI has a REST API that can be accessed by adding /api to the URL that Kobold provides you (For example http://127.0.0.1:5000/api).
When accessing this link in a browser you will be taken to the interactive documentation.

Contributors

This project contains work from the following contributors :

  • The Gantian - Creator of KoboldAI, has created most features such as the interface, the different AI model / API integrations and in general the largest part of the project.
  • VE FORBRYDERNE - Contributed many features such as the Editing overhaul, Adventure Mode, expansions to the world info section, breakmodel integration, scripting support, API, softpromtps and much more. As well as vastly improving the TPU compatibility and integrating external code into KoboldAI so we could use official versions of Transformers with virtually no downsides.
  • Henk717 - Contributed the installation scripts, this readme, random story generator, the docker scripts, the foundation for the commandline interface and other smaller changes as well as integrating multiple parts of the code of different forks to unite it all. He also optimized the model loading so that downloaded models get converted to efficient offline models and that in future models are more likely to work out of the box. Not all code Github attributes to Henk717 is by Henk717 as some of it has been integrations of other people's work. We try to clarify this in the contributors list as much as we can.
  • Ebolam - Automatic Saving, back/redo, pinning, web loading of models
  • one-some, Logits Viewer and Token Streaming
  • db0, KoboldAI Horde
  • Frogging101 - top_k / tfs support (Part of this support was later redone by VE to integrate what was originally inside of finetuneanon's transformers)
  • UWUplus (Ralf) - Contributed storage systems for community colabs, as well as cleaning up and integrating the website dependencies/code better. He is also the maintainer of flask-cloudflared which we use to generate the cloudflare links.
  • Javalar - Initial Performance increases on the story_refresh
  • LexSong - Initial environment file adaptation for conda that served as a basis for the install_requirements.bat overhaul.
  • Arrmansa - Breakmodel support for other projects that served as a basis for VE FORBRYDERNE's integration.
  • Jojorne - Small improvements to the response selection for gens per action.
  • OccultSage (GooseAI) - Improved support for GooseAI/OpenAI

As well as various Model creators who will be listed near their models, and all the testers who helped make this possible!

Did we miss your contribution? Feel free to issue a commit adding your name to this list.

License

KoboldAI is licensed with a AGPL license, in short this means that it can be used by anyone for any purpose. However, if you decide to make a publicly available instance your users are entitled to a copy of the source code including all modifications that you have made (which needs to be available trough an interface such as a button on your website), you may also not distribute this project in a form that does not contain the source code (Such as compiling / encrypting the code and distributing this version without also distributing the source code that includes the changes that you made. You are allowed to distribute this in a closed form if you also provide a separate archive with the source code.).

umamba.exe is bundled for convenience because we observed that many of our users had trouble with command line download methods, it is not part of our project and does not fall under the AGPL license. It is licensed under the BSD-3-Clause license. Other files with differing licenses will have a reference or embedded version of this license within the file. It has been sourced from https://anaconda.org/conda-forge/micromamba/files and its source code can be found here : https://github.com/mamba-org/mamba/tree/master/micromamba

koboldai-client's People

Contributors

adcar avatar crataco avatar db0 avatar ebolam avatar gouvernathor avatar henk717 avatar ioncorimenia avatar javalar avatar jojorne avatar koboldai avatar lightsaveus avatar marcusllewellyn avatar mrreplikant avatar mrseeker avatar nolialsea avatar one-some avatar pi6am avatar rahulmb avatar recoveredapparatus avatar relys avatar scott-ca avatar scythe000 avatar smolbleat avatar uwuplus avatar vfbd avatar waffshappen avatar wbrown avatar yellowrosecx avatar zurnaz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

koboldai-client's Issues

Feature: play.bat takes command line arguments to allow startup automation

It would be nice for play.bat to be able to take arguments to allow for a single command to skip the startup 'questionnaire', something like play.bat models/<model> gpu to start a custom model or play.bat gpt-2 cpu to load a default model. I'm playing around with writing a global command to run Kobold AI and it would be even better if it could get me right to the models loading up and the local server starting without any extra input from me.

Sanitize AI output

If I ask the AI to generate me a simple Hello World program in C, the prompt window is not sanitized:

Image

License

Please add a license file.

Feature: Allow the creation of a blacklist that prevents the generation of X characters in a row

Right now, some models (currently I only know that this happens with GPT-J-6B) output weird garbage that completely derail the story and can sometimes soft break generation. The AI will completely ignore the input tokens and get into a loop of repeating things like

***********************The next morning
!!!Her face falls
_______________________________________________________________________________________Chapter 2:

etc. Notice that all of these start with a line of either asterisks, exclamation marks or underscores. It should be possible to cause a group of more than X special characters (like asterisks or underscores) to trigger something that causes the line to be regenerated, which would likely solve the issue.

Authentication error getting access to Google drive storage for colab client

I'm getting the error

Authorization Error
Error 400: policy_enforced
Advanced Protection prevented your Google Account from signing in. This security feature stops most non-Google apps and services from accessing your data to keep your account protected.

I'm looking to see if this can be fixed. This was working on this account a week ago.

Auto-save option, and undo-history for edits

I just lost a paragraph because it disappeared right after editing for some reason; then I went to the terminal to copy back what I had written previously, but the muscle memory made me hit Ctrl-C instead of Ctrl-Shift-C, that kills a running program in the terminal instead of copying the selected text. But since I had the browser window still open with the remaining text, I just booted up Kobold again; but it wiped what I had written completely when it loaded, even in the browser window that was already open from before.

I can still kinda recover the text from the terminal; but the formatting is a bit screwy with all the brackets and escaped characters and stuff; annoying.

It would be great if there was an option to have Kobold automatically save ongoing stories in a temp file or something of the sort, and offer to recover it next time it boots up if it wasn't manually saved before Kobold got closed; and additionally, it would be great to be able to roll back changes, so that if for some reason some part of the text goes bad after editing, you can go back to what it was previously.

ps: Do I need to make this two separate entries, or is it ok to have the whole thing here?

New line on action

I've been playing KoboldAI for some time now, and it's amazing. But one thing that bothers me is that by default when you make an input, it just gets appended to the last line and continues from there. I'd like to suggest adding a configuration that adds a new line when inputs are sent.

This way it's configurable and both people who like it in the same line and people who don't would have a good and easy playthrough \o/

Thanks! Looking forward to scripting!

Feature: repetition penalty slope

Let's face it. Repetition penalty 1.2 is good, but only for very short texts. Later it starts selecting irrelevant words, because the tokens need to be distinct from everything that was before.
NovelAI has Repetition Penalty Slope, where tokens further from the end of context don't need to be so distinct (meaning repetition penalty gradually fades to 0 the further from context the tokens are; the slope regulates the speed of fading).

Set model to eval mode for better performance

From some reports, people got lower performance from my finetuned models, but copying over the original model's config seems to fix it. The only significant difference is that gradient checkpointing is enabled for my models, which should only cause a difference when training. However, it seems that KoboldAI doesn't set the model to evaluation mode. Adding the following on line 240 in aiserver.py should fix it: model = model.eval()

From what I see in the transformers code, adding the device argument to pipeline should not actually do anything when the model is already instantiated and passed in directly. In that case, calling model.cuda(0) instead should work to load it on the GPU.

Inferkit prompt not defined, InferKit API Error: 500

The inferkit integration seems to be broken. I get an error on line 877 of aiserver.py (prompt is referenced before assignment.)
I tried moving the assignment from 858 to 841, and that sort of fixed it.

But now I'm getting "InferKit API Error: 500 - INVALID_INPUT"

Multiple Sequence Generation Colab Bug

I'm been trying to use the Multiple Sequence Generation feature that was added and while it does work, it seems like it ends up freezing and not working anymore.

After generating three times or so with the feature on. Colab and KoboldAI both freeze, Colab doesn't appear to receive any info to regenerate while KoboldAI continues to wait for a response that won't come.

I had the settings for 3 gens per action with 60 tokens, I also had the formatting settings (all of them on), maybe it could be that?
(just checked formatting settings, seems unrelated)

It seems very random when it eventually breaks, I don't know if the colab notebook is bugged or what, I am using the latest notebook for it.

Replication:
Use Colab's latest notebook, have it generate a paragraph and then retry a few times. Set 3 generations per action.

It might be the retrying too many times? Not sure.

Other settings I've had set
0.9 temp - No world info or memory used. - Repetition Penalty was around 1.1 and max tokens 512.

Bug: "Retry" deletes multiple steps of the story!

My Settings:

  • Model GPT Neo 2.7B
  • Temperature 0.9
  • Top P Sampling 0.8
  • Repetition Penalty 2
  • Amount to Generate 60
  • Max Tokens 2048
  • Gens Per Action 1
  • W Info Depth 5
  • Always Add Prompt YES
  • Trim Incomplete Sentences YES
  • Remove Blank Lines YES
  • Remove Special Characters NO
  • Add Sentence Spacing YES

Steps to Reproduce the Bug:

  1. First use my settings above.
  2. Type "1." in the prompt and submit it (this is just a random starter seed for the story).
  3. Press Submit a few times to generate multiple steps of AI output in the story.
  4. Change the "Gens Per Action" setting to 2 or higher.
  5. Submit again (blank line) to see two suggestions.
  6. Click Retry to re-generate those two suggestions. KoboldAI will now DELETE steps of the story.
  7. Every time you click Retry it deletes more and more steps of the story.
  8. This bug makes the "Gens Per Action" feature unusable at the moment, but it's a very cool feature so I look forward to using it in the future! :-) ❤️

Suggestion: Allow selection of a phrase of text to add to memory.

I really love how the Edit button works, allowing a person to highlight and select a section of text right from the main story area.

It might be nice to be able to leverage the ability to select text in the same fashion for the purpose of adding new information to memory as a person's story progresses.

Moving towards a dedicated multiplayer server implementation

First of all, great work. This project seems to be very promising in building a custom, self-hosted AIDungeon-type game.

I would like to run this as a simple webservice for my friends, in order to play D&D during Covid. The following list are some things which I believe are more or less necessary for this.

  • Remove the TK UI and instead use command line flags or a configuration file
  • In the same vein, make the server process completely non-interactive / headless
  • Modify the server to listen on 0.0.0.0 instead of localhost only
  • Modify the application Javascript to open the socket on the correct host instead of localhost
  • Add a name / account / session system for multiplayer games
  • Modify the websocket server to broadcast to all connected clients (use socketio.emit instead of emit)

A few things I also found could be useful:

  • Dedicated WSGI deployment (with a dedicated websocket module like e.g. eventlet)
  • Better multithreading support, currently the AI process hangs the webserver
  • Docker support
  • More UI options for verb selection
  • World info support
  • Proper truncation of responses
  • Code cleanup, maybe split into various files / classes

Currently, I am testing using my fork, but I'd like to upstream any contributions eventually. However I thought I'd first get your opinion on these ideas, any criticism is welcome.

By the way, most models running on pytorch run on AMD cards just as well using ROCm.

Edit: The 0.16 version addressed the most important points, see comments. I'll leave this issue open however to track further improvement of the standalone server.

Feature: Clean up punctuation.

There's some weird text generation sometimes, and the current "remove all strange characters" feature is too heavy-handed. Some sentence cleanup regex would be a much better solution.

Here are some examples of ideas for necessary text transformations:

  • He said that he was telling the truth. (This was false ---> He said that he was telling the truth. (This was false.)
  • "Wherever you want to go ---> "Wherever you want to go."
  • It's a wonderful world" ---> It's a wonderful world. or "It's a wonderful world."
  • It's a wonderful world!" ---> It's a wonderful world! or "It's a wonderful world!"
  • That's nice, ---> That's nice.

I wouldn't be surprised if there are already libraries out there for either Node.JS (for inspiration, it's the largest package repo in the world) or on PyPi for Python, that can do these kinds of sentence cleanup transformations.

Autosaving

Would be nice to have it to not lose chunks of the story because colab notebook stopped.

Notebook automatically stops by performing ctrl-c

While attempting to start up koboldai notebook ran for 10 minutes and then ^C itself. No keystrokes were typed on the user side.

Welcome to the KoboldAI Client!
Select an AI model to continue:

Welcome to ColabKobold! The easiest way to run KoboldAI! We will now load the AI, once its done you will see a message to refresh the cloudflare page.
Looking for GPU support...FOUND!
Initializing Flask... OK!
Initializing transformers, please wait...
2021-07-21 12:34:24.124339: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
tcmalloc: large alloc 5302616064 bytes == 0x5605c8c30000 @  0x7f917b37cb6b 0x7f917b39c379 0x7f912076a25e 0x7f912076b9d2 0x7f91627858ed 0x7f91733a7280 0x7f9172fe5d39 0x56043d6dfbf8 0x56043d7536f2 0x56043d74e235 0x56043d6e073a 0x56043d74eb0e 0x56043d74e235 0x56043d6e034b 0x56043d6dfe59 0x56043d82725d 0x56043d796c3b 0x56043d6def01 0x56043d7d0c0d 0x56043d7530d8 0x56043d74e235 0x56043d61fe2c 0x56043d750318 0x56043d74dc35 0x56043d6e073a 0x56043d74f93b 0x56043d74dc35 0x56043d6e073a 0x56043d752f40 0x56043d74dc35 0x56043d74d933
^C

Possibly tensorflow/tensorflow#33255
tensorflow/models#7652
huggingface/transformers#4668

Unable to load AIDCAT Scenarios or Adventures

When trying to load scenarios from an unmodified export JSON from AIDCAT, this traceback is issued:

  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\threading.py", line 954, in _bootstrap_inner
    self.run()
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\threading.py", line 892, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\site-packages\socketio\server.py", line 688, in _handle_event_internal
    r = server._trigger_event(data[0], namespace, sid, *data[1:])
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\site-packages\socketio\server.py", line 712, in _trigger_event
    return self.handlers[namespace][event](*args)
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\site-packages\flask_socketio\__init__.py", line 283, in _handler
    return self._handle_event(handler, message, namespace, sid,
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\site-packages\flask_socketio\__init__.py", line 751, in _handle_event
    ret = handler(*args)
  File "C:\Users\foxyd\Documents\My Games\KoboldAI\KoboldAI-Client-main\aiserver.py", line 343, in get_message
    importRequest()
  File "C:\Users\foxyd\Documents\My Games\KoboldAI\KoboldAI-Client-main\aiserver.py", line 1230, in importRequest
    ob["acts"]  = len(story["actions"])
KeyError: 'actions'

When trying to load adventures from an unmodified export JSON, this traceback is issued:

  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\threading.py", line 954, in _bootstrap_inner
    self.run()
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\threading.py", line 892, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\site-packages\socketio\server.py", line 688, in _handle_event_internal
    r = server._trigger_event(data[0], namespace, sid, *data[1:])
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\site-packages\socketio\server.py", line 712, in _trigger_event
    return self.handlers[namespace][event](*args)
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\site-packages\flask_socketio\__init__.py", line 283, in _handler
    return self._handle_event(handler, message, namespace, sid,
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\site-packages\flask_socketio\__init__.py", line 751, in _handle_event
    ret = handler(*args)
  File "C:\Users\foxyd\Documents\My Games\KoboldAI\KoboldAI-Client-main\aiserver.py", line 343, in get_message
    importRequest()
  File "C:\Users\foxyd\Documents\My Games\KoboldAI\KoboldAI-Client-main\aiserver.py", line 1212, in importRequest
    vars.importjs = json.load(file)
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\json\__init__.py", line 293, in load
    return loads(fp.read(),
  File "C:\Users\foxyd\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 144658: character maps to <undefined>

I'm not sure how these particular issues could've happened, considering they're just the raw, exported files.

EDIT: Even the included sample story fails to load.

Editing text in the output screen

The current way of editing text is still a bit clunky. Couldn't you make it so that you didn't need to press an Edit button, but the mouse would highlight story chunks in normal operation mode too, but there wouldn't be a separate editor for those chunks and instead they were edited in the output window directly?

Cannot find module

Traceback (most recent call last):
File "aiserver.py", line 155, in
import torch
ModuleNotFoundError: No module named 'torch'

I got this error after following the instructions to run the program on my GPU. I don't know where to begin to even start fixing this, please help.

ValueError

ValueError: unable to parse D:/KoboldAI-Client-main\config.json as a URL or as a local path
(base) D:\KoboldAI-Client-main>File "K:\python\lib\site-packages\transformers\file_utils.py", line 1420, in cached_path

Tried firstly using tempory K disk, then on client folder. Both gave the same error.

Shinen model won't launch

trying to open the shinen model in colab gives me
OSError: /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch_global_deps.so: cannot open shared object file: No such file or directory

Training the AI

How do I feed stories to the AI? I want to train it on worlds such as Pokemon and Five Nights at Freddys as well as create custom information for it to base other things on. I read the info but couldn't find anything about it.

Accessibility issues

I am finding it difficult to edit text after pressing "edit" with a screen reader as well as finding where to add an author's note. I can't seem to select what to edit when pressing the edit button, and where to add author's note isn't clear.
Thanks.

Easier to use predownloaded models

In a fair few AID2 forks there's a "models" directory where I could symbolically link the directories actually containing the models. So using that scheme, there was only a single copy of each model.

It seems KoboldAI has a different system. I tried to understand the code of what it does, but my best guess is that the transformers library is tasked to handle locating/downloading the models (unless one of those more "special" options is selected). But I still couldn't figure out where they are placed on my system so that I could bypass downloading the models and instead use already existing ones on my system. Could this be made easier somehow?

Feature: Implement tail free sampling.

Not sure if anyone wants to give it a try at implementing this feature. It sounds like an awesome technique for making story-driven AI stay on topic and generate coherent stories:

https://trentbrick.github.io/Tail-Free-Sampling/

It was invented by someone who wanted their D&D generated stories to be on-topic within an overarching narrative, yet be completely original. The paper is fascinating.

Colors in the Windows terminal

You need to call os.system('color') at the beginning of the script for the colors to work in the Windows terminal. Otherwise it will display colors like this:

←[96mWelcome to the KoboldAI Server!
Select an AI model to continue:←[0m

Some people do:

import os

#==================================================================#
# Variables & Storage
#==================================================================#

# System call
os.system('')

# Terminal tags for colored text
class colors:

Hangs on back -> retry

On the user's side:

  1. Make any prompt at the start
  2. Press "Back"
  3. Press "Retry

This hangs, that is, never completes the query. I was using the smallest GPT-2 model, in case that matters.

A minor thing that doesn't really need a separate issue: in the server log, "Data recieved:" should probably be "Data received: ".

Syntax error in aiserver.py

When I try to run play.bat, I get the following error:

A:\Projects\KoboldAI>aiserver.py
  File "A:\Projects\KoboldAI\aiserver.py", line 117
    print("{0}Looking for GPU support...{1}".format(colors.HEADER, colors.ENDC), end="")
                                                                                    ^
SyntaxError: invalid syntax

I'm using Python 3.8.10 on Windows 10.

Ability to run without model

I am just using koboldAI to build a WI index, for a scenario, and don't require the AI to be functional. Is there a way to run it without inputting a model??

Running on linux

I was trying to get this to run on Pop OS when I encountered an issue.
The installation steps all went fine, but when I was first tried to start the game using play-cuda.sh it didn't work because of this issue:

RUN apt update && apt install xorg -y

Resolved that by commenting out the line, because xorg was already installed.
Now when trying to run it I get this error:

Error response from daemon: error gathering device information while adding custom device "/dev/kfd": no such file or directory

Found out this might have something to do with rocm, which I don't have installed (Because I'm trying to run this on a 1080Ti).
Now I wonder if running this on linux is even supported since all the instructions are made for windows :)
I have to mention though, that it does run flawlessly in my Installation of Windows. No issues at any process step. Only tried the gpt neo 2.7B parameter set so far and runs fine on 11GB VRAM. Thanks for all the work that has already been put into this.

Encoding problem with accented and special characters

Hello! I've gone through two issues here. I was testing a model I finetuned to see if it spoke portuguese well. I imported an AID adventure that had some portuguese in it, and KAI started throwing errors because of the accented characters. I had to write a script that replaces them with their unaccented variation (i.e., ã became a).

Now, I had a similar problem with my CAT WIs. The triple bar doesn't seem to work with KAI. When the WI is triggered, the character gets messy.

How the WI got inserted into LMI: Received Data: [ Clavicus Vile description:< name ≡ Clavicus Vile& Vile>/< age ≡ primeval>. Clavicus Vile summary:< appears ≡ male>/< location ≡ The Fields of Regret>/< almost always with his hound Barbas by his side>. Clavicus Vile appearance:< skin ≡ yellow>/< long black horns>/< eyes ≡ red>. Clavicus Vile mental:< jokester& sarcastic& trickster& manipulative>. Clavicus Vile occupation:< Daedric Prince of Trickery and Bargains/ God of Trickery and Bargains>. Clavicus Vile speech:< mocking tone>.] (≡ where it should be )

With the accented character, the error was something like UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 867: character maps to <undefined>enter code here. After removing all of them, the import worked.

Thanks!

Can't run any GPT-J-6B model locally in CPU or GPU+CPU modes

Seems like there's no way to run GPT-J-6B models locally using CPU or CPU+GPU modes. I've tried both transformers versions (original and finetuneanon's) in both modes (CPU and GPU+CPU), but they all fail in one way or another.

First, I'll describe the error that appears when trying to use the gpt-j-6b-adventure-hf model locally in GPU+CPU hybrid mode. In this case KoboldAI raises the following error:

module 'keras.backend' has no attribute 'is_tensor'

Steps to reproduce

I'm testing this on Linux.

  1. Setup everything and start KoboldAI.
git clone https://github.com/KoboldAI/KoboldAI-Client.git kobold-local
cd kobold-local

python3 -m venv ./venv
source venv/bin/activate

pip install -r requirements.txt

mkdir -p models
cd models
wget 'https://api.wandb.ai/files/ve-forbryderne/adventure/carol-data/models/gpt-j-6b-adventure-hf.7z'
7za x gpt-j-6b-adventure-hf.7z
cd ..

python3 aiserver.py
  1. Choose 1 - Custom Neo (GPT-Neo / Converted GPT-J).

  2. Pick models/gpt-j-6b-adventure-hf.

  3. Choose 3 - Both (slower than GPU-only but uses less VRAM).

  4. Choose a number of blocks for the system RAM. In my case it was 24 (but later I used 20).

  5. Enter anything in the web GUI prompt and click Submit.

After some time the abovementioned error will appear.

I was using the bundled requirements.txt, so the finetuneanon's version of the transformers was used.

Click here to view the full output
❯ python3 aiserver.py
Welcome to the KoboldAI Server!
Select an AI model to continue:

    #   Model                           V/RAM
    =========================================
    1  - Custom Neo (GPT-Neo / Converted GPT-J)
    2  - Custom GPT-2 (eg CloverEdition)
    3  - GPT Neo 1.3B                   8GB
    4  - GPT Neo 2.7B                   16GB
    5  - GPT-2                          1GB
    6  - GPT-2 Med                      2GB
    7  - GPT-2 Large                    4GB
    8  - GPT-2 XL                       8GB
    9  - InferKit API (requires API key)
    10 - Google Colab
    11 - OpenAI API (requires API key)
    12 - Read Only (No AI)

Model #> 1
Please choose the folder where pytorch_model.bin is located:

Looking for GPU support...FOUND!
You're using a model that supports GPU-CPU hybrid generation!
Currently only GPT-Neo models and GPT-J-6B support this feature.
Use GPU or CPU for generation?:  (Default GPU)
    1 - GPU
    2 - CPU
    3 - Both (slower than GPU-only but uses less VRAM)

Mode> 3
Initializing Flask... OK!
Initializing transformers, please wait...

How many layers would you like to put into system RAM?
The more of them you put into system RAM, the slower it will run,
but it will require less VRAM
(roughly proportional to number of layers).
This model has 28 layers.

# of layers> 24
Will commit 24 of 28 layers to system RAM.
OK! NeoCustom pipeline created!
You may now connect with a browser at http://127.0.0.1:5000/
* Serving Flask app "aiserver" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
The WebSocket transport is not available, you must install a WebSocket server that is compatible with your async mode to enable it. See the documentation for details. (further occurrences of this error will be logged with level INFO)
Client connected!
Data received:{'cmd': 'submit', 'actionmode': 0, 'data': 'I see a shining light.'}
Min:7, Max:86, Txt:I see a shining light.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
module 'keras.backend' has no attribute 'is_tensor'

The generic gpt-j-6b model throws the same error.

Other errors

When I try to use finetuneanon's transformers in CPU mode, a different error occurs: "LayerNormKernelImpl" not implemented for 'Half'. This is documented, so it's "ok".

When I try to use the original transformers in GPU+CPU mode I get this error: Input, output and indices must be on the current device.

Click here to view the full output
❯ python3 aiserver.py
Welcome to the KoboldAI Server!
Select an AI model to continue:

    #   Model                           V/RAM
    =========================================
    1  - Custom Neo (GPT-Neo / Converted GPT-J)
    2  - Custom GPT-2 (eg CloverEdition)
    3  - GPT Neo 1.3B                   8GB
    4  - GPT Neo 2.7B                   16GB
    5  - GPT-2                          1GB
    6  - GPT-2 Med                      2GB
    7  - GPT-2 Large                    4GB
    8  - GPT-2 XL                       8GB
    9  - InferKit API (requires API key)
    10 - Google Colab
    11 - OpenAI API (requires API key)
    12 - Read Only (No AI)

Model #> 1
Please choose the folder where pytorch_model.bin is located:

Looking for GPU support...FOUND!
You're using a model that supports GPU-CPU hybrid generation!
Currently only GPT-Neo models and GPT-J-6B support this feature.
Use GPU or CPU for generation?:  (Default GPU)
    1 - GPU
    2 - CPU
    3 - Both (slower than GPU-only but uses less VRAM)

Mode> 3
Initializing Flask... OK!
Initializing transformers, please wait...
Some weights of the model checkpoint at /home/user/test/kobold-local/models/gpt-j-6b were not used when initializing GPTNeoForCausalLM: ['lm_head.bias']
- This IS expected if you are initializing GPTNeoForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing GPTNeoForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of GPTNeoForCausalLM were not initialized from the model checkpoint at /home/user/test/kobold-local/models/gpt-j-6b and are newly initialized: ['transformer.h.25.ln_2.weight', 'transformer.h.21.ln_2.bias', 'transformer.h.10.ln_2.weight', 'transformer.h.24.attn.attention.out_proj.bias', 'transformer.h.7.ln_2.bias', 'transformer.h.21.attn.attention.out_proj.bias', 'transformer.h.24.ln_2.bias', 'transformer.h.22.attn.attention.out_proj.bias', 'transformer.h.0.attn.attention.out_proj.bias', 'transformer.h.1.ln_2.bias', 'transformer.h.9.ln_2.bias', 'transformer.h.9.attn.attention.out_proj.bias', 'transformer.h.19.ln_2.weight', 'transformer.h.8.ln_2.weight', 'transformer.h.8.attn.attention.out_proj.bias', 'transformer.h.17.ln_2.bias', 'transformer.h.27.ln_2.bias', 'transformer.h.13.ln_2.weight', 'transformer.h.24.ln_2.weight', 'transformer.h.16.ln_2.bias', 'transformer.h.3.attn.attention.out_proj.bias', 'transformer.h.11.ln_2.bias', 'transformer.h.20.ln_2.weight', 'transformer.h.0.ln_2.bias', 'transformer.h.1.attn.attention.out_proj.bias', 'transformer.h.10.attn.attention.out_proj.bias', 'transformer.h.4.ln_2.bias', 'transformer.h.5.ln_2.bias', 'transformer.h.11.attn.attention.out_proj.bias', 'transformer.h.25.ln_2.bias', 'transformer.h.15.ln_2.bias', 'transformer.h.3.ln_2.weight', 'transformer.h.18.ln_2.weight', 'transformer.h.18.attn.attention.out_proj.bias', 'transformer.h.9.ln_2.weight', 'transformer.h.23.ln_2.bias', 'transformer.h.6.attn.attention.out_proj.bias', 'transformer.h.7.attn.attention.out_proj.bias', 'transformer.h.2.attn.attention.out_proj.bias', 'transformer.h.16.ln_2.weight', 'transformer.h.7.ln_2.weight', 'transformer.h.3.ln_2.bias', 'transformer.h.23.attn.attention.out_proj.bias', 'transformer.h.27.ln_2.weight', 'transformer.h.12.ln_2.weight', 'transformer.h.13.attn.attention.out_proj.bias', 'transformer.h.5.ln_2.weight', 'transformer.h.8.ln_2.bias', 'transformer.h.2.ln_2.weight', 'transformer.h.20.attn.attention.out_proj.bias', 'transformer.h.4.ln_2.weight', 'transformer.h.26.ln_2.weight', 'transformer.h.6.ln_2.weight', 'transformer.h.22.ln_2.bias', 'transformer.h.14.attn.attention.out_proj.bias', 'transformer.h.20.ln_2.bias', 'transformer.h.13.ln_2.bias', 'transformer.h.18.ln_2.bias', 'transformer.h.25.attn.attention.out_proj.bias', 'transformer.h.26.attn.attention.out_proj.bias', 'transformer.h.26.ln_2.bias', 'transformer.h.19.ln_2.bias', 'transformer.h.17.ln_2.weight', 'transformer.h.14.ln_2.weight', 'transformer.h.4.attn.attention.out_proj.bias', 'transformer.h.17.attn.attention.out_proj.bias', 'transformer.h.27.attn.attention.out_proj.bias', 'transformer.h.6.ln_2.bias', 'transformer.h.5.attn.attention.out_proj.bias', 'transformer.h.23.ln_2.weight', 'transformer.h.15.ln_2.weight', 'transformer.h.21.ln_2.weight', 'transformer.h.19.attn.attention.out_proj.bias', 'transformer.h.2.ln_2.bias', 'transformer.h.10.ln_2.bias', 'transformer.h.1.ln_2.weight', 'transformer.h.22.ln_2.weight', 'transformer.h.11.ln_2.weight', 'transformer.h.14.ln_2.bias', 'transformer.h.0.ln_2.weight', 'transformer.h.15.attn.attention.out_proj.bias', 'transformer.h.12.attn.attention.out_proj.bias', 'transformer.wpe.weight', 'transformer.h.16.attn.attention.out_proj.bias', 'transformer.h.12.ln_2.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

How many layers would you like to put into system RAM?
The more of them you put into system RAM, the slower it will run,
but it will require less VRAM
(roughly proportional to number of layers).
This model has 28 layers.

# of layers> 20
Will commit 20 of 28 layers to system RAM.
OK! NeoCustom pipeline created!
You may now connect with a browser at http://127.0.0.1:5000/
* Serving Flask app "aiserver" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
The WebSocket transport is not available, you must install a WebSocket server that is compatible with your async mode to enable it. See the documentation for details. (further occurrences of this error will be logged with level INFO)
Client connected!
Client connected!
Client connected!
Data received:{'cmd': 'submit', 'actionmode': 0, 'data': 'I see a shining light.'}
Min:7, Max:86, Txt:I see a shining light.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Client connected!
Input, output and indices must be on the current device

And when I try to use the original transformers in CPU mode there's no error, but the output is garbage. For example, when I input I see a shining light. it gives me this:

Analog Disk Sellvest Lif medically brightest scalingieuEVURNprefix DISTRICT relay Samson Commission Fold recallAUmaps bumper PB dex Cullen Championships unp HERO Raspberry Ankalse Ness sustained invokevind Pikachu Volks Meth Lect EMP cyan steering Tens LET ENexplet laptops fliesATT InstituteERSON mitochond!

The original transformers also produce some warnings (truncated):

Some weights of the model checkpoint at /home/user/test/kobold-local/models/gpt-j-6b were not used when initializing GPTNeoForCausalLM: ['lm_head.bias']
- This IS expected if you are initializing GPTNeoForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing GPTNeoForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of GPTNeoForCausalLM were not initialized from the model checkpoint at /home/user/test/kobold-local/models/gpt-j-6b and are newly initialized: ['transformer.h.7.ln_2.weight', 'transformer.h.25.ln_2.bias', 'transformer.h.26.ln_2.bias', 'transformer.h.5.ln_2.bias', ...]
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Click here to view the full output
❯ python3 aiserver.py
Welcome to the KoboldAI Server!
Select an AI model to continue:

    #   Model                           V/RAM
    =========================================
    1  - Custom Neo (GPT-Neo / Converted GPT-J)
    2  - Custom GPT-2 (eg CloverEdition)
    3  - GPT Neo 1.3B                   8GB
    4  - GPT Neo 2.7B                   16GB
    5  - GPT-2                          1GB
    6  - GPT-2 Med                      2GB
    7  - GPT-2 Large                    4GB
    8  - GPT-2 XL                       8GB
    9  - InferKit API (requires API key)
    10 - Google Colab
    11 - OpenAI API (requires API key)
    12 - Read Only (No AI)

Model #> 1
Please choose the folder where pytorch_model.bin is located:

Looking for GPU support...FOUND!
You're using a model that supports GPU-CPU hybrid generation!
Currently only GPT-Neo models and GPT-J-6B support this feature.
Use GPU or CPU for generation?:  (Default GPU)
    1 - GPU
    2 - CPU
    3 - Both (slower than GPU-only but uses less VRAM)

Mode> 2
Initializing Flask... OK!
Initializing transformers, please wait...
Some weights of the model checkpoint at /home/user/test/kobold-local/models/gpt-j-6b were not used when initializing GPTNeoForCausalLM: ['lm_head.bias']
- This IS expected if you are initializing GPTNeoForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing GPTNeoForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of GPTNeoForCausalLM were not initialized from the model checkpoint at /home/user/test/kobold-local/models/gpt-j-6b and are newly initialized: ['transformer.h.7.ln_2.weight', 'transformer.h.25.ln_2.bias', 'transformer.h.26.ln_2.bias', 'transformer.h.5.ln_2.bias', 'transformer.h.18.attn.attention.out_proj.bias', 'transformer.h.1.ln_2.weight', 'transformer.h.13.ln_2.weight', 'transformer.h.21.ln_2.bias', 'transformer.h.8.ln_2.bias', 'transformer.h.19.attn.attention.out_proj.bias', 'transformer.h.23.attn.attention.out_proj.bias', 'transformer.h.8.ln_2.weight', 'transformer.h.19.ln_2.bias', 'transformer.h.2.attn.attention.out_proj.bias', 'transformer.h.11.ln_2.bias', 'transformer.h.5.ln_2.weight', 'transformer.h.3.attn.attention.out_proj.bias', 'transformer.h.6.attn.attention.out_proj.bias', 'transformer.h.22.ln_2.bias', 'transformer.h.17.ln_2.bias', 'transformer.h.16.attn.attention.out_proj.bias', 'transformer.h.14.ln_2.bias', 'transformer.h.27.attn.attention.out_proj.bias', 'transformer.h.16.ln_2.bias', 'transformer.h.0.ln_2.bias', 'transformer.h.2.ln_2.bias', 'transformer.h.6.ln_2.bias', 'transformer.h.8.attn.attention.out_proj.bias', 'transformer.h.15.attn.attention.out_proj.bias', 'transformer.h.13.ln_2.bias', 'transformer.h.0.ln_2.weight', 'transformer.h.12.ln_2.weight', 'transformer.h.10.ln_2.bias', 'transformer.h.7.ln_2.bias', 'transformer.h.20.ln_2.bias', 'transformer.h.14.attn.attention.out_proj.bias', 'transformer.h.4.ln_2.weight', 'transformer.h.26.ln_2.weight', 'transformer.h.26.attn.attention.out_proj.bias', 'transformer.h.4.ln_2.bias', 'transformer.h.10.attn.attention.out_proj.bias', 'transformer.wpe.weight', 'transformer.h.1.ln_2.bias', 'transformer.h.6.ln_2.weight', 'transformer.h.24.attn.attention.out_proj.bias', 'transformer.h.11.attn.attention.out_proj.bias', 'transformer.h.22.attn.attention.out_proj.bias', 'transformer.h.3.ln_2.weight', 'transformer.h.3.ln_2.bias', 'transformer.h.23.ln_2.bias', 'transformer.h.25.attn.attention.out_proj.bias', 'transformer.h.27.ln_2.weight', 'transformer.h.23.ln_2.weight', 'transformer.h.9.ln_2.weight', 'transformer.h.0.attn.attention.out_proj.bias', 'transformer.h.1.attn.attention.out_proj.bias', 'transformer.h.9.attn.attention.out_proj.bias', 'transformer.h.13.attn.attention.out_proj.bias', 'transformer.h.24.ln_2.weight', 'transformer.h.17.attn.attention.out_proj.bias', 'transformer.h.12.ln_2.bias', 'transformer.h.24.ln_2.bias', 'transformer.h.2.ln_2.weight', 'transformer.h.25.ln_2.weight', 'transformer.h.18.ln_2.weight', 'transformer.h.19.ln_2.weight', 'transformer.h.21.attn.attention.out_proj.bias', 'transformer.h.7.attn.attention.out_proj.bias', 'transformer.h.16.ln_2.weight', 'transformer.h.27.ln_2.bias', 'transformer.h.20.ln_2.weight', 'transformer.h.15.ln_2.weight', 'transformer.h.10.ln_2.weight', 'transformer.h.9.ln_2.bias', 'transformer.h.18.ln_2.bias', 'transformer.h.12.attn.attention.out_proj.bias', 'transformer.h.5.attn.attention.out_proj.bias', 'transformer.h.22.ln_2.weight', 'transformer.h.11.ln_2.weight', 'transformer.h.20.attn.attention.out_proj.bias', 'transformer.h.4.attn.attention.out_proj.bias', 'transformer.h.15.ln_2.bias', 'transformer.h.14.ln_2.weight', 'transformer.h.17.ln_2.weight', 'transformer.h.21.ln_2.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
OK! NeoCustom pipeline created!
You may now connect with a browser at http://127.0.0.1:5000/
* Serving Flask app "aiserver" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
The WebSocket transport is not available, you must install a WebSocket server that is compatible with your async mode to enable it. See the documentation for details. (further occurrences of this error will be logged with level INFO)
Client connected!
Data received:{'cmd': 'submit', 'actionmode': 0, 'data': 'I see a shining light.'}
Min:7, Max:86, Txt:I see a shining light.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Client connected!
Analog Disk Sellvest Lif medically brightest scalingieuEVURNprefix DISTRICT relay Samson Commission Fold recallAUmaps bumper PB dex Cullen Championships unp HERO Raspberry Ankalse Ness sustained invokevind Pikachu Volks Meth Lect EMP cyan steering Tens LET ENexplet laptops fliesATT InstituteERSON mitochond!=EMP Meng BengEh KakERSON webs purchaser Sitting sunk liquphan%; accompanies lecturer Championships bumperrite sailorsasaki hammşATTarth Bash MAT Pupp

Summary

mode transformers error
CPU original Garbage output
CPU finetuneanon "LayerNormKernelImpl" not implemented for 'Half'
GPU+CPU original Input, output and indices must be on the current device
GPU+CPU finetuneanon module 'keras.backend' has no attribute 'is_tensor'

If these errors are unfixable I think that at least it needs to be documented somewhere.

Other details:

  • gpt-j-6b-adventure-hf and gpt-j-6b models produce the same errors.

  • I've tested 2.7B models and they work fine in CPU and GPU+CPU modes.

  • I can't test 6B models in GPU-only mode (not enough VRAM).

System

  • GeForce GTX 1060 6GB
  • 32 GB RAM (+ pagefile since using CPU-only requires around 45GB)
  • Kubuntu 21.10
  • CUDA 11.3.109

Saving/Loading Issue

It seems for odd reasons, I'm unable to save any stories and when I go to load them, it finds nothing.
I am currently using the latest version.

Reproduce issue:
Start a new story prompt, run a few times generation wise and then try to save. The file will "save" but won't be found by the load system.

Solution:
It turns out that the client isn't properly adding the .json extension at the end of the saved files, so they will show up as nothing.

Band-Aid fix is to add the .json extension to the extensionless files and it'll show up and load properly.

how to clean up the cache to save space

i decided to use the full precision setting today and its drastically lowered the amount of space left on my hard drive, i'm assuming its storing something on my computer somewhere as i didnt get the space back even after terminating the session, help would be appreciated in cleaning up the space the full precision files took up

Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found

I've followed the setup instructions, and then run through the instructions to enable GPU support. I have a GTX 1080, for reference. I installed CUDA 10.2, which you linked to in your instructions, along with the 2 updates they provided for it. During that installation, I did disable "Geforce Experience" (don't want to use it) and "Graphics Drivers" (I'd rather update those separately) from its installation list, only having it install everything under the CUDA section.
I then got the command line for installing PyTorch with CUDA 10.2 support at that link you provided, which turned out to be:
pip3 install torch==1.8.1+cu102 torchvision==0.9.1+cu102 torchaudio===0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
I ran that, which looked to have replaced the existing PyTorch which didn't have CUDA support.

However, I get an error when running KoboldAI which indicate's it's not loading CUDA correctly and thus not using my GPU. Example output:

Model #> 3
Looking for GPU support...FOUND!
Use GPU or CPU for generation?: (Default GPU)

1 - GPU
2 - CPU

Mode> 1
Initializing Flask... OK!
Initializing transformers, please wait...
2021-05-18 19:56:30.807734: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2021-05-18 19:56:30.807837: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
OK! gpt2 pipeline created!
You may now connect with a browser at http://127.0.0.1:5000/

From the error it looks like it's looking for CUDA 11 instead of CUDA 10? Should I just install CUDA 11.1 and then the appropriate version of PyTorch for that, or...?

AI Dungeon selected adventure wrapping after the 9th entry

Linux Mint, Firefox, first time listener and caller.

Just hooked everything up and set it with IK to start testing, and selected a story (the 26th in the list) which properly read in terminal as
Data recieved:{'cmd': 'importselect', 'data': 'import25'}

However, it loaded the sixth adventure in the list instead. Suspecting the behavior happening, I tried the 19th adventure, and got the 9th. The 11th adventure got me the first, import10 becoming import0 it appears. The terminal log showing data received matches the story I'm clicking, but the client in-browser is wrapping around after 9. I have no warnings or errors other than the statement of being a development server and GPU support not found, but I don't think either are at play here. The issue is happening all the way through to import68 being read (far as I can tell) as import8.

Not sure how to get debug logs but as far as I can tell my install was fine.

Errror: "LayerNormKernelImpl" not implemented for 'Half'

Not entirely sure what this means. I've been practicing running the various models and any time I try to use the neo-horni model, I get this error.

Here's logs:

←[96mWelcome to the KoboldAI Client!
Select an AI model to continue:←[0m

    #   Model                           V/RAM
    =========================================
    1  - GPT Neo 1.3B                   8GB
    2  - GPT Neo 2.7B                   16GB
    3  - GPT-2                          1.2GB
    4  - GPT-2 Med                      2GB
    5  - GPT-2 Large                    16GB
    6  - GPT-2 XL                       16GB
    7  - InferKit API (requires API key)
    8  - Custom Neo   (eg Neo-horni)
    9  - Custom GPT-2 (eg CloverEdition)
    10 - Google Colab
    11 - OpenAI API (requires API key)
    12 - Read Only (No AI)

Model #> 8
←[96mPlease choose the folder where pytorch_model.bin is located:←[0m

Looking for GPU support...NOT FOUND!
Initializing Flask... OK!
Initializing transformers, please wait...
2021-07-05 09:33:21.416244: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
OK! NeoCustom pipeline created!
You may now connect with a browser at http://127.0.0.1:5000/
 * Serving Flask app 'aiserver' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
The WebSocket transport is not available, you must install a WebSocket server that is compatible with your async mode to enable it. See the documentation for details. (further occurrences of this error will be logged with level INFO)
Client connected!
Data recieved:{'cmd': 'submit', 'data': 'Testing'}
Min:2, Max:61, Txt:Testing
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
"LayerNormKernelImpl" not implemented for 'Half'

Memory requirements

Something I've noticed is that the memory requirements for the same AI model seem higher for KoboldAI than for CloverEdition. My system has 16 GB system memory, and 8 GB onboard video memory (with an additional 8 GB shared memory available).

So, for example, using GPT-Neo 2.7b.
System memory usage for Clover Edition loading GPT-Neo 2.7b climbs gradually up to 11.5 GB memory, before dropping back down to about 1.2 GB. Meanwhile, at the same time, GPU memory usage goes up to an additional 6.5 GB used, before finally dropping down to about 5.5 GB overhead once it finishes loading.
When trying to load GPT-Neo 2.7b in KoboldAI, the system memory usage climbs up fairly rapidly to over 12 GB, while the GPU memory doesn't budge. My computer then hangs, going almost completely unresponsive, even the clock not updating at all, though every once in a while (every 30 seconds or so, maybe?) the mouse cursor will move a tick if I am moving the mouse. I end up being unable to even kill the process due to the unresponsiveness and have to power cycle my computer. Presumably this is due to my system memory just getting overloaded.

When I load GPT-2 in KoboldAI, as per my previous Issue, it does noticeably start loading into GPU memory almost immediately, so I'm not sure why that seems to be different from GPT-Neo 2.7b loading in KoboldAI.

From the text on the Clover Edition readme page, they mention using 16-bit instead of 32-bit; is that maybe something that would help in KoboldAI?

I'll add that I really like the browser interface and additional features of KoboldAI, and really hate the commandline interface of Clover Edition. I'm just hoping that I can end up running GPT-Neo 2.7b on KoboldAI like I can in Clover Edition.

Autofill / templating?

How about extending the model with the jinja's autocompletion, or, rather, autofill? So by inserting {{templates}} in the text it would also generate the output in those as well, not just at the end of the input.

Loadsettings Fails on InferKit

The loadsettings function currently throws a key failure when starting InferKit with no client.settings file existing. Unzip the attached client settings file into your KoboldAI directory for the moment until I can get the bug squashed,
clientsettings.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.