Coder Social home page Coder Social logo

cocktailpeanut / dalai Goto Github PK

View Code? Open in Web Editor NEW
13.1K 146.0 1.4K 11.95 MB

The simplest way to run LLaMA on your local machine

Home Page: https://cocktailpeanut.github.io/dalai

JavaScript 13.61% EJS 5.04% Shell 0.01% CSS 81.11% Dockerfile 0.23%
ai llama llm

dalai's Introduction

Dalai

Run LLaMA and Alpaca on your computer.

GitHub Twitter Discord


JUST RUN THIS

TO GET

Both alpaca and llama working on your computer!

alpaca.gif


  1. Powered by llama.cpp, llama-dl CDN, and alpaca.cpp
  2. Hackable web app included
  3. Ships with JavaScript API
  4. Ships with Socket.io API

Intro

1. Cross platform

Dalai runs on all of the following operating systems:

  1. Linux
  2. Mac
  3. Windows

2. Memory Requirements

Runs on most modern computers. Unless your computer is very very old, it should work.

According to a llama.cpp discussion thread, here are the memory requirements:

  • 7B => ~4 GB
  • 13B => ~8 GB
  • 30B => ~16 GB
  • 65B => ~32 GB

3. Disk Space Requirements

Alpaca

Currently 7B and 13B models are available via alpaca.cpp

7B

Alpaca comes fully quantized (compressed), and the only space you need for the 7B model is 4.21GB:

alpaca_7b.png

13B

Alpaca comes fully quantized (compressed), and the only space you need for the 13B model is 8.14GB:

alpaca_13b.png

LLaMA

You need a lot of space for storing the models. The model name must be one of: 7B, 13B, 30B, and 65B.

You do NOT have to install all models, you can install one by one. Let's take a look at how much space each model takes up:

NOTE

The following numbers assume that you DO NOT touch the original model files and keep BOTH the original model files AND the quantized versions.

You can optimize this if you delete the original models (which are much larger) after installation and keep only the quantized versions.

7B

  • Full: The model takes up 31.17GB
  • Quantized: 4.21GB

7b.png

13B

  • Full: The model takes up 60.21GB
  • Quantized: 4.07GB * 2 = 8.14GB

13b.png

30B

  • Full: The model takes up 150.48GB
  • Quantized: 5.09GB * 4 = 20.36GB

30b.png

65B

  • Full: The model takes up 432.64GB
  • Quantized: 5.11GB * 8 = 40.88GB

65b.png


Quickstart

Docker compose

Requires that you have docker installed and running.

docker compose build
docker compose run dalai npx dalai alpaca install 7B # or a different model
docker compose up -d

This will dave the models in the ./models folder

View the site at http://127.0.0.1:3000/

Mac

Step 1. Install node.js >= 18

If your mac doesn't have node.js installed yet, make sure to install node.js >= 18

Install Node.js

Step 2.1. Install models

Currently supported engines are llama and alpaca.

Add alpaca models

To download alpaca models, you can run:

npx dalai alpaca install 7B

Add llama models

To download llama models, you can run:

npx dalai llama install 7B

or to download multiple models:

npx dalai llama install 7B 13B

Now go to step 3.

Step 2.2. Troubleshoot

Normally you don't need this step, but if running the commands above don't do anything and immediately end, it means something went wrong because some of the required modules are not installed on your system.

In that case, try the following steps:

1. Install homebrew

In case homebrew is not installed on your computer, install it by running:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Or you can find the same instruction on the homebrew hompage: https://brew.sh/

2. Install dependencies

Once homebrew is installed, install these dependencies:

brew install cmake
brew install pkg-config

3. Update NPM

Just to make sure we cover every vector, let's update NPM as well:

npm install -g npm@latest

Now go back to step 2.1 and try running the npx dalai commands again.

Step 3. Run Web UI

After everything has been installed, run the following command to launch the web UI server:

npx dalai serve

and open http://localhost:3000 in your browser. Have fun!


Windows

Step 1. Install Visual Studio

On windows, you need to install Visual Studio before installing Dalai.

Press the button below to visit the Visual Studio downloads page and download:

Download Microsoft Visual Studio

IMPORTANT!!!

When installing Visual Studio, make sure to check the 3 options as highlighted below:

  1. Python development
  2. Node.js development
  3. Desktop development with C++

vs.png


Step 2.1. Install models

IMPORTANT

On Windows, make sure to run all commands in cmd.

DO NOT run in powershell. Powershell has unnecessarily strict permissions and makes the script fail silently.

Currently supported engines are llama and alpaca.

Install alpaca

To download alpaca models. Open your cmd application and enter:

npx dalai alpaca install 7B

Add llama models

To download llama models. Open your cmd application and enter:

npx dalai llama install 7B

or to download multiple models:

npx dalai llama install 7B 13B

Step 2.2. Troubleshoot (optional)

In case above steps fail, try installing Node.js and Python separately.

Install Python:

Download Python

Install Node.js >= 18:

Download Node.js

After both have been installed, open powershell and type python to see if the application exists. And also type node to see if the application exists as well.

Once you've checked that they both exist, try again.

Step 3. Run Web UI

After everything has been installed, run the following command to launch the web UI server (Make sure to run in cmd and not powershell!):

npx dalai serve

and open http://localhost:3000 in your browser. Have fun!


Linux

Step 1. Install Dependencies

You need to make sure you have the correct version of Python and Node.js installed.

Step 1.1. Python <= 3.10

Download Python

Make sure the version is 3.10 or lower (not 3.11) Python must be 3.10 or below (pytorch and other libraries are not supported yet on the latest)

Step 1.2. Node.js >= 18

Download node.js

Make sure the version is 18 or higher


Step 2.1. Install models

Currently supported engines are llama and alpaca.

Add alpaca models

To download alpaca models, you can run:

npx dalai alpaca install 7B

Add llama models

To download llama models, you can run:

npx dalai llama install 7B

or to download multiple models:

npx dalai llama install 7B 13B

Step 2.2. Troubleshoot

In case the model install silently fails or hangs forever, try the following command, and try running the npx command again:

On ubuntu/debian/etc.:

sudo apt-get install build-essential python3-venv -y

On fedora/etc.:

dnf install make automake gcc gcc-c++ kernel-devel python3-virtualenv -y

Step 3. Run Web UI

After everything has been installed, run the following command to launch the web UI server:

npx dalai serve

and open http://localhost:3000 in your browser. Have fun!


API

Dalai is also an NPM package:

  1. programmatically install
  2. locally make requests to the model
  3. run a dalai server (powered by socket.io)
  4. programmatically make requests to a remote dalai server (via socket.io)

Dalai is an NPM package. You can install it using:

npm install dalai

1. constructor()

Syntax

const dalai = new Dalai(home)
  • home: (optional) manually specify the llama.cpp folder

By default, Dalai automatically stores the entire llama.cpp repository under ~/llama.cpp.

However, often you may already have a llama.cpp repository somewhere else on your machine and want to just use that folder. In this case you can pass in the home attribute.

Examples

Basic

Creates a workspace at ~/llama.cpp

const dalai = new Dalai()

Custom path

Manually set the llama.cpp path:

const dalai = new Dalai("/Documents/llama.cpp")

2. request()

Syntax

dalai.request(req, callback)
  • req: a request object. made up of the following attributes:
    • prompt: (required) The prompt string
    • model: (required) The model type + model name to query. Takes the following form: <model_type>.<model_name>
      • Example: alpaca.7B, llama.13B, ...
    • url: only needed if connecting to a remote dalai server
      • if unspecified, it uses the node.js API to directly run dalai locally
      • if specified (for example ws://localhost:3000) it looks for a socket.io endpoint at the URL and connects to it.
    • threads: The number of threads to use (The default is 8 if unspecified)
    • n_predict: The number of tokens to return (The default is 128 if unspecified)
    • seed: The seed. The default is -1 (none)
    • top_k
    • top_p
    • repeat_last_n
    • repeat_penalty
    • temp: temperature
    • batch_size: batch size
    • skip_end: by default, every session ends with \n\n<end>, which can be used as a marker to know when the full response has returned. However sometimes you may not want this suffix. Set skip_end: true and the response will no longer end with \n\n<end>
  • callback: the streaming callback function that gets called every time the client gets any token response back from the model

Examples

1. Node.js

Using node.js, you just need to initialize a Dalai object with new Dalai() and then use it.

const Dalai = require('dalai')
new Dalai().request({
  model: "7B",
  prompt: "The following is a conversation between a boy and a girl:",
}, (token) => {
  process.stdout.write(token)
})

2. Non node.js (socket.io)

To make use of this in a browser or any other language, you can use thie socket.io API.

Step 1. start a server

First you need to run a Dalai socket server:

// server.js
const Dalai = require('dalai')
new Dalai().serve(3000)     // port 3000
Step 2. connect to the server

Then once the server is running, simply make requests to it by passing the ws://localhost:3000 socket url when initializing the Dalai object:

const Dalai = require("dalai")
new Dalai().request({
  url: "ws://localhost:3000",
  model: "7B",
  prompt: "The following is a conversation between a boy and a girl:",
}, (token) => {
  console.log("token", token)
})

3. serve()

Syntax

Starts a socket.io server at port

dalai.serve(port)

Examples

const Dalai = require("dalai")
new Dalai().serve(3000)

4. http()

Syntax

connect with an existing http instance (The http npm package)

dalai.http(http)
  • http: The http object

Examples

This is useful when you're trying to plug dalai into an existing node.js web app

const app = require('express')();
const http = require('http').Server(app);
dalai.http(http)
http.listen(3000, () => {
  console.log("server started")
})

5. install()

Syntax

await dalai.install(model_type, model_name1, model_name2, ...)
  • model_type: the name of the model. currently supports:
    • "alpaca"
    • "llama"
  • model1, model2, ...: the model names to install ("7B"`, "13B", "30B", "65B", etc)

Examples

Install Llama "7B" and "13B" models:

const Dalai = require("dalai");
const dalai = new Dalai()
await dalai.install("llama", "7B", "13B")

Install alpaca 7B model:

const Dalai = require("dalai");
const dalai = new Dalai()
await dalai.install("alpaca", "7B")

6. installed()

returns the array of installed models

Syntax

const models = await dalai.installed()

Examples

const Dalai = require("dalai");
const dalai = new Dalai()
const models = await dalai.installed()
console.log(models)     // prints ["7B", "13B"]

FAQ

Using a different home folder

By default Dalai uses your home directory to store the entire repository (~/dalai). However sometimes you may want to store the archive elsewhere.

In this case you can call all CLI methods using the --home flag:

1. Installing models to a custom path

npx dalai llama install 7B --home ~/test_dir

2. Serving from the custom path

npx dalai serve --home ~/test_dir

Updating to the latest

To make sure you update to the latest, first find the latest version at https://www.npmjs.com/package/dalai

Let's say the latest version is 0.3.0. To update the dalai version, run:

Staying up to date

Have questions or feedback? Follow the project through the following outlets:

GitHub Twitter Discord


dalai's People

Contributors

ab-smith avatar anluoridge avatar arjanaswal avatar caetano-dev avatar cdancette avatar cocktailpeanut avatar dev-jelly avatar doocter avatar eliasvincent avatar eltociear avatar francip avatar itspi3141 avatar jaseunda avatar keldenl avatar kou029w avatar marcuswestin avatar matbee-eth avatar metrafonic avatar mommotexx avatar rizenfrmtheashes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dalai's Issues

Requests use 7B model when only 65B model is installed

Installed Dalai via "npx dalai llama 65B" to only download the 65B parameter model. This successfully downloaded and quantized.
Once this completed I ran "npx dalai serve" to run the web page.

The web page successfully runs, and from the "models" drop down only 65B is available.

When I enter a request and hit autocomplete nothing happens. Looking at the server output for the query, it lists "model: '7B' " for every request I make. At the very bottom there is another line that states "models: ['65B']" but every request still appears to be attempting to use the 7B model. I believe this may be why they're failing for me

How to serve another model than 7B

Hello, thanks for the awesome work!
Is it possible to run something like npx dalai serve 3000 13B in order to serve the 13B model (or similar) on port 3000?
It seems that there is currently no easy way to do it, or am I missing it?

Any feedback would be much appreciated!

Errors ignored when download fails

downloading checklist.chk 100%[============================================================================>] done
downloading params.json 100%[==============================================================================>] done
Error: abortednsolidated.00.pth   5%[==>                                                                    ] in 107min
    at connResetException (node:internal/errors:717:14)
    at TLSSocket.socketCloseListener (node:_http_client:456:19)
    at TLSSocket.emit (node:events:524:35)
    at node:net:316:12
    at TCP.done (node:_tls_wrap:588:7) {
  code: 'ECONNRESET'
}
downloading consolidated.00.pth 100%[======================================================================>] done
Downloading consolidated.01.pth   1%[                                                                       ] in 2hours

If a download fails, the install continues regardless. This can be hard to notice, so it would be good if it didn't ignore the error.

No Targets found in make

System

Intel Based Mac running ventura

Error Message

exec: make in /Users/jeremy.bolster/llama.cpp
make
exit
bash-5.2$ make
make: *** No targets specified and no makefile found.  Stop.
bash-5.2$ exit
exit
/Users/me/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:153
      throw new Error("running 'make' failed")
            ^

Error: running 'make' failed
    at Dalai.install (/Users/me/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:153:13)

Workaround

  1. Run make manually after error message
  2. Remove the make command from the npm cache
  3. Run npx dalai llama to continue installing

Installation - SyntaxError: Unexpected token '?'

$ npx dalai llama
Need to install the following packages:
  dalai
Ok to proceed? (y) y
npm WARN EBADENGINE Unsupported engine {
npm WARN EBADENGINE   package: '[email protected]',
npm WARN EBADENGINE   required: { node: '>=16.13.0' },
npm WARN EBADENGINE   current: { node: 'v12.22.9', npm: '8.5.1' }
npm WARN EBADENGINE }
npm WARN EBADENGINE Unsupported engine {
npm WARN EBADENGINE   package: '[email protected]',
npm WARN EBADENGINE   required: { node: '>=16.13.0' },
npm WARN EBADENGINE   current: { node: 'v12.22.9', npm: '8.5.1' }
npm WARN EBADENGINE }
npm WARN EBADENGINE Unsupported engine {
npm WARN EBADENGINE   package: '[email protected]',
npm WARN EBADENGINE   required: { node: '>=14.15.0' },
npm WARN EBADENGINE   current: { node: 'v12.22.9', npm: '8.5.1' }
npm WARN EBADENGINE }
/home/gbajson/.npm/_npx/3c737cbb02d79cc9/node_modules/string-kit/lib/unicode.js:237
                return emojiWidthLookup.get( code ) ?? 2 ;
                                                     ^

SyntaxError: Unexpected token '?'
    at wrapSafe (internal/modules/cjs/loader.js:915:16)
    at Module._compile (internal/modules/cjs/loader.js:963:27)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)
    at Module.load (internal/modules/cjs/loader.js:863:32)
    at Function.Module._load (internal/modules/cjs/loader.js:708:14)
    at Module.require (internal/modules/cjs/loader.js:887:19)
    at require (internal/modules/cjs/helpers.js:74:18)
    at Object.<anonymous> (/home/gbajson/.npm/_npx/3c737cbb02d79cc9/node_modules/string-kit/lib/string.js:54:13)
    at Module._compile (internal/modules/cjs/loader.js:999:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)
(⎈|concierge-dev-02:dev-grzeg) (main) gbajson@zauek:~$ npx dalai llama 7B 13B 30B 65B
/home/gbajson/.npm/_npx/3c737cbb02d79cc9/node_modules/string-kit/lib/unicode.js:237
                return emojiWidthLookup.get( code ) ?? 2 ;
                                                     ^

SyntaxError: Unexpected token '?'
    at wrapSafe (internal/modules/cjs/loader.js:915:16)
    at Module._compile (internal/modules/cjs/loader.js:963:27)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)
    at Module.load (internal/modules/cjs/loader.js:863:32)
    at Function.Module._load (internal/modules/cjs/loader.js:708:14)
    at Module.require (internal/modules/cjs/loader.js:887:19)
    at require (internal/modules/cjs/helpers.js:74:18)
    at Object.<anonymous> (/home/gbajson/.npm/_npx/3c737cbb02d79cc9/node_modules/string-kit/lib/string.js:54:13)
    at Module._compile (internal/modules/cjs/loader.js:999:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)

"Failed to quantize model" with every model but the 7B one

With the previous version I successfully downloaded the 65B model and quantized it. With the latest version I get this error with every model but the 7B one:
bash-3.2$ ./quantize ./models/13B/ggml-model-f16.bin.1 ./models/13B/ggml-model-q4_0.bin.1 2 llama_model_quantize: loading model from './models/13B/ggml-model-f16.bin.1' llama_model_quantize: failed to open "./models/13B/ggml-model-f16.bin.1' for reading main: failed to quantize model from './models/13B/ggml-model-f16.bin.1'

In the models folder the only file with this kind of name present is ggml-model-f16.bin, without the numbers in the end, and this one is quantized without errors, so it seems that the issue is that the script does not create these successive ggml-model-f16.bin.1, .2, .3 files and so on.

I am using a MacBook Pro with M1 Pro and 16GB of RAM, macOS v13.2.1

npx dalai serve not working, no error output (Ubuntu 18.04, [email protected])

Hello,

Thank you for making this!

I was just trying it out on a Ubuntu 18.04 server, and the 2nd command npx dalai serve didn't seem to work on my server (I've run npx dalai llama).

Some command history and outputs:

(base) ➜  ~ npx dalai serve
Need to install the following packages:
  [email protected]
Ok to proceed? (y) 
(base) ➜  ~ lsof -i :3000
(base) ➜  ~ sudo netstat -plnt | grep :3000

(base) ➜  ~ node -v # node is installed via snap
v18.15.0
(base) ➜  ~ npm -v
9.5.0
(base) ➜  ~ 

Screenshot

CleanShot-2023-03-13T10-54-32@2x

Document System Requirements

Firstly, this looks super helpful, thank you :)

Given my limited compute (only a relatively dated CPU), could you document system requirements.

  • How much RAM is necessary? How much is recommended?
  • Is there a minimum CPU requirement (or architecture/GPU...)? (I assume this could work on x86, what about aarch64?)
  • At the minimum specs, what sort of speed token generation is to be expected?

Even a simple list of tokens per second on various metal configurations would be helpful. Thanks!

npx dalai llama: FileNotFoundError: [Errno 2] No such file or directory: 'models/7B//consolidated.00.pth'

When running npx dalai llama the following error is found:

  • it downloads the files correctly (consolidated.00.pth)
  • tries to execute python3 convert-pth-to-ggml.py models/7B/ 1
  • cannot find file FileNotFoundError: [Errno 2] No such file or directory: 'models/7B//consolidated.00.pth'
downloading consolidated.00.pth 100%[======================================================================>] done      
downloading tokenizer_checklist.chk 100%[==================================================================>] done      
downloading tokenizer.model 100%[==========================================================================>] done      
python3 convert-pth-to-ggml.py models/7B/ 1
exit

The default interactive shell is now zsh.
To update your account to use zsh, please run `chsh -s /bin/zsh`.
For more details, please visit https://support.apple.com/kb/HT208050.
bash-3.2$ python3 convert-pth-to-ggml.py models/7B/ 1
{'dim': 4096, 'multiple_of': 256, 'n_heads': 32, 'n_layers': 32, 'norm_eps': 1e-06, 'vocab_size': 32000}
n_parts =  1
Processing part  0
Traceback (most recent call last):
  File "/Users/miguel_lemos/dalai/convert-pth-to-ggml.py", line 89, in <module>
    model = torch.load(fname_model, map_location="cpu")
  File "/Users/miguel_lemos/Library/Python/3.9/lib/python/site-packages/torch/serialization.py", line 771, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/Users/miguel_lemos/Library/Python/3.9/lib/python/site-packages/torch/serialization.py", line 270, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/Users/miguel_lemos/Library/Python/3.9/lib/python/site-packages/torch/serialization.py", line 251, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'models/7B//consolidated.00.pth'

Hardware Overview:

Model Name: MacBook Pro
Model Identifier: Mac14,6
Model Number: Z176000J2KS/A
Chip: Apple M2 Max
Total Number of Cores: 12 (8 performance and 4 efficiency)
Memory: 64 GB
System Firmware Version: 8419.80.7
OS Loader Version: 8419.80.7
Serial Number (system): VC41C9DXRY
Hardware UUID: 5E24912F-6D1E-5098-81AE-841CE11FB23F
Provisioning UDID: 00006021-0014199E0187401E
Activation Lock Status: Enabled

Preserve newlines and other formatting

When using dalai, it strips newlines among other things. I believe this is so that it works in shell (not that you can't pass arguments with newlines, just needs quoting).

I propose the following:

  • store the prompt in a file (or use pipes for Unix) and use llama.cpp's -f instead of -p

This has the advantage of not needing to worry about escaping/sanitizing user input and will fix other issues I've observed, like:

  • it not being able to parse prompts with `
  • prompts with $, ect

Ubuntu 20.04 error: Unexpected token '?'

On Ubuntu 20.04 I get the following warnings and error when trying to run npx dalai llama:

(base) riccardo@riccardo-Aspire-A317-51G:~$ npx dalai llama
../src/unix/pty.cc: In function ‘void pty_after_waitpid(uv_async_t*)’:
../src/unix/pty.cc:512:43: warning: ‘void* memset(void*, int, size_t)’ writing to an object of type ‘class Nan::Persistent<v8::Function>’ with no trivial copy-assignment [-Wclass-memaccess]
  512 |   memset(&baton->cb, -1, sizeof(baton->cb));
      |                                           ^
In file included from ../../nan/nan.h:409,
                 from ../src/unix/pty.cc:20:
../../nan/nan_persistent_12_inl.h:12:40: note: ‘class Nan::Persistent<v8::Function>’ declared here
   12 | template<typename T, typename M> class Persistent :
      |                                        ^~~~~~~~~~
In file included from ../../nan/nan.h:60,
                 from ../src/unix/pty.cc:20:
../src/unix/pty.cc: At global scope:
/home/riccardo/.cache/node-gyp/12.22.12/include/node/node.h:736:43: warning: cast between incompatible function types from ‘void (*)(Nan::ADDON_REGISTER_FUNCTION_ARGS_TYPE)’ {aka ‘void (*)(v8::Local<v8::Object>)’} to ‘node::addon_register_func’ {aka ‘void (*)(v8::Local<v8::Object>, v8::Local<v8::Value>, void*)’} [-Wcast-function-type]
  736 |       (node::addon_register_func) (regfunc),                          \
      |                                           ^
/home/riccardo/.cache/node-gyp/12.22.12/include/node/node.h:770:3: note: in expansion of macro ‘NODE_MODULE_X’
  770 |   NODE_MODULE_X(modname, regfunc, NULL, 0)  // NOLINT (readability/null_usage)
      |   ^~~~~~~~~~~~~
../src/unix/pty.cc:734:1: note: in expansion of macro ‘NODE_MODULE’
  734 | NODE_MODULE(pty, init)
      | ^~~~~~~~~~~
Unexpected token '?'

Node and npx versions:

(base) riccardo@riccardo-Aspire-A317-51G:~$ node -v
v12.22.12
(base) riccardo@riccardo-Aspire-A317-51G:~$ npx -v
6.14.16

main: failed to quantize model from './models/7B/ggml-model-f16.bin'

It looks like this assumes numpy is installed but does not install it if it is not. I don't notice a mention of needing numpy to be installed before running in the README. Maybe a doc update and check / exit if not installed would be sufficient.

This does try to install numpy and torch but if that fails for some reason the build continues and then fails at the end.

The server also starts without error in this case... but has no response to any prompts (error or output).

1st Error - This should probably stop the build when it fails

pip install torch torchvision torchaudio sentencepiece numpy
exit

The default interactive shell is now zsh.
To update your account to use zsh, please run `chsh -s /bin/zsh`.
For more details, please visit https://support.apple.com/kb/HT208050.
bash-3.2$ pip install torch torchvision torchaudio sentencepiece numpy
ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch

2nd Error

bash-3.2$ python3 convert-pth-to-ggml.py models/7B/ 1
Traceback (most recent call last):
  File "/Users/huntharo/dalai/convert-pth-to-ggml.py", line 23, in <module>
    import numpy as np
ModuleNotFoundError: No module named 'numpy'

Clean Up?

Thanks for making this!

I'm done messing around for now, and want to clean up. I found a ~/llama.cpp--are there any other files I should remove from my system?

GPU usage

Hi there.
Thanks for sharing this project.

Does this use the GPU? or just the CPU?

mention size in readme?

idk how much disk space is required.

would love to know to see if i can actually try it or not?

i don't understand 4b, 7b, etc... so not sure of the size either.

'Make" not recognized

Getting the error below, continues running after displaying error, but when it finishes and I open local and enter prompt, getting no response

make : The term 'make' is not recognized as the name of a cmdlet, function,
script file, or operable program.

[Error?] No installation started, no downloads started after installing Node.js and running npx dalai llama 65B

image
It seems to start the install as I see some IdealTree output but then it just exits. No further explanation, no errors, nothing installed, nothing downloaded.
image

I am on Windows 10 and just installed Node.js 18.15.0 and npm recommended upgrading to 9.6.1 but aside from that I've seen this do nothing.

It's worth noting that the same thing happens when running only npx dalai llama instead of specifying a model

Workaround for `pip3 install torch` fails on MacOS 12 on Intel CPU with Python 3.11

Issue: pip3 install torch fails on MacOS 12 on Intel CPU with Python 3.11

Environment

  • System Version: macOS 12.6.3
  • Processor Name: Dual-Core Intel Core i5
  • Python version: 3.11.0

Error

dalai/index.js

Line 80 in 07412c4

await this.exec("pip3 install torch torchvision torchaudio sentencepiece numpy")

When attempting to install torch, torchvision, torchaudio, sentencepiece, and numpy using pip3 install command, the following error occurs:

ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch

pip3 install falls back to pip install, which may point to a Python 2 version of pip, leading to further issues.

Solution

The issue is related to torch, Python 3.11 is not supported for macOS 12 on Intel CPU.

An easy workaround is to use Python 3.10 and invoke the npx dalai llama command within a virtualenv.

To do so, follow these steps:

  1. Create a virtualenv using Python 3.10
mkdir dalai-venv-py310
cd dalai-venv-py310
pip3 install virtualenv
virtualenv venv --python /usr/local/Cellar/[email protected]/3.10.10_1/bin/python3.10
source venv/bin/activate
pip3 --version
  1. Run npx dalai llama inside the virtualenv
npx dalai llama

torch will be installed successfully, and the installation process will continue.

Error: abortedecklist.chk

make: Nothing to be done for 'default'.
root@ubuntu-s-4vcpu-8gb-intel-sgp1-01:~/llama.cpp# exit
exit
Download model 7B

^[[48;1Ring checklist.chk   0% /                                                                                 downloading checklist.chk 100%[======================================================================>] done     Error: abortedecklist.chk 100%[======================================================================>] done         at connResetException (node:internal/errors:704:14)
    at TLSSocket.socketCloseListener (node:_http_client:441:19)
    at TLSSocket.emit (node:events:525:35)
    at node:net:758:14
    at TCP.done (node:_tls_wrap:583:7) {
  code: 'ECONNRESET'
}



exec: /root/llama.cpp/venv/bin/python convert-pth-to-ggml.py models/7B/ 1 in /root/llama.cpp
/root/llama.cpp/venv/bin/python convert-pth-to-ggml.py models/7B/ 1
exit
root@ubuntu-s-4vcpu-8gb-intel-sgp1-01:~/llama.cpp# /root/llama.cpp/venv/bin/python convert-pth-to-ggml.py models/7B/ 1
{'dim': 4096, 'multiple_of': 256, 'n_heads': 32, 'n_layers': 32, 'norm_eps': 1e-06, 'vocab_size': 32000}
n_parts =  1
Processing part  0
Traceback (most recent call last):
  File "/root/llama.cpp/convert-pth-to-ggml.py", line 88, in <module>
    model = torch.load(fname_model, map_location="cpu")
  File "/root/llama.cpp/venv/lib/python3.10/site-packages/torch/serialization.py", line 791, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/root/llama.cpp/venv/lib/python3.10/site-packages/torch/serialization.py", line 271, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/root/llama.cpp/venv/lib/python3.10/site-packages/torch/serialization.py", line 252, in __init__
    super().__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'models/7B//consolidated.00.pth'

File does not exist when

Hi, I tried running

npx dalai llama
npx dalai serve

And I get this when I try to write anything in the textbox. Is there a setup step missing? There is indeed no ggml-model-q4_0.bin file in the 7B folder.

image

Improvement: Dependency Check

I wasn't able to install dalai because I was using npm 9.4. After updating to latest (9.6.1) it works fine.

Adding a check or error-message would be great as it wasn't clear why the installation was stuck.

AssertionError when running npx dalai llama

Running npx dalai llama results in the following output:

ERROR: Exception:
Traceback (most recent call last):
  File "/home/null/llama.cpp/venv/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 165, in exc_logging_wrapper
    status = run_func(*args)
  File "/home/null/llama.cpp/venv/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 205, in wrapper
    return func(self, options, args)
  File "/home/null/llama.cpp/venv/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 389, in run
    to_install = resolver.get_installation_order(requirement_set)
  File "/home/null/llama.cpp/venv/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 188, in get_installation_order
    weights = get_topological_weights(
  File "/home/null/llama.cpp/venv/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 276, in get_topological_weights
    assert len(weights) == expected_node_count
AssertionError

Any way to fix this? This is perhaps not the intended behavior.

No output after submitting prompt

No errors whatsoever occurred while installing and setting up dalai
But there's no output after submitting the prompt on the Web UI.

7B Model:
Screenshot 2023-03-13 at 10 10 48 PM

Terminal:
Screenshot 2023-03-13 at 9 56 52 PM

Browser:
There seems to be a warning message associated with socket.io
Screenshot 2023-03-13 at 9 54 16 PM

(Running on M1 MacBook Air)

make ggml.c fails with implicit declaration of function

Getting this when running npx dalai llama:

jamesyu:~/llama.cpp(master) $ make
I llama.cpp build info: 
I UNAME_S:  Darwin
I UNAME_P:  arm
I UNAME_M:  arm64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -pthread -DGGML_USE_ACCELERATE
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread
I LDFLAGS:   -framework Accelerate
I CC:       Apple clang version 12.0.0 (clang-1200.0.32.29)
I CXX:      Apple clang version 12.0.0 (clang-1200.0.32.29)

cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -pthread -DGGML_USE_ACCELERATE   -c ggml.c -o ggml.o
ggml.c:1364:25: error: implicit declaration of function 'vdotq_s32' is invalid
      in C99 [-Werror,-Wimplicit-function-declaration]
        int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls);
                        ^
ggml.c:1364:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values)
      with an expression of incompatible type 'int'
        int32x4_t p_0 = vdotq_s32(vdupq_n_s32(0), v0_0ls, v1_0ls);
                  ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ggml.c:1365:19: error: initializing 'int32x4_t' (vector of 4 'int32_t' values)
      with an expression of incompatible type 'int'
        int32x4_t p_1 = vdotq_s32(vdupq_n_s32(0), v0_1ls, v1_1ls);
                  ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 errors generated.
make: *** [ggml.o] Error 1
jamesyu:~/llama.cpp(master) $ exit
exit
/Users/jamesyu/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:153
      throw new Error("running 'make' failed")
            ^

Error: running 'make' failed
    at Dalai.install (/Users/jamesyu/.npm/_npx/3c737cbb02d79cc9/node_modules/dalai/index.js:153:13)

Node.js v18.14.1

Any ideas?

Memory used all up when downloading model

When downloading the model it seems like it wants to load the whole file into RAM before writing it to the disk. Could the download method be changed, so that it gradually downloads and saves the file, without putting it completely to memory.

This is how it looks like on my machine:

grafik

while downloading the model:
grafik

The memory consumption grew gradually during the download till it stops working because the memory is full.

Error: Cannot find module 'node-pty'

Running npx dalai llama just spits this error at me.

node:internal/modules/cjs/loader:1078
  throw err;
  ^

Error: Cannot find module 'node-pty'
Require stack:
- C:\Llama\dalai\index.js
- C:\Llama\dalai\bin\cli.js
    at Module._resolveFilename (node:internal/modules/cjs/loader:1075:15)
    at Module._load (node:internal/modules/cjs/loader:920:27)
    at Module.require (node:internal/modules/cjs/loader:1141:19)
    at require (node:internal/modules/cjs/helpers:110:18)
    at Object.<anonymous> (C:\Llama\dalai\index.js:2:13)
    at Module._compile (node:internal/modules/cjs/loader:1254:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1308:10)
    at Module.load (node:internal/modules/cjs/loader:1117:32)
    at Module._load (node:internal/modules/cjs/loader:958:12)
    at Module.require (node:internal/modules/cjs/loader:1141:19) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [ 'C:\\Llama\\dalai\\index.js', 'C:\\Llama\\dalai\\bin\\cli.js' ]
}

Node.js v18.14.2```

Unable to install LLaMa on Windows 11 using Node.js 18.15.0 and npm 9.6.1

Description

I'm trying to install LLaMa on my Windows 11 machine using Node.js 18.15.0 and npm 9.6.1 installed through Volta. However, when I run the command npx dalai llama, I receive the following error:

Need to install the following packages:
  [email protected]
Ok to proceed? (y)
npm WARN cleanup Failed to remove some directories [
npm WARN cleanup   [
npm WARN cleanup     'C:\\Users\\sky10\\AppData\\Local\\npm-cache\\_npx\\3c737cbb02d79cc9\\node_modules',
npm WARN cleanup     [Error: EPERM: operation not permitted, rmdir 'C:\Users\sky10\AppData\Local\npm-cache\_npx\3c737cbb02d79cc9\node_modules\engine.io-client\node_modules'] {
npm WARN cleanup       errno: -4048,
npm WARN cleanup       code: 'EPERM',
npm WARN cleanup       syscall: 'rmdir',
npm WARN cleanup       path: 'C:\\Users\\sky10\\AppData\\Local\\npm-cache\\_npx\\3c737cbb02d79cc9\\node_modules\\engine.io-client\\node_modules'
npm WARN cleanup     }
npm WARN cleanup   ]
npm WARN cleanup ]

I have tried running the command with administrator privileges, as well as manually deleting the directory mentioned in the error message, but the issue persists. I have also tried deleting the node_modules folder in my project directory and reinstalling, but that did not solve the problem either.

Can someone please help me resolve this issue so I can successfully install and use LLaMa?

Environment:

  • Operating system: Windows 11
  • Node.js version: 18.15.0 (installed through Volta)
  • npm version: 9.6.1

Thanks in advance for your help!

`npx dalai install 65B` hangs at checkpoint download

It reaches

downloading checklist.chk 100%[============================================================================>] done

I don't think it actually download the model / any checkpoint, it just hung immediately at 100%.

This is on Ubuntu 22.10, npm 8.18, ndoe 18.7

Unexpected token ?

Error when installing any of the versions in ubuntu 20.04 LTS, it doesn't matter whether is the old version or the new one:
../src/unix/pty.cc: In function ‘void pty_after_waitpid(uv_async_t*)’: ../src/unix/pty.cc:512:43: warning: ‘void* memset(void*, int, size_t)’ writing to an object of type ‘class Nan::Persistent<v8::Function>’ with no trivial copy-assignment [-Wclass-memaccess] 512 | memset(&baton->cb, -1, sizeof(baton->cb)); | ^ In file included from ../../nan/nan.h:409, from ../src/unix/pty.cc:20: ../../nan/nan_persistent_12_inl.h:12:40: note: ‘class Nan::Persistent<v8::Function>’ declared here 12 | template<typename T, typename M> class Persistent : | ^~~~~~~~~~ In file included from ../../nan/nan.h:60, from ../src/unix/pty.cc:20: ../src/unix/pty.cc: At global scope: /usr/include/nodejs/src/node.h:573:43: warning: cast between incompatible function types from ‘void (*)(Nan::ADDON_REGISTER_FUNCTION_ARGS_TYPE)’ {aka ‘void (*)(v8::Local<v8::Object>)’} to ‘node::addon_register_func’ {aka ‘void (*)(v8::Local<v8::Object>, v8::Local<v8::Value>, void*)’} [-Wcast-function-type] 573 | (node::addon_register_func) (regfunc), \ | ^ /usr/include/nodejs/src/node.h:607:3: note: in expansion of macro ‘NODE_MODULE_X’ 607 | NODE_MODULE_X(modname, regfunc, NULL, 0) // NOLINT (readability/null_usage) | ^~~~~~~~~~~~~ ../src/unix/pty.cc:734:1: note: in expansion of macro ‘NODE_MODULE’ 734 | NODE_MODULE(pty, init) | ^~~~~~~~~~~ In file included from /usr/include/nodejs/src/node.h:63, from ../../nan/nan.h:60, from ../src/unix/pty.cc:20: /usr/include/nodejs/deps/v8/include/v8.h: In instantiation of ‘void v8::PersistentBase<T>::SetWeak(P*, typename v8::WeakCallbackInfo<P>::Callback, v8::WeakCallbackType) [with P = node::ObjectWrap; T = v8::Object; typename v8::WeakCallbackInfo<P>::Callback = void (*)(const v8::WeakCallbackInfo<node::ObjectWrap>&)]’: /usr/include/nodejs/src/node_object_wrap.h:84:78: required from here /usr/include/nodejs/deps/v8/include/v8.h:9502:16: warning: cast between incompatible function types from ‘v8::WeakCallbackInfo<node::ObjectWrap>::Callback’ {aka ‘void (*)(const v8::WeakCallbackInfo<node::ObjectWrap>&)’} to ‘Callback’ {aka ‘void (*)(const v8::WeakCallbackInfo<void>&)’} [-Wcast-function-type] 9502 | reinterpret_cast<Callback>(callback), type); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/include/nodejs/deps/v8/include/v8.h: In instantiation of ‘void v8::PersistentBase<T>::SetWeak(P*, typename v8::WeakCallbackInfo<P>::Callback, v8::WeakCallbackType) [with P = Nan::ObjectWrap; T = v8::Object; typename v8::WeakCallbackInfo<P>::Callback = void (*)(const v8::WeakCallbackInfo<Nan::ObjectWrap>&)]’: ../../nan/nan_object_wrap.h:65:61: required from here /usr/include/nodejs/deps/v8/include/v8.h:9502:16: warning: cast between incompatible function types from ‘v8::WeakCallbackInfo<Nan::ObjectWrap>::Callback’ {aka ‘void (*)(const v8::WeakCallbackInfo<Nan::ObjectWrap>&)’} to ‘Callback’ {aka ‘void (*)(const v8::WeakCallbackInfo<void>&)’} [-Wcast-function-type] Unexpected token ?

Error with current Python

I have a Python 3.11 installed by default. It seems that pytorch does not work, this is the output of npx dalai llama:

pip3 install torch torchvision torchaudio sentencepiece numpy
ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch

I think the reason is this:

$ pip3 --version
pip 23.0.1 from /opt/homebrew/lib/python3.11/site-packages/pip (python 3.11)
$ pip3.10 --version
pip 23.0.1 from /opt/homebrew/lib/python3.10/site-packages/pip (python 3.10)

I think the install package should check for Python version and run 3.10, because 3.11 is now installed by default with homebrew.

Doubts about system requirements

Hi, what kind of computational requirements is required to run this thing effectively? My system has integrated Intel Iris XE graphics so no Nvidea GPU.

Error: Cannot find module '../build/Debug/pty.node' or '../build/Release/pty.node'

I'm encountering an issue when trying to use the dalai package on my M2 MacBook Air with Node v18.15.0 (installed using nvm).

When running npx dalai llama, I receive the following error:

innerError Error: Cannot find module '../build/Debug/pty.node'

Error: Cannot find module '../build/Release/pty.node'

I have tried the following steps to resolve the issue:

  1. Installed the latest version of Node (v19.7.0), but the issue persisted.
  2. Switched back to Node v18.15.0 (LTS) using nvm.
  3. Reinstalled the dalai package.

Unfortunately, none of these steps resolved the issue. I am still encountering the same error when trying to use dalai.

It seems there might be a compatibility issue with the node-pty module, or some other dependency/environment setting that I am missing. Any guidance on resolving this issue would be greatly appreciated.

Environment details:

  • OS: macOS 13.3 (M2 MacBook)
  • Node: v18.15.0 (LTS)
  • XCode installed

Thank you for your assistance.

Autocomplete not working due to Socket.io issues

Description

Running the server with npx dalai serve and clicking "autocomplete" doesn't work

Expected Behavior

Clicking autocomplete after typing in text leads to nothing happening.

Actual Behavior

2023-03-12 17 18 44

Nothing renders on page. There's a suspicious error in console about socket.io
Console error:
Failed to load resource: the server responded with a status of 404 (Not Found) http://localhost:3000/socket.io.min.js.map

Possible Solution

[If you have an idea for how to fix the issue, describe it here.]

Steps to Reproduce

  1. npx dalai llama
  2. npx dalai serve
  3. Try to use autocomplete (and fail).

Additional Information

Environment

Running on an M1 MBP. Tried Safari and Chrome on localhost:3000

  • OS: MacOS Ventura
  • Browser: Safari/Chrome
  • Version:

Screenshots

image

Related Issues

Labels

  • bug

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.