I am running this in Colab with their free tier GPU (15GB), using WizardLM-13B-1.0.ggm

No progress after generating initial chain of prompts about llm-search HOT 7 CLOSED

snexus commented on August 23, 2024

No progress after generating initial chain of prompts

from llm-search.

Comments (7)

snexus commented on August 23, 2024 1

It is the absolute number of layers and depends on the actual model architecture. When the model is loaded, in this case using llamacpp, you can see it in the log (see screenshot attached).

So in the example below, the model consists of 43 layers, and 15 were offloaded to GPU. You can then check VRAM usage and adjust n_gpu_layers accordingly. You potentially will need more memory than it's currently stated, depending on the context length and the embedding model used (which also requires GPU in most cases)

from llm-search.

snexus commented on August 23, 2024 1

I've created a demo notebook on how to run it on Google Colab (free tier) - https://github.com/snexus/llm-search/blob/main/notebooks/llmsearch_google_colab_demo.ipynb

from llm-search.

ziptron commented on August 23, 2024 1

Wow thanks so much! I tried this out this morning and it works well! I may not have been setting the variables (below) correctly, or at all to be honest.

%env CMAKE_ARGS="-DLLAMA_CUBLAS=on"

%env FORCE_CMAKE=1

Thanks for making this project and for your help.

from llm-search.

snexus commented on August 23, 2024

Hi,

I never tried to run it on Google Colab, 15GB should be enough for this model - I can run it locally on 10GB VRAM card (with half of the layers offloaded to CPU).
If you are still stuck - do you mind posting your model's section of config.yaml and I will try to reproduce it?

from llm-search.

ziptron commented on August 23, 2024

Thanks for responding. I do think this may be a Colab issue, so I'll keep trying today and post results later.

By the way, stupid question, how do you know how many "layers" there are? I've been fiddling with the n_gpu_layers parameter, but I cannot quite understand what that means. Does 50 mean 50% (half), or is that a unit of layers? If you could point me towards some info on that I'd much appreciate it.

Thanks!

from llm-search.

ziptron commented on August 23, 2024

This screen shot made me realize that I am not offloading anything to the GPU. See mine below.

I had some errors while installing (see below). Should I try to resolve these errors you think? Or is there a different way to diagnose why I'm not offloading to the GPU?

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests==2.27.1, but you have requests 2.29.0 which is incompatible.
tensorflow 2.12.0 requires protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3, but you have protobuf 3.20.2 which is incompatible.
tensorflow-metadata 1.13.1 requires protobuf<5,>=3.20.3, but you have protobuf 3.20.2 which is incompatible.
torchaudio 2.0.2+cu118 requires torch==2.0.1, but you have torch 2.0.0 which is incompatible.
torchdata 0.6.1 requires torch==2.0.1, but you have torch 2.0.0 which is incompatible.
torchtext 0.15.2 requires torch==2.0.1, but you have torch 2.0.0 which is incompatible.
Successfully installed InstructorEmbedding-1.0.1 XlsxWriter-3.1.2 accelerate-0.19.0 argilla-1.13.3 auto-gptq-0.3.0 backoff-2.2.1 bitsandbytes-0.41.0 chromadb-0.3.26 clickhouse-connect-0.6.8 coloredlogs-15.0.1 cryptography-41.0.2 dataclasses-json-0.5.14 datasets-2.14.2 deprecated-1.2.14 dill-0.3.7 diskcache-5.6.1 einops-0.6.1 fastapi-0.95.1 filetype-1.2.0 gitdb-4.0.10 gitpython-3.1.32 h11-0.14.0 hnswlib-0.7.0 httpcore-0.16.3 httptools-0.6.0 httpx-0.23.3 huggingface-hub-0.16.4 humanfriendly-10.0 langchain-0.0.219 langchainplus-sdk-0.0.20 llama-cpp-python-0.1.77 llama-index-0.6.9 llmsearch-0.1.dev74+g7207a16.d20230801 loguru-0.7.0 lz4-4.3.2 marshmallow-3.20.1 monotonic-1.6 msg-parser-1.2.0 multiprocess-0.70.15 mypy-extensions-1.0.0 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 olefile-0.46 onnxruntime-1.15.1 openai-0.27.8 openapi-schema-pydantic-1.2.4 overrides-7.3.1 pdf2image-1.16.3 pdfminer.six-20221105 peft-0.4.0 posthog-3.0.1 protobuf-3.20.2 pulsar-client-3.2.0 pydeck-0.8.1b0 pympler-1.0.1 pymupdf-1.22.5 pypandoc-1.11 pypdf2-3.0.1 python-docx-0.8.11 python-dotenv-1.0.0 python-magic-0.4.27 python-pptx-0.6.21 pytz-deprecation-shim-0.1.0.post0 requests-2.29.0 rfc3986-1.5.0 rouge-1.0.1 safetensors-0.3.1 sentence-transformers-2.2.2 sentencepiece-0.1.99 smmap-5.0.0 sqlalchemy-1.4.48 starlette-0.26.1 streamlit-1.24.1 threadpoolctl-3.1.0 tiktoken-0.3.3 tokenizers-0.13.3 torch-2.0.0 torchvision-0.15.1 transformers-4.29.2 typer-0.7.0 typing-inspect-0.9.0 tzdata-2023.3 tzlocal-4.3.1 unstructured-0.7.8 uvicorn-0.23.2 uvloop-0.17.0 validators-0.20.0 watchdog-3.0.0 watchfiles-0.19.0 websockets-11.0.3 xxhash-3.3.0 zstandard-0.21.0

WARNING: The following packages were previously imported in this runtime:
  [google]
You must restart the runtime in order to use newly installed versions.

from llm-search.

snexus commented on August 23, 2024

Sorry that you are facing problems.

It looks llamacpp was built without GPU support during the installation, that's why you don't see it in the output. Will need to investigate how to enable it in the Colab environment,

On a local GPU-enabled computer, assuming all the prerequisites are installed, llamacpp needs the flags described in https://github.com/ggerganov/llama.cpp#cublas in order to build with GPU support.

In this repository, these flags are set using setvars.sh before the installation (it is also described in README).

from llm-search.

No progress after generating initial chain of prompts about llm-search HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent