This issue somewhat overlaps with <a class="issue-link js-issue-link" data-error-text=

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Thank you for the follow-up <a class="user-mention notranslate" data-hovercard-type="u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

NeMo-Guardrails does not work with many other LLM providers about nemo-guardrails HOT 13 OPEN

nvidia commented on July 18, 2024

NeMo-Guardrails does not work with many other LLM providers

from nemo-guardrails.

Comments (13)

drazvan commented on July 18, 2024 5

Hi @serhatgktp! Yes, at the end of May we made it technically possible to connect multiple types of LLMs. But, in order to get them to work properly the prompts would also need to be tweaked. We've been working on a set of mechanisms for that. They will be pushed to the repo next week and released to PyPI at the end of next week as 0.3.0. We did manage to successfully use huggingface_pipeline. I will look into your particular config and come back to you (there seems to be a different type of issue). Thanks!

from nemo-guardrails.

serhatgktp commented on July 18, 2024 1

@serhatgktp I'm getting the same response when using max_new_tokens too. Have you tried using other models other than Dolly? And did you find any success?

@QUANGLEA I've tried several other models, mainly the promising models from Hugging Face Hub such as Falcon. Unfortunately, I haven't been able to get any of them to work yet.

from nemo-guardrails.

serhatgktp commented on July 18, 2024 1

Thank you for the follow-up @drazvan. I've been trying to choose LLMs that are seemingly powerful, accurate, and not too large.

The following two are the ones I'm most interested in at the moment:

tiiuae/falcon-7b-instruct (available on hugging face hub)
gpt4all-j-v1.3-groovy (available through gpt4all)

1) Issue with Hugging Face Hub Models

a) Using the Built-in Feature for Hugging Face Hub

The built-in support for Hugging Face Hub currently has bugs as it is giving me the following error:

Configuration:

models:
  - type: main
    engine: huggingface_hub
    model: tiiuae/falcon-7b-instruct

Error:

Error argument of type 'NoneType' is not iterable while execution generate_user_intent
Traceback (most recent call last):
  File "/Users/efkan/.asdf/installs/python/3.10.5/lib/python3.10/site-packages/nemoguardrails/actions/action_dispatcher.py", line 125, in execute_action
    result = await fn(**params)
  File "/Users/efkan/.asdf/installs/python/3.10.5/lib/python3.10/site-packages/nemoguardrails/actions/llm/generation.py", line 257, in generate_user_intent
    with llm_params(self.llm, temperature=self.config.lowest_temperature):
  File "/Users/efkan/.asdf/installs/python/3.10.5/lib/python3.10/site-packages/nemoguardrails/llm/params.py", line 44, in __enter__
    elif hasattr(self.llm, "model_kwargs") and param in getattr(
TypeError: argument of type 'NoneType' is not iterable

it seems that the code below is failing:

            elif hasattr(self.llm, "model_kwargs") and param in getattr(
                self.llm, "model_kwargs", {}
            ):
                self.original_params[param] = self.llm.model_kwargs[param]
                self.llm.model_kwargs[param] = value

because the following line is returning None:

getattr(self.llm, "model_kwargs", {})

b) Importing Hugging Face Hub Models Externally

We can also use custom wrappers to fetch models externally, similar to how it's done here.

However, when we do so with Hugging Face Hub models, we get the following error:

raceback (most recent call last):
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/nemoguardrails/actions/action_dispatcher.py", line 125, in execute_action
    result = await fn(**params)
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/nemoguardrails/actions/llm/generation.py", line 258, in generate_user_intent
    result = await llm_call(self.llm, prompt)
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/nemoguardrails/actions/llm/utils.py", line 31, in llm_call
    result = await llm.agenerate_prompt(
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/langchain/llms/base.py", line 136, in agenerate_prompt
    return await self.agenerate(prompt_strings, stop=stop, callbacks=callbacks)
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/langchain/llms/base.py", line 250, in agenerate
    raise e
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/langchain/llms/base.py", line 244, in agenerate
    await self._agenerate(prompts, stop=stop, run_manager=run_manager)
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/langchain/llms/base.py", line 400, in _agenerate
    else await self._acall(prompt, stop=stop)
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/nemoguardrails/llm/providers.py", line 44, in _acall
    return self._call(*args, **kwargs)
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/langchain/llms/huggingface_hub.py", line 111, in _call
    raise ValueError(f"Error raised by inference API: {response['error']}")
ValueError: Error raised by inference API: Input validation error: `temperature` must be strictly positive

I believe the code attempts to use the LLM with temperature=0 to keep the output as close to the expected format as possible. However, if I'm not mistaken this cannot be done with Hugging Face Hub models as they expect a positive temperature.

2) GPT4All

a) Built-in

It looks like the issue here is related to how the model path is being passed to langchain. It looks like the values dictionary doesn't have a model key.

Configuration:

models:
  - type: main
    engine: gpt4all
    model: gpt4all-j-v1.3-groovy
    
    # (I also tried by using the path to the model)
    # model: ./models/ggml-gpt4all-l13b-snoozy.bin

Error:

Traceback (most recent call last):
  File "/Users/efkan/Desktop/repos/guardrails-demo/demo.py", line 13, in <module>
    demo()
  File "/Users/efkan/Desktop/repos/guardrails-demo/demo.py", line 5, in demo
    rails = LLMRails(config)
  File "/Users/efkan/.asdf/installs/python/3.10.5/lib/python3.10/site-packages/nemoguardrails/rails/llm/llmrails.py", line 79, in __init__
    self._init_llm()
  File "/Users/efkan/.asdf/installs/python/3.10.5/lib/python3.10/site-packages/nemoguardrails/rails/llm/llmrails.py", line 132, in _init_llm
    self.llm = provider_cls(**kwargs)
  File "pydantic/main.py", line 339, in pydantic.main.BaseModel.__init__
  File "pydantic/main.py", line 1102, in pydantic.main.validate_model
  File "/Users/efkan/.asdf/installs/python/3.10.5/lib/python3.10/site-packages/langchain/llms/gpt4all.py", line 170, in validate_environment
    model_path=values["model"],
KeyError: 'model'

b) External

The error is below. Not too sure what this one is about.

Error:

Traceback (most recent call last):
  File "/Users/efkan/anaconda3/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/efkan/anaconda3/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/efkan/.vscode/extensions/ms-python.python-2023.10.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 39, in <module>
    cli.main()
  File "/Users/efkan/.vscode/extensions/ms-python.python-2023.10.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 430, in main
    run()
  File "/Users/efkan/.vscode/extensions/ms-python.python-2023.10.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 284, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/Users/efkan/.vscode/extensions/ms-python.python-2023.10.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 321, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/Users/efkan/.vscode/extensions/ms-python.python-2023.10.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 135, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/Users/efkan/.vscode/extensions/ms-python.python-2023.10.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 124, in _run_code
    exec(code, run_globals)
  File "/Users/efkan/Desktop/repos/ibm-repos-private/demo_guardrails.py", line 17, in <module>
    demo()
  File "/Users/efkan/Desktop/repos/ibm-repos-private/demo_guardrails.py", line 6, in demo
    rails = LLMRails(config)
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/nemoguardrails/rails/llm/llmrails.py", line 79, in __init__
    self._init_llm()
  File "/Users/efkan/anaconda3/lib/python3.10/site-packages/nemoguardrails/rails/llm/llmrails.py", line 143, in _init_llm
    self.llm = provider_cls(**kwargs)
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for GPT4All
__root__
  Model.__init__() got an unexpected keyword argument 'n_parts' (type=type_error)

To conclude, I think all problems mentioned here are worth of having their own issue but that might not be necessary if you are already working on it and it's close to completion.

Thanks!

from nemo-guardrails.

drazvan commented on July 18, 2024

@serhatgktp: you can check out this example: https://github.com/NVIDIA/NeMo-Guardrails/tree/main/examples/llm/hf_pipeline_dolly for how to use HuggingFacePipeline to run a local model. I did not get a chance to look into your specific configuration just yet. Let me know if this helps.

from nemo-guardrails.

serhatgktp commented on July 18, 2024

Thanks @drazvan, this example seems to work! However, I had to modify the configuration as I was getting the following error with regards to the device parameter:

│ /Users/efkan/anaconda3/lib/python3.10/site-packages/langchain/llms/huggingface_pipeline.py:106   │
│ in from_model_id                                                                                 │
│                                                                                                  │
│   103 │   │   │                                                                                  │
│   104 │   │   │   cuda_device_count = torch.cuda.device_count()                                  │
│   105 │   │   │   if device < -1 or (device >= cuda_device_count):                               │
│ ❱ 106 │   │   │   │   raise ValueError(                                                          │
│   107 │   │   │   │   │   f"Got device=={device}, "                                              │
│   108 │   │   │   │   │   f"device is required to be within [-1, {cuda_device_count})"           │
│   109 │   │   │   │   )                                                                          │
ValueError: Got device==0, device is required to be within [-1, 0)

I modified config.py by excluding device from the initialization of llm so that it uses the default value instead. Seen below:

@lru_cache
def get_dolly_v2_3b_llm():
    repo_id = "databricks/dolly-v2-3b"
    params = {"temperature": 0, "max_length": 1024}
    llm = HuggingFacePipeline.from_model_id(
        model_id=repo_id, task="text-generation", model_kwargs=params
    )
    return llm

It seems that my computer does not have any CUDA-enabled GPUs, which causes an issue when we try to select the "first" CUDA device.

Thanks again!

from nemo-guardrails.

QUANGLEA commented on July 18, 2024

@serhatgktp I was wondering if you got this warning when you successfully ran the example.
UserWarning: Using "max_length"'s default (1024) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using "max_new_tokens" to control the maximum length of the generation. warnings.warn(

from nemo-guardrails.

serhatgktp commented on July 18, 2024

@serhatgktp I was wondering if you got this warning when you successfully ran the example. UserWarning: Using "max_length"'s default (1024) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using "max_new_tokens" to control the maximum length of the generation. warnings.warn(

@QUANGLEA Yes, I'm getting that warning as well. However, it seems that using max_new_tokens causes an unexpected keyword error:

TypeError: GPTNeoXForCausalLM.__init__() got an unexpected keyword argument 'max_new_tokens'

I haven't had much time to look into it so I've chosen to ignore it for now.

from nemo-guardrails.

QUANGLEA commented on July 18, 2024

@serhatgktp I'm getting the same response when using max_new_tokens too. Have you tried using other models other than Dolly? And did you find any success?

from nemo-guardrails.

drazvan commented on July 18, 2024

@serhatgktp What LLMs are you interested in? Were there any specific errors? I'm asking because we will be testing a few more LLM providers on our end in the next couple of weeks. So, maybe we can align our efforts.

Thanks!

from nemo-guardrails.

AIAnytime commented on July 18, 2024

Any progress?

from nemo-guardrails.

shikhardadhich commented on July 18, 2024

Any progress?

I also need the same...running Guardrail using local LLM

from nemo-guardrails.

dineshpiyasamara commented on July 18, 2024

ERROR:nemoguardrails.actions.action_dispatcher:Error argument of type 'NoneType' is not iterable while execution generate_user_intent
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/actions/action_dispatcher.py", line 125, in execute_action
    result = await fn(**params)
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/actions/llm/generation.py", line 269, in generate_user_intent
    with llm_params(llm, temperature=self.config.lowest_temperature):
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/llm/params.py", line 44, in __enter__
    elif hasattr(self.llm, "model_kwargs") and param in getattr(
TypeError: argument of type 'NoneType' is not iterable
I'm sorry, an internal error has occurred.

Same here... Any progress?

from nemo-guardrails.

Sudhu2004 commented on July 18, 2024

ERROR:nemoguardrails.actions.action_dispatcher:Error argument of type 'NoneType' is not iterable while execution generate_user_intent
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/actions/action_dispatcher.py", line 125, in execute_action
    result = await fn(**params)
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/actions/llm/generation.py", line 269, in generate_user_intent
    with llm_params(llm, temperature=self.config.lowest_temperature):
  File "/usr/local/lib/python3.10/dist-packages/nemoguardrails/llm/params.py", line 44, in __enter__
    elif hasattr(self.llm, "model_kwargs") and param in getattr(
TypeError: argument of type 'NoneType' is not iterable
I'm sorry, an internal error has occurred.

Same here... Any progress?

A same kind of issue is been solved over here /issues/155
You can go through that It helped me

from nemo-guardrails.

NeMo-Guardrails does not work with many other LLM providers about nemo-guardrails HOT 13 OPEN

Comments (13)

1) Issue with Hugging Face Hub Models

a) Using the Built-in Feature for Hugging Face Hub

b) Importing Hugging Face Hub Models Externally

2) GPT4All

a) Built-in

b) External

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent