ngonie-x / langchain_csv Goto Github PK

View Code? Open in Web Editor NEW

150.0 150.0 70.0 134 KB

Python 100.00%

langchain_csv's Introduction

👨‍💻Ngonidzashe Nzenze

Software Craftsman

Software developer with a passion for all things python

🌱 I’m currently learning Fast API and Kotlin
👨‍💻 All of my projects are available at https://github.com/Ngonie-x
📝 I regularly write articles on https://dev.to/ngonidzashe
💬 Ask me about anything
📫 How to reach me [email protected]
⚡ Fun fact **One Piece is the best anime on this planet.**👀

Connect with me

🧰 Languages and Tools

📊 Stats

langchain_csv's People

Contributors

Stargazers

Watchers

Forkers

jrk-petiq sambosis popoolaibrahim mkc909 superissy jesalg ml-projects-rana mecamangeg iainheng taltaf913 mooreliving777 therealmkadmi eikyo latslats luisriverag tshm bsvmelo abhishekkumarjjha stephen-ouzounis ranigb nkada johannsky lgs onlycai estimatorjames tmoneybidness taylors-training-and-education aksharahegde kuntal-c fmerinop chetan8000 sujaykar lamlai sarkarda momohuri tuuulllyyy gapmdq yiminchen1999 derkodex valteresj2 kulhunter ppkliu boomsaka georgekariukingugi minglunhan zengdavid4 snrism merkede3 noabenefraim mrctito rickidevsos kazitoufiq jithinbv aitools-2023 ai-nishikant mbala18 abhishekganji27 ganesanintel ivyteen anugrahakbarp danbenton

langchain_csv's Issues

Add Long-Term Memory

Hi! really awesome project! I just have question about potential usecase. Let's assume my dataframe has a few columns with textual data. I want to ask a question about which LMM model should have domain-specific knowledge to answer this question. This domain-specific knowledge comes in this case from columns with textual information, and the system is able to find the most relevant cells with text and generate an answer based on this information (let's say my question + text from the top 10 most relevant cells from the dataframe). So for such functionality, we need to have those columns, as well as my question, embedded and put into a vector search database. It would be nice if the project would have such functionality. So it is essentially how AI assistants with knowledge databases work. I was wondering if it would be possible to add such functionality to the project as well.

JSONDecodeError: Extra data: line 1 column 61 (char 60)

Hi. I am getting the following error:

JSONDecodeError: Extra data: line 1 column 61 (char 60)
Traceback:
File "C:\Users\AppData\Local\anaconda3\envs\py311_test\Lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
File "C:\Users\Desktop\GenAI\app\app2_test\interface.py", line 72, in <module>
    decoded_response = decode_response(response)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Desktop\GenAI\app\app2_test\interface.py", line 17, in decode_response
    return json.loads(response)
           ^^^^^^^^^^^^^^^^^^^^
File "C:\Users\AppData\Local\anaconda3\envs\py311_test\Lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\\AppData\Local\anaconda3\envs\py311_test\Lib\json\decoder.py", line 340, in decode
    raise JSONDecodeError("Extra data", s, end)

It looks like something wrong with decode_response function.

I changed it to:

def decode_response(response: str) -> dict:
    lines = response.splitlines()
    json_line = lines[0]  
    return json.loads(json_line)

and it started working for simple questions, but it fails for most questions (e.g. plot something)

Tokens exceeded for small dataset

Thoughts?

InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4255 tokens (3999 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

Also - some preprocessing of CSV's to ensure other character errors don't arise would be ideal. :) I'm trying to do some of this, too! Thank you so much for this great work!

using gpt4all for same code

I replaced the open ai llm with gpt4all llm but I am getting following error

AttributeError: 'LLModel' object has no attribute 'model_type'
Traceback:
File "C:\Users\vishnu.a2\AppData\Roaming\Python\Python311\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in _run_script
exec(code, module.dict)
File "D:\langchain_csv-main\interface.py", line 66, in
agent = create_agent(data)
^^^^^^^^^^^^^^^^^^
File "D:\langchain_csv-main\agent.py", line 20, in create_agent
llm = GPT4All(model=local_path, n_ctx=1000, backend='gptj', n_batch=8, callbacks=callbacks, verbose=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pydantic\main.py", line 339, in pydantic.main.BaseModel.init
File "pydantic\main.py", line 1102, in pydantic.main.validate_model
File "C:\Users\vishnu.a2\AppData\Roaming\Python\Python311\site-packages\langchain\llms\gpt4all.py", line 156, in validate_environment
values["backend"] = values["client"].model.model_type
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

maximum context length is 4097 tokens

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4229 tokens (3973 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.
on the very short file

2023-05-22 08:40:33.801 Uncaught app exception
Traceback (most recent call last):
  File "/home/svc/.local/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
  File "/home/svc/dalet/webnews/research/ai/langchain_csv/interface.py", line 72, in <module>
    decoded_response = decode_response(response)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/dalet/webnews/research/ai/langchain_csv/interface.py", line 17, in decode_response
    return json.loads(response)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 24 (char 23)
2023-05-22 08:41:37.574 Uncaught app exception
Traceback (most recent call last):
  File "/home/svc/.local/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
  File "/home/svc/dalet/webnews/research/ai/langchain_csv/interface.py", line 72, in <module>
    decoded_response = decode_response(response)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/dalet/webnews/research/ai/langchain_csv/interface.py", line 17, in decode_response
    return json.loads(response)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
2023-05-22 08:41:52.434 Uncaught app exception
Traceback (most recent call last):
  File "/home/svc/.local/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
  File "/home/svc/dalet/webnews/research/ai/langchain_csv/interface.py", line 69, in <module>
    response = query_agent(agent=agent, query=query)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/dalet/webnews/research/ai/langchain_csv/agent.py", line 83, in query_agent
    response = agent.run(prompt)
               ^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 236, in run
    return self(args[0], callbacks=callbacks)[self.output_keys[0]]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 140, in __call__
    raise e
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 134, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/agents/agent.py", line 947, in _call
    next_step_output = self._take_next_step(
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/agents/agent.py", line 762, in _take_next_step
    output = self.agent.plan(
             ^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/agents/agent.py", line 443, in plan
    full_output = self.llm_chain.predict(callbacks=callbacks, **full_inputs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/chains/llm.py", line 213, in predict
    return self(kwargs, callbacks=callbacks)[self.output_key]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 140, in __call__
    raise e
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/chains/base.py", line 134, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/chains/llm.py", line 69, in _call
    response = self.generate([inputs], run_manager=run_manager)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/chains/llm.py", line 79, in generate
    return self.llm.generate_prompt(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/llms/base.py", line 134, in generate_prompt
    return self.generate(prompt_strings, stop=stop, callbacks=callbacks)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/llms/base.py", line 191, in generate
    raise e
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/llms/base.py", line 185, in generate
    self._generate(prompts, stop=stop, run_manager=run_manager)
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/llms/openai.py", line 314, in _generate
    response = completion_with_retry(self, prompt=_prompts, **params)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/llms/openai.py", line 106, in completion_with_retry
    return _completion_with_retry(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 314, in iter
    return fut.result()
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/svc/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/langchain/llms/openai.py", line 104, in _completion_with_retry
    return llm.client.create(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/openai/api_resources/completion.py", line 25, in create
    return super().create(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
                           ^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/openai/api_requestor.py", line 230, in request
    resp, got_stream = self._interpret_response(result, stream)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/svc/.local/lib/python3.11/site-packages/openai/api_requestor.py", line 624, in _interpret_response
    self._interpret_response_line(
  File "/home/svc/.local/lib/python3.11/site-packages/openai/api_requestor.py", line 687, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens, however you requested 4229 tokens (3973 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

JSONDecodeError: Expecting value: line 1 column 1 (char 0) in my CSV file (converted from .xlsx)

JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Traceback:
File "C:\Projects\langchain_csv_1\venv\Lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 552, in run_script
exec(code, module.dict)
File "C:\Projects\langchain_csv_1\interface.py", line 72, in
decoded_response = decode_response(response)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Projects\langchain_csv_1\interface.py", line 17, in decode_response
return json.loads(response)
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\json_init.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\AppData\Local\Programs\Python\Python311\Lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None

Token limit error

i am not a professional programmer . I am trying to run this program and I am getting the below error : This model's maximum context length is 4097 tokens, however you requested 11215 tokens (10959 in your prompt; 256 for the completion). Please reduce your prompt; or completion length.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.