Checked other resources <li class="task-list-ite

I run the following code and faced no error: <div class="highlight hi

I run the following code and faced no error: <div class="highlight highlight-sourc

llm_transformer.convert_to_graph_documents TypeError: list indices must be integers or slices, not str,about hwchase17/langchain

Comments (17)

VJJJJJJ1 commented on August 24, 2024 1

I run the following code and faced no error:

from langchain_huggingface import HuggingFacePipeline
from langchain_text_splitters.character import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain_community.graphs import Neo4jGraph
from langchain_experimental.graph_transformers.llm import LLMGraphTransformer

llm = HuggingFacePipeline.from_model_id(model_id='baichuan-inc/Baichuan2-7B-Chat',task="text-generation")

loader = TextLoader('doc.txt')
documents = loader.load() # + docx_documents
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=800, chunk_overlap=200)
texts = text_splitter.split_documents(documents)
graph = Neo4jGraph(url = 'bolt://localhost:7687',database='neo4j',username='neo4j',password='')
llm_transformer = LLMGraphTransformer(llm=llm)
graph_documents = llm_transformer.convert_to_graph_documents(texts)

The doc txt was as follow

Marie Curie, born in 1867, was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity.
She was the first woman to win a Nobel Prize, the first person to win a Nobel Prize twice, and the only person to win a Nobel Prize in two scientific fields.
Her husband, Pierre Curie, was a co-winner of her first Nobel Prize, making them the first-ever married couple to win the Nobel Prize and launching the Curie family legacy of five Nobel Prizes.
She was, in 1906, the first woman to become a professor at the University of Paris.

Can you double check your input file or possible share a sample because I feel like that might be causing a problem.

this is one part of my doc samples:

卷一　建　　置-->  第一章　沿　　革-->    第一节　隶　属-->      三、中华人民共和国成立后的隶属

三、中华人民共和国成立后的隶属
1949年9月9日，晋江县人民政府成立，归属福建省人民政府第五行政督察专员公署(治设泉州城区)。1950年4月，属泉州行政督察专员公署。1950年9月，属晋江区行政督察专员公署。1955年4月，属晋江专区专员公署。1967年6月，由晋江专区军事管制委员会管辖。1968年9月，属晋江专区革命委员会。1971年7月，属晋江地区革命委员会。1980年1月，属晋江地区行政公署。1986年1月至1988年12月，属泉州市。




卷一　建　　置-->  第一章　沿　　革-->    第二节　境域析变

第二节　境域析变
晋江县位于福建省东南沿海，晋江下游。东经118°24′～118°46′，北纬24°30′～24°54′。东濒**海峡，西接南安县，南与金门隔海相望，北邻鲤城区。南北长42公里，东西宽37公里。总面积809.24平方公里。建县时境域包括今惠安县、鲤城区。宋代兼辖澎湖岛。后几经析变形成今晋江县域。




卷一　建　　置-->  第一章　沿　　革-->    第二节　境域析变-->      一、开元建县

一、开元建县
据清道光《晋江县志》记载，晋江之名“以晋南渡时，衣冠避此者多沿江而居，故名”。西晋末年，北方士族为避兵燹，纷纷南迁，部分在今晋江两岸定居，劳动生息。晋江之名，即始于此。建县时遂以江名命县。唐初，晋江一带属南安县地。景云二年(711)，改武荣州为泉州(即今泉州)，属闽州都督府。州治无县，刺史冯仁知以此为由，呈请置县。于是在唐开元六年(718)，析南安县东南部设置新县，即为晋江县之始建。县治在今鲤城区内，州县同城而治。晋江县唐建县境域示意图晋江县当代境域析变示意图

and I run this code will meet the same error:

doc = Document(page_content="Elon Musk is suing OpenAI")
graph_documents = llm_transformer.convert_to_graph_documents([doc])

maybe the version is incorrect?

from langchain.

VJJJJJJ1 commented on August 24, 2024

this code will meet the same error:
doc = Document(page_content="Elon Musk is suing OpenAI")
graph_documents = llm_transformer.convert_to_graph_documents([doc])

from langchain.

keenborder786 commented on August 24, 2024

what is the llm you are using?

from langchain.

VJJJJJJ1 commented on August 24, 2024

what is the llm you are using?

baichuan2-7b-chat. Thank u for your reply

from langchain.

VJJJJJJ1 commented on August 24, 2024

what is the llm you are using?

my code about using llm as follows:

model_path='../../Models/Baichuan2-7B-Chat'
model = HuggingFacePipeline.from_model_id(model_id=model_path,
                                              task="text-generation",
                                              model_kwargs={
                                                  "torch_dtype": load_type,
                                                  "low_cpu_mem_usage": True,
                                                  "temperature": 0.2,
                                                  "max_length": 1000,
                                                  "device_map": "auto",
                                                  "repetition_penalty": 1.1,
                                                  "trust_remote_code": True,
                                                  "quantization_config": quantization_config, }
                                              )

from langchain.

keenborder786 commented on August 24, 2024

okay I am checking it. Will get back to you.

from langchain.

keenborder786 commented on August 24, 2024

I run the following code and faced no error:

from langchain_huggingface import HuggingFacePipeline
from langchain_text_splitters.character import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain_community.graphs import Neo4jGraph
from langchain_experimental.graph_transformers.llm import LLMGraphTransformer

llm = HuggingFacePipeline.from_model_id(model_id='baichuan-inc/Baichuan2-7B-Chat',task="text-generation")

loader = TextLoader('doc.txt')
documents = loader.load() # + docx_documents
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=800, chunk_overlap=200)
texts = text_splitter.split_documents(documents)
graph = Neo4jGraph(url = 'bolt://localhost:7687',database='neo4j',username='neo4j',password='')
llm_transformer = LLMGraphTransformer(llm=llm)
graph_documents = llm_transformer.convert_to_graph_documents(texts)

The doc txt was as follow

Marie Curie, born in 1867, was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity.
She was the first woman to win a Nobel Prize, the first person to win a Nobel Prize twice, and the only person to win a Nobel Prize in two scientific fields.
Her husband, Pierre Curie, was a co-winner of her first Nobel Prize, making them the first-ever married couple to win the Nobel Prize and launching the Curie family legacy of five Nobel Prizes.
She was, in 1906, the first woman to become a professor at the University of Paris.

Can you double check your input file or possible share a sample because I feel like that might be causing a problem.

from langchain.

drahmad89 commented on August 24, 2024

i have the same issue

from langchain.

li-hhhh commented on August 24, 2024

我也遇到了相同的问题，请问最后是如何解决的呢？谢谢

from langchain.

dumanting commented on August 24, 2024

这是来自QQ邮箱的假期自动回复邮件。您好，我最近正在休假中，无法亲自回复您的邮件。我将在假期结束后，尽快给您回复。

from langchain.

SatSadhu commented on August 24, 2024

Could someone solve the error?

from langchain.

VJJJJJJ1 commented on August 24, 2024

I have not solved this error

from langchain.

VJJJJJJ1 commented on August 24, 2024

I run the following code and faced no error:

from langchain_huggingface import HuggingFacePipeline
from langchain_text_splitters.character import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain_community.graphs import Neo4jGraph
from langchain_experimental.graph_transformers.llm import LLMGraphTransformer

llm = HuggingFacePipeline.from_model_id(model_id='baichuan-inc/Baichuan2-7B-Chat',task="text-generation")

loader = TextLoader('doc.txt')
documents = loader.load() # + docx_documents
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=800, chunk_overlap=200)
texts = text_splitter.split_documents(documents)
graph = Neo4jGraph(url = 'bolt://localhost:7687',database='neo4j',username='neo4j',password='')
llm_transformer = LLMGraphTransformer(llm=llm)
graph_documents = llm_transformer.convert_to_graph_documents(texts)

The doc txt was as follow

Marie Curie, born in 1867, was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity.
She was the first woman to win a Nobel Prize, the first person to win a Nobel Prize twice, and the only person to win a Nobel Prize in two scientific fields.
Her husband, Pierre Curie, was a co-winner of her first Nobel Prize, making them the first-ever married couple to win the Nobel Prize and launching the Curie family legacy of five Nobel Prizes.
She was, in 1906, the first woman to become a professor at the University of Paris.

Can you double check your input file or possible share a sample because I feel like that might be causing a problem.

this is one part of my doc samples:

卷一　建　　置-->  第一章　沿　　革-->    第一节　隶　属-->      三、中华人民共和国成立后的隶属

三、中华人民共和国成立后的隶属
1949年9月9日，晋江县人民政府成立，归属福建省人民政府第五行政督察专员公署(治设泉州城区)。1950年4月，属泉州行政督察专员公署。1950年9月，属晋江区行政督察专员公署。1955年4月，属晋江专区专员公署。1967年6月，由晋江专区军事管制委员会管辖。1968年9月，属晋江专区革命委员会。1971年7月，属晋江地区革命委员会。1980年1月，属晋江地区行政公署。1986年1月至1988年12月，属泉州市。




卷一　建　　置-->  第一章　沿　　革-->    第二节　境域析变

第二节　境域析变
晋江县位于福建省东南沿海，晋江下游。东经118°24′～118°46′，北纬24°30′～24°54′。东濒**海峡，西接南安县，南与金门隔海相望，北邻鲤城区。南北长42公里，东西宽37公里。总面积809.24平方公里。建县时境域包括今惠安县、鲤城区。宋代兼辖澎湖岛。后几经析变形成今晋江县域。




卷一　建　　置-->  第一章　沿　　革-->    第二节　境域析变-->      一、开元建县

一、开元建县
据清道光《晋江县志》记载，晋江之名“以晋南渡时，衣冠避此者多沿江而居，故名”。西晋末年，北方士族为避兵燹，纷纷南迁，部分在今晋江两岸定居，劳动生息。晋江之名，即始于此。建县时遂以江名命县。唐初，晋江一带属南安县地。景云二年(711)，改武荣州为泉州(即今泉州)，属闽州都督府。州治无县，刺史冯仁知以此为由，呈请置县。于是在唐开元六年(718)，析南安县东南部设置新县，即为晋江县之始建。县治在今鲤城区内，州县同城而治。晋江县唐建县境域示意图晋江县当代境域析变示意图

and I run this code will meet the same error:

doc = Document(page_content="Elon Musk is suing OpenAI")
graph_documents = llm_transformer.convert_to_graph_documents([doc])

maybe the version is incorrect?

@keenborder786 could my sample cause the problem? it confuses me a long time.

from langchain.

VJJJJJJ1 commented on August 24, 2024

I run the following code and faced no error:

from langchain_huggingface import HuggingFacePipeline
from langchain_text_splitters.character import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain_community.graphs import Neo4jGraph
from langchain_experimental.graph_transformers.llm import LLMGraphTransformer

llm = HuggingFacePipeline.from_model_id(model_id='baichuan-inc/Baichuan2-7B-Chat',task="text-generation")

loader = TextLoader('doc.txt')
documents = loader.load() # + docx_documents
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=800, chunk_overlap=200)
texts = text_splitter.split_documents(documents)
graph = Neo4jGraph(url = 'bolt://localhost:7687',database='neo4j',username='neo4j',password='')
llm_transformer = LLMGraphTransformer(llm=llm)
graph_documents = llm_transformer.convert_to_graph_documents(texts)

The doc txt was as follow

Marie Curie, born in 1867, was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity.
She was the first woman to win a Nobel Prize, the first person to win a Nobel Prize twice, and the only person to win a Nobel Prize in two scientific fields.
Her husband, Pierre Curie, was a co-winner of her first Nobel Prize, making them the first-ever married couple to win the Nobel Prize and launching the Curie family legacy of five Nobel Prizes.
She was, in 1906, the first woman to become a professor at the University of Paris.

Can you double check your input file or possible share a sample because I feel like that might be causing a problem.

this is one part of my doc samples:

卷一　建　　置-->  第一章　沿　　革-->    第一节　隶　属-->      三、中华人民共和国成立后的隶属

三、中华人民共和国成立后的隶属
1949年9月9日，晋江县人民政府成立，归属福建省人民政府第五行政督察专员公署(治设泉州城区)。1950年4月，属泉州行政督察专员公署。1950年9月，属晋江区行政督察专员公署。1955年4月，属晋江专区专员公署。1967年6月，由晋江专区军事管制委员会管辖。1968年9月，属晋江专区革命委员会。1971年7月，属晋江地区革命委员会。1980年1月，属晋江地区行政公署。1986年1月至1988年12月，属泉州市。




卷一　建　　置-->  第一章　沿　　革-->    第二节　境域析变

第二节　境域析变
晋江县位于福建省东南沿海，晋江下游。东经118°24′～118°46′，北纬24°30′～24°54′。东濒**海峡，西接南安县，南与金门隔海相望，北邻鲤城区。南北长42公里，东西宽37公里。总面积809.24平方公里。建县时境域包括今惠安县、鲤城区。宋代兼辖澎湖岛。后几经析变形成今晋江县域。




卷一　建　　置-->  第一章　沿　　革-->    第二节　境域析变-->      一、开元建县

一、开元建县
据清道光《晋江县志》记载，晋江之名“以晋南渡时，衣冠避此者多沿江而居，故名”。西晋末年，北方士族为避兵燹，纷纷南迁，部分在今晋江两岸定居，劳动生息。晋江之名，即始于此。建县时遂以江名命县。唐初，晋江一带属南安县地。景云二年(711)，改武荣州为泉州(即今泉州)，属闽州都督府。州治无县，刺史冯仁知以此为由，呈请置县。于是在唐开元六年(718)，析南安县东南部设置新县，即为晋江县之始建。县治在今鲤城区内，州县同城而治。晋江县唐建县境域示意图晋江县当代境域析变示意图

and I run this code will meet the same error:

doc = Document(page_content="Elon Musk is suing OpenAI")
graph_documents = llm_transformer.convert_to_graph_documents([doc])

maybe the version is incorrect?

@keenborder786 could my sample cause the problem? it confuses me a long time.

from langchain.

SatSadhu commented on August 24, 2024

I was able to put together a solution to the problem...
What I did was edit the 'llm.py', in your case (@VJJJJJJ1) the path is -> /root/miniconda3/envs/rag/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py

What I did was go to line 714 and rename the function called 'process_response' as 'process_response_old' and then create another function called 'process_response' which is the following:

def process_response(self, document: Document) -> GraphDocument: 
   """
   Processes a single document, transforming it into a graph document using
   an LLM based on the model's schema and constraints.
   """
   text = document.page_content
   raw_schema = self.chain.invoke({"input": text})
   if self._function_call:
       raw_schema = cast(Dict[Any, Any], raw_schema)
       nodes, relationships = _convert_to_graph_document(raw_schema)
   else:
       nodes_set = set()
       relationships = []
       if not isinstance(raw_schema, str):
           raw_schema = raw_schema.content
       parsed_json = self.json_repair.loads(raw_schema)
       print(parsed_json)
       
       properties = parsed_json.get('properties', {})

       # Handle different types of values: strings and lists
       def get_first_element(value):
           return value[0] if isinstance(value, list) else value

       head = get_first_element(properties.get('head', 'Unknown'))
       head_type = get_first_element(properties.get('head_type', 'Unknown'))
       tail = get_first_element(properties.get('tail', 'Unknown'))
       tail_type = get_first_element(properties.get('tail_type', 'Unknown'))
       relation = get_first_element(properties.get('relation', 'Unknown'))

       # Nodes need to be deduplicated using a set
       nodes_set.add((head, head_type))
       nodes_set.add((tail, tail_type))

       source_node = Node(id=head, type=head_type)
       target_node = Node(id=tail, type=tail_type)
       relationships.append(
           Relationship(
               source=source_node, target=target_node, type=relation
           )
       )
       
       # Create nodes list
       nodes = [Node(id=el[0], type=el[1]) for el in list(nodes_set)]

   # Strict mode filtering
   if self.strict_mode and (self.allowed_nodes or self.allowed_relationships):
       if self.allowed_nodes:
           lower_allowed_nodes = [el.lower() for el in self.allowed_nodes]
           nodes = [
               node for node in nodes if node.type.lower() in lower_allowed_nodes
           ]
           relationships = [
               rel
               for rel in relationships
               if rel.source.type.lower() in lower_allowed_nodes
               and rel.target.type.lower() in lower_allowed_nodes
           ]
       if self.allowed_relationships:
           relationships = [
               rel
               for rel in relationships
               if rel.type.lower()
               in [el.lower() for el in self.allowed_relationships]
           ]

   return GraphDocument(nodes=nodes, relationships=relationships, source=document)

I hope it has been helpfull! It worked for me...

from langchain.

ingmars1709 commented on August 24, 2024

I had the same error when using llama2:

nodes_set.add((rel["head"], rel["head_type"]))
TypeError: list indices must be integers or slices, not str

When I changed the model string variable to 'llama3' I got correct results using the old implementation of process_response using the following set-up:

loader = TextLoader('doc.txt')
documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=200)
texts = text_splitter.split_documents(documents)

llm=ChatOllama(model="llama3") # make sure you run llama3 model

llm_transformer = LLMGraphTransformer(llm=llm)
graph_documents = llm_transformer.convert_to_graph_documents(texts)

print(f"Nodes:{graph_documents[0].nodes}")
print(f"Relationships:{graph_documents[0].relationships}")

from langchain.

Otnielush commented on August 24, 2024

It happens when extracting information from LLM output with input prompt (for me it was by default).
can be fixed by:

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=2000, return_full_text=True)
llm = HuggingFacePipeline(pipeline=pipe)

from langchain.

llm_transformer.convert_to_graph_documents TypeError: list indices must be integers or slices, not str about langchain HOT 17 OPEN

Comments (17)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent