Coder Social home page Coder Social logo

dreaminvoker / gain Goto Github PK

View Code? Open in Web Editor NEW
141.0 141.0 30.0 2.06 MB

Source code for EMNLP 2020 paper: Double Graph Based Reasoning for Document-level Relation Extraction

License: MIT License

Python 94.13% Shell 5.87%
dgl document-level-relation-extraction graph-neural-networks natural-language-processing relation-extraction

gain's Introduction

Hi there 👋

DreamInvoker

This is Shuang Zeng [google scholar].

Currently, I am an application researcher in Data-Douyin-Comment at ByteDance.

I got Master Degree at Peking University under the supervision of Prof. Baobao Chang [google scholar].

My research interests now include Large Vision-Language Model, Retrieval-augmented Generation, Text2SQL, and Question Answering.

Profile Summary

summary

gain's People

Contributors

dreaminvoker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

gain's Issues

TypeError: expected Tensor as element 0 in argument 0, but got str

Hi! I'm trying to train the neural network, using the default values provided in the script run_GAIN_BERT.sh (I just changed the bert_path from ../PLM/bert-base-uncased to bert-base-uncased), but the training script is giving me an error after beginning. The error stack trace is as follows:

Traceback (most recent call last):
  File "train.py", line 233, in <module>
    train(opt)
  File "train.py", line 140, in train
    ht_pair_distance=d['ht_pair_distance']
  File "/home/saptakathaa/nre/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/saptakathaa/GAIN/code/models/GAIN.py", line 306, in forward
    encoder_outputs = torch.cat([encoder_outputs, self.entity_type_emb(params['entity_type'])], dim=-1)
TypeError: expected Tensor as element 0 in argument 0, but got str

Can you please help me in fixing it?

P.S. : The line numbers may vary by 2-3 lines as I have included some print statements to debug it myself.

Pretrained models?

Are you able to post or release the pretrained models from the paper?

It would be helpful for those of us that don't have big enough GPUs to train 😆

头、尾实体下标问题

在path_table中,存储头实体和尾实体时已经将index+1转换了;然后在h_t_pairs中,存储头实体和尾实体时也将index+1转换了,按道理说这两个是匹配的,都是index+1,但是在模型的forward()中又将h_t_pairs中的头、尾实体+1,然后从path_table中取中间节点,请问这个地方是我理解错了么?还是代码在某个地方进行了转化我没注意到?

Which random seed are you using?

Hi, I'm trying to reproduce the results on BERT-base, but I can only get F1 / ignF1 = 0.5985 / 0.5752 for full model, and F1 / ignF1 = 0.6010 / 0.5796 for the nomention ablation.

I am using the given .sh so I suppose the reason is not the hyper-parameters. Could you provide the random seed you are using to reproduce the F1 = 0.6122 result? Thx!

关于bert结果的一些疑问

您好,想请教您一些关于bert的问题。
您在论文里面写的bert的初始学习率是1e-5
但是在代码的bert sh 里面是1e-3。
还有就是,您论文里面bert结果的那次训练中,bert直接fix了?还是也会更新?
谢谢!

Training cannot be done in GPU

Hi! I'm trying to train the neural network, using the default values provided in the script run_GAIN_GLOVE.sh. The code is running fine but I found that the training is being done using CPU. I tried to change the devices in the code to cuda to make it train using GPU. But it didn't work and started throwing errors "GPU device not available" although my system has GPUs.

Can you please help to execute the code using GPU?

关于数据集

你好,我想问一下,如果换个数据集的话,应该怎么把其他数据集换成您的这个数据集的格式呢?您是怎么处理的呢?期待回复!万分感谢!

能详细说说Infer-F1指标是怎么被计算出来的吗?

我对Infer-F1这个指标比较感兴趣,但是论文里谈论得不多,我个人猜测是只计算能组成类似“三角关系”的三元组(e.g. (A, r1, B), (B, r2, C), (A, r3, C))的性能?不知道是否有理解错?能否开源指标的计算代码?

Error in Training

Hi! I'm trying to train the neural network, using the default values provided in the script run_GAIN_BERT.sh(I just changed the bert_path from ../PLM/bert-base-uncased to bert-base-uncased), but the training script is giving me an error after beginning. It seems to be an error with DGL The error is the following:

Traceback (most recent call last):
  File "train.py", line 231, in <module>
    train(opt)
  File "train.py", line 138, in train
    ht_pair_distance=d['ht_pair_distance']
  File "/Users/carlos.jimenez/PycharmProjects/GAIN/doc_processor/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/carlos.jimenez/PycharmProjects/GAIN/code/models/GAIN.py", line 348, in forward
    graphs = dgl.unbatch_hetero(graph_big)
  File "/Users/carlos.jimenez/PycharmProjects/GAIN/doc_processor/lib/python3.7/site-packages/dgl/batch.py", line 418, in unbatch_hetero
    return batch(*args, **kwargs)
  File "/Users/carlos.jimenez/PycharmProjects/GAIN/doc_processor/lib/python3.7/site-packages/dgl/batch.py", line 167, in batch
    if any(g.is_block for g in graphs):
  File "/Users/carlos.jimenez/PycharmProjects/GAIN/doc_processor/lib/python3.7/site-packages/dgl/batch.py", line 167, in <genexpr>
    if any(g.is_block for g in graphs):
  File "/Users/carlos.jimenez/PycharmProjects/GAIN/doc_processor/lib/python3.7/site-packages/dgl/heterograph.py", line 1968, in __getitem__
    raise DGLError('Invalid key "{}". Must be one of the edge types.'.format(orig_key))
dgl._ffi.base.DGLError: Invalid key "0". Must be one of the edge types.

Could you please help me finding what is wrong?

在多GPUs上训练出错

您好,请问一下我尝试用DataParallel把这个模型改成多GPU训练,在运行的时候报错了:

Traceback (most recent call last):
  File "/disks/disk1/remote_src/DocRED/GAIN-master/code/train.py", line 237, in <module>
    train(opt)
  File "/disks/disk1/remote_src/DocRED/GAIN-master/code/train.py", line 130, in train
    predictions = model(words=d['context_idxs'],
  File "/disks/disk1/envs/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/disks/disk1/envs/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/disks/disk1/envs/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/disks/disk1/envs/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
    output.reraise()
  File "/disks/disk1/envs/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
IndexError: Caughtin replica 0 on device 0.
Original Traceback (most recent call last):
  File "/disks/disk1/envs/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "/disks/disk1/envs/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/disks/disk1/remote_src/DocRED/GAIN-master/code/models/GAIN.py", line 332, in forward
    encoder_output = encoder_outputs[i]  # [slen, bert_hid]
IndexError: index 1 is out of bounds for dimension 0 with size 1

头实体、尾实体下标问题

在path_table中,存储头实体和尾实体时已经将下标+1了;然后在h_t_pairs中,存储头实体和尾实体时也将下标+1了,按道理说这两个是匹配的,但是在模型的forward()中又将h_t_pairs中的头尾实体+1和path_table中进行匹配,这个地方代码是不是存在错误?

cuda版本

您好,我在使用gpu的时候,产生的错误是OSError: libcublas.so.10: cannot open shared object file: No such file or directory。
我的cuda版本是11,请问一定需要10.2 的版本吗

Spatial features?

I was wondering if maybe it would be possible to apply the GAIN framework on document images.

Perhaps, could we include spatial features like the position (x, y) of each entity, along with other features that are used in the model like the text embeddings, entity type embeddings, etc.?

Training with custom data

Hello,

I have put my dataset in the DocRed format and also created the corresponding filestrain_annotated.json, dev.json, test.json, ner2id.json and rel2id.json to train a BERT type architecture. However, in my dataset, the number of relationships and entities is different than what is used in DocRed. I would like to know which files/parameters I would need to modify in order to be able to train with custom data.

Best regards

Size mismatch in training

I try to run the code through run_GAIN_BERT.sh, I get a size mismatch problem in GAIN.py when I try to extract features through edge_layer.
image

关于docred的数据的最大长度

你好,在使用bert的tokenizer时,我发现docred中部分数据获得的subtokens是超过512的,而如果超过512,在data.py中有如下处理方式,下列代码的意图是什么?

if entity2mention[idx] == []:
                    entity2mention[idx].append(mention_idx)
                    while mention_id[replace_i] != 0:
                        replace_i += 1
                    mention_id[replace_i] = mention_idx
                    pos_id[replace_i] = idx
                    ner_id[replace_i] = ner2id[vertex[0]['type']]
                    mention_idx += 1

Question about non-entity words.

As mentioned in Section 3.1 Encoding Module in the paper, it says "We introduce None entity type and id for those words not belonging to any entity".

However, the proposed model seems to not use any non-entity words.

In Mention-level Graph Aggregation Module, the graph only contains entities and not non-entities.

Thus, my question is that are non-entity words are just dropped from the graph's input or I overlooked some details about the model.

Thanks!

关于运行./eval_BERT.sh问题

您好!
1.我按照您的参数运行./eval_GAIN_BERT.sh 0 0.7972 时,结果出现如下情况:
微信截图_20211109215944
这里的input_theta 0.7972是在运行./run_GAIN_BERT.sh 1的最佳epoch中的。请问为什么会出现这个问题呢?
2.是不是因为上面出错的原因,也没法得到rusult.json文件

关于其他数据集的修改:

我修改了自己的数据集使得其符合格式,然后修改'config.py'中的relation_nums为2,但是训练的时候仍然出错了。
错误报告如下

Traceback (most recent call last):
  File "train.py", line 194, in <module>
    train(opt)
  File "train.py", line 106, in train
    loss = torch.sum(BCE(predictions, relation_multi_label) * relation_mask.unsqueeze(2))/(opt.relation_nums * torch.sum(relation_mask))
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/loss.py", line 717, in forward
    reduction=self.reduction)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 2824, in binary_cross_entropy_with_logits
    raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
ValueError: Target size (torch.Size([1, 21, 97])) must be the same as input size (torch.Size([1, 21, 2]))
end

Test

Just for test.

Error in training

Hi! I'm trying to train the neural network, using the default values provided in the script run_GAIN_BERT.sh, but the training script is giving me an error after beginning. It seems to be an error with DGL The error is the following:
捕获

Question about text's length and use of other models

First of all, I would like to thank you for your great work! and paper.

I have been experimenting with the training scripts you have proposed and they have worked well.

So I would like to ask two questions.

Are there any limitations on the size of the sentences/documents that the model which uses the BERT Encoder could process? since BERT is limited to only 512 sub-words units.

And if I would like to experiment with other languages and therefore use other encoders, say bert-base-multilingual-cased or xlm-roberta-base, is it enough to create a folder for these models and download/place the files pytotch_model_bin, vocab.txt etc., accordingly?

I also imagine that it would be necessary to create a GAIN_BERT_MUL training script that points-out to the folder of the new model and modify the parameters as required.

Best regards

头实体、尾实体下标问题

在path_table中,存储头实体和尾实体时已经将下标+1了;然后在h_t_pairs中,存储头实体和尾实体时也将下标+1了,按道理说这两个是匹配的,但是在模型的forward()中又将h_t_pairs中的头尾实体+1和path_table中进行匹配,这个地方代码是不是存在错误?

Compatibility with CUDA 11

Hello, I am trying to train over a cuda 11.0 gpu card, I noticed then that cuda 11 its only compatible with dlg>=0.6.
Are there any workaround to train with dgl>=0.6.0 or I would love to hear some insights on how to adapt myself the code to train with the newest dlg version?

dgl_cu111-0.7.1 I use this dgl version and get train errors

Traceback (most recent call last):
File "train.py", line 231, in
train(opt)
File "train.py", line 125, in train
predictions = model(words=d['context_idxs'],
File "/home/anaconda3/envs/qusiyu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data/qusiyu/GAIN-master/code/models/GAIN.py", line 304, in forward
encoder_outputs = torch.cat([encoder_outputs, self.entity_type_emb(params['entity_type'])], dim=-1)
TypeError: expected Tensor as element 0 in argument 0, but got str

请问用GPU运行报错是怎么回事?CPU可以正常运行

配置好readme中要求的环境和各项依赖后,设置gpu_id为0(本机只有id为0的这一张显卡),运行run_GAIN_BERT.sh后,log文件中显示:”RuntimeError: CUDA out of memory. Tried to allocate 14.00 MiB (GPU 0; 6.00 GiB total capacity; 4.30 GiB already allocated; 10.44 MiB free; 4.38 GiB reserved in total by PyTorch)“,无法继续运行。
用nvidia-smi检查GPU占用情况,显示只有10%左右的占用,后台并未运行其它程序,显存资源充足。我的GPU是RTX2060,6G显存。
我使用的CUDA版本是10.2,torch是gpu版本的1.6.0,其它依赖版本也均和readme中相同,而用CPU跑就可以成功运行,请问是哪里出了问题?
202107211814

关于代码的部分问题和总结

先说一下我的运行环境:
dgl 0.6.1 torch 1.8.0
这份代码貌似问题有点儿多啊!
(1)DGLREDataloader 类中的使用for循环对mapping的zero_()操作是否有点儿冗余,直接将外层声明tensor的语句放到for循环内部不就可以了吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.