xuyanfu / tensorflow_rlre Goto Github PK

View Code? Open in Web Editor NEW

154.0 154.0 49.0 133.74 MB

Reinforcement Learning for Relation Classification from Noisy Data(TensorFlow)

Python 100.00%

reinforcement-learning relation-classification relation-extraction

tensorflow_rlre's People

Contributors

Stargazers

Watchers

Forkers

wellbeing18 shaoyn0817 chenmoshushi cherry979988 chenglongchen hallochen shengrant gearsuccess hoangcuong2011 agolocuonghoang little-girl-1992 zhaohuiqiang zhoudayang meccy chensab2 imnujf as32608 cjopengler sidney1994 quincy1994 xjohnxjohn zhaojunzuozjzfr cxncu001 praveern zhangyijia1979 legendtianjin timwang666 nicemartin strongman1995 xcgfth dangutdavid aaronliu7 zxs1995 langfangctt yuandongdongdong yaoxinzhi kunlun-zhu yzuki zhou-yantong pandengyao manikant92 wang-0311 yamicro shubhampachori12110095 strawberrylunar lv184614886 kangjinq

tensorflow_rlre's Issues

why do you update parameters manually?

How to predict relations for new sentences?

where does all_sentence_ebd used by cnnrlmodel.py come from?

Thanks~

what's the fb_mid_e1 in training file mean?

training file format (fb_mid_e1, fb_mid_e2, e1_name, e2_name, relation, sentence).
I see some examples of fb_mid_e1 and fb_mid_e2 in files like m.04t_bj m.01l443l, what's that mean?

get_action and deicde_action

您好，
请教一下rlmodel.py中的 get_action 和 deicde_action 有什么区别与联系呢，用这两个函数的作用分别是什么呢？谢谢！

我训练出来的结果正确率和召回率都很低啊（都是0.01级别的）？不知道出现了什么问题，按步骤走的。有人遇到吗？

test.py bug

你在计算num_total时统计了所有拥有positive 实体对的个数，但在计算召回率的时候可能会算上negative 的导致后面的召回率大于1的

sampletimes = 3 的含义

你好在rl模型部分有sample_times=3的设置请问是什么意思呢？在原文的algorithm 2里似乎只sample一次吧？

关于自己测试与您给的模型结果差别较大的问题

作者您好：
对于您的代码，我完全按照流程，自己训练好模型，测试的联合模型结果为[0.52, 0.53, 0.44333333333333336]，远低于您给的best_CNN_model的结果[0.8, 0.735,0.7066666666666667],同时原始CNN模型的效果也低于您文件中写的CNN模型效果。请问您在得到best模型使用了哪些效果提升的方法么？
不好意思打扰您了～

missing npy file in ./data

Where are all_sentence_ebd.npy, all_reward.npy, average_reward.npy? It can't be generated in initial.py.

selected_cnn_model.ckpt保存后似乎没有使用

Hi 在rlmodel.py的最后一行（448行）保存了一个selected_cnn_model.ckpt，但是在后面的代码中似乎没有用到这个ckpt，在cnnrlmodel.py里的171行interact = cnnmodel.interaction(sess1,save_path='model/origin_cnn_model.ckpt')，这里的origin_cnn_model.ckpt是否应该换成selected_cnn_model.ckpt呢？

pre-train时epoch设置问题

你好请问一下在pretrain阶段（pretrain cnn和rl的时候）是需要把两个网络train到完全收敛吗？

test出错

您好，我在运行test.py出现：
FileNotFoundError: [Errno 2] No such file or directory: 'data/testall_word.npy'
在data文件夹下并没有这个npy文件，请问这个文件是和cnndata里面相同的吗？

About the sentence size

chosen sentence size: 364691
total_reward: -0.150552
best_reward -0.150552
Excuse me， the total sentence size is 280579, why the chosen size is 364691 which is lager than total sentence size?

About the chosen sentence

chosen sentence size: 224149
total_reward: -0.24561
best_reward -0.24561
Excuse me ， how can I get the data set of the 224149（280579） chosen sentence ?

关系训练rl模型

您好我在运行rlmodel.py的时候，一直选择的都是整个训练集的所有句子，并没有进行句子筛选，请问是怎么回事呢？
chosen sentence size: 235962
total_reward: -1.0407107
best_reward -1.0406367
chosen sentence size: 235962
total_reward: -1.0407107
best_reward -1.0406367
chosen sentence size: 235962
total_reward: -1.0407109
best_reward -1.0406367
chosen sentence size: 235962
total_reward: -1.040711
best_reward -1.0406367
chosen sentence size: 235962
total_reward: -1.0407109
best_reward -1.0406367

PCNN关系抽取器

您好，我看这个代码只有CNN的，没有使用PCNN来跑实验吗？

error: failed to fetch some objects from 'https://github.com/unreliableXu/TensorFlow_RLRE.git/info/lfs'

batch response: This repository is over its data quota. Purchase more data packs to restore access.                   
error: failed to fetch some objects from 'https://github.com/unreliableXu/TensorFlow_RLRE.git/info/lfs'

Any ideas ?

重新训练完之后找不到best_model啊。

initial.py with error

ub16c9@ub16c9-gpu:/ub16_prj/TensorFlow_RLRE$ python3.6 initial.py
reading train data...
Traceback (most recent call last):
File "initial.py", line 482, in
init_entityebd()
File "initial.py", line 393, in init_entityebd
en1 = content[2]
IndexError: list index out of range
ub16c9@ub16c9-gpu:/ub16_prj/TensorFlow_RLRE$

Question about parameters updating

Hi,
Thanks for your code. I have a question about parameters updating.
In Algorithm 2 of the paper, after computing delayed reward for one bag, the original policy network will be updated (See: Update the parameter Theta of instance selector...).
But in rlmodel.py, my understanding is that both the original policy network and the target policy network are updated after the decision process of all bags (line 308 for the original one, and line 317 for the target one), and the gradient is the sum of all bags (+= of line 279).
So, do I understand your code wrongly? Or the implementation is different from Algorithm 2?
Thanks.

tvars_best, tvars_old, gradBuffer 这三个变量的用处是什么？

你好，在rlmodel.py的168-175行里陆续定义了三个变量tvars_best, tvars_old, gradBuffer，请问一下他们三个的作用分别是什么呢？ thx

关于测试的一个问题

文中提到这个模型是sentence-level的relation extracion模型，把每一个句子看成一个包来训练。您的代码中好像是根据包（entity pair)来测试的。请问您这种测试方法是否可行。谢谢.

sampletimes = 3问题

在rlmodel 中定义sampletimes = 3 ，网络结构并没有改变，只是重复计算prob吗？ list_state, list_action三次采样应该都是一样的值。

for j in range(sampletimes):
                      #reset environment
                      state = env.reset( batch_en1, batch_en2,batch_sentence_ebd,batch_reward)
                      list_action = []
                      list_state = []
                      old_prob = []


                      #get action
                      #start = time.time()
                      for i in range(batch_len):

                          state_in = np.append(state[0],state[1])
                          feed_dict = {}
                          feed_dict[myAgent.entity1] = [state[2]]
                          feed_dict[myAgent.entity2] = [state[3]]
                          feed_dict[myAgent.state_in] = [state_in]
                          prob = sess2.run(myAgent.prob,feed_dict = feed_dict)

                          old_prob.append(prob[0])
                          action = get_action(prob)
                          #add produce data for training cnn model
                          list_action.append(action)
                          list_state.append(state)
                          state = env.step(action)

how to get train data?In the origin_data/ directory. The train.txt is None but the "version https://git-lfs.github.com/spec/v1,oid sha256:b2d8f5818c946b2c236f4b64628aa5931e5d61ed2cfc772ce415980105780f86,size 160588287". So, can you tell me how to get training data?

对这个work有一个疑惑：

我在研究您的论文时，产生了一个疑惑：
你的模型/方法破坏了training set & testing set的原始分布。

其他的RL工作都是基于改变模型参数来适配拟合数据的，也就是不会改变training data & testing data。这样就保证了training set & testing set的原始分布。

但是这篇文章的工作核心是：用RL来对原始training数据的noise bag进行剔除，通过标签Y改变input data。这在training阶段是OK的，这样做确实可以减少noise data对我的分类模型的干扰。但是在test阶段还能这样吗？test set都没label了，如何反馈reward给policy module进行test set中的bag的剔除？那么我在test phrase还如何work呢？

我看了代码，发现in test phrase，确实是直接对test set用CNN做关系分类。

谢谢。

best_cnn_P@100,200,300: [0.04, 0.03, 0.02]

I get quite similar scores even if I train the CNN model and then run tests.
Could it possibly be an issue with the test code?
I'm currently using Tensorflow version 1.6.0 on Python 3.5.