lingluodlut / att-chemdner Goto Github PK
View Code? Open in Web Editor NEWAtt-ChemdNER
License: Apache License 2.0
Att-ChemdNER
License: Apache License 2.0
Successfully installed theano-0.9.0
You are using pip version 9.0.3, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
mldl@mldlUB1604:/ub16_prj/Att-ChemdNER/src$ python train.py --train trainfile --dev devfile --test testfile --pre_emb word_embedding.model/ub16_prj/Att-ChemdNER/src$
Using TensorFlow backend.
Traceback (most recent call last):
File "train.py", line 12, in
from utils import create_input
File "/home/mldl/ub16_prj/Att-ChemdNER/src/utils.py", line 187, in
import initializations;
File "/home/mldl/ub16_prj/Att-ChemdNER/src/initializations.py", line 3, in
import backend as K
File "/home/mldl/ub16_prj/Att-ChemdNER/src/backend/init.py", line 67, in
from .tensorflow_backend import *
ImportError: No module named tensorflow_backend
mldl@mldlUB1604:
Hi
BilSTMCRF model train code is given and it generates the model, but how do we update the word2vec model for new words?
Hi
In the pre-trained model, if we pass the chunk and lemma, it is failing to interpret, was it trained without chunk and lemma? Kindly let know
Hello, thanks for sharing the codes. I want to apply your model on my own dataset, so I need to preprocess my documents so that it could fit to your model. Can you tell me how to preprocess it, or can you share the preprocess code? Thank you a lot.
Conversion of the second argument of issubdtype from float
to np.floating
is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type
.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Using TensorFlow backend.
Traceback (most recent call last):
File "train.py", line 176, in
assert os.path.isfile(opts.train)
AssertionError
你好,我想问下Chemical的原始语料用Genia tagger分词后,对应的BIO标签该如何获取?因为分词后的语料与标签文件给定的实体位置不一致,而且部分单词也被切分了,难以跟原本的标签对上,请问你是如何处理的?谢谢
谢谢你的工作。
我想请问的是,你代码里对新文档预测标注的应该只有chemdner数据集的?
我跑了cdr数据集的模型,用来预测,报错越界,主要是loder.py文件的226行,
ner=[dic_to_id[w[4]] for w in s];
这里报list越界,debug之后感觉和数据集有关。
不知道作者能否提供什么帮助,非常感谢
报错如上,发现问题出在nn.py中的link函数中,不太清楚它需要的input是什么样子的
以图1,我的报错为例
word_input = word_layer.link(word_ids)
其中的参数word_ids是Theano.tensor.ivector,他是一维向量: ivector(int 类型的向量)
进入link函数中就报错了(参数input显示为 'TensorVariable' 类型)
-----------------------------------------依赖包
absl-py 1.2.0 pypi_0 pypi
astor 0.8.1 pypi_0 pypi
backend 0.2.4.1 pypi_0 pypi
blas 1.0 mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
ca-certificates 2022.07.19 haa95532_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
certifi 2022.9.14 py37haa95532_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
cython 0.29.28 pypi_0 pypi
gast 0.2.2 pypi_0 pypi
gensim 4.2.0 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.48.1 pypi_0 pypi
h5py 3.7.0 pypi_0 pypi
importlib-metadata 4.12.0 pypi_0 pypi
intel-openmp 2021.4.0 haa95532_3556 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
keras 2.2.5 pypi_0 pypi
keras-applications 1.0.8 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
markdown 3.4.1 pypi_0 pypi
markupsafe 2.1.1 pypi_0 pypi
mkl 2021.4.0 haa95532_640 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
mkl-service 2.4.0 py37h2bbff1b_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
numpy 1.21.6 pypi_0 pypi
openssl 1.1.1q h2bbff1b_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
opt-einsum 3.3.0 pypi_0 pypi
pip 22.1.2 py37haa95532_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
protobuf 3.20.1 pypi_0 pypi
python 3.7.13 h6244533_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
pyyaml 6.0 pypi_0 pypi
scipy 1.7.3 pypi_0 pypi
setuptools 65.3.0 pypi_0 pypi
six 1.16.0 pyhd3eb1b0_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
smart-open 6.2.0 pypi_0 pypi
sqlite 3.39.2 h2bbff1b_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
tensorboard 1.15.0 pypi_0 pypi
tensorflow 1.15.0 pypi_0 pypi
tensorflow-estimator 1.15.1 pypi_0 pypi
termcolor 2.0.1 pypi_0 pypi
theano 1.0.5 pypi_0 pypi
typing-extensions 4.3.0 pypi_0 pypi
vc 14.2 h21ff451_1 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
vs2015_runtime 14.27.29016 h5e58377_2 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
werkzeug 2.2.2 pypi_0 pypi
wheel 0.37.1 pyhd3eb1b0_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
wincertstore 0.2 py37haa95532_2 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
wrapt 1.14.1 pypi_0 pypi
zipp 3.8.1 pypi_0 pypi
def step(self,state,attended,source):
#from theano.gradient import disconnected_grad;
#state=disconnected_grad(state_);
#M_state=T.dot(self.W_A_h,state) ;
_energy=self.scoreFun(attended,state,self.W_A)
energy=T.nnet.softmax(_energy);
#energy=self.softmaxReScale(_energy,0.02);
#energy=self.reScale(energy.flatten(),0.02).reshape((1,-1))
#energyIndex=energy.flatten().argmin(axis=-1);
glimpsed=(energy.T*source).sum(axis=0)
#glimpsed=source[energyIndex];
return energy.flatten(),glimpsed;
def link(self,attended,state,source):
step_function=self.step;
attended_=T.tanh(T.dot(attended,self.W_A_X))+self.b_A_X;
#attended_=attended;
[energy,glimpsed],_=theano.scan(fn=step_function,
sequences=[attended_],
outputs_info=None,
non_sequences=[attended_,source]);
self.energy=energy;
#combine
#combine=T.concatenate([glimpsed,attended],axis=-1);
combine=T.concatenate([glimpsed,source],axis=-1);
combined=T.tanh(T.dot(combine,self.W_A_combine))+self.b_A_combine;
#no source
#combined=T.tanh(T.dot(glimpsed,self.W_A_combine))+self.b_A_combine;
return combined;
在model.py中,调用了此link函数,传入的参数是(final_output,final_c,final_output),此link函数中有个scan函数,传入到step_funtion中的是attended_,attended_,source,传入了两个一样的attended_,那么_energy=self.scoreFun(attended,state,self.W_A)得到的不是零吗?传入的attended和state相等的话计算距离公式不是为零吗。小菜鸟提个问,希望能得到解释,万分感谢!
请问在您的模型性能达到最优时,sentencesLevelLoss的值的设置(True or False),以及计算句子水平的损失和doc水平的损失具体有什么区别呢?十分感谢。
Hi
It is mentioned that Jochem dictionar is used as additional feature with a 5D input. BUt in code I couldnt find where that look up and match is made. Could you please share.
Could you please check the code for all the paths? I seem to miss something.
Thanks
hello,can you provide the new pytorch version of this code??
I can‘t clone it in my environment.Thank you.
Hi there,
I am having issue running Att-ChemdNER.
I have installed the following package:
After running the command line in the readme:
python train.py --train trainfile --dev devfile --test testfile --pre_emb word_embedding.model
I am getting this error:
Using TensorFlow backend.
Traceback (most recent call last):
File "train.py", line 12, in <module>
from utils import create_input
File "/home/fdaha/virt_env_att/Att-ChemdNER-master/src/utils.py", line 187, in <module>
import initializations;
File "/home/fdaha/virt_env_att/Att-ChemdNER-master/src/initializations.py", line 3, in <module>
import backend as K
File "/home/fdaha/virt_env_att/Att-ChemdNER-master/src/backend/__init__.py", line 67, in <module>
from .tensorflow_backend import *
ImportError: No module named tensorflow_backend
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.