Coder Social home page Coder Social logo

macanv / bert-bilstm-crf-ner Goto Github PK

View Code? Open in Web Editor NEW
4.6K 94.0 1.3K 3.84 MB

Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services

Home Page: https://github.com/macanv/BERT-BiLSMT-CRF-NER

Python 96.96% Shell 0.01% Perl 3.03%
bert ner named-entity-recognition blstm crf bert-bilstm-crf

bert-bilstm-crf-ner's Introduction

BERT-BiLSTM-CRF-NER

Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning

使用谷歌的BERT模型在BLSTM-CRF模型上进行预训练用于中文命名实体识别的Tensorflow代码'

中文文档请查看https://blog.csdn.net/macanv/article/details/85684284 如果对您有帮助,麻烦点个star,谢谢~~

Welcome to star this repository!

The Chinese training data($PATH/NERdata/) come from:https://github.com/zjy-ucas/ChineseNER

The CoNLL-2003 data($PATH/NERdata/ori/) come from:https://github.com/kyzhouhzau/BERT-NER

The evaluation codes come from:https://github.com/guillaumegenthial/tf_metrics/blob/master/tf_metrics/__init__.py

Try to implement NER work based on google's BERT code and BiLSTM-CRF network! This project may be more close to process Chinese data. but other language only need Modify a small amount of code.

THIS PROJECT ONLY SUPPORT Python3.
###################################################################

Download project and install

You can install this project by:

pip install bert-base==0.0.9 -i https://pypi.python.org/simple

OR

git clone https://github.com/macanv/BERT-BiLSTM-CRF-NER
cd BERT-BiLSTM-CRF-NER/
python3 setup.py install

if you do not want to install, you just need clone this project and reference the file of <run.py> to train the model or start the service.

UPDATE:

  • 2020.2.6 add simple flask ner service code
  • 2019.2.25 Fix some bug for ner service
  • 2019.2.19: add text classification service
  • fix Missing loss error
  • add label_list params in train process, so you can using -label_list xxx to special labels in training process.

Train model:

You can use -help to view the relevant parameters of the training named entity recognition model, where data_dir, bert_config_file, output_dir, init_checkpoint, vocab_file must be specified.

bert-base-ner-train -help

train/dev/test dataset is like this:

海 O
钓 O
比 O
赛 O
地 O
点 O
在 O
厦 B-LOC
门 I-LOC
与 O
金 B-LOC
门 I-LOC
之 O
间 O
的 O
海 O
域 O
。 O

The first one of each line is a token, the second is token's label, and the line is divided by a blank line. The maximum length of each sentence is [max_seq_length] params.
You can get training data from above two git repos
You can training ner model by running below command:

bert-base-ner-train \
    -data_dir {your dataset dir}\
    -output_dir {training output dir}\
    -init_checkpoint {Google BERT model dir}\
    -bert_config_file {bert_config.json under the Google BERT model dir} \
    -vocab_file {vocab.txt under the Google BERT model dir}

like my init_checkpoint:

init_checkpoint = F:\chinese_L-12_H-768_A-12\bert_model.ckpt

you can special labels using -label_list params, the project get labels from training data.

# using , split
-labels 'B-LOC, I-LOC ...'
OR save label in a file like labels.txt, one line one label
-labels labels.txt

After training model, the NER model will be saved in {output_dir} which you special above cmd line.

My Training environment:Tesla P40 24G mem

As Service

Many server and client code comes from excellent open source projects: bert as service of hanxiao If my code violates any license agreement, please let me know and I will correct it the first time. and NER server/client service code can be applied to other tasks with simple modifications, such as text categorization, which I will provide later. this project private Named Entity Recognition and Text Classification server service. Welcome to submit your request or share your model, if you want to share it on Github or my work.

You can use -help to view the relevant parameters of the NER as Service: which model_dir, bert_model_dir is need

bert-base-serving-start -help

and than you can using below cmd start ner service:

bert-base-serving-start \
    -model_dir C:\workspace\python\BERT_Base\output\ner2 \
    -bert_model_dir F:\chinese_L-12_H-768_A-12
    -model_pb_dir C:\workspace\python\BERT_Base\model_pb_dir
    -mode NER

or text classification service:

bert-base-serving-start \
    -model_dir C:\workspace\python\BERT_Base\output\ner2 \
    -bert_model_dir F:\chinese_L-12_H-768_A-12
    -model_pb_dir C:\workspace\python\BERT_Base\model_pb_dir
    -mode CLASS
    -max_seq_len 202

as you see:
mode: If mode is NER/CLASS, then the service identified by the Named Entity Recognition/Text Classification will be started. If it is BERT, it will be the same as the [bert as service] project.
bert_model_dir: bert_model_dir is a BERT model, you can download from https://github.com/google-research/bert ner_model_dir: your ner model checkpoint dir model_pb_dir: model freeze save dir, after run optimize func, there will contains like ner_model.pb binary file

You can download my ner model from:https://pan.baidu.com/s/1m9VcueQ5gF-TJc00sFD88w, ex_code: guqq Or text classification model from: https://pan.baidu.com/s/1oFPsOUh1n5AM2HjDIo2XCw, ex_code: bbu8
Set ner_mode.pb/classification_model.pb to model_pb_dir, and set other file to model_dir(Different models need to be stored separately, you can set ner models label_list.pkl and label2id.pkl to model_dir/ner/ and set text classification file to model_dir/text_classification) , Text classification model can classify 12 categories of Chinese data: '游戏', '娱乐', '财经', '时政', '股票', '教育', '社会', '体育', '家居', '时尚', '房产', '彩票'

You can see below service starting info:

you can using below code test client:

1. NER Client

import time
from bert_base.client import BertClient

with BertClient(show_server_config=False, check_version=False, check_length=False, mode='NER') as bc:
    start_t = time.perf_counter()
    str = '1月24日,新华社对外发布了**对雄安新区的指导意见,洋洋洒洒1.2万多字,17次提到北京,4次提到天津,信息量很大,其实也回答了人们关心的很多问题。'
    rst = bc.encode([str, str])
    print('rst:', rst)
    print(time.perf_counter() - start_t)

you can see this after run the above code: If you want to customize the word segmentation method, you only need to make the following simple changes on the client side code.

rst = bc.encode([list(str), list(str)], is_tokenized=True)

2. Text Classification Client

with BertClient(show_server_config=False, check_version=False, check_length=False, mode='CLASS') as bc:
    start_t = time.perf_counter()
    str1 = '北京时间2月17日凌晨,第69届柏林国际电影节公布主竞赛单元获奖名单,王景春、咏梅凭借王小帅执导的**影片《地久天长》连夺最佳男女演员双银熊大奖,这是**演员首次包揽柏林电影节最佳男女演员奖,为华语影片刷新纪录。与此同时,由青年导演王丽娜执导的影片《第一次的别离》也荣获了本届柏林电影节新生代单元国际评审团最佳影片,可以说,在经历数个获奖小年之后,**电影在柏林影展再次迎来了高光时刻。'
    str2 = '受粤港澳大湾区规划纲要提振,港股周二高开,恒指开盘上涨近百点,涨幅0.33%,报28440.49点,相关概念股亦集体上涨,电子元件、新能源车、保险、基建概念多数上涨。粤泰股份、珠江实业、深天地A等10余股涨停;中兴通讯、丘钛科技、舜宇光学分别高开1.4%、4.3%、1.6%。比亚迪电子、比亚迪股份、光宇国际分别高开1.7%、1.2%、1%。越秀交通基建涨近2%,粤海投资、碧桂园等多股涨超1%。其他方面,日本软银集团股价上涨超0.4%,推动日经225和东证指数齐齐高开,但随后均回吐涨幅转跌东证指数跌0.2%,日经225指数跌0.11%,报21258.4点。受芯片制造商SK海力士股价下跌1.34%拖累,韩国综指下跌0.34%至2203.9点。澳大利亚ASX 200指数早盘上涨0.39%至6089.8点,大多数行业板块均现涨势。在保健品品牌澳佳宝下调下半财年的销售预期后,其股价暴跌超过23%。澳佳宝CEO亨弗里(Richard Henfrey)认为,公司下半年的利润可能会低于上半年,主要是受到销售额疲弱的影响。同时,亚市早盘澳洲联储公布了2月会议纪要,政策委员将继续谨慎评估经济增长前景,因前景充满不确定性的影响,稳定当前的利率水平比贸然调整利率更为合适,而且当前利率水平将有利于趋向通胀目标及改善就业,当前劳动力市场数据表现强势于其他经济数据。另一方面,经济增长前景亦令消费者消费意愿下滑,如果房价出现下滑,消费可能会进一步疲弱。在澳洲联储公布会议纪要后,澳元兑美元下跌近30点,报0.7120 。美元指数在昨日触及96.65附近的低点之后反弹至96.904。日元兑美元报110.56,接近上一交易日的低点。'
    str3 = '新京报快讯 据国家市场监管总局消息,针对媒体报道水饺等猪肉制品检出非洲猪瘟病毒核酸阳性问题,市场监管总局、农业农村部已要求企业立即追溯猪肉原料来源并对猪肉制品进行了处置。两部门已派出联合督查组调查核实相关情况,要求猪肉制品生产企业进一步加强对猪肉原料的管控,落实检验检疫票证查验规定,完善非洲猪瘟检测和复核制度,防止染疫猪肉原料进入食品加工环节。市场监管总局、农业农村部等部门要求各地全面落实防控责任,强化防控措施,规范信息报告和发布,对不按要求履行防控责任的企业,一旦发现将严厉查处。专家认为,非洲猪瘟不是人畜共患病,虽然对猪有致命危险,但对人没有任何危害,属于只传猪不传人型病毒,不会影响食品安全。开展猪肉制品病毒核酸检测,可为防控溯源工作提供线索。'
    rst = bc.encode([str1, str2, str3])
    print('rst:', rst)
    print('time used:{}'.format(time.perf_counter() - start_t))

you can see this after run the above code:

Note that it can not start NER service and Text Classification service together. but you can using twice command line start ner service and text classification with different port.

Flask server service

sometimes, multi thread deep learning model service may not use C/S service, you can useing simple http service replace that, like using flask. now you can reference code:bert_base/server/simple_flask_http_service.py,building your simple http server service

License

MIT.

The following tutorial is an old version and will be removed in the future.

How to train

1. Download BERT chinese model :

wget https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip  

2. create output dir

create output path in project path:

mkdir output

3. Train model

first method
  python3 bert_lstm_ner.py   \
                  --task_name="NER"  \ 
                  --do_train=True   \
                  --do_eval=True   \
                  --do_predict=True
                  --data_dir=NERdata   \
                  --vocab_file=checkpoint/vocab.txt  \ 
                  --bert_config_file=checkpoint/bert_config.json \  
                  --init_checkpoint=checkpoint/bert_model.ckpt   \
                  --max_seq_length=128   \
                  --train_batch_size=32   \
                  --learning_rate=2e-5   \
                  --num_train_epochs=3.0   \
                  --output_dir=./output/result_dir/ 
OR replace the BERT path and project path in bert_lstm_ner.py
if os.name == 'nt': #windows path config
   bert_path = '{your BERT model path}'
   root_path = '{project path}'
else: # linux path config
   bert_path = '{your BERT model path}'
   root_path = '{project path}'

Than Run:

python3 bert_lstm_ner.py

USING BLSTM-CRF OR ONLY CRF FOR DECODE!

Just alter bert_lstm_ner.py line of 450, the params of the function of add_blstm_crf_layer: crf_only=True or False

ONLY CRF output layer:

    blstm_crf = BLSTM_CRF(embedded_chars=embedding, hidden_unit=FLAGS.lstm_size, cell_type=FLAGS.cell, num_layers=FLAGS.num_layers,
                          dropout_rate=FLAGS.droupout_rate, initializers=initializers, num_labels=num_labels,
                          seq_length=max_seq_length, labels=labels, lengths=lengths, is_training=is_training)
    rst = blstm_crf.add_blstm_crf_layer(crf_only=True)

BiLSTM with CRF output layer

    blstm_crf = BLSTM_CRF(embedded_chars=embedding, hidden_unit=FLAGS.lstm_size, cell_type=FLAGS.cell, num_layers=FLAGS.num_layers,
                          dropout_rate=FLAGS.droupout_rate, initializers=initializers, num_labels=num_labels,
                          seq_length=max_seq_length, labels=labels, lengths=lengths, is_training=is_training)
    rst = blstm_crf.add_blstm_crf_layer(crf_only=False)

Result:

all params using default

In dev data set:

In test data set

entity leval result:

last two result are label level result, the entitly level result in code of line 796-798,this result will be output in predict process. show my entity level result :

my model can download from baidu cloud:
链接:https://pan.baidu.com/s/1GfDFleCcTv5393ufBYdgqQ 提取码:4cus
NOTE: My model is trained by crf_only params

ONLINE PREDICT

If model is train finished, just run

python3 terminal_predict.py

Using NER as Service

Service

Using NER as Service is simple, you just need to run the python script below in the project root path:

python3 runs.py \ 
    -mode NER
    -bert_model_dir /home/macan/ml/data/chinese_L-12_H-768_A-12 \
    -ner_model_dir /home/macan/ml/data/bert_ner \
    -model_pd_dir /home/macan/ml/workspace/BERT_Base/output/predict_optimizer \
    -num_worker 8

You can download my ner model from:https://pan.baidu.com/s/1m9VcueQ5gF-TJc00sFD88w, ex_code: guqq
Set ner_mode.pb to model_pd_dir, and set other file to ner_model_dir and than run last cmd

Client

The client using methods can reference client_test.py script

import time
from client.client import BertClient

ner_model_dir = 'C:\workspace\python\BERT_Base\output\predict_ner'
with BertClient( ner_model_dir=ner_model_dir, show_server_config=False, check_version=False, check_length=False, mode='NER') as bc:
    start_t = time.perf_counter()
    str = '1月24日,新华社对外发布了**对雄安新区的指导意见,洋洋洒洒1.2万多字,17次提到北京,4次提到天津,信息量很大,其实也回答了人们关心的很多问题。'
    rst = bc.encode([str])
    print('rst:', rst)
    print(time.perf_counter() - start_t)

NOTE: input format you can sometime reference bert as service project.
Welcome to provide more client language code like java or others.

Using yourself data to train

if you want to use yourself data to train ner model,you just modify the get_labes func.

def get_labels(self):
       return ["O", "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC", "X", "[CLS]", "[SEP]"]

NOTE: "X", “[CLS]”, “[SEP]” These three are necessary, you just replace your data label to this return list.
Or you can use last code lets the program automatically get the label from training data

def get_labels(self):
        # 通过读取train文件获取标签的方法会出现一定的风险。
        if os.path.exists(os.path.join(FLAGS.output_dir, 'label_list.pkl')):
            with codecs.open(os.path.join(FLAGS.output_dir, 'label_list.pkl'), 'rb') as rf:
                self.labels = pickle.load(rf)
        else:
            if len(self.labels) > 0:
                self.labels = self.labels.union(set(["X", "[CLS]", "[SEP]"]))
                with codecs.open(os.path.join(FLAGS.output_dir, 'label_list.pkl'), 'wb') as rf:
                    pickle.dump(self.labels, rf)
            else:
                self.labels = ["O", 'B-TIM', 'I-TIM', "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC", "X", "[CLS]", "[SEP]"]
        return self.labels

NEW UPDATE

2019.1.30 Support pip install and command line control

2019.1.30 Add Service/Client for NER process

2019.1.9: Add code to remove the adam related parameters in the model, and reduce the size of the model file from 1.3GB to 400MB.

2019.1.3: Add online predict code

reference:

Any problem please open issue OR email me([email protected])

bert-bilstm-crf-ner's People

Contributors

lonelyhentxi avatar macanv avatar oasis-0927 avatar scievan avatar wenlisong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bert-bilstm-crf-ner's Issues

运行报错, 没有checkpoint/bert_config.json文件

c_api.TF_GetCode(self.status.status))

tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: checkpoint/bert_config.json : The system cannot find the file specified.
; No such file or directory

在计算真实序列时不应该用input_ids,而应该用input_mask

在代码中调用Blstm + crf时需要传入序列的真实长度列表,但在代码中是用input_ids来求的:
used = tf.sign(tf.abs(input_ids))
lengths = tf.reduce_sum(used, reduction_indices=1)
这个是不是应该改成:
used = tf.sign(tf.abs(input_mask))
lengths = tf.reduce_sum(used, reduction_indices=1)
因为input_mask中真实token是用1表示的,pad是用0表示的。这样调用reduce_sum函数才能得到真实的序列长度。

已经跑起来了!

把train_batch_size调整为16,本人在11G显存GPU1080Ti跑起来了,accurcy99%,感谢!

读取数据时对空格的处理好像有些问题

在原数据中的原文如果为空格,则在读取数据的read_data函数中,会出现错误。
image
这个是原数据
image
这个是预测后的结果

原因应该是在read_data和convert_single_example两个函数中,将字符串用空格split时出现问题。需要在语聊中提前将空格替换掉。

训练自己模型的问题

假设我要对植物、动物名词进行NER,请问需要多少大概多少语料做训练集?

我现在标注了十几篇文章中的植物名(B-PLANT,I-PLANT)、动物名(B-AN,I-AN),追加到了NERdata/train.txt中,然后用bert_lstm_ner.py训练出了模型。
之后用模型想提取一些动植物名,可以识别出ORG、LOC、PER,但就是识别不出PLANT和AN?
不清楚是哪方面的问题,训练集太少吗?

评价指标

@macanv 您好。我对您项目中的评价指标有点疑惑。我看代码里好像是以tag为单位计算的F1(例如 B-LOC,I-LOC,B-ORG, I-ORG 是单独计算的),但NER一般是以实体类型为单位计算的(如LOC, ORG)。您项目里附加的正确的评测脚本conlleva好像并没有用上。

请问试过的效果咋样?

我试过的命名体识别效果都不太好,对于相同的中文文本,每次运行结果都不一样,有用过的哥们吗

where is bert_lstm_ner.py?

请问bert_lstm_ner.py在哪里?我想自己训练数据,但是�没找到bert_lstm_ner.py,是run Python3.6 site package 的bert_lstm_ner.py吗?

请问有人跟我一样load不进dev test 的数据吗?

我试着运行了主程序,可是发现每次都出现dev test数据load不进去的情况,train就没有这样的问题,查了代码觉得没什么问题,请问有人跟我遇到一样的问题吗?能否帮忙解答一下,我可能错过了什么,万分感谢~
screen shot 2018-12-11 at 11 09 00 pm
screen shot 2018-12-11 at 11 09 09 pm

使用命令行进行NER训练时报错 AttributeError: module 'tensorflow.data' has no attribute 'experimental',TensorFlow版本1.10.0,另外1.11.0和1.12.0也尝试过,同样报错

大佬你好,我碰到以下问题,还麻烦您帮忙解答下,谢谢!
1、命令行的方式我尝试过,一直报下面的错误,我用的TensorFlow的版本是1.10.0,另外,1.11.0和1.12.0我都尝试过,都不行,用的是大佬提供的原始训练语料train.txt ,报错信息如下:
Traceback (most recent call last):
File "/home/software/anaconda3/bin/bert-base-ner-train", line 11, in
sys.exit(train_ner())
File "/home/software/anaconda3/lib/python3.6/site-packages/bert_base/runs/init.py", line 37, in train_ner
train(args=args)
File "/home/software/anaconda3/lib/python3.6/site-packages/bert_base/train/bert_lstm_ner.py", line 616, in train
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "/home/software/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 451, in train_and_evaluate
return executor.run()
File "/home/software/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 590, in run
return self.run_local()
File "/home/software/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 691, in run_local
saving_listeners=saving_listeners)
File "/home/software/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 376, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/software/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1145, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/home/software/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1167, in _train_model_default
input_fn, model_fn_lib.ModeKeys.TRAIN))
File "/home/software/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1011, in _get_features_and_labels_from_input_fn
result = self._call_input_fn(input_fn, mode)
File "/home/software/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1100, in _call_input_fn
return input_fn(**kwargs)
File "/home/software/anaconda3/lib/python3.6/site-packages/bert_base/train/bert_lstm_ner.py", line 329, in input_fn
d = d.apply(tf.data.experimental.map_and_batch(lambda record: _decode_record(record, name_to_features),
AttributeError: module 'tensorflow.data' has no attribute 'experimental'

模型收敛情况

问下各位,这模型训练到多少个echo开始收敛,损失函数在train上多大,最近我在上千万数据上做ner,模型一直不收敛,所有的数据都过一遍不见收敛,求交流?

模型训练问题

您好,我在使用新版本进行模型训练的时候碰到如下错误,找不到问题,求助:
2019-02-11 02:01:02.791404: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-02-11 02:01:03.806589: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-02-11 02:01:03.806652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2019-02-11 02:01:03.806674: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2019-02-11 02:01:03.806956: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2019-02-11 02:01:03.807062: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10754 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
2019-02-11 02:02:25.470659: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
shape of input_ids (?, 128)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/metrics_impl.py:259: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/bert_base/train/tf_metrics.py:141: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Traceback (most recent call last):
File "/usr/local/bin/bert-base-ner-train", line 10, in
sys.exit(train_ner())
File "/usr/local/lib/python3.6/dist-packages/bert_base/runs/init.py", line 37, in train_ner
train(args=args)
File "/usr/local/lib/python3.6/dist-packages/bert_base/train/bert_lstm_ner.py", line 616, in train
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/training.py", line 471, in train_and_evaluate
return executor.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/training.py", line 610, in run
return self.run_local()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/training.py", line 711, in run_local
saving_listeners=saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1183, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1217, in _train_model_default
saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1411, in _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 788, in exit
self._close_internal(exception_type)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 821, in _close_internal
h.end(self._coordinated_creator.tf_sess)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/basic_session_run_hooks.py", line 588, in end
self._save(session, last_step)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/basic_session_run_hooks.py", line 607, in _save
if l.after_save(session, step):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/training.py", line 517, in after_save
self._evaluate(global_step_value) # updates self.eval_result
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/training.py", line 537, in _evaluate
self._evaluator.evaluate_and_export())
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/training.py", line 912, in evaluate_and_export
hooks=self._eval_spec.hooks)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 474, in evaluate
return _evaluate()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 460, in _evaluate
self._evaluate_build_graph(input_fn, hooks, checkpoint_path))
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1424, in _evaluate_build_graph
self._call_model_fn_eval(input_fn, self.config))
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1460, in _call_model_fn_eval
features, labels, model_fn_lib.ModeKeys.EVAL, config)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1171, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/bert_base/train/bert_lstm_ner.py", line 421, in model_fn
eval_metric_ops=eval_metrics
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/model_fn.py", line 194, in new
raise ValueError('Missing loss.')
ValueError: Missing loss.

tf.sequence_mask使用是否错误?

作者,您好,请问下在bert_lstm_ner.py文件中metric_fn函数中,weight=tf.sequence_mask(FLAGS.max_seq_length)是否错误,
官方API :tf.sequence_mask(lengths, maxlen=None, dtype=tf.bool, name=None),这里应该接受一个lengths参数,也就是一个batch_size长的列表,记录batch_size中每条序列真实的长度。
盼回复!

关于实体标签的一点小建议

首先感谢作者的分享,关于实体属性现在是通过读取train文件获取标签,建议能放出配置的地方或者放出接口,大家可以根据需要自行设定。主要是在自定义进行模型训练的时候,有一些自定义的标签。

droupout typo

这个issue很微小,just FYI, droupout被携程了dropout

UnboundLocalError = RCV1 attempt

I'm looking at ways to fine tune with RCV1 as opposed to the supplied dev.txt, train.txt and test.txt.

I also use cased_L-24_H-1024_A-16 as opposed to the Chinese provided one.

Whether I rename the three new files (RCV1, as found and compiled here in the second reply) as dev.txt, train.txt and test.txt or I change them from dev.txt, train.txt and test.txt to eng.testa, eng.testb and eng.train in bert_lstm_ner.py I am faced with this following error:

UnboundLocalError: local variable 'word' referenced before assignment

Any ideas on how to resolve this?

If I simply follow the readme then everything works fine, only once I make a change as mentioned above do I face the problem.

关于bert优化器的疑问

@macanv [您好。我在调用bert中的optimization.create_optimizer()时,返回了一个该函数内部的bug,不知道您在该项目过程中有没有遇到过。
bert_optimizer_bug

please help me ,How can I solve this error?

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[8192,3072] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node bert/encoder/layer_11/intermediate/dense/mul_1}} = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](bert/encoder/layer_11/intermediate/dense/BiasAdd, bert/encoder/layer_11/intermediate/dense/mul)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[{{node crf_loss/Mean/_4197}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2965_crf_loss/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

this is my computer:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48 Driver Version: 390.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:05:00.0 On | N/A |
| 27% 46C P8 16W / 250W | 460MiB / 11175MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... Off | 00000000:06:00.0 Off | N/A |
| 25% 44C P8 10W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... Off | 00000000:09:00.0 Off | N/A |
| 25% 44C P8 20W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... Off | 00000000:0A:00.0 Off | N/A |
| 21% 32C P8 17W / 250W | 2MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1117 G /usr/lib/xorg/Xorg 263MiB |
| 0 2047 G compiz 135MiB |
| 0 19793 G ...-token=92AF324520652561B673766DB2684322 58MiB |
+-----------------------------------------------------------------------------+

I run this program in the terminal .......

max_seq_length 可以自适应吗

训练集中有的序列长度较长,但是最长的不知道有多长,可以有参数比如 max_seq_length = -1 ,让程序自适应找到最大长度吗

将ner改为分类的的模式遇到的问题

我用原生的bert代码训练了一个2分类模型,准备用于意图识别,可是做在线预测的时候,在CPU的环境下,一条预测需要4s多,所以想试试这套试试。
打开服务时,是成功的。
image
但进行预测时,就一直停在这。
image
输入的都能打印出来 就是进不到下一步
image

max_seq_length 不够怎么办?

有的句子比较长,128的长度完全不够,设置到500(语料里有甚至更长的)会导致OOM,机器内存不够用,想知道max_sqe_length不够用怎么解决,不能覆盖完句子,会有什么不好的影响

缺少label_list.pkl

Errno 2] No such file or directory: 'F:\BERT-BiLSTM-CRF-NER-master\output\label_list.pkl'
是在运行bert_lstm_ner.py的时候发生的

使用的时候performance提升不上去

在CoNLL数据集上实验,epoch设置为3,10,20结果performance都差不多卡在73% 提升不上去,数据集已经根据要求改成了需要的格式,其他和教程的都一致就是performance训练不出来

do_train=True ?

python3 bert_lstm_ner.py
--task_name="NER" \
--do_train=True
--do_eval=True
--do_predict=True
--data_dir=NERdata
--vocab_file=checkpoint/vocab.txt \
--bert_config_file=checkpoint/bert_config.json \
--init_checkpoint=checkpoint/bert_model.ckpt
--max_seq_length=128
--train_batch_size=32
--learning_rate=2e-5
--num_train_epochs=3.0
--output_dir=./output/result_dir/

迭代少量Epoch

当迭代轮数较少的时候,比如训练时跑了3个epoch,出现token丢失的情况有遇到吗?
比如test data:
word1,tag1
word2,tag2
word3,tag3

预测的时候变成了:
word1,tag1'
word2,tag2'
..
word3 没了

运行过程有报错

楼主,您好,有以下报错,请问data.conf这个文件是您忘了上传了吗?还是我需要配置成什么样子的?
Traceback (most recent call last):
File "bert_lstm_ner.py", line 815, in
tf.app.run()
File "/usr/local/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "bert_lstm_ner.py", line 724, in main
with codecs.open(FLAGS.data_config_path, 'a', encoding='utf-8') as fd:
File "/usr/local/anaconda3/lib/python3.6/codecs.py", line 897, in open
file = builtins.open(filename, mode, buffering)
FileNotFoundError: [Errno 2] No such file or directory: '/home/wangyingshuai/BERT-BiLSMT-CRF-NER/data.conf'

测试结果和验证结果是nan

INFO:tensorflow:***** Eval results *****
INFO:tensorflow: eval_f = nan
INFO:tensorflow: eval_precision = nan
INFO:tensorflow: eval_recall = nan
INFO:tensorflow: global_step = 327
INFO:tensorflow: loss = 2.0464203

INFO:tensorflow:***** Predict results *****
INFO:tensorflow: eval_f = nan
INFO:tensorflow: eval_precision = nan
INFO:tensorflow: eval_recall = nan
INFO:tensorflow: global_step = 327
INFO:tensorflow: loss = 2.4771109

entity level
processed 214543 tokens with 7450 phrases; found: 7740 phrases; correct: 6676.
accuracy: 98.83%; precision: 86.25%; recall: 89.61%; FB1: 87.90
LOC: precision: 87.66%; recall: 89.81%; FB1: 88.72 3549
ORG: precision: 77.08%; recall: 85.69%; FB1: 81.15 2408
PER: precision: 95.85%; recall: 93.90%; FB1: 94.87 1783

关于数据ORI

请问该数据在原始未加入bert的时候各评价指标是多少?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.