Coder Social home page Coder Social logo

bertner's Introduction

Bert-ChineseNER

Introduction

该项目是基于谷歌开源的BERT预训练模型,在中文NER任务上进行fine-tune。

Datasets & Model

训练本模型的主要标记数据,来自于zjy-usas的ChineseNER项目。本项目在原本的BiLSTM+CRF的框架前,添加了BERT模型作为embedding的特征获取层,预训练的中文BERT模型及代码来自于Google Research的bert

Results

引入bert之后,可以看到在验证集上的F-1值在训练了16个epoch时就已经达到了94.87,并在测试集上达到了93.68,在这个数据集上的F-1值提升了两个多百分点。

Train

  1. 下载bert模型代码,放入本项目根目录
  2. 下载bert的中文预训练模型,解压放入本项目根目录
  3. 搭建依赖环境python3+tensorflow1.12
  4. 执行python3 train.py即可训练模型
  5. 执行python3 predict.py可以对单句进行测试

整理后的项目目录,应如下所示:

 ├── BERT fine-tune实践.md
 ├── README.md
 ├── bert
 ├── chinese_L-12_H-768_A-12
 ├── conlleval
 ├── conlleval.py
 ├── data
 ├── data_utils.py
 ├── loader.py
 ├── model.py
 ├── pictures> 
 ├── predict.py
 ├── rnncell.py
 ├── train.py
 └── utils.py

Conclusion

可以看到,使用bert以后,模型的精度提升了两个多百分点。并且,在后续测试过程中发现,使用bert训练的NER模型拥有更强的泛化性能,比如训练集中未见过的公司名称等,都可以很好的识别。而仅仅使用ChineseNER中提供的训练集,基于BiLSTM+CRF的框架训练得到的模型,基本上无法解决OOV问题。

Fine-tune

目前的代码是Feature Based的迁移,可以改为Fine-tune的迁移,效果还能再提升1个点左右。fine-tune可以自行修改代码,将model中的bert参数加入一起训练,并将lr修改到1e-5的量级。 并且,是否添加BiLSTM都对结果影响不大,可以直接使用BERT输出的结果进行解码,建议还是加一层CRF,强化标记间的转移规则。

Reference

(1) https://github.com/zjy-ucas/ChineseNER

(2) https://github.com/google-research/bert

(3) Neural Architectures for Named Entity Recognition

bertner's People

Contributors

yumath avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bertner's Issues

代码复现第4步(训练)遇到问题

博主你好,我在复现您的代码的时候,进行到训练过程这一步(第4步),控制台输出如下错误,请问能帮助解答一下这是我哪里没有注意到吗,应该如何解决呢?

D:\developer_tools\Anaconda\envs\bertNER\python.exe E:/开题/NER_code/bertNER/train.py
Found 9 unique named entity tags
20864 / 0 / 4636 sentences in train / dev / test.
2021-02-15 13:03:09,605 - log\train.log - INFO - num_tags       :	9
2021-02-15 13:03:09,605 - log\train.log - INFO - lstm_dim       :	200
2021-02-15 13:03:09,606 - log\train.log - INFO - batch_size     :	128
2021-02-15 13:03:09,606 - log\train.log - INFO - max_seq_len    :	128
2021-02-15 13:03:09,606 - log\train.log - INFO - clip           :	5
2021-02-15 13:03:09,606 - log\train.log - INFO - dropout_keep   :	0.5
2021-02-15 13:03:09,606 - log\train.log - INFO - optimizer      :	adam
2021-02-15 13:03:09,606 - log\train.log - INFO - lr             :	0.001
2021-02-15 13:03:09,606 - log\train.log - INFO - tag_schema     :	iob
2021-02-15 13:03:09,606 - log\train.log - INFO - zeros          :	False
2021-02-15 13:03:09,606 - log\train.log - INFO - lower          :	True
2021-02-15 13:03:09.607092: W c:\l\tensorflow_1501918863922\work\tensorflow-1.2.1\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2021-02-15 13:03:09.607092: W c:\l\tensorflow_1501918863922\work\tensorflow-1.2.1\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2021-02-15 13:03:09.607092: W c:\l\tensorflow_1501918863922\work\tensorflow-1.2.1\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2021-02-15 13:03:09.608092: W c:\l\tensorflow_1501918863922\work\tensorflow-1.2.1\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2021-02-15 13:03:09.608092: W c:\l\tensorflow_1501918863922\work\tensorflow-1.2.1\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2021-02-15 13:03:09.608092: W c:\l\tensorflow_1501918863922\work\tensorflow-1.2.1\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2021-02-15 13:03:09.609092: W c:\l\tensorflow_1501918863922\work\tensorflow-1.2.1\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2021-02-15 13:03:09.610092: W c:\l\tensorflow_1501918863922\work\tensorflow-1.2.1\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
  File "E:/开题/NER_code/bertNER/train.py", line 179, in <module>
    tf.app.run(main)
  File "D:\developer_tools\Anaconda\envs\bertNER\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "E:/开题/NER_code/bertNER/train.py", line 175, in main
    train()
  File "E:/开题/NER_code/bertNER/train.py", line 150, in train
    model = create_model(sess, Model, FLAGS.ckpt_path, config, logger)
  File "E:\开题\NER_code\bertNER\utils.py", line 177, in create_model
    model = Model_class(config)
  File "E:\开题\NER_code\bertNER\model.py", line 40, in __init__
    embedding = self.bert_embedding()
  File "E:\开题\NER_code\bertNER\model.py", line 101, in bert_embedding
    use_one_hot_embeddings=False)
  File "E:\开题\NER_code\bertNER\bert\modeling.py", line 194, in __init__
    dropout_prob=config.hidden_dropout_prob)
  File "E:\开题\NER_code\bertNER\bert\modeling.py", line 520, in embedding_postprocessor
    output = layer_norm_and_dropout(output, dropout_prob)
  File "E:\开题\NER_code\bertNER\bert\modeling.py", line 370, in layer_norm_and_dropout
    output_tensor = layer_norm(input_tensor, name)
  File "E:\开题\NER_code\bertNER\bert\modeling.py", line 365, in layer_norm
    inputs=input_tensor, begin_norm_axis=-1, begin_params_axis=-1, scope=name)
  File "D:\developer_tools\Anaconda\envs\bertNER\lib\site-packages\tensorflow\contrib\framework\python\ops\arg_scope.py", line 181, in func_with_args
    return func(*args, **current_args)
TypeError: layer_norm() got an unexpected keyword argument 'begin_norm_axis'
Process finished with exit code 1

config_file

您好,当执行 python predict.py 的时候,会报错:no such file 'config_file',想请问一下这个文件应该放在哪,是什么样的文件呢

请问可以将模型用于识别其他实体吗?

仍然是iob形式,只不过不是识别Per、Log和Org三类实体,而是其他比如医学领域的实体。在我已经有标注数据的基础上,请问可以使用您的方法训练模型么?需要改变哪些部分呢?

为何还是需要wiki_100.utf8?

您好,为何代码中依然需要word2vec训练的wiki_100.utf8文件?我的理解是:bert是用来替换word2vec做特征抽取,chinese_L-12_H-768_A-12这个中文预训练模型不是用来替代wiki_100.utf8的吗?谢谢!

预测代码问题

bert_embedding中is_training=True(model.py 97行), 在预测的时候也是True,意味着预测的时候,模型也在训练?每次预测结果可能不一样,这是我的理解

Testing issues

Excuse me, how can I test the training results after the training is over?

GPU没有被使用

您好,代码一直占用CPU运行,有什么办法可以使用GPU运行提高运行速度吗,谢谢

有个空白的项是什么?

我在训练的时候总有一项是空白的,不知道这是什么。大佬能指点下么?
image
还有第一个accuracy是什么?后面是准确率,召回率和F1值我大概知道,这个accuracy是什么指标啊?

怎样不冻结bert参数?

我把这儿 grads = tf.gradients(self.loss, train_vars) 的train_vars改成了tvars,但是出来的结果全是0

为什么执行python train.py之后没有反应?

博主,为什么执行python train.py之后没有反应啊?会报一个
“bert\tokenization.py:125: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.“的warning,然后就退出了,没有报其他错,很奇怪。。。。

没有日志

您好,为什么执行了python main.py 之后没有日志打出
image
这样就结束了

space interference

image
This space seems to affect my index calculation. Do you know how to remove this interference。

训练结果相关问题

请问这个算是训练完成了吗,精度没有达到预期的效果呢,是不是还要改哪些参数,下面是我的日志,麻烦您再帮我解惑,感谢!!

`2021-02-22 00:59:21,511 - log\train.log - INFO - num_tags : 9
2021-02-22 00:59:21,521 - log\train.log - INFO - lstm_dim : 200
2021-02-22 00:59:21,521 - log\train.log - INFO - batch_size : 128
2021-02-22 00:59:21,521 - log\train.log - INFO - max_seq_len : 128
2021-02-22 00:59:21,521 - log\train.log - INFO - clip : 5.0
2021-02-22 00:59:21,521 - log\train.log - INFO - dropout_keep : 0.5
2021-02-22 00:59:21,521 - log\train.log - INFO - optimizer : adam
2021-02-22 00:59:21,521 - log\train.log - INFO - lr : 0.001
2021-02-22 00:59:21,521 - log\train.log - INFO - tag_schema : iob
2021-02-22 00:59:21,521 - log\train.log - INFO - zeros : False
2021-02-22 00:59:21,521 - log\train.log - INFO - lower : True
2021-02-22 00:59:29,081 - log\train.log - INFO - Created model with fresh parameters.
2021-02-22 00:59:30,261 - log\train.log - INFO - start training
2021-02-22 03:56:54,944 - log\train.log - INFO - iteration:1 step:100/163, NER loss: 8.407235
2021-02-22 05:47:21,221 - log\train.log - INFO - evaluate:dev
2021-02-22 06:18:44,810 - log\train.log - INFO - processed 106932 tokens with 3661 phrases; found: 3697 phrases; correct: 3092.

2021-02-22 06:18:44,810 - log\train.log - INFO - accuracy: 98.58%; precision: 83.64%; recall: 84.46%; FB1: 84.04

2021-02-22 06:18:44,810 - log\train.log - INFO - : precision: 0.00%; recall: 0.00%; FB1: 0.00 3

2021-02-22 06:18:44,810 - log\train.log - INFO - LOC: precision: 83.94%; recall: 84.55%; FB1: 84.24 1812

2021-02-22 06:18:44,810 - log\train.log - INFO - ORG: precision: 74.55%; recall: 75.61%; FB1: 75.08 990

2021-02-22 06:18:44,810 - log\train.log - INFO - PER: precision: 93.39%; recall: 94.44%; FB1: 93.91 892

2021-02-22 06:18:44,978 - log\train.log - INFO - new best dev f1 score:84.040
2021-02-22 06:18:56,372 - log\train.log - INFO - model saved
2021-02-22 06:18:56,372 - log\train.log - INFO - evaluate:test
2021-02-22 07:21:41,547 - log\train.log - INFO - processed 214621 tokens with 7456 phrases; found: 7619 phrases; correct: 6261.

2021-02-22 07:21:41,547 - log\train.log - INFO - accuracy: 98.43%; precision: 82.18%; recall: 83.97%; FB1: 83.06

2021-02-22 07:21:41,547 - log\train.log - INFO - : precision: 0.00%; recall: 0.00%; FB1: 0.00 3

2021-02-22 07:21:41,547 - log\train.log - INFO - LOC: precision: 82.66%; recall: 84.35%; FB1: 83.50 3535

2021-02-22 07:21:41,547 - log\train.log - INFO - ORG: precision: 73.30%; recall: 76.04%; FB1: 74.64 2247

2021-02-22 07:21:41,547 - log\train.log - INFO - PER: precision: 92.26%; recall: 92.97%; FB1: 92.61 1834

2021-02-22 07:21:41,737 - log\train.log - INFO - new best test f1 score:83.060
2021-02-22 08:26:20,947 - log\train.log - INFO - iteration:2 step:37/163, NER loss: 2.939173
`

Originally posted by @BingtaoLiang in #23 (comment)

Time

How long has your program been running on your dataset?

ner_predict.utf8预测结果的问题

在测试集上测试的时候,输出的ner_predict.utf8文件,每句话的最后一个词的gold是[SEP] 预测也是[SEP] 这好像不对吧 最后一个词也有可能是某一实体的一部分,这个要怎么修改啊?还是说这是正确的?大部分情况下,每句话的最后一个字是标点符号,这个倒是没多大影响,但也存在最后一个字不是标点符号的情况,这样预测不对吧?是我哪部分做错了吗?
image
image

有关iteration step epoch的问题

代码中的iteration指的是100个iteration跑遍整个训练集还是相当于一个epoch,跑了100遍训练集,希望能解释一下。

项目引用格式

您好!在研究中借助了您发布的这一项目完成了部分实验,请问您可以提供该项目的引用格式嘛?如果不需要引用您的文章的话,我就按照电子文档的格式引用这一项目,您看可以吗?

请教一下 precision recall为零的原因

2022-11-26 10:08:38,383 - log\train.log - INFO - num_tags : 9
2022-11-26 10:08:38,384 - log\train.log - INFO - lstm_dim : 200
2022-11-26 10:08:38,385 - log\train.log - INFO - batch_size : 16
2022-11-26 10:08:38,386 - log\train.log - INFO - max_seq_len : 128
2022-11-26 10:08:38,386 - log\train.log - INFO - clip : 5.0
2022-11-26 10:08:38,387 - log\train.log - INFO - dropout_keep : 0.7
2022-11-26 10:08:38,388 - log\train.log - INFO - optimizer : adam
2022-11-26 10:08:38,388 - log\train.log - INFO - lr : 1e-05
2022-11-26 10:08:38,388 - log\train.log - INFO - tag_schema : iob
2022-11-26 10:08:38,389 - log\train.log - INFO - zeros : False
2022-11-26 10:08:38,390 - log\train.log - INFO - lower : True
2022-11-26 10:08:42,989 - log\train.log - INFO - Created model with fresh parameters.
2022-11-26 10:08:43,767 - log\train.log - INFO - start training
2022-11-26 10:08:51,170 - log\train.log - INFO - iteration:1 step:0/1304, NER loss:63.690010
2022-11-26 10:12:04,559 - log\train.log - INFO - iteration:1 step:50/1304, NER loss:136.277084
2022-11-26 10:15:17,775 - log\train.log - INFO - iteration:1 step:100/1304, NER loss:109.048058
2022-11-26 10:18:35,602 - log\train.log - INFO - iteration:1 step:150/1304, NER loss:113.007179
2022-11-26 10:21:48,247 - log\train.log - INFO - iteration:1 step:200/1304, NER loss:112.711403
2022-11-26 10:25:00,814 - log\train.log - INFO - iteration:1 step:250/1304, NER loss:106.165657
2022-11-26 10:28:12,863 - log\train.log - INFO - iteration:1 step:300/1304, NER loss:98.900383
2022-11-26 10:31:26,013 - log\train.log - INFO - iteration:1 step:350/1304, NER loss:100.851982
2022-11-26 10:34:37,698 - log\train.log - INFO - iteration:1 step:400/1304, NER loss:80.513985
2022-11-26 10:37:50,628 - log\train.log - INFO - iteration:1 step:450/1304, NER loss:90.724358
2022-11-26 10:41:02,878 - log\train.log - INFO - iteration:1 step:500/1304, NER loss:85.533752
2022-11-26 10:44:15,463 - log\train.log - INFO - iteration:1 step:550/1304, NER loss:83.445625
2022-11-26 10:47:27,357 - log\train.log - INFO - iteration:1 step:600/1304, NER loss:76.617126
2022-11-26 10:50:39,485 - log\train.log - INFO - iteration:1 step:650/1304, NER loss:75.493614
2022-11-26 10:53:51,453 - log\train.log - INFO - iteration:1 step:700/1304, NER loss:68.558174
2022-11-26 10:57:03,932 - log\train.log - INFO - iteration:1 step:750/1304, NER loss:68.461647
2022-11-26 11:00:17,545 - log\train.log - INFO - iteration:1 step:800/1304, NER loss:72.436432
2022-11-26 11:03:30,673 - log\train.log - INFO - iteration:1 step:850/1304, NER loss:58.224152
2022-11-26 11:06:43,195 - log\train.log - INFO - iteration:1 step:900/1304, NER loss:53.891891
2022-11-26 11:09:55,809 - log\train.log - INFO - iteration:1 step:950/1304, NER loss:50.741997
2022-11-26 11:13:08,250 - log\train.log - INFO - iteration:1 step:1000/1304, NER loss:47.740196
2022-11-26 11:16:20,369 - log\train.log - INFO - iteration:1 step:1050/1304, NER loss:43.548584
2022-11-26 11:19:33,584 - log\train.log - INFO - iteration:1 step:1100/1304, NER loss:47.549740
2022-11-26 11:22:46,523 - log\train.log - INFO - iteration:1 step:1150/1304, NER loss:42.699841
2022-11-26 11:25:58,604 - log\train.log - INFO - iteration:1 step:1200/1304, NER loss:42.679581
2022-11-26 11:29:11,478 - log\train.log - INFO - iteration:1 step:1250/1304, NER loss:42.947208
2022-11-26 11:32:23,519 - log\train.log - INFO - iteration:1 step:1300/1304, NER loss:39.232140
2022-11-26 11:32:35,170 - log\train.log - INFO - evaluate:dev
2022-11-26 11:32:56,737 - log\train.log - INFO - processed 106932 tokens with 3661 phrases; found: 0 phrases; correct: 0.

2022-11-26 11:32:56,738 - log\train.log - INFO - accuracy: 89.06%; precision: 0.00%; recall: 0.00%; FB1: 0.00

2022-11-26 11:32:56,739 - log\train.log - INFO - : precision: 0.00%; recall: 0.00%; FB1: 0.00 0

2022-11-26 11:32:56,740 - log\train.log - INFO - LOC: precision: 0.00%; recall: 0.00%; FB1: 0.00 0

2022-11-26 11:32:56,741 - log\train.log - INFO - ORG: precision: 0.00%; recall: 0.00%; FB1: 0.00 0

2022-11-26 11:32:56,742 - log\train.log - INFO - PER: precision: 0.00%; recall: 0.00%; FB1: 0.00 0

2022-11-26 11:32:56,822 - log\train.log - INFO - evaluate:test
2022-11-26 11:33:38,469 - log\train.log - INFO - processed 214621 tokens with 7456 phrases; found: 0 phrases; correct: 0.

2022-11-26 11:33:38,470 - log\train.log - INFO - accuracy: 88.61%; precision: 0.00%; recall: 0.00%; FB1: 0.00

2022-11-26 11:33:38,471 - log\train.log - INFO - : precision: 0.00%; recall: 0.00%; FB1: 0.00 0

2022-11-26 11:33:38,471 - log\train.log - INFO - LOC: precision: 0.00%; recall: 0.00%; FB1: 0.00 0

2022-11-26 11:33:38,472 - log\train.log - INFO - ORG: precision: 0.00%; recall: 0.00%; FB1: 0.00 0

2022-11-26 11:33:38,473 - log\train.log - INFO - PER: precision: 0.00%; recall: 0.00%; FB1: 0.00 0

2022-11-26 11:36:39,016 - log\train.log - INFO - iteration:2 step:46/1304, NER loss:41.068260
2022-11-26 11:39:50,682 - log\train.log - INFO - iteration:2 step:96/1304, NER loss:37.320129
2022-11-26 11:43:03,009 - log\train.log - INFO - iteration:2 step:146/1304, NER loss:39.942009
2022-11-26 11:46:14,960 - log\train.log - INFO - iteration:2 step:196/1304, NER loss:38.224483
2022-11-26 11:49:27,003 - log\train.log - INFO - iteration:2 step:246/1304, NER loss:40.297264
2022-11-26 11:52:39,573 - log\train.log - INFO - iteration:2 step:296/1304, NER loss:39.178818
2022-11-26 11:55:52,423 - log\train.log - INFO - iteration:2 step:346/1304, NER loss:40.981510

我换了很多学习率,其余参数也有更替试运行,但都是得到precision为零的结果.想知道是什么原因?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.