xu-song / bert-as-language-model Goto Github PK
View Code? Open in Web Editor NEWBERT as language model, fork from https://github.com/google-research/bert
License: Apache License 2.0
BERT as language model, fork from https://github.com/google-research/bert
License: Apache License 2.0
如果没有的话,这样推理,是不是用的是初始权重,对最后的概率是有影响的
请问是否尝试过将checkpoint导出成为可以用来做serving的pb格式呢?
Hello,
Is there a paper for the model that could be cited?
你好,感谢你提供了使用bert计算perplexity的代码。
目前我有两个小疑问希望能得到解答:
1、计算某个字的prob时是不是首先对该字进行mask,然后输入mask后的句子,再对mask位置的隐层向量进行softmax从而得到该字的prob。通过对句中所有字进行上述遍历,遍得到了句中所有字的prob?
2、是否会计算[CLS]和[SEP]的prob?计算perplexity时是否会考虑[CLS]和[SEP]的prob,还是只考虑原始句子中所有字的prob?
感谢!
用uncased_L-12_H-768_A-12,chinese_L-12_H-768_A-12 跑不出来效果
如果把bert作为语言模型,来对句子进行rescore,是相当于token classification任务是吧,是需要在BertModel的基础上加上一个线性层(如linear[hidden_size, vocab_size]),然后进行finetune吗
你哈,我从bert的链接上下载了bert预训练模型,但是对于以下三句话的复杂度预测结果和你的差距有点大
而且第三句话的复杂度更小一些,这似乎不太正常
你知道什么这是原因吗,非常感谢~
我使用的tensorflow版本是2.6.2
[
{
"tokens": [
{
"token": "there",
"prob": 0.002376210642978549
},
{
"token": "is",
"prob": 0.00032396349706687033
},
{
"token": "a",
"prob": 0.00016864163626451045
},
{
"token": "book",
"prob": 8.497028466081247e-05
},
{
"token": "on",
"prob": 0.000501244910992682
},
{
"token": "the",
"prob": 0.00038025222602300346
},
{
"token": "desk",
"prob": 6.700590802211082e-06
}
],
"ppl": 4931.9851396876575
},
{
"tokens": [
{
"token": "there",
"prob": 0.002963493810966611
},
{
"token": "is",
"prob": 0.0003500459424685687
},
{
"token": "a",
"prob": 0.00018642270879354328
},
{
"token": "plane",
"prob": 1.383832932333462e-05
},
{
"token": "on",
"prob": 0.0005545589956454933
},
{
"token": "the",
"prob": 0.00038116113864816725
},
{
"token": "desk",
"prob": 7.67214714869624e-06
}
],
"ppl": 5835.439745980134
},
{
"tokens": [
{
"token": "there",
"prob": 0.002954021329060197
},
{
"token": "is",
"prob": 0.00039738742634654045
},
{
"token": "a",
"prob": 0.00024926112382672727
},
{
"token": "book",
"prob": 0.00010113466123584658
},
{
"token": "in",
"prob": 0.00033981725573539734
},
{
"token": "the",
"prob": 0.00039128249045461416
},
{
"token": "desk",
"prob": 6.479389867308782e-06
}
],
"ppl": 4531.281922045702
}
]
就是按顺序依次mask掉每一个词,然后预测该词的概率。
整个句子的概率就按照下面这个公式简单近似的:
Originally posted by @xu-song in #15 (comment)
请问最后的结果就是将预测的概率相乘?是这个意思吗?
INFO:tensorflow:Could not find trained model in model_dir: ./bert_output, running initialization to predict.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Running infer on CPU
INFO:tensorflow:Initialize variable bert/embeddings/LayerNorm/beta:0 from checkpoint
参数指定了 output_dir作为导出的模型保存路径, 不过导出为空。封装成函数去调用, 每次调用预测函数 estimator.predict, 都要从google的中文模型中重新加载, 重新走到 Could not find trained model in model_dir: ./bert_output, running initialization to predict.
如题
I see tokenizer convert tokens to ids using the vocabulary file. What if the input sentence contains words not in the vocabulary file? Do we need to use our own vocabulary file?
Hi, after seeing the result of predicting several Chinese phrases, I found that the probability of the last word is always too small compared to other words in the same phrase. This also happens in all examples shown in your readme.md. Therefore, the perplexity of phrases become also very high.
What do you think about this phenomenon? Thanks for your attention.
看代码没看懂
谢谢。
另外看到一些评论说, bert不能作为语言模型对句子合理性打分,记得是bert 官方的回复, 您怎么理解。
请问predict的主要耗时在哪里呢,想做成一个服务,但是耗时太久请问怎么改进呢
Hi, I'm trying to export the unfine-tuned BERT model as online service.
I followed the official instructions SavedModel and successfully exported the fine-tuned model. But when I try to export the original BERT model, it failed. The error messages are as followed:
Traceback (most recent call last):
File "export_lm_predictor.py", line 136, in
'./exported_model', serving_input_receiver_fn(max_seq_len, 20))
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 734, in export_saved_model
strip_default_attrs=True)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 663, in export_savedmodel
mode=model_fn_lib.ModeKeys.PREDICT)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 789, in _export_saved_model_for_mode
strip_default_attrs=strip_default_attrs)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 878, in _export_all_saved_models
raise ValueError("Couldn't find trained model at %s." % self._model_dir)
ValueError: Couldn't find trained model at ../bert_models/chinese_L-12_H-768_A-12.
I guess that there is no graph.pbtxt and checkpoint files in the original model dir. Does anyone have any ideas? Thanks!
[edit]
I specify the checkpoint_path parameter in export_saved_model function. By the way, I
create the estimator using tf.estimator.Estimator. Then I got a new error:
ValueError: Couldn't find 'checkpoint' file or checkpoints in given directory ../bert_models/chinese_L-12_H-768_A-12
So we must got checkpoint file in the original BERT model directory?
依照bert-as-language-model的思路,p(a,b) = p(a|b)*p(b|a),显然错误,这得到的根本不是概率。
求概率还是用AR model:p(a,b) = p(a)*p(b|a)
这个并没有fine-tune的模型,如何将bert-as-language-model作为线上服务呢?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.