Coder Social home page Coder Social logo

bert-as-language-model's People

Contributors

0xflotus avatar abhishekraok avatar aijunbai avatar ammarasmro avatar bitmindlab avatar bogdandidenko avatar cbockman avatar craigcitro avatar eric-haibin-lin avatar georgefeng avatar jacobdevlin-google avatar jasonjpu avatar pengli09 avatar qwfy avatar rodgzilla avatar stefan-it avatar xu-song avatar zhaoyongke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bert-as-language-model's Issues

serving 的问题

请问是否尝试过将checkpoint导出成为可以用来做serving的pb格式呢?

paper to cite

Hello,
Is there a paper for the model that could be cited?

字prob和句子perplexity计算方法

你好,感谢你提供了使用bert计算perplexity的代码。
目前我有两个小疑问希望能得到解答:
1、计算某个字的prob时是不是首先对该字进行mask,然后输入mask后的句子,再对mask位置的隐层向量进行softmax从而得到该字的prob。通过对句中所有字进行上述遍历,遍得到了句中所有字的prob?
2、是否会计算[CLS]和[SEP]的prob?计算perplexity时是否会考虑[CLS]和[SEP]的prob,还是只考虑原始句子中所有字的prob?
感谢!

作为language model对句子打分需要finetune吗

如果把bert作为语言模型,来对句子进行rescore,是相当于token classification任务是吧,是需要在BertModel的基础上加上一个线性层(如linear[hidden_size, vocab_size]),然后进行finetune吗

推导结果不正常

你哈,我从bert的链接上下载了bert预训练模型,但是对于以下三句话的复杂度预测结果和你的差距有点大
而且第三句话的复杂度更小一些,这似乎不太正常
你知道什么这是原因吗,非常感谢~
我使用的tensorflow版本是2.6.2
[
{
"tokens": [
{
"token": "there",
"prob": 0.002376210642978549
},
{
"token": "is",
"prob": 0.00032396349706687033
},
{
"token": "a",
"prob": 0.00016864163626451045
},
{
"token": "book",
"prob": 8.497028466081247e-05
},
{
"token": "on",
"prob": 0.000501244910992682
},
{
"token": "the",
"prob": 0.00038025222602300346
},
{
"token": "desk",
"prob": 6.700590802211082e-06
}
],
"ppl": 4931.9851396876575
},
{
"tokens": [
{
"token": "there",
"prob": 0.002963493810966611
},
{
"token": "is",
"prob": 0.0003500459424685687
},
{
"token": "a",
"prob": 0.00018642270879354328
},
{
"token": "plane",
"prob": 1.383832932333462e-05
},
{
"token": "on",
"prob": 0.0005545589956454933
},
{
"token": "the",
"prob": 0.00038116113864816725
},
{
"token": "desk",
"prob": 7.67214714869624e-06
}
],
"ppl": 5835.439745980134
},
{
"tokens": [
{
"token": "there",
"prob": 0.002954021329060197
},
{
"token": "is",
"prob": 0.00039738742634654045
},
{
"token": "a",
"prob": 0.00024926112382672727
},
{
"token": "book",
"prob": 0.00010113466123584658
},
{
"token": "in",
"prob": 0.00033981725573539734
},
{
"token": "the",
"prob": 0.00039128249045461416
},
{
"token": "desk",
"prob": 6.479389867308782e-06
}
],
"ppl": 4531.281922045702
}
]

每次执行时提示找不到模型

INFO:tensorflow:Could not find trained model in model_dir: ./bert_output, running initialization to predict.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Running infer on CPU
INFO:tensorflow:Initialize variable bert/embeddings/LayerNorm/beta:0 from checkpoint

参数指定了 output_dir作为导出的模型保存路径, 不过导出为空。封装成函数去调用, 每次调用预测函数 estimator.predict, 都要从google的中文模型中重新加载, 重新走到 Could not find trained model in model_dir: ./bert_output, running initialization to predict.

Words outside of vocab

I see tokenizer convert tokens to ids using the vocabulary file. What if the input sentence contains words not in the vocabulary file? Do we need to use our own vocabulary file?

Probability of the last word is always too small

Hi, after seeing the result of predicting several Chinese phrases, I found that the probability of the last word is always too small compared to other words in the same phrase. This also happens in all examples shown in your readme.md. Therefore, the perplexity of phrases become also very high.

What do you think about this phenomenon? Thanks for your attention.

Export bert-as-language-model as online service but failed.

Hi, I'm trying to export the unfine-tuned BERT model as online service.
I followed the official instructions SavedModel and successfully exported the fine-tuned model. But when I try to export the original BERT model, it failed. The error messages are as followed:

Traceback (most recent call last):
File "export_lm_predictor.py", line 136, in
'./exported_model', serving_input_receiver_fn(max_seq_len, 20))
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 734, in export_saved_model
strip_default_attrs=True)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 663, in export_savedmodel
mode=model_fn_lib.ModeKeys.PREDICT)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 789, in _export_saved_model_for_mode
strip_default_attrs=strip_default_attrs)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 878, in _export_all_saved_models
raise ValueError("Couldn't find trained model at %s." % self._model_dir)
ValueError: Couldn't find trained model at ../bert_models/chinese_L-12_H-768_A-12.

I guess that there is no graph.pbtxt and checkpoint files in the original model dir. Does anyone have any ideas? Thanks!

[edit]
I specify the checkpoint_path parameter in export_saved_model function. By the way, I
create the estimator using tf.estimator.Estimator. Then I got a new error:
ValueError: Couldn't find 'checkpoint' file or checkpoints in given directory ../bert_models/chinese_L-12_H-768_A-12
So we must got checkpoint file in the original BERT model directory?

TODO:

  • 拼写检查,word打分
  • decode: 猜词

求概率还是用AR model

依照bert-as-language-model的思路,p(a,b) = p(a|b)*p(b|a),显然错误,这得到的根本不是概率。
求概率还是用AR model:p(a,b) = p(a)*p(b|a)

online service的问题

这个并没有fine-tune的模型,如何将bert-as-language-model作为线上服务呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.