Comments (9)
import numpy as np
from bert4keras.models import build_transformer_model
from bert4keras.tokenizers import SpTokenizer
config_path = '/root/kg/bert/albert_base_en_tfhub/albert_config.json'
checkpoint_path = '/root/kg/bert/albert_base_en_tfhub/variables/variables'
spm_path = '/root/kg/bert/albert_base_en_tfhub/assets/30k-clean.model'
tokenizer = SpTokenizer(spm_path)
model = build_transformer_model(config_path, checkpoint_path, model='albert')
token_ids, segment_ids = tokenizer.encode('language model')
print(model.predict([np.array([token_ids]), np.array([segment_ids])]))
其中albert_config.json
自行保存下来。
from bert4keras.
from bert4keras.models import build_transformer_model
from bert4keras.tokenizers import Tokenizer
import numpy as np
config_path = './albert_large/albert_config.json'
checkpoint_path = './albert_large/model.ckpt-best'
dict_path = './albert_large/30k-clean.vocab'
tokenizer = Tokenizer(dict_path, do_lower_case=True) # 建立分词器
model = build_transformer_model(config_path, checkpoint_path, model='albert') # 建立模型,加载权重
编码测试
token_ids,segment_ids = tokenizer.encode(u'are you ok')
print('\n ===== predicting =====\n')
print(model.predict([np.array([token_ids]), np.array([segment_ids])]))
报以下错误
AttributeError: 'Tokenizer' object has no attribute '_token_unk_id'
from bert4keras.
感謝
from bert4keras.
对不起,看了这个例子,还是不明白如何把https://tfhub.dev/google/albert_base/2 的那个pb文件转换成ckpt...或许这个例子不是这个意思?
from bert4keras.
@SchenbergZY 不用转啊,variables目录下就是ckpt文件,用上述方式就可以直接加载。
from bert4keras.
是这样的,在https://tfhub.dev/google/albert_base/2 只能下载一个2.tar的文件,解压后是一个叫”2“的无扩展名文件(或许是pb?),并没有ckpt类型的文件.
from bert4keras.
@SchenbergZY 下载得到的是2.tar.gz,解压后是一个名为2的文件夹,里边有很多东西。如果不是,请重新下载并且学会解压tar.gz。我相信Google不会只为你一个人提供独特的下载结果的。
from bert4keras.
谢谢。通过看bert-for-tf2我找到了下载2.tar.gz的方法
from bert4keras.
from bert4keras.models import build_transformer_model from bert4keras.tokenizers import Tokenizer import numpy as np
config_path = './albert_large/albert_config.json' checkpoint_path = './albert_large/model.ckpt-best' dict_path = './albert_large/30k-clean.vocab'
tokenizer = Tokenizer(dict_path, do_lower_case=True) # 建立分词器 model = build_transformer_model(config_path, checkpoint_path, model='albert') # 建立模型,加载权重
编码测试
token_ids,segment_ids = tokenizer.encode(u'are you ok')
print('\n ===== predicting =====\n') print(model.predict([np.array([token_ids]), np.array([segment_ids])]))
报以下错误
AttributeError: 'Tokenizer' object has no attribute '_token_unk_id'
我遇到了同样的问题,用了SpTokenizer, 但是无.match函数,bert+ner报错
from bert4keras.
Related Issues (20)
- 关于GPLinker训练中会出现LOSS为负的问题
- where is chinese_nezha_gpt_L-12_H-768_A-12 ? HOT 4
- from bert4keras.models import build_transformer_model HOT 2
- from bert4keras.backend import keras, K HOT 2
- NER实体识别的问题,能否指定单个标签只对应单个实体,或者说想要实现这种方式要如何处理比较好
- task_question_answer_generation_by_seq2seq.py 修改成多gpu报错 HOT 1
- 多任务学习要如何固定输出层的参数 HOT 2
- model.save() 保存模型时报错 HOT 1
- 4080显卡适配问题,failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED HOT 1
- CPU环境下,bert4keras加载大小模型生成向量的时间差不多,并不能通过更换小模型解决性能问题,求问。 HOT 2
- where is the function "_set_hyper" in class LionV2 HOT 5
- 利用OpenVINO进行模型转换时报错 HOT 1
- Contrastive Search解码策略
- module 'inspect' has no attribute 'getargspec'. Did you mean: 'getargs'?
- 关于复用keras_bert训练后的模型
- tensorflow.python.framework.errors_impl.OpError
- 'Tokenizer' object has no attribute '_token_pad_id'
- 'Tokenizer' object has no attribute '_token_pad_id' HOT 2
- 在对bert4keras加载的模型进行人工特征输入时出现问题,
- bert4keras/examples /task_relation_extraction.py HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bert4keras.