目前看範例只有ckpt，是否有直接使用 https://tfhub.dev/google/albert_base/2 的方法、範例?

<div class="highlight highlight-source-python notranslate position-relative overflow-auto" dir="auto

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

想詢問google 原版 albert 是pb檔的使用方法? about bert4keras HOT 9 CLOSED

bojone commented on June 9, 2024

想詢問google 原版 albert 是pb檔的使用方法?

from bert4keras.

Comments (9)

bojone commented on June 9, 2024 2

import numpy as np
from bert4keras.models import build_transformer_model
from bert4keras.tokenizers import SpTokenizer


config_path = '/root/kg/bert/albert_base_en_tfhub/albert_config.json'
checkpoint_path = '/root/kg/bert/albert_base_en_tfhub/variables/variables'
spm_path = '/root/kg/bert/albert_base_en_tfhub/assets/30k-clean.model'


tokenizer = SpTokenizer(spm_path)
model = build_transformer_model(config_path, checkpoint_path, model='albert')

token_ids, segment_ids = tokenizer.encode('language model')
print(model.predict([np.array([token_ids]), np.array([segment_ids])]))

其中albert_config.json自行保存下来。

from bert4keras.

koryako commented on June 9, 2024 1

from bert4keras.models import build_transformer_model
from bert4keras.tokenizers import Tokenizer
import numpy as np

config_path = './albert_large/albert_config.json'
checkpoint_path = './albert_large/model.ckpt-best'
dict_path = './albert_large/30k-clean.vocab'

tokenizer = Tokenizer(dict_path, do_lower_case=True) # 建立分词器
model = build_transformer_model(config_path, checkpoint_path, model='albert') # 建立模型，加载权重

编码测试

token_ids,segment_ids = tokenizer.encode(u'are you ok')

print('\n ===== predicting =====\n')
print(model.predict([np.array([token_ids]), np.array([segment_ids])]))

报以下错误

AttributeError: 'Tokenizer' object has no attribute '_token_unk_id'

from bert4keras.

PteroMaplePT commented on June 9, 2024

感謝

from bert4keras.

SchenbergZY commented on June 9, 2024

对不起，看了这个例子，还是不明白如何把https://tfhub.dev/google/albert_base/2 的那个pb文件转换成ckpt...或许这个例子不是这个意思？

from bert4keras.

bojone commented on June 9, 2024

@SchenbergZY 不用转啊，variables目录下就是ckpt文件，用上述方式就可以直接加载。

from bert4keras.

SchenbergZY commented on June 9, 2024

是这样的，在https://tfhub.dev/google/albert_base/2 只能下载一个2.tar的文件，解压后是一个叫”2“的无扩展名文件（或许是pb？），并没有ckpt类型的文件.

from bert4keras.

bojone commented on June 9, 2024

@SchenbergZY 下载得到的是2.tar.gz，解压后是一个名为2的文件夹，里边有很多东西。如果不是，请重新下载并且学会解压tar.gz。我相信Google不会只为你一个人提供独特的下载结果的。

from bert4keras.

SchenbergZY commented on June 9, 2024

谢谢。通过看bert-for-tf2我找到了下载2.tar.gz的方法

from bert4keras.

Teddy-SC commented on June 9, 2024

from bert4keras.models import build_transformer_model from bert4keras.tokenizers import Tokenizer import numpy as np

config_path = './albert_large/albert_config.json' checkpoint_path = './albert_large/model.ckpt-best' dict_path = './albert_large/30k-clean.vocab'

tokenizer = Tokenizer(dict_path, do_lower_case=True) # 建立分词器 model = build_transformer_model(config_path, checkpoint_path, model='albert') # 建立模型，加载权重

编码测试

token_ids,segment_ids = tokenizer.encode(u'are you ok')

print('\n ===== predicting =====\n') print(model.predict([np.array([token_ids]), np.array([segment_ids])]))

报以下错误

AttributeError: 'Tokenizer' object has no attribute '_token_unk_id'

我遇到了同样的问题，用了SpTokenizer, 但是无.match函数，bert+ner报错

from bert4keras.

想詢問google 原版 albert 是pb檔的使用方法? about bert4keras HOT 9 CLOSED

Comments (9)

编码测试

编码测试

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent