Coder Social home page Coder Social logo

Comments (9)

bojone avatar bojone commented on June 9, 2024 2
import numpy as np
from bert4keras.models import build_transformer_model
from bert4keras.tokenizers import SpTokenizer


config_path = '/root/kg/bert/albert_base_en_tfhub/albert_config.json'
checkpoint_path = '/root/kg/bert/albert_base_en_tfhub/variables/variables'
spm_path = '/root/kg/bert/albert_base_en_tfhub/assets/30k-clean.model'


tokenizer = SpTokenizer(spm_path)
model = build_transformer_model(config_path, checkpoint_path, model='albert')

token_ids, segment_ids = tokenizer.encode('language model')
print(model.predict([np.array([token_ids]), np.array([segment_ids])]))

其中albert_config.json自行保存下来。

from bert4keras.

koryako avatar koryako commented on June 9, 2024 1

from bert4keras.models import build_transformer_model
from bert4keras.tokenizers import Tokenizer
import numpy as np

config_path = './albert_large/albert_config.json'
checkpoint_path = './albert_large/model.ckpt-best'
dict_path = './albert_large/30k-clean.vocab'

tokenizer = Tokenizer(dict_path, do_lower_case=True) # 建立分词器
model = build_transformer_model(config_path, checkpoint_path, model='albert') # 建立模型,加载权重

编码测试

token_ids,segment_ids = tokenizer.encode(u'are you ok')

print('\n ===== predicting =====\n')
print(model.predict([np.array([token_ids]), np.array([segment_ids])]))

报以下错误

AttributeError: 'Tokenizer' object has no attribute '_token_unk_id'

from bert4keras.

PteroMaplePT avatar PteroMaplePT commented on June 9, 2024

感謝

from bert4keras.

SchenbergZY avatar SchenbergZY commented on June 9, 2024

对不起,看了这个例子,还是不明白如何把https://tfhub.dev/google/albert_base/2 的那个pb文件转换成ckpt...或许这个例子不是这个意思?

from bert4keras.

bojone avatar bojone commented on June 9, 2024

@SchenbergZY 不用转啊,variables目录下就是ckpt文件,用上述方式就可以直接加载。

from bert4keras.

SchenbergZY avatar SchenbergZY commented on June 9, 2024

是这样的,在https://tfhub.dev/google/albert_base/2 只能下载一个2.tar的文件,解压后是一个叫”2“的无扩展名文件(或许是pb?),并没有ckpt类型的文件.

from bert4keras.

bojone avatar bojone commented on June 9, 2024

@SchenbergZY 下载得到的是2.tar.gz,解压后是一个名为2的文件夹,里边有很多东西。如果不是,请重新下载并且学会解压tar.gz。我相信Google不会只为你一个人提供独特的下载结果的。

from bert4keras.

SchenbergZY avatar SchenbergZY commented on June 9, 2024

谢谢。通过看bert-for-tf2我找到了下载2.tar.gz的方法

from bert4keras.

Teddy-SC avatar Teddy-SC commented on June 9, 2024

from bert4keras.models import build_transformer_model from bert4keras.tokenizers import Tokenizer import numpy as np

config_path = './albert_large/albert_config.json' checkpoint_path = './albert_large/model.ckpt-best' dict_path = './albert_large/30k-clean.vocab'

tokenizer = Tokenizer(dict_path, do_lower_case=True) # 建立分词器 model = build_transformer_model(config_path, checkpoint_path, model='albert') # 建立模型,加载权重

编码测试

token_ids,segment_ids = tokenizer.encode(u'are you ok')

print('\n ===== predicting =====\n') print(model.predict([np.array([token_ids]), np.array([segment_ids])]))

报以下错误

AttributeError: 'Tokenizer' object has no attribute '_token_unk_id'

我遇到了同样的问题,用了SpTokenizer, 但是无.match函数,bert+ner报错

from bert4keras.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.