bojone / bert_in_keras Goto Github PK
View Code? Open in Web Editor NEW在Keras下微调Bert的一些例子;some examples of bert in keras
在Keras下微调Bert的一些例子;some examples of bert in keras
苏神,您好,想请教您一下,虽然已经了解学习bert一段时间了,但实际应用还是第一次,针对您nl2sql的baseline有一些想法。首先,您将question和headers拼接让我有点迷惑,虽然我一开始的直观感觉也是将问题和表头放在一起编码,仔细想了一下,设置多个[CLS]倒是没问题,毕竟[CLS]实在微调分类的时候才起分类作用,但是输入的embedding除了token embedding还有segment和position embedding吧,position embedding不太清楚可能是在tokenize的时候keras-bert内置函数编码加到输入里了吧?但segment embedding全0我无法理解,虽然question和headers拼接不能向两个句子那样,segment embedding为[0,0,0,...0,1,1,1,...1],但都是0的话应该是没有任何意义的吧。或者我可以理解成您抛去了segment embedding?
TypeError Traceback (most recent call last)
in ()
60 pcop_loss = K.sparse_categorical_crossentropy(cop_in, pcop)
61 pcop_loss = K.sum(pcop_loss * xm) / K.sum(xm)
---> 62 pcsel_loss = K.sparse_categorical_crossentropy(csel_in, pcsel)
63 pcsel_loss = K.sum(pcsel_loss * xm * cm) / K.sum(xm * cm)
64 loss = psel_loss + pconn_loss + pcop_loss + pcsel_loss
/usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py in sparse_categorical_crossentropy(target, output, from_logits, axis)
3343 output_shape = output.get_shape()
3344 targets = cast(flatten(target), 'int64')
-> 3345 logits = tf.reshape(output, [-1, int(output_shape[-1])])
3346 res = tf.nn.sparse_softmax_cross_entropy_with_logits(
3347 labels=targets,
TypeError: int returned non-int (type NoneType)
Traceback (most recent call last):
File "E:\PYTHON36\lib\contextlib.py", line 99, in exit
self.gen.throw(type, value, traceback)
File "E:\PYTHON36\lib\site-packages\tensorflow\python\framework\ops.py", line 5652, in get_controller
yield g
File "E:\PYTHON36\lib\site-packages\keras\engine\base_layer.py", line 491, in call
arguments=user_kwargs)
File "E:\PYTHON36\lib\site-packages\keras\engine\base_layer.py", line 559, in _add_inbound_node
output_tensors[i]._keras_shape = output_shapes[i]
IndexError: list index out of range
when i run nl2sql_baseline.py , i meet this error:
Traceback (most recent call last):
File "subject_extract.py", line 47, in
D = D[D[2] != u'其他']
File "/home/work/.local/lib/python2.7/site-packages/pandas/core/ops.py", line 1728, in wrapper
(is_extension_array_dtype(other) and not is_scalar(other))):
File "/home/work/.local/lib/python2.7/site-packages/pandas/core/dtypes/common.py", line 1749, in is_extension_array_dtype
registry.find(dtype) is not None)
File "/home/work/.local/lib/python2.7/site-packages/pandas/core/dtypes/dtypes.py", line 89, in find
return dtype_type.construct_from_string(dtype)
File "/home/work/.local/lib/python2.7/site-packages/pandas/core/dtypes/dtypes.py", line 699, in construct_from_string
raise TypeError(msg.format(string))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
@bojone
np_utils.to_categorical(np.squeeze(y, axis=-1), NB_CLASSES)
model.compile(
loss='categorical_crossentropy',
optimizer=Adam(1e-5), # 用足够小的学习率
metrics=['accuracy']
)
苏神287到304行代码中 当预测出的value存在问句的句末时,这是并没有将value添加进conds中
作者你好, 我刚入门NLP。对于下面这几行代码不是很理解,想请问一下, 在extract_entity函数中,为何要将_ps递减10,谢谢!
if len(_t) == 1 and re.findall(u'[^\u4e00-\u9fa5a-zA-Z0-9\*]', _t) and _t not in additional_chars:
_ps1[i] -= 10
使用sentiment.py进行训练完,模型保持之后,加载提示存在自定义layer,此处需要如何处理
多谢苏神的例子,已经在自己的数据集上重新训练了模型,效果也有很大的提升。目前想拿新的模型预测数据,但是好像卡住了,希望苏神能够帮忙看一下:
text = '这辆车真是太差了!'
tokenizer = Tokenizer(token_dict)
def predict_sentiment(text):
tokens = tokenizer.tokenize(text)
indices = np.array([[token_dict[token] for token in tokens]])
segments = np.zeros(len(tokens))
result = model_in.predict([indices, segments], verbose=True)[0].argmax()
return result
predict_sentiment(text)
目前无论放什么 text 进去,结果都是 0,不知道错在哪里?
苏神你好,在relation_extraction那个例子中有一行是random.choice选择主语: https://github.com/bojone/bert_in_keras/blob/master/relation_extract.py#L171
这里k1随机出来之后,其实对应k1的正确k2就有了,但是k2你也是随机选出来的,这使得k2有可能选对,也有可能选错,选错的话就会在下面for j in items.get((k1, k2), []):
时为[],o1和o2都是全0的,当前错误的subject没有对应的object和predicate。这里是为了加入反例训练样本吗?这种做法有什么reference也是这样吗?不是很了解这块,希望向您请教下~
把sentiment任务稍作修改后,训练自己的4分类模型,结果在模型预测的时候单样本需要3s左右,这是怎么回事
train_data = json.load(open('../datasets/train_data_me.json'))
dev_data = json.load(open('../datasets/dev_data_me.json'))
id2predicate, predicate2id = json.load(open('../datasets/all_50_schemas_me.json'))
这个数据集在哪里。
您好!在参考了您的代码后,想问一下后续一点的问题
在对bert进行微调过后用 model.save()保存了模型。
但是在加载模型的时候发生了报错:
ValueError: Unknown layer: TokenEmbedding
加载方式:model = load_model(self.fmodel, custom_objects={"tf":tf, "backend":backend})
这个问题该怎么解决?
您好,对于关系抽取中的repair函数的意义不是很理解,为何要对专辑,歌曲等做特殊处理?麻烦您了,谢谢您
(因为目前比赛结束无法获取)
苏神你好:
我想请教一个问题,我在您的代码基础上,想复现X-SQL....
遇到一个问题:X-SQL中有一个Context Reinforcing Layer,它利用Context Embedding全局向量来对表头向量列表[...]逐一作Attention和残差计算。您的代码中是取列名[CLS]所在位置的向量拼接然后分类,而我希望的是将全局向量hctx和每一个表头的不定长向量输出做一个attention,那么问题来了,batch_gather该如何设计呢?
苏大 能不能给个数据例子出来 就一个就行 好像关系抽取 relation_extract.py 那个 前面两个 将多行json塞进数组能理解
但是这句就不懂了
id2predicate, predicate2id = json.load(open('../datasets/all_50_schemas_me.json'))
原本的数据是
{"object_type": "地点", "predicate": "祖籍", "subject_type": "人物"}
{"object_type": "人物", "predicate": "父亲", "subject_type": "人物"}
怎么处理呢 谢谢
when i run nl2sql_baseline.py , i meet this error:
Traceback (most recent call last):
File 'nl2sql_baseline.py', line 254, in
pcsel_loss = K.sparse_categorical_crossentropy(csel_in, pcsel)
File '/home/work/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py', line 3345, in sparse_categorical_crossentropy
logits = tf.reshape(output, [-1, int(output_shape[-1])])
TypeError: int returned non-int (type NoneType)
so i want to know the version of tenfsorfow and keras and python?
@bojone
Traceback (most recent call last):
File "F:/pycharm/bert_in_keras/relation_extract.py", line 240, in
k1v = Lambda(seq_gather)([t, k1])
File "D:\software\anoconda\envs\keras_bert\lib\site-packages\keras\engine\base_layer.py", line 451, in call
output = self.call(inputs, **kwargs)
File "D:\software\anoconda\envs\keras_bert\lib\site-packages\keras\layers\core.py", line 716, in call
return self.function(inputs, **arguments)
File "F:/pycharm/bert_in_keras/relation_extract.py", line 212, in seq_gather
return K.tf.gather_nd(seq, idxs)
AttributeError: module 'keras.backend' has no attribute 'tf'
neg = pd.read_excel('neg.xls', header=None)
pos = pd.read_excel('pos.xls', header=None)
求这俩文件,谢谢~
请问您训练的时候是多少分?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.