Coder Social home page Coder Social logo

bert-utils's People

Contributors

terrifyzhao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bert-utils's Issues

同时启动两个Bert对象出错

我需要启动两个max_seq_len不一样的Bert对象(一个是100,另一个是300),报如下错“ValueError: Requested return tensor 'final_encodes:0' not found in graph def
”请问是怎么回事呢?我用bert-as-service就没有这个问题。

按说明运行文本分类,最后提示 BertSim对象没有test方法

训练和评估都完成了。
INFO:tensorflow:***** Eval results *****
INFO:tensorflow: eval_accuracy = 0.5
INFO:tensorflow: eval_auc = 0.5
INFO:tensorflow: eval_loss = 0.6931482
INFO:tensorflow: global_step = 7812
INFO:tensorflow: loss = 0.6931507
最后提示:
Traceback (most recent call last):
File "fenlei.py", line 15, in
bs.test()
AttributeError: 'BertSim' object has no attribute 'test'
我看了一下similarity.py源码,确实没有这个方法。请作者有空时进行解答或者修复,谢谢。

如何只打印句向量

您好,看了您代码后,受益匪浅,谢谢大佬的辛勤付出和分享!

这里能否问一个问题,我想打印出所生成的句向量,如下:

  with tf.gfile.GFile(tmp_file, 'wb') as f:
        f.write(tmp_g.SerializeToString())
        print(tmp_g.SerializeToString())

但看起来它非常大,请问是什么原因呢,每一个句子都可以转化成固定长度的词向量对吗,它的长度有多大?如何只打印出句向量呢?

运行extract_feature.py

有时会报错,ValueError: generator yielded an element of shape (0,) where an element of shape (?, 128) was expected。
另外,输入一句话,每个字都会产生一个词向量,最后要把所有的词向量都想加组成句子的词向量吗?

No module named "bert"

When I use "from bert.extrac_feature import BertVector", the error massage is showed

No module named "bert"

How to solve it?
Thank you very much.

训练好模型后,进行eval

tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions must be in [0, 1]] [Condition x <= y did not hold element-wise:x (ArgMax:0) = ] [119 119 119...] [y (auc/Cast_1:0) = ] [1]

出现了这个问题,请问是哪里设置错了吗?

请问哪里可修改句向量长度?

按照你的执行的句向量,但每次 [CLS]和[SEP]之间只能是3个字,如下:
INFO:tensorflow:tokens: [CLS] 话 说 今 [SEP]

请问如何修改长度?

句向量为什么不需要进行fine tune

1.为什么不采用fine tune后的句向量,对于相似度计算是否可以采取fine tune后的句向量结合annoy等算法先检索出几个备选值。
2.代码中为什么默认指定 layer_indexes = [-2]

用gpu环境跑出问题

您好,我只需要得到句向量,但是在GPU环境下出问题了,您能帮解决一下吗?没有报错,但是程序无法继续运行,在cpu环境下没有问题,谢谢。

2019-06-03 11:20:11.971647: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:134] Unknown compute capability (7, 5) .Defaulting to telling LLVM that we're compiling for sm_30
2019-06-03 11:20:13.429726: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105] Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20
2019-06-03 11:20:13.448273: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at xla_ops.cc:429 : Not found: ./libdevice.compute_20.10.bc not found

训练误差突然变大,怎么回事

做文本相似性分析,样本总共160w条,正负样本各占一半,batch_size=16, learning_rate=0.00005, max_seq_len=64, 训练到1000 step后,训练误差基本上0.00001左右,但是到9w step时,误差突然增加到0.7左右,然后就一直在0.7左右徘徊,请问有没有遇到这种情况?谢谢

相似度预测方法

直接相似度预测,请问下过程是不是先通过data里的两个csv训练执行了sim.train(), sim.eval()
后,然后注释掉sim.train(),sim.eval()步骤,只做sim = BertSim(),sim.set_mode(tf.estimator.ModeKeys.PREDICT) 就可以通过sim.predict(sentence1, sentence2)预测? 谢谢了

jupyter notebook

Hello! I just run this file in jupyter notebook,but it seemed that the zip file added to jupyter notebook can not be encoded to UTF-8, how can I solve this problem? Thansks!

多线程

多线程怎么保证输入和输出的一致性的,我没看太懂...

出现 KeyERROR'0'

我进行代码的修改, 训练过程中没有出现问题 但是在验证和预测的时候出现看 KEYERROR'0'这个问题, 定位到代码位置是在 label_id = label_map[example.label] 这行 这是什么错误

句向量問題

請問你改完的extract_feature.py和官方提供的源碼,在功能上有差別嗎?

相似度问题

我用默认的语料fine-tune了下, 但跑的结果很不理想,请问大概是什么原因

image

train.csv文件有错误

"我的手机号码换了,我的蚂蚁花贝蚂蚁借呗怎么转过来",蚂蚁借呗借的钱,转到挂失卡里了怎么办,0

这条数据多了一个英文逗号,将导致读取失败

用此代码跑eval的时候出现的问题

tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions must be in [0, 1]] [Condition x <= y did not hold element-wise:x (ArgMax:0) = ] [119 119 119...] [y (auc/Cast_1:0) = ] [1]

你好,我在运行你的代码的时候,在进行eval的时候,出现了这个问题是怎么回事?

你好,想做单纯的分类。遇到问题。

你好,我想做句子分类。如:上半年证金公司***** (句子),股票(label),单句对应一个分类标签。训练过程已经完成。但在eval过程时,遇到了。tensorflow.python.framework.errors_impl.InvalidArgumentError:assertion failed predictions must be in [0,1] [condition x <=y did not hold element-wise:x (ArgMax:0)= ] [5 5 5...] [y (auc/Cast_1:0) = ] [1]
请问还需要改什么代码呢

关于predict准确率的问题

我利用您开源的数据训练后,loss效果还是不错的,验证集也有接近80%的准确率,但是我实际进行测试的时候,发现两个语义相似度高的句子并不能很好的被识别出来,往往仅有1%的相似度,反观那些可以识别的句子,多半是因为其本身在字符级的相似度较高,模型容易识别这类相似的句子对,并没有在bert上看到较为明显的强大之处。是否是因为这个数据集的原因,以及相似度本身处理起来并不如分类任务效果好?是否BERT在分类任务中会有更好的表现?

extract_feature.py 句向量生成demo build graph 显示 Could not find trained model in model_dir

你好, 当我在运行你的句向量生成代码时, 得到以下信息:
image
我的环境是:
系统: Ubuntu 18.04
用户: root
python: 3.6
tensorflow: 1.11,
INFO:tensorflow:Could not find trained model in model_dir: /tmp/tmpkpgegq1e, running initialization to predict. 我理解的是, 无法找到相关模型, 所以bert自己随机初始化了权重, 即该权重没有经过任何训练, 请问下这是正常情况嘛? 影响句子生成的句向量嘛?

when run BertSim

hello, I have a question when I run the text classify.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:*** Features ***
INFO:tensorflow: name = input_ids, shape = (128, 32)
INFO:tensorflow: name = input_mask, shape = (128, 32)
INFO:tensorflow: name = label_ids, shape = (128,)
INFO:tensorflow: name = segment_ids, shape = (128, 32)
Traceback (most recent call last):
File "bert_train.py", line 10, in
bs.train()
File "/data0/home/jinguo3/workspace/jinguo3/bert/bert_demo/bert-utils/similarity.py", line 627, in train
estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
File "/data0/home/jinguo3/anaconda2/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/data0/home/jinguo3/anaconda2/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1207, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/data0/home/jinguo3/anaconda2/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1237, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/data0/home/jinguo3/anaconda2/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 1195, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/data0/home/jinguo3/workspace/jinguo3/bert/bert_demo/bert-utils/similarity.py", line 203, in model_fn
num_labels, use_one_hot_embeddings)
TypeError: unbound method create_model() must be called with BertSim instance as first argument (got BertConfig instance instead)
Thanks

如何获取相似度更高的两个语句

通过这个方式获取到相似度值,要么无限趋近于0,要么无限趋近于1。项目中有需求判断语句A和语句B相似度是否高于语句A和语句C的相似度,测试后效果并不好,如下:
sentenceA:世界上世界上拥有摩天大楼最多的国家 sentenceB:世界上世界上拥有摩天大楼最多的国家 score: 0.9999398
sentenceA:世界上世界上拥有摩天大楼最多的国家 sentenceC:世界上世界摩天大楼最多的城市 score: 0.99997306

为什么encode里用queue来实现

如题 为什么用在encode里用queue来异步获取句向量呢?而且我看里面设置的queue的长度为1,如果有并发的时候 会不会导致丢失数据呢

修改args max_seq_len后报错

如题修改max_seq_len参数后报错,请问是否还需要修改其他地方?

ValueError: Dimensions must be equal, but are 5 and 128 for 'import/mul' (op: 'Mul') with input shapes: [?,5,768], [?,128,1].

When run BertVector

Exception in thread Thread-4:
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/bert-utils/extract_feature.py", line 83, in predict_from_queue
for i in prediction:
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 577, in predict
features, None, model_fn_lib.ModeKeys.PREDICT, self.config)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1195, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/home/ubuntu/bert-utils/extract_feature.py", line 60, in model_fn
graph_def.ParseFromString(f.read())
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 125, in read
self._preread_check()
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 85, in _preread_check
compat.as_bytes(self.__name), 1024 * 512, status)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 61, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got None

新浪新闻数据分类结果随机

我用新浪新闻是分类数据cnews做fine-tuning的时候,结果准确率居然是0.1,感觉就是完全靠猜的一个结果,不知道怎么回事?
另外,训练的时候,我有4块GPU,但是只有第一块感觉用上了,关于GPU我没有做任何设置,如果我GPU都用上该怎么改,谢谢。。

Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 13410 C python 10649MiB |
| 1 13410 C python 215MiB |
| 2 13410 C python 215MiB |
| 3 13410 C python 215MiB |

显存不足问题

1080Ti 单卡 执行下面的代码,直接显存不足了,其他桌面程序用了400M

from extract_feature import BertVector
bv = BertVector()
print(bv.encode(['今天天气不错']))
2019-06-11 19:40:21.473032: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
2019-06-11 19:40:24.593479: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:216] failed to load CUBIN: Internal: failed to load in-memory CUBIN: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-06-11 19:40:24.593505: F tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:881] Check failed: module != nullptr 

extract_feature.py 也改了配置

config.gpu_options.allow_growth = False
config.gpu_options.per_process_gpu_memory_fraction = 0.6

如何用GPU跑

您好,我的笔记本电脑是GTX1070的,跑的时候报ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[4096,768] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
应该是内存不够,这应该怎么办呀?

出现 KeyERROR'0'

我进行代码的修改, 训练过程中没有出现问题 但是在验证和预测的时候出现看 KEYERROR'0'这个问题, 定位到代码位置是在 label_id = label_map[example.label] 这行 这是什么错误

_truncate_seq_pair method does not seem to be reasonable

The underlying assumption for a sequence pair to work under this method is that the two sentences are equally informative. However, in my practice, the shorter sentence may show much less information, especially when it differs from the longer and more information sentence. I hope the assumptions like this one could be highlighted in the readme before other newcomers struggle in their own project.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.