acl2020spellgcn / spellgcn Goto Github PK
View Code? Open in Web Editor NEWSpellGCN
Home Page: https://arxiv.org/abs/2004.14166
SpellGCN
Home Page: https://arxiv.org/abs/2004.14166
没有修改代码,使用run.sh一样的设置,在 SIGHAN15 的 sentence level 上的结果比论文中 report 结果大概低3个点。
请问如何才能复现论文中结果?
您好, 我看代码里验证集用的也是sighan的测试数据,请问这个CSC任务的统一做法吗
您好,我查看了相应的数据集,论文中SIGHAN 2013 test set是1000条但是repo中只有998条,论文中train set有281379条但是repo中却刚好多了两条,为什么会这样?
我在训练的时候,遇到了OOM问题,应该是在GCN层建模混淆集的时候,导致的OOM问题,请问大家怎么解决的?
data/gcn_graph.ty_xj/relation_vocab.txt
中只有如下三种关系:
同音同调 0
同音异调 0
形近 1
但是data/gcn_graph.ty_xj/spellGraphs.txt
中除了以上三种还有近音同调
,近音异调
和同部首同笔画
,为什么这三种被忽略了呢?
estemator.predict的方式需要多次重新加载模型,发现难以实现高效部署。
参考https://blog.csdn.net/hezhefly/article/details/98877796?utm_term=Estimatortensorflow%E5%8A%A0%E8%BD%BD%E6%A8%A1%E5%9E%8B&utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~all~sobaiduweb~default-1-98877796&spm=3001.4430
依旧没有实现
加载训练的ckpt模型,使用传入feed_dict的方式,但是发现模型中没有输入的变量,tensor_name_list如下:
<class 'list'>: ['global_step/Initializer/zeros', 'global_step', 'global_step/Assign', 'global_step/read', 'global_step/IsVariableInitialized', 'global_step/cond/Switch', 'global_step/cond/switch_t', 'global_step/cond/switch_f', 'global_step/cond/pred_id', 'global_step/cond/read/Switch', 'global_step/cond/read', 'global_step/cond/Switch_1', 'global_step/cond/Merge', 'global_step/add/y', 'global_step/add', 'Const', 'flat_filenames/shape', 'flat_filenames', 'TensorSliceDataset', 'FlatMapDataset', 'count', 'RepeatDataset', 'buffer_size', 'seed', 'seed2', 'ShuffleDataset', 'batch_size', 'num_parallel_calls', 'drop_remainder', 'ExperimentalMapAndBatchDataset', 'optimizations', 'OptimizeDataset', 'ModelDataset', 'IteratorV2', 'MakeIterator', 'IteratorToStringHandle', 'IteratorGetNext', 'Shape', 'strided_slice/stack', 'strided_slice/stack_1', 'strided_slice/stack_2', 'strided_slice', 'bert/embeddings/ExpandDims/dim', 'bert/embeddings/ExpandDims', 'bert/embeddings/word_embeddings/Initializer/truncated_normal/shape', 'bert/embeddings/word_embeddings/Initializer/truncated_normal/mean', 'bert/embeddings/word_embeddings/Initializer/truncated_normal/stddev', 'bert/embeddings/word_embeddings/Initializer/truncated_normal/TruncatedNormal', 'bert/embeddings/word_embeddings/Initializer/truncated_normal/mul', 'bert/embeddings/word_embeddings/Initializer/truncated_normal', 'bert/embeddings/word_embeddings', 'bert/embeddings/word_embeddings/Assign', 'bert/embeddings/word_embeddings/read', 'bert/embeddings/Reshape/shape', 'bert/embeddings/Reshape', 'bert/embeddings/GatherV2/axis', 'bert/embeddings/GatherV2', 'bert/embeddings/Shape', 'bert/embeddings/strided_slice/stack', 'bert/embeddings/strided_slice/stack_1', 'bert/embeddings/strided_slice/stack_2', 'bert/embeddings/strided_slice', 'bert/embeddings/Reshape_1/shape/1', 'bert/embeddings/Reshape_1/shape/2', 'bert/embeddings/Reshape_1/shape', 'bert/embeddings/Reshape_1', 'bert/embeddings/Shape_1', 'bert/embeddings/strided_slice_1/stack', 'bert/embeddings/strided_slice_1/stack_1', 'bert/embeddings/strided_slice_1/stack_2', 'bert/embeddings/strided_slice_1', 'bert/embeddings/token_type_embeddings/Initializer/truncated_normal/shape', 'bert/embeddings/token_type_embeddings/Initializer/truncated_normal/mean', 'bert/embeddings/token_type_embeddings/Initializer/truncated_normal/stddev', 'bert/embeddings/token_type_embeddings/Initializer/truncated_normal/TruncatedNormal', 'bert/embeddings/token_type_embeddings/Initializer/truncated_normal/mul', 'bert/embeddings/token_type_embeddings/Initializer/truncated_normal', 'bert/embeddings/token_type_embeddings', 'bert/embeddings/token_type_embeddings/Assign', 'bert/embeddings/token_type_embeddings/read', 'bert/embeddings/Reshape_2/shape', 'bert/embeddings/Reshape_2', 'bert/embeddings/one_hot/on_value', 'bert/embeddings/one_hot/off_value', 'bert/embeddings/one_hot/depth', 'bert/embeddings/one_hot', 'bert/embeddings/MatMul', 'bert/embeddings/Reshape_3/shape/1', 'bert/embeddings/Reshape_3/shape/2', 'bert/embeddings/Reshape_3/shape', 'bert/embeddings/Reshape_3', 'bert/embeddings/add', 'bert/embeddings/assert_less_equal/x', 'bert/embeddings/assert_less_equal/y', 'bert/embeddings/assert_less_equal/LessEqual', 'bert/embeddings/assert_less_equal/Const', 'bert/embeddings/assert_less_equal/All', 'bert/embeddings/assert_less_equal/Assert/Const', 'bert/embeddings/assert_less_equal/Assert/Const_1', 'bert/embeddings/assert_less_equal/Assert/Const_2'...
我在跑模型的时候遇到了一个这样的错误:
ValueError: Shape must be rank 2 but is rank 3 for 'gcn/GCN-0/relation_prototype_scope/MatMul' (op: 'MatMul') with input shapes: [2,4755,768], [768,1].
不知道还有没有同学遇到了,求解答,非常非常感谢啦!
你好,感谢您的数据开源与部分代码开源。依然想请问一下,有后续代码开源的计划吗,谢谢。祝六一愉快,嘿嘿。
I read the SpellGCN papers you wrote, and the effect is great. Is it convenient to publish the training script now? Thank you.
首先非常感谢您的数据代码开源,请问最近有模型代码开源的计划吗?
``Traceback (most recent call last):
File "../run_spellgcn.py", line 1310, in
tf.app.run()
File "C:\Users\gys14\anaconda3\envs\spellgcn\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run
_sys.exit(main(argv))
File "../run_spellgcn.py", line 1012, in main
bert_config = modeling.BertConfig.from_json_file(FLAGS.bert_config_file)
File "D:\PycharmProjects\SpellGCN-master\modeling.py", line 93, in from_json_file
text = reader.read()
File "C:\Users\gys14\anaconda3\envs\spellgcn\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 125, in read
self._preread_check()
File "C:\Users\gys14\anaconda3\envs\spellgcn\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 85, in _preread_check
compat.as_bytes(self.__name), 1024 * 512, status)
File "C:\Users\gys14\anaconda3\envs\spellgcn\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: C:/Program Files/Git/home/fanyin.cxy/chinese_L-12_H-768_A-12/bert_config.json : \u03f5\u0373\udcd5\u04b2\udcbb\udcb5\udcbd\u05b8\udcb6\udca8\udcb5\udcc4��\udcbe\udcb6\udca1\udca3
; No such process
如题。请问能否把你们训练的模型上传呢?
这个spellGCN我看源码时注意到仍然有masked_lm_positions,也就是它在检测一句话时无法一次性检测全部的字,而是要分多次才能检测完一句话吗?
复现bert、spellgcn后的结果与论文已经很接近了,但是两者的指标有时bert高,有时spellgcn高,飘忽不定,还有发现两者推理速度相差很大呀,spellgcn的推理时间是bert的几乎4倍了,有这么夸张吗?看权重spellgcn也就比bert多了几个gcn层而已,这几个gcn层的参数量不超过240万,与bert相比就是毛毛雨啊,为什么推理速度差距这么大?
我理解的FP是文本中本来正确的字被纠正成错误字的个数
而代码中correct_FP是检测出来有错误但是没有纠对的个数
Thanks for your opening this repository, I just follow the README instruction and got this error.
It seems like missing finetuned model files.
when will you release the finetuned model file? Thanks for your reply.
如题,想问下代码如何实现多卡训练,一块2080不够大小,是使用horovod吗?
模型跑的时候GPU只用了253M,CPU倒是能占用1400%。。。
把batch_size调大CPU直接OOM了,GPU却一点事都没有
模型就是把GPU拉来看个热闹?
BERT baseline是在最后一层输出直接加一个线性变换吗?
还是使用GCN的方法类似,使用embedding table进行点乘?
谢谢解答。
INFO:tensorflow:guid: (sighan13-id=0)
INFO:tensorflow:tokens: [CLS] 我 看 过 许 多 勇 敢 的 人 , 不 怕 措 折 地 奋 斗 , 这 种 精 神 值 得 我 们 学 习 。 [SEP]
执行run.sh后显示如上,这是用错误的句子训练模型吗?
再问一下epoch数量对这个结果影响大吗?显存不够跑一次花费时间太久了。
您好,我把Wang2018的伪数据也加到训练集中训练,在SIGHAN15和SIGHAN14上得到结果比paper上写的低1个点左右。
Sentce Level,Detection F1
SIGHAN15: paper上写的77.7,复现结果:76.46;SIGHAN14: paper上写的67.2,复现结果65.43。
我跑了多次,结果有高有低,但都没有达到paper上报但值。paper上说报的结果是跑5次平均之后的结果,理论上跑多次得到的值应该在paper得到的值上下才对。
请问可以给一个可以完全复现paper上结果的代码setting吗,比如随机种子的值或其他细节之处?
请问本项目是否可以处理多字少字的问题?
论文中训练数据集是281379条数据,项目中只有10052条数据,请问在哪里可以获取论文中的训练数据呢
对于spellGraphs.txt文件中的拼音部分感觉太少了,比如没有an-ang,uan-uang等实际场景常见的模糊音的字,希望完善
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.