Comments (10)
我也遇到过有重复的情况额,感觉很难重现karpathy的结果啊.....貌似用他写的程序训练也会出现重复的现象,估计和参数设定有关吧
from char-rnn.
确实是,在github上找了好多相关的项目,都没有karpathy的python版跑出的结果好(对torch不了解)
from char-rnn.
但应该也不是参数的问题,训练karpathy的程序时,换过各种参数,效果都还可以,应该不是参数的问题
from char-rnn.
是吗,我没有试过他torch的代码额。我试过他用numpy写的简易模型,貌似效果也不是很好额,估计torch写的那个模型应该有一些小trick吧,我对torch也不是很熟悉-_-
from char-rnn.
https://github.com/hejunqing/tf-char-cnn-lstm
可以看看这个更新版本。无限接近Yoon Kim's paper。
from char-rnn.
所以请假一下,不停重复翻译的原因是?我也遇到了同样的情况
from char-rnn.
@apeterswu 在做生成的时候其实有两种策略,一种是argmax还有一种是sample,本程序用的是argmax策略,这个策略会导致重复的现象,而sample策略不会但句子连贯性会比argmax策略差一些(karpathy的程序默认采用的是sample策略)。同时,我最近用tensorflow重写这个模型后发现增加训练语料以及采用多层RNN能使重复现象出现时序列长度更长(采用argmax策略时)。
from char-rnn.
@hit-computer 不过在decoding的时候用beam search,所以还是使用argmax,因此这个问题还是会存在?
from char-rnn.
@apeterswu 是的,beam search每次选的是top-N max,也是会出现重复问题。所以建议还是增加语料,增加隐层参数和迭代次数,然后采用sample策略是比较好的
from char-rnn.
谢谢分享,你知道karpathy blog里面例子的参数吗?
from char-rnn.
Related Issues (6)
- 楼主你好,把你的代码放到gpu上跑,训练速度好慢啊,大概半个小时才能训练一次,是我机器的原因吗 ,还是其他原因? HOT 1
- 楼主,可以把train和sample分开吗 就是用训练好的模型生成文本,可以吗? HOT 2
- 代码理解 HOT 1
- 能提供那个1M的训练语料作为参考吗 HOT 5
- 能否讲解一下各变量 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from char-rnn.