Coder Social home page Coder Social logo

lmthang / nmt.matlab Goto Github PK

View Code? Open in Web Editor NEW
105.0 105.0 50.0 3.84 MB

Code to train state-of-the-art Neural Machine Translation systems.

Home Page: http://nlp.stanford.edu/projects/nmt/

MATLAB 52.48% Shell 4.21% Python 17.17% Perl 25.50% Smalltalk 0.64%

nmt.matlab's People

Contributors

lmthang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nmt.matlab's Issues

the content about translations.txt

Hi
when I use command line:
trainLSTM('../output/id.1000/train.10k', '../output/id.1000/valid.100', '../output/id.1000/test.100', 'de', 'en', '../output/id.1000/train.10k.de.vocab.1000', '../output/id.1000/train.10k.en.vocab.1000', '../output/basic', 'isResume', 0)
testLSTM('../output/advanced/modelRecent.mat', 2, 10, 1, '../output/advanced/translations.txt')
but the translation.txt contents are that:





The




model ensemble

Hi,

Is model ensemble part available? Or at least can you explain how to do model ensemble.

Here is my currently understanding:

Give the sequence x_1, x_2,...x_n, and two models m_1, m_2. First use the encoder part of m_1, m_2 to get the states s_1, s_2, then feed the states to the decoder parts of m_1, m_2, the first input to the decoder is EOS, then the decoder generates the probability for the next words p_1, p_2, average the probability to get (p_1 + p_2)/2, then pick the top k words according to the average probability as candidates, and then use them to feed into the next steps, the states s_1, s_2 are both updated after the first step.

Is this correct? I tried to implement using the above way, but don't find improvement using several models.

NaN values in "srcVecsAll"

Hi,

I am wondering why I am getting error as
"Subscript indices must either be real positive integers or logicals." in buildSrcVecs line 58. When I checked the values of srcVecsAll, it includes NaN while it should include only integers as indices. Could you help me why it happens and what causes this NaN values?

Thanks

the content about translations.txt

Hi
when I use command line:
trainLSTM('../output/id.1000/train.10k', '../output/id.1000/valid.100', '../output/id.1000/test.100', 'de', 'en', '../output/id.1000/train.10k.de.vocab.1000', '../output/id.1000/train.10k.en.vocab.1000', '../output/basic', 'isResume', 0)
testLSTM('../output/advanced/modelRecent.mat', 2, 10, 1, '../output/advanced/translations.txt')
but the translation.txt contents are that:
"1.
2.


...
...
...
100. "
could you please tell how can I get the correct translations
and the translation.txt should be the translation of test.txt

Error when run the code for toy data

Hi everyone,
I am new on neural networks . Now I am trying to but I get this error:


trainLSTM('../output/id.1000/train.10k', '../output/id.1000/valid.100', '../output/id.1000/test.100', 'de', 'en', '../output/id.1000/train.10k.de.vocab.1000', '../output/id.1000/train.10k.en.vocab.1000', '../output/basic', 'isResume', 0)

Loading vocab from file ../output/id.1000/train.10k.en.vocab.1000 ...
Loading vocab from file ../output/id.1000/train.10k.en.vocab.1000 ...
Loading vocab from file ../output/id.1000/train.10k.de.vocab.1000 ...
Bilingual setting
Init LSTM parameters using dataType=double, initRange=0.1
Model size = 458700, individual sizes: W_src{1}=80000 W_tgt{1}=80000 W_emb_src=99500 W_emb_tgt=99600 W_soft=99600
assert = 0
attnFunc = 0
attnOpt = 0
batchSize = 128
dataType = double
debug = 0
decode = 1
dropout = 1
epochFraction = 1
feedInput = 0
finetuneEpoch = 5
finetuneRate = 0.5
gpuDevice = 0
initRange = 0.1
isBi = 1
isClip = 1
isGradCheck = 0
isProfile = 0
isResume = 0
isReverse = 0
learningRate = 1
loadModel =
logFreq = 10
lstmOpt = 0
lstmSize = 100
maxGradNorm = 5
maxLenRatio = 1.5
maxSentLen = 51
minLenRatio = 0.5
numEpoches = 10
numLayers = 1
onlyCPU = 0
outDir = ../output/basic
posWin = 10
saveHDF = 0
seed = 0
shuffle = 1
sortBatch = 1
srcLang = de
srcVocabFile = ../output/id.1000/train.10k.de.vocab.1000
testPrefix = ../output/id.1000/test.100
tgtLang = en
tgtVocabFile = ../output/id.1000/train.10k.en.vocab.1000
trainPrefix = ../output/id.1000/train.10k
validPrefix = ../output/id.1000/valid.100
chunkSize = 12800
baseIndex = 0
clipForward = 50
clipBackward = 1000
nonlinear_gate_f = sigmoid
nonlinear_gate_f_prime = sigmoidPrime
nonlinear_f = tanh
nonlinear_f_prime = tanhPrime
beamSize = 12
stackSize = 100
unkPenalty = 0
lengthReward = 0
isGPU = 0
logId = 6
srcSos = 994
srcEos = 995
srcVocabSize = 995
nullPosId = 0
tgtSos = 995
tgtEos = 996
tgtVocabSize = 996
modelFile = ../output/basic/model.mat
modelRecentFile = ../output/basic/modelRecent.mat
softmaxSize = 100
lr = 1
epoch = 1
bestCostValid = 100000
testPerplexity = 100000
curTestPerpWord = 100000
startIter = 0
iter = 0
epochBatchCount = 0
finetuneCount = 0
modelSize = 458700
Loading data src from file ../output/id.1000/valid.100.de
src 1:gegen , um der von
src end: eine die des von hat , eine for eine .
Loading data tgt from file ../output/id.1000/valid.100.en
tgt 1:than to the ##AT##-##AT## of
tgt end: one a in the of , however an .
numSents=100, numWords=1953
Loading data src from file ../output/id.1000/test.100.de
src 1: und sich gibt Verf眉gung
src end:Der aber ein des vor und sie am , dass die des stehen und per 3 , um , Ort es f眉r auf , 2 er .
Loading data tgt from file ../output/id.1000/test.100.en
tgt 1: and minute land time
tgt end:The a of the beginning and that the 's double " and in " to mine the guests en 's .
numSents=100, numWords=2225
Load train data srcFile "../output/id.1000/train.10k.de" and tgtFile "../output/id.1000/train.10k.en"
src 1: Unsere aus der Ihrer .
src end:wei脽 Ein ich und auch eingesetzt sei , wie es Modul war : Express einen , New dieser als Wann 17 , der auch gibt from war , ihm von seiner zu seiner durch ihr gehen zu , f眉r eine so , da脽 da die der am , und guten auf eine Apartments , da脽 sich in Dateien eine und .
tgt: from 鈥?do .
tgt end:cities shall We apartment , and restaurants has 24 , 't were it was that an a , of a mobile first de of , and one , bit , we was to the of a by his entire unto , Lord be a of to the of the some of the available download of the information , and in a to having in about a of the available 鈥?do and reviews .
src input 1: <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> <s_sos> Teil Entwicklung N盲he Sterne des Hotel in kommen Sie eine dir .
src mask: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
tgt input 1: <t_sos> than ##UNDERSCORE## Update is 00 to short your % at Hotel in . <t_eos>
tgt output 1: than ##UNDERSCORE## Update is 00 to short your % at Hotel in . <t_eos> <t_eos>
tgt mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
Epoch 1, lr=1, 30-Oct-2015 19:47:17
varsDenseUpdate: W_src W_tgt W_soft
Index exceeds matrix dimensions.

Error in lstmCostGrad (line 110)
x_t = W_emb(:, input(:, tt));

Error in trainLSTM (line 269)
[costs, grad] = lstmCostGrad(model, trainData, params, 0);
Could anyone help me with it? Thanks

buildSrcVecs.m

Hi, Imthang
I am puzzled when I read your code in file buildSrcVecs.m, I cannot understand the following code:

% check those that are out of boundaries
excludeIds = find(indicesAll>numSrcHidVecs | indicesAll<(srcMaxLen-srcLens+1));

I cannot understand:
indicesAll<(srcMaxLen-srcLens+1)

If srcMaxLen is 10, one of the sentence length is 7, then following the code, indices < (10-7+1)=4 are out of bound, why? Can you explain it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.