Comments (14)
I am also adding instructions on how to train and load the gnmt model from scratch.
from nmt.
Thanks, I will need to update the nmt/standard_hparams/wmt16_en_de_gnmt.json
.
from nmt.
I ran a standard attention / scaled_luong / uni system and go the expected results.
Same with gnmt architecture / scaled_luong / enc_type gnmt, completely off.
Is there something special to do for GNMT attention architecture ?
from nmt.
@vince62s Did you check with the standard_hparams for GNMT, there are also pre-trained models available for download in the README page.
from nmt.
Same problem here. After training, when doing inference I get:
KeyError: 'num_encoder_residual_layers'
It only works when I delete all these keys from the hparams file, and when I set the --hparams_path to the directory of the best_bleu, but then after one run, for some reason, it rewrites the hparams file, and add these problematic key/values again... It's not clear how this mechanism works.
My guess is that when the code is saving hparams, it simply writes key values that it doesn't suppose to.
from nmt.
@NadavB can you share the command getting the error? were you using the standard_hparams file in the repo for inference?
There are some updates to the hparams recently, so I think the standard_hparams maybe out of date.
from nmt.
@oahziur I did not use the standard hparams. I used the params as shown in the tutorial.
So for training:
--attention=scaled_luong \
--src=vi --tgt=en \
--vocab_prefix=tmp/nmt_data/vocab \
--train_prefix=tmp/nmt_data/train \
--dev_prefix=tmp/nmt_data/tst2012 \
--test_prefix=tmp/nmt_data/tst2013 \
--out_dir=/tmp/nmt_attention_model \
--num_train_steps=5000 \
--steps_per_stats=20 \
--num_layers=2 \
--num_units=128 \
--dropout=0.2 \
--metrics=bleu
And for inference:
python nmt/nmt.py \
--out_dir=/tmp/nmt_attention_model \
--inference_input_file=/tmp/nmt_data/source_infer.vi \
--inference_output_file=/tmp/nmt_attention_model/output_infer
from nmt.
@NadavB Hello. i"m studying nmt.
i want to run test file. so i ran just. nmt,py but failed.
how to command your script?? please let me know basiclly
from nmt.
@LimWoohyun Look at https://github.com/tensorflow/nmt -> search for "Hands-on – building an attention-based NMT model" the command is written there.
from nmt.
@oahziur I get Key error using the standard_hparams (tf 1.6rc1, will try on my other machine with tf1.5-cuda).
NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
using the command:
[bquast@UX370UA ~]$ cd nmt
[bquast@UX370UA nmt]$ python -m nmt.nmt \
> --src=de --tgt=en \
> --ckpt=deen_gnmt_model_4_layer/translate.ckpt \
> --hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
> --out_dir=/tmp/deen_gnmt \
> --vocab_prefix=/home/bquast/en_de_data/vocab.bpe.32000 \
> --inference_input_file=/home/bquast/en_de_data/newstest2014.tok.bpe.32000.de \
> --inference_output_file=/home/bquast/deen_gnmt_model_4_layer/output_infer \
full output here:
https://gist.github.com/bquast/30ba7630d2bf32b59dd8349889fc7638
EDIT: confirmed, same error on tf15.-cuda
https://gist.github.com/bquast/0ddbf8eda363d312dd57b51aebb11f5d
from nmt.
@bquast I recently got the error using the same configuration.
Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
I tried this with tf14 too, but no luck. Are there any updates on this?
Thank you.
from nmt.
hey, no news yet, any progress on your side?
from nmt.
@bquast I think this is related to #264 and there is a PR fixed this #265. Maybe you can try patch the PR and see if you still get the issue. Make sure you clear the model directory.
from nmt.
@bquast @tiberiu92 @oahziur
I got the same error using the same configuration. (tf-1.8, python-2.7)
python -m nmt.nmt \
--src=de --tgt=en \
--ckpt=/home/xiaohao/nmt/models/deen_gnmt_model_4_layer/translate.ckpt \
--hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
--out_dir=/home/xiaohao/data/deen_gnmt \
--vocab_prefix=/home/xiaohao/data/wmt16/vocab.bpe.32000 \
--inference_input_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.de \
--inference_output_file=/home/xiaohao/data/deen_gnmt/output_infer \
--inference_ref_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.en
NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
I print keys of deen_gnmt_model_4_layer/translate.ckpt
,not find .../rnn/basic_lstm_cell/bias
xiaohao@ubuntu:~/nmt$ python ckpt_print.py models/deen_gnmt_model_4_layer/translate.ckpt
('CHECKPOINT_FILE: ', 'models/deen_gnmt_model_4_layer/translate.ckpt')
('tensor_name: ', 'embeddings/encoder/embedding_encoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/memory_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/output_projection/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/query_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_v')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_b')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_g')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'Variable')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/bias')
('tensor_name: ', 'embeddings/decoder/embedding_decoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
xiaohao@ubuntu:~/nmt$
I try the PR(#265), and rm -rf /home/xiaohao/data/deen_gnmt/*
. The problem is sloved!
tks~ @oahziur
from nmt.
Related Issues (20)
- How this performs against Facebook's fairseq?
- How to get the vocab file for custom dataset? HOT 1
- CopyNet
- TypeError: __call__() got an unexpected keyword argument 'training' HOT 6
- default embedding HOT 5
- Shuffle buffer filled
- assertion failed: [All values in memory_sequence_length must greater than zero. HOT 1
- Why we need to pass test dataset while training?
- GPU not fully utilized HOT 1
- top-k predictions not generated
- The result is bad HOT 3
- How to convert custom tensorflow seq2seq checkpoint model to SavedModel format(pb)? HOT 1
- Dead link in the NMT tutorial HOT 1
- How to check the attention matrix?
- Cannot generate vocab file
- How to do Hyperparameter Optimization using Tensorflow NMT? HOT 1
- What is the different between num_translations_per_input and beam_width?
- How to use BERT embedding? HOT 3
- What is the name of the loss function used in the NMT? HOT 1
- Using Pickle with CPU
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nmt.