It complains a key error "KeyError: num_residual_layers" Here is

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Filed to run GNMT about nmt HOT 14 OPEN

tensorflow commented on May 19, 2024

Filed to run GNMT

from nmt.

Comments (14)

oahziur commented on May 19, 2024 1

I am also adding instructions on how to train and load the gnmt model from scratch.

from nmt.

oahziur commented on May 19, 2024

Thanks, I will need to update the nmt/standard_hparams/wmt16_en_de_gnmt.json.

from nmt.

vince62s commented on May 19, 2024

I ran a standard attention / scaled_luong / uni system and go the expected results.
Same with gnmt architecture / scaled_luong / enc_type gnmt, completely off.
Is there something special to do for GNMT attention architecture ?

from nmt.

oahziur commented on May 19, 2024

@vince62s Did you check with the standard_hparams for GNMT, there are also pre-trained models available for download in the README page.

from nmt.

ndvbd commented on May 19, 2024

Same problem here. After training, when doing inference I get:

KeyError: 'num_encoder_residual_layers'

It only works when I delete all these keys from the hparams file, and when I set the --hparams_path to the directory of the best_bleu, but then after one run, for some reason, it rewrites the hparams file, and add these problematic key/values again... It's not clear how this mechanism works.

My guess is that when the code is saving hparams, it simply writes key values that it doesn't suppose to.

from nmt.

oahziur commented on May 19, 2024

@NadavB can you share the command getting the error? were you using the standard_hparams file in the repo for inference?

There are some updates to the hparams recently, so I think the standard_hparams maybe out of date.

from nmt.

ndvbd commented on May 19, 2024

@oahziur I did not use the standard hparams. I used the params as shown in the tutorial.

So for training:

--attention=scaled_luong \
    --src=vi --tgt=en \
    --vocab_prefix=tmp/nmt_data/vocab  \
    --train_prefix=tmp/nmt_data/train \
    --dev_prefix=tmp/nmt_data/tst2012  \
    --test_prefix=tmp/nmt_data/tst2013 \
    --out_dir=/tmp/nmt_attention_model \
    --num_train_steps=5000 \
    --steps_per_stats=20 \
    --num_layers=2 \
    --num_units=128 \
    --dropout=0.2 \
    --metrics=bleu

And for inference:

python nmt/nmt.py \
    --out_dir=/tmp/nmt_attention_model \
    --inference_input_file=/tmp/nmt_data/source_infer.vi \
    --inference_output_file=/tmp/nmt_attention_model/output_infer

from nmt.

LimWoohyun commented on May 19, 2024

@NadavB Hello. i"m studying nmt.
i want to run test file. so i ran just. nmt,py but failed.

how to command your script?? please let me know basiclly

from nmt.

ndvbd commented on May 19, 2024

@LimWoohyun Look at https://github.com/tensorflow/nmt -> search for "Hands-on – building an attention-based NMT model" the command is written there.

from nmt.

bquast commented on May 19, 2024

@oahziur I get Key error using the standard_hparams (tf 1.6rc1, will try on my other machine with tf1.5-cuda).

NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

using the command:

[bquast@UX370UA ~]$ cd nmt
[bquast@UX370UA nmt]$ python -m nmt.nmt \
>     --src=de --tgt=en \
>     --ckpt=deen_gnmt_model_4_layer/translate.ckpt \
>     --hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
>     --out_dir=/tmp/deen_gnmt \
>     --vocab_prefix=/home/bquast/en_de_data/vocab.bpe.32000 \
>     --inference_input_file=/home/bquast/en_de_data/newstest2014.tok.bpe.32000.de \
>     --inference_output_file=/home/bquast/deen_gnmt_model_4_layer/output_infer \

full output here:

https://gist.github.com/bquast/30ba7630d2bf32b59dd8349889fc7638

EDIT: confirmed, same error on tf15.-cuda

https://gist.github.com/bquast/0ddbf8eda363d312dd57b51aebb11f5d

from nmt.

tiberiu92 commented on May 19, 2024

@bquast I recently got the error using the same configuration.

Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

I tried this with tf14 too, but no luck. Are there any updates on this?

Thank you.

from nmt.

bquast commented on May 19, 2024

hey, no news yet, any progress on your side?

from nmt.

oahziur commented on May 19, 2024

@bquast I think this is related to #264 and there is a PR fixed this #265. Maybe you can try patch the PR and see if you still get the issue. Make sure you clear the model directory.

from nmt.

xiaohaoliang commented on May 19, 2024

@bquast @tiberiu92 @oahziur
I got the same error using the same configuration. （tf-1.8, python-2.7）

python -m nmt.nmt \
    --src=de --tgt=en \
    --ckpt=/home/xiaohao/nmt/models/deen_gnmt_model_4_layer/translate.ckpt \
    --hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
    --out_dir=/home/xiaohao/data/deen_gnmt \
    --vocab_prefix=/home/xiaohao/data/wmt16/vocab.bpe.32000 \
    --inference_input_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.de \
    --inference_output_file=/home/xiaohao/data/deen_gnmt/output_infer \
    --inference_ref_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.en

NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint

I print keys of deen_gnmt_model_4_layer/translate.ckpt ,not find .../rnn/basic_lstm_cell/bias

xiaohao@ubuntu:~/nmt$ python ckpt_print.py models/deen_gnmt_model_4_layer/translate.ckpt
('CHECKPOINT_FILE: ', 'models/deen_gnmt_model_4_layer/translate.ckpt')
('tensor_name: ', 'embeddings/encoder/embedding_encoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/memory_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/output_projection/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/query_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_v')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_b')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_g')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'Variable')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/bias')
('tensor_name: ', 'embeddings/decoder/embedding_decoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
xiaohao@ubuntu:~/nmt$

I try the PR(#265), and rm -rf /home/xiaohao/data/deen_gnmt/* . The problem is sloved！

tks~ @oahziur

from nmt.

Filed to run GNMT about nmt HOT 14 OPEN

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent