Coder Social home page Coder Social logo

Filed to run GNMT about nmt HOT 14 OPEN

tensorflow avatar tensorflow commented on May 19, 2024
Filed to run GNMT

from nmt.

Comments (14)

oahziur avatar oahziur commented on May 19, 2024 1

I am also adding instructions on how to train and load the gnmt model from scratch.

from nmt.

oahziur avatar oahziur commented on May 19, 2024

Thanks, I will need to update the nmt/standard_hparams/wmt16_en_de_gnmt.json.

from nmt.

vince62s avatar vince62s commented on May 19, 2024

I ran a standard attention / scaled_luong / uni system and go the expected results.
Same with gnmt architecture / scaled_luong / enc_type gnmt, completely off.
Is there something special to do for GNMT attention architecture ?

from nmt.

oahziur avatar oahziur commented on May 19, 2024

@vince62s Did you check with the standard_hparams for GNMT, there are also pre-trained models available for download in the README page.

from nmt.

ndvbd avatar ndvbd commented on May 19, 2024

Same problem here. After training, when doing inference I get:

KeyError: 'num_encoder_residual_layers'

It only works when I delete all these keys from the hparams file, and when I set the --hparams_path to the directory of the best_bleu, but then after one run, for some reason, it rewrites the hparams file, and add these problematic key/values again... It's not clear how this mechanism works.

My guess is that when the code is saving hparams, it simply writes key values that it doesn't suppose to.

from nmt.

oahziur avatar oahziur commented on May 19, 2024

@NadavB can you share the command getting the error? were you using the standard_hparams file in the repo for inference?

There are some updates to the hparams recently, so I think the standard_hparams maybe out of date.

from nmt.

ndvbd avatar ndvbd commented on May 19, 2024

@oahziur I did not use the standard hparams. I used the params as shown in the tutorial.

So for training:

--attention=scaled_luong \
    --src=vi --tgt=en \
    --vocab_prefix=tmp/nmt_data/vocab  \
    --train_prefix=tmp/nmt_data/train \
    --dev_prefix=tmp/nmt_data/tst2012  \
    --test_prefix=tmp/nmt_data/tst2013 \
    --out_dir=/tmp/nmt_attention_model \
    --num_train_steps=5000 \
    --steps_per_stats=20 \
    --num_layers=2 \
    --num_units=128 \
    --dropout=0.2 \
    --metrics=bleu

And for inference:

python nmt/nmt.py \
    --out_dir=/tmp/nmt_attention_model \
    --inference_input_file=/tmp/nmt_data/source_infer.vi \
    --inference_output_file=/tmp/nmt_attention_model/output_infer

from nmt.

LimWoohyun avatar LimWoohyun commented on May 19, 2024

@NadavB Hello. i"m studying nmt.
i want to run test file. so i ran just. nmt,py but failed.

how to command your script?? please let me know basiclly

from nmt.

ndvbd avatar ndvbd commented on May 19, 2024

@LimWoohyun Look at https://github.com/tensorflow/nmt -> search for "Hands-on – building an attention-based NMT model" the command is written there.

from nmt.

bquast avatar bquast commented on May 19, 2024

@oahziur I get Key error using the standard_hparams (tf 1.6rc1, will try on my other machine with tf1.5-cuda).

NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

using the command:

[bquast@UX370UA ~]$ cd nmt
[bquast@UX370UA nmt]$ python -m nmt.nmt \
>     --src=de --tgt=en \
>     --ckpt=deen_gnmt_model_4_layer/translate.ckpt \
>     --hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
>     --out_dir=/tmp/deen_gnmt \
>     --vocab_prefix=/home/bquast/en_de_data/vocab.bpe.32000 \
>     --inference_input_file=/home/bquast/en_de_data/newstest2014.tok.bpe.32000.de \
>     --inference_output_file=/home/bquast/deen_gnmt_model_4_layer/output_infer \

full output here:

https://gist.github.com/bquast/30ba7630d2bf32b59dd8349889fc7638

EDIT: confirmed, same error on tf15.-cuda

https://gist.github.com/bquast/0ddbf8eda363d312dd57b51aebb11f5d

from nmt.

tiberiu92 avatar tiberiu92 commented on May 19, 2024

@bquast I recently got the error using the same configuration.

Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

I tried this with tf14 too, but no luck. Are there any updates on this?

Thank you.

from nmt.

bquast avatar bquast commented on May 19, 2024

hey, no news yet, any progress on your side?

from nmt.

oahziur avatar oahziur commented on May 19, 2024

@bquast I think this is related to #264 and there is a PR fixed this #265. Maybe you can try patch the PR and see if you still get the issue. Make sure you clear the model directory.

from nmt.

xiaohaoliang avatar xiaohaoliang commented on May 19, 2024

@bquast @tiberiu92 @oahziur
I got the same error using the same configuration. (tf-1.8, python-2.7)

python -m nmt.nmt \
    --src=de --tgt=en \
    --ckpt=/home/xiaohao/nmt/models/deen_gnmt_model_4_layer/translate.ckpt \
    --hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
    --out_dir=/home/xiaohao/data/deen_gnmt \
    --vocab_prefix=/home/xiaohao/data/wmt16/vocab.bpe.32000 \
    --inference_input_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.de \
    --inference_output_file=/home/xiaohao/data/deen_gnmt/output_infer \
    --inference_ref_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.en	
NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint

I print keys of deen_gnmt_model_4_layer/translate.ckpt ,not find .../rnn/basic_lstm_cell/bias

xiaohao@ubuntu:~/nmt$ python ckpt_print.py models/deen_gnmt_model_4_layer/translate.ckpt
('CHECKPOINT_FILE: ', 'models/deen_gnmt_model_4_layer/translate.ckpt')
('tensor_name: ', 'embeddings/encoder/embedding_encoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/memory_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/output_projection/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/query_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_v')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_b')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_g')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'Variable')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/bias')
('tensor_name: ', 'embeddings/decoder/embedding_decoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
xiaohao@ubuntu:~/nmt$

I try the PR(#265), and rm -rf /home/xiaohao/data/deen_gnmt/* . The problem is sloved!

tks~ @oahziur

from nmt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.