Coder Social home page Coder Social logo

problem with interactive.py about sca HOT 12 CLOSED

teslacool avatar teslacool commented on August 21, 2024
problem with interactive.py

from sca.

Comments (12)

teslacool avatar teslacool commented on August 21, 2024

When inferring a sentence, we do not touch the lm. First use nmt's encoder to generate hidden states and then work on nmt's decoder. You should look carefully at the fairseq decoding steps.

And now i test a sentence and fixed a bug. You can use interactive.py like original fairseq without any other command parameters.

from sca.

nicolabertoldi avatar nicolabertoldi commented on August 21, 2024

@teslacool
could you please give me an example how to run properly the command ./interactive.py

assuming that I have the following checkpoint models:
for srclm: ./lm_sl/checkpoint_best.pt
for tgtlm: ./lm_tl/checkpoint_best.pt
for transformer: ./engine/checkpoint_best.pt

I got these checkpoints by running the following preprocessing and training commands:

fairseq-preprocess --source-lang sl --target-lang tl --trainpref ./encoded_corpora//train --validpref ./encoded_corpora//dev --destdir ./data_generated --joined-dictionary

fairseq-train --task language_modeling --arch transformer_lm --lr-scheduler inverse_sqrt --lr-shrink 0.1 --warmup-updates 4000 --warmup-init-lr 1e-07 --min-lr 1e-09 --optimizer adam --lr 0.0001 --clip-norm 0.1 --criterion adaptive_loss --max-tokens 4096 --update-freq 8 --seed 1 --sample-break-mode none --skip-invalid-size-inputs-valid-test --ddp-backend=no_c10d --save-interval-updates 1000 --keep-interval-updates 10 --no-epoch-checkpoints --attention-dropout 0.1 --dropout 0.3 --criterion label_smoothed_cross_entropy --save-dir ./lm_sl/ ./data_generated_sl

fairseq-train --task language_modeling --arch transformer_lm --lr-scheduler inverse_sqrt --lr-shrink 0.1 --warmup-updates 4000 --warmup-init-lr 1e-07 --min-lr 1e-09 --optimizer adam --lr 0.0001 --clip-norm 0.1 --criterion adaptive_loss --max-tokens 4096 --update-freq 8 --seed 1 --sample-break-mode none --skip-invalid-size-inputs-valid-test --ddp-backend=no_c10d --save-interval-updates 1000 --keep-interval-updates 10 --no-epoch-checkpoints --attention-dropout 0.1 --dropout 0.3 --criterion label_smoothed_cross_entropy --save-dir ./lm_tl/ ./data_generated_tl

python3 ../train.py ./data_generated --task lm_translation --arch transformer_iwslt_de_en --share-decoder-input-output-embed --optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 --lr-scheduler inverse_sqrt --warmup-init-lr 1e-07 --warmup-updates 4000 --lr 0.0009 --min-lr 1e-09 --dropout 0.3 --weight-decay 0.0 --criterion label_smoothed_cross_entropy --label-smoothing 0.1 --max-tokens 2084 --update-freq 8 --save-dir ./engine --save-interval-updates 1000 --seed 200 --tradeoff 0.15 --load-lm --load-srclm-file ./lm_sl/checkpoint_best.pt --load-tgtlm-file ./lm_tl/checkpoint_best.pt --lmdecoder-ffn-embed-dim 2048 --no-epoch-checkpoints --keep-interval-updates 10

Note that all these scripts succedeed.

from sca.

teslacool avatar teslacool commented on August 21, 2024

run

echo "Danke dir ." | python interactive.py data-bin/iwslt14.tokenized.de-en     --path  checkpoints/transformer/checkpoint_best.pt --buffer-size 1024     --batch-size 128 --beam 5 --remove-bpe  | grep ^H | cut -f3-

you will get

thank you .

if your model is deen task.

from sca.

nicolabertoldi avatar nicolabertoldi commented on August 21, 2024

@teslacool

after your fix, it works

echo "ciao ciao ciao" | python3 ../interactive.py --remove-bpe --raw-text --path ./engine/checkpoint_best.pt --task lm_translation --src-no-lm --tgt-no-lm --load-srclm-file ./lm_sl/checkpoint_best.pt  --load-tgtlm-file ./lm_tl/checkpoint_best.pt  ./data_generated 

thank you

from sca.

nicolabertoldi avatar nicolabertoldi commented on August 21, 2024

@teslacool

I have a few more questions:

  • why do I need do specify the data-bin directory? Only for the dictionary? or is there another reason?

from sca.

teslacool avatar teslacool commented on August 21, 2024

yes, you need dict to get id for embedding matrix.

from sca.

nicolabertoldi avatar nicolabertoldi commented on August 21, 2024
  • I specify --remove-bpe, but my output still contains the _ symbols; why does this happen?

from sca.

teslacool avatar teslacool commented on August 21, 2024

remove bpe is to remove @@

from sca.

nicolabertoldi avatar nicolabertoldi commented on August 21, 2024

so it is correct that my output looks like

I_ am_ a_ newbie_ ._

from sca.

nicolabertoldi avatar nicolabertoldi commented on August 21, 2024

actually I get

I_ am_ a_ new bie_ ._

from sca.

teslacool avatar teslacool commented on August 21, 2024

it looks like not correct. I do not know why your sentences consist of words following _.

Because I and am are not a single word, the symbol _ was not added by bpe operation.

from sca.

nicolabertoldi avatar nicolabertoldi commented on August 21, 2024

@teslacool

sorry, my fault.

I close the issue by now,
but probably I will have more questions in short time.

from sca.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.