Coder Social home page Coder Social logo

Comments (4)

andi611 avatar andi611 commented on July 24, 2024

Hi,

I haven't encountered this error before. The error msg doesn't tell much either.
However, I have a hunch to suspect this problem may be caused by downsampling and upsampling.
To verify this:

Method 1 - can you try to save a randomly initialized aalbert model with downsample_rate: 1:

  1. set downsample_rate: 1 and save_step: 1 in your .yaml
  2. run the training script over 1 step then ctrl+c, this will save a randomly initialized model at training step 1
  3. try to load this new model and see if the error still occurs.

Method 2 - in your pytorch-kaldi/nn_transfromer.py, print and verify that the input feature length and output representation length are the same.

  1. add print(x.shape) at both the beginning and end of the forward function of your pytorch-kaldi/nn_transfromer.py.
  2. execute the pytorch-kaldi run_exp.py
  3. The input acoustic feature should have the shape of: (time steps, batch size=12, dim=40);
    and the output representation should have the shape of (time steps, batch size=12, dim=768).
    The time steps number should be the same.

Please let me know if this is the case!

FYI, we find that downsample_rate: 1 is more suitable for the pytorch-kaldi DNN/HMM framework.
Using a downsample rate > 1 always yield worse results.

from s3prl.

ybNo1 avatar ybNo1 commented on July 24, 2024

Hi,

I haven't encountered this error before. The error msg doesn't tell much either.
However, I have a hunch to suspect this problem may be caused by downsampling and upsampling.
To verify this:

Method 1 - can you try to save a randomly initialized aalbert model with downsample_rate: 1:

  1. set downsample_rate: 1 and save_step: 1 in your .yaml
  2. run the training script over 1 step then ctrl+c, this will save a randomly initialized model at training step 1
  3. try to load this new model and see if the error still occurs.

Method 2 - in your pytorch-kaldi/nn_transfromer.py, print and verify that the input feature length and output representation length are the same.

  1. add print(x.shape) at both the beginning and end of the forward function of your pytorch-kaldi/nn_transfromer.py.
  2. execute the pytorch-kaldi run_exp.py
  3. The input acoustic feature should have the shape of: (time steps, batch size=12, dim=40);
    and the output representation should have the shape of (time steps, batch size=12, dim=768).
    The time steps number should be the same.

Please let me know if this is the case!

FYI, we find that downsample_rate: 1 is more suitable for the pytorch-kaldi DNN/HMM framework.
Using a downsample rate > 1 always yield worse results.

thanks for replying, method One works for me!
Could you tell me what experiments you had done training ASR?About which structure of ASR you trained after pre-trained with aalbert/mockingjay,
I mean if aalbert acts the similar role with tdnn, then I should train ASR with simple fc layers and aalbert pretrain structure ,and ASR could aquire the same accuracy or better than kaldi tdnn ASR?

from s3prl.

ybNo1 avatar ybNo1 commented on July 24, 2024

I've got answers from another issue, thanks for replying!I'll close this issue.

from s3prl.

andi611 avatar andi611 commented on July 24, 2024

thanks for replying, method One works for me!

I've tried setting downsample to 3 to match your setting, but I did not encounter any error.
Upsampling and downsampling worked fine, not sure why there is an error in your case.

Also, I noticed that the AALBERT config file is not set according to the original paper.
I've made some adjustments and updates in this commit.
Note that the previous settings are for our new work TERA, and not AALBERT.
I suggest you use the updated default config for future AALBERT training.

Happy training!

from s3prl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.