when running pytorch-kaldi for asr training using pretrained aalbert model, something

thanks for replying, method One works for me! <p dir="a

error training asr with pretrained aalbert about s3prl HOT 4 CLOSED

s3prl commented on July 24, 2024

error training asr with pretrained aalbert

from s3prl.

Comments (4)

andi611 commented on July 24, 2024

Hi,

I haven't encountered this error before. The error msg doesn't tell much either.
However, I have a hunch to suspect this problem may be caused by downsampling and upsampling.
To verify this:

Method 1 - can you try to save a randomly initialized aalbert model with downsample_rate: 1:

set downsample_rate: 1 and save_step: 1 in your .yaml
run the training script over 1 step then ctrl+c, this will save a randomly initialized model at training step 1
try to load this new model and see if the error still occurs.

Method 2 - in your pytorch-kaldi/nn_transfromer.py, print and verify that the input feature length and output representation length are the same.

add print(x.shape) at both the beginning and end of the forward function of your pytorch-kaldi/nn_transfromer.py.
execute the pytorch-kaldi run_exp.py
The input acoustic feature should have the shape of: (time steps, batch size=12, dim=40);
and the output representation should have the shape of (time steps, batch size=12, dim=768).
The time steps number should be the same.

Please let me know if this is the case!

FYI, we find that downsample_rate: 1 is more suitable for the pytorch-kaldi DNN/HMM framework.
Using a downsample rate > 1 always yield worse results.

from s3prl.

ybNo1 commented on July 24, 2024

Hi,

I haven't encountered this error before. The error msg doesn't tell much either.
However, I have a hunch to suspect this problem may be caused by downsampling and upsampling.
To verify this:

Method 1 - can you try to save a randomly initialized aalbert model with downsample_rate: 1:

set downsample_rate: 1 and save_step: 1 in your .yaml

run the training script over 1 step then ctrl+c, this will save a randomly initialized model at training step 1

try to load this new model and see if the error still occurs.

Method 2 - in your pytorch-kaldi/nn_transfromer.py, print and verify that the input feature length and output representation length are the same.

add print(x.shape) at both the beginning and end of the forward function of your pytorch-kaldi/nn_transfromer.py.

execute the pytorch-kaldi run_exp.py

The input acoustic feature should have the shape of: (time steps, batch size=12, dim=40);
and the output representation should have the shape of (time steps, batch size=12, dim=768).
The time steps number should be the same.

Please let me know if this is the case!

FYI, we find that downsample_rate: 1 is more suitable for the pytorch-kaldi DNN/HMM framework.
Using a downsample rate > 1 always yield worse results.

thanks for replying, method One works for me!
Could you tell me what experiments you had done training ASR?About which structure of ASR you trained after pre-trained with aalbert/mockingjay,
I mean if aalbert acts the similar role with tdnn, then I should train ASR with simple fc layers and aalbert pretrain structure ,and ASR could aquire the same accuracy or better than kaldi tdnn ASR?

from s3prl.

ybNo1 commented on July 24, 2024

I've got answers from another issue, thanks for replying!I'll close this issue.

from s3prl.

andi611 commented on July 24, 2024

thanks for replying, method One works for me!

I've tried setting downsample to 3 to match your setting, but I did not encounter any error.
Upsampling and downsampling worked fine, not sure why there is an error in your case.

Also, I noticed that the AALBERT config file is not set according to the original paper.
I've made some adjustments and updates in this commit.
Note that the previous settings are for our new work TERA, and not AALBERT.
I suggest you use the updated default config for future AALBERT training.

Happy training!

from s3prl.

error training asr with pretrained aalbert about s3prl HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent