Hello, I'm sorry for asking some stupid question. I have train

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

run_downstream_babel.py <div class="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

about the downstream evaluation about s3prl HOT 5 CLOSED

s3prl commented on June 26, 2024

about the downstream evaluation

from s3prl.

Comments (5)

andi611 commented on June 26, 2024

Hi,

for 1) in order for me to help, please provide the arguments you used when you encounter this error.
for 2) What do you mean by speaker information? do you mean the ground truth speaker label? If so, it is provided in the LibriSpeech dataset (the name of each utterance file contains the speaker ID). And No I haven't explored other languages.

Thanks

from s3prl.

ArtemisZGL commented on June 26, 2024

@andi611 sorry for late reply. I found the error above is caused by the online feature extracting. And after using the script to extract features, I met another error while training. Here is my command and the log.

 python run_downstream_babel.py --run=speaker_utterance --upstream=transformer --ckpt=/data3/zgl/mock_babel_ckpt/states-500000.ckpt

File "/Self-Supervised-Speech-Pretraining-and-Representation-Learning/transformer/model.py", line 119, in forward
    input_representations = spec_transformed + pos_enc
RuntimeError: The size of tensor a (10909) must match the size of tensor b (5000) at non-singleton dimension 1

I think it's maybe caused by my own dataset. But all of them were extracted feature by the same script. What reasons may cause this error?

from s3prl.

andi611 commented on June 26, 2024

run_downstream_babel.py

File "/Self-Supervised-Speech-Pretraining-and-Representation-Learning/transformer/model.py", line 119, in forward
    input_representations = spec_transformed + pos_enc
RuntimeError: The size of tensor a (10909) must match the size of tensor b (5000) at non-singleton dimension 1

It seems like you've modified our original code, including run_downstream.py and data_loader.py.

Hence I can only guess that 10909 and 5000 are the sequence length of spec_transformer and pos_enc (please verify this). If so, then you have to change this line here.

from s3prl.

ArtemisZGL commented on June 26, 2024

@andi611 thanks, I have just change the dataloader for librispeech to another dataset. And the whole dataset maybe just one sample is too long for test split, maybe just the test split will not drop the too long sample which is showed in the dataloader.py.

from s3prl.

andi611 commented on June 26, 2024

Yes, the test split will not drop too long sequences.

from s3prl.

Recommend Projects

about the downstream evaluation about s3prl HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent