<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Same issue here. And I have enabled the auto settings. <a target="_blank" rel="noo

The size of tensor a (16) must match the size of tensor b (5) atnon-singleton dimension 3 about speech-translate HOT 13 CLOSED

dadangdut33 commented on August 15, 2024

The size of tensor a (16) must match the size of tensor b (5) atnon-singleton dimension 3

from speech-translate.

Comments (13)

kelvincht commented on August 15, 2024 2

Ok, I did a hacky fix using a Thread.lock acquire/release between model.transcribe in line 5xx and 7xx in Record.py

It works... you may want to implement it more elegantly

from speech-translate.

Dadangdut33 commented on August 15, 2024 2

@Dadangdut33 Unfortunately the issue is still there. RuntimeError: The size of tensor a (12) must match the size of tensor b (5) at non-singleton dimension 3

I have also set auto channels and auto sample rate in the setting.. Nothing changed.

I am using the last version released.

Its fixed on 1.3.0 which is not released yet. I will try to release it maybe tomorrow or the day after it

from speech-translate.

Dadangdut33 commented on August 15, 2024 2

Fixed in 1.3.0 release

from speech-translate.

Dadangdut33 commented on August 15, 2024

Try to set auto channels and auto sample rate in the setting

from speech-translate.

sugarbobo-ch commented on August 15, 2024

Same issue here. And I have enabled the auto settings.

2023-04-21 09:13:17,328 - INFO - Console window hidden. If it is not hidden (only minimized), try changing your default windows terminal to windows cmd. (Main.py:51) [MainThread]
2023-04-21 09:13:17,328 - INFO - Booting up | Version: 1.2.3 (Main.py:1059) [MainThread]
2023-04-21 09:13:17,393 - DEBUG - Available Theme to use: ['vista', 'sv-light', 'sv-dark'] (Main.py:159) [MainThread]
2023-04-21 09:13:17,393 - DEBUG - Setting theme: sv-dark (Style.py:28) [MainThread]
2023-04-21 09:13:17,406 - DEBUG - Setting custom dark theme style (Style.py:49) [MainThread]
2023-04-21 09:13:17,724 - INFO - Checking for update on start (About.py:100) [MainThread]
2023-04-21 09:13:17,908 - INFO - Checking for update... (About.py:125) [MainThread]
2023-04-21 09:13:18,318 - INFO - No update available (About.py:145) [Thread-5 (req_update_check)]
2023-04-21 09:13:20,523 - INFO - Checking model name (Helper_Whisper.py:19) [MainThread]
2023-04-21 09:13:20,524 - DEBUG - modelKey: Large (v1) (1x speed), src_english: False (Helper_Whisper.py:20) [MainThread]
2023-04-21 09:13:20,524 - DEBUG - modelName: large-v1 (Helper_Whisper.py:25) [MainThread]
2023-04-21 09:13:22,415 - INFO - Checking model name (Helper_Whisper.py:19) [Thread-6 (rec_realTime)]
2023-04-21 09:13:22,415 - DEBUG - modelKey: Large (v1) (1x speed), src_english: False (Helper_Whisper.py:20) [Thread-6 (rec_realTime)]
2023-04-21 09:13:22,415 - DEBUG - modelName: large-v1 (Helper_Whisper.py:25) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,179 - INFO - -------------------------------------------------- (Record.py:344) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,179 - INFO - Task: transcribe (Record.py:345) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,179 - INFO - Modelname: large-v1 (Record.py:346) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,180 - INFO - Engine: Whisper (Record.py:347) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,180 - INFO - Auto mode: True (Record.py:348) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,180 - INFO - Source Lang: auto detect (Record.py:349) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,180 - INFO - Target Lang: english (Record.py:351) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,201 - DEBUG - Device: (1) Microphone (Superlux E205U) (Record.py:393) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,201 - DEBUG - {'index': 1, 'structVersion': 2, 'name': 'Microphone (Superlux E205U)', 'hostApi': 0, 'maxInputChannels': 2, 'maxOutputChannels': 0, 'defaultLowInputLatency': 0.09, 'defaultLowOutputLatency': 0.09, 'defaultHighInputLatency': 0.18, 'defaultHighOutputLatency': 0.18, 'defaultSampleRate': 44100.0, 'isLoopbackDevice': False} (Record.py:394) [Thread-6 (rec_realTime)]
2023-04-21 09:13:32,211 - DEBUG - Record Session Started (Record.py:401) [Thread-6 (rec_realTime)]
2023-04-21 09:13:44,484 - ERROR - Error in record session (Record.py:719) [Thread-6 (rec_realTime)]
2023-04-21 09:13:44,485 - ERROR - The size of tensor a (13) must match the size of tensor b (5) at non-singleton dimension 3 (Record.py:720) [Thread-6 (rec_realTime)]
Traceback (most recent call last):
  File "speech_translate\utils\Record.py", line 588, in rec_realTime
  File "whisper\transcribe.py", line 229, in transcribe
  File "whisper\transcribe.py", line 164, in decode_with_fallback
  File "torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "whisper\decoding.py", line 811, in decode
  File "torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "whisper\decoding.py", line 724, in run
  File "whisper\decoding.py", line 673, in _main_loop
  File "whisper\decoding.py", line 157, in logits
  File "torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "whisper\model.py", line 211, in forward
  File "torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "whisper\model.py", line 136, in forward
  File "torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "whisper\model.py", line 90, in forward
  File "whisper\model.py", line 104, in qkv_attention
RuntimeError: The size of tensor a (13) must match the size of tensor b (5) at non-singleton dimension 3
2023-04-21 09:13:48,013 - INFO - Recording Mic Stopped (Main.py:896) [Thread-6 (rec_realTime)]

from speech-translate.

Dadangdut33 commented on August 15, 2024

I honestly couldn't figure out what is wrong here, it might be related to the device because i tried on my mic and headphone and it is working just fine. Have you tried another mic / device ?

from speech-translate.

sugarbobo-ch commented on August 15, 2024

I have tested on other devices, also I'm using Windows 11.
When encoutering errors, I close the app, the progress seem not to be killed, and keep the GPU memory getting use.

from speech-translate.

kelvincht commented on August 15, 2024

Got the same issue.

In my case, if Whisper Translation is used, regardless keeping transcript or not. It will have the same error in the first few seconds.
The size of tensor a (x) must match the size of tensor b (y)

If I just use transcript without translation. No errors

If I use transcript in Whisper, and translate using Google translate. No errors

from speech-translate.

kelvincht commented on August 15, 2024

2023-04-21 09:13:44,485 - ERROR - The size of tensor a (13) must match the size of tensor b (5) at non-singleton dimension 3 (Record.py:720) [Thread-6 (rec_realTime)]
Traceback (most recent call last):
File "speech_translate\utils\Record.py", line 588, in rec_realTime

Some suggestion from Whisper AI community

openai/whisper#951

Hi, it appears that you're calling the model from different threads. The model is not equipped for that, mainly because of the kv cache logic using the hooks. I'd suggest keep using the lock, if that's not too much of a slowdown.

from speech-translate.

kelvincht commented on August 15, 2024

I guess the issue is in Record.py, you can't call two model.transcribe concurrently. It need some kind of lock between them, or have them in one thread.

from speech-translate.

PawelGu commented on August 15, 2024

I'm having the same problem and wanted to edit record.py.
Seems I'm too dumb or just can't find the file.
I'm on Windows 10 and there is neither a utils directory nor the python file.
Can anybody help me? The online translation integration is nice but so much slower and eeven seems less accurate...

Edit: Came to my mind that I may have to use the module via pip? I guess the precompiled binary does have some Python code built in.
Thumbs up if I'm right.

from speech-translate.

Dadangdut33 commented on August 15, 2024

Ok, I did a hacky fix using a Thread.lock acquire/release between model.transcribe in line 5xx and 7xx in Record.py

It works... you may want to implement it more elegantly

Thanks for the help @kelvincht <3 i have added it to the code

from speech-translate.

EllyKher commented on August 15, 2024

@Dadangdut33 Unfortunately the issue is still there.
RuntimeError: The size of tensor a (12) must match the size of tensor b (5) at non-singleton dimension 3

I have also set auto channels and auto sample rate in the setting.. Nothing changed.

I am using the last version released.

from speech-translate.

The size of tensor a (16) must match the size of tensor b (5) atnon-singleton dimension 3 about speech-translate HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent