Running using following command will not produce the full tran, it will exit wit

<a target="_blank" rel="noopener noreferrer" href="https://private-user-images.githubu

In what OS do you run it? Maybe there would me more info with <code class="notrans

It is on Windows 10 22H2 <div class="highlight highlight-source-shell notranslate

The output log in console is not correct with medium model about whisper-standalone-win HOT 11 CLOSED

purfview commented on May 13, 2024

The output log in console is not correct with medium model

from whisper-standalone-win.

Comments (11)

Purfview commented on May 13, 2024

Post a screenshot of what is written when it finishes.

from whisper-standalone-win.

yuchaoqian commented on May 13, 2024

There is no log, the audio is 17min, and the task always stops in the middle of processing when using the medium model

from whisper-standalone-win.

Purfview commented on May 13, 2024

And you sure that whisper-faster is not in memory anymore?
Do you run on CUDA? How much VRAM it has?

from whisper-standalone-win.

yuchaoqian commented on May 13, 2024

Yes it is run on CUDA using 3070. It has 8GB.
During the execution, i saw vram usage is below 4GB in total. Once the task exits, the ressource usage drops immediately.

When i tried the large model, it works just fine with success log


Transcription speed: 9.89 audio seconds/s

Subtitles are written to 'D:\workspace\whisper\fast-whisper\Whisper-Faster' directory.


Operation finished in: 165 seconds

from whisper-standalone-win.

Purfview commented on May 13, 2024

Strange. Can you share that audio?

from whisper-standalone-win.

yuchaoqian commented on May 13, 2024

Nah I can not share it.

from whisper-standalone-win.

Purfview commented on May 13, 2024

In what OS do you run it?
Maybe there would me more info with --verbose=true.
Try --compute_type=int8.

from whisper-standalone-win.

yuchaoqian commented on May 13, 2024

It is on Windows 10 22H2

2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] CPU: AuthenticAMD (SSE4.1=true, AVX=true, AVX2=true, AVX512=false)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info]  - Selected ISA: AVX2
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info]  - Use Intel MKL: false
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info]  - SGEMM backend: DNNL (packed: false)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info]  - GEMM_S16 backend: none (packed: false)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info]  - GEMM_S8 backend: DNNL (packed: false, u8s8 preferred: true)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] GPU #0: NVIDIA GeForce RTX 3070 (CC=8.6)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info]  - Allow INT8: true
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info]  - Allow FP16: true (with Tensor Cores: true)
[2023-07-13 10:55:44.283] [ctranslate2] [thread 30220] [info] Using CUDA allocator: cuda_malloc_async
[2023-07-13 10:55:44.644] [ctranslate2] [thread 30220] [info] Loaded model D:\workspace\whisper\fast-whisper\Whisper-Faster\_models\faster-whisper-medium on device cuda:0
[2023-07-13 10:55:44.644] [ctranslate2] [thread 30220] [info]  - Binary version: 6
[2023-07-13 10:55:44.644] [ctranslate2] [thread 30220] [info]  - Model specification revision: 3
[2023-07-13 10:55:44.644] [ctranslate2] [thread 30220] [info]  - Selected compute type: float16

Standalone Faster-Whisper r134.6 running on: CUDA

Number of visible GPU devices: 1

Supported compute types by GPU: {'float16', 'int8', 'int8_float16', 'float32'}


Model loaded in: 1.41 seconds

Processing audio with duration 17:38.773

VAD filter removed 00:31.889 of audio
VAD filter kept the following audio segments: [00:00.000 -> 17:06.884]

Audio processing finished in: 5.02 seconds

Processing segment at 00:00.000

# ....same Processing segment log here so I do not paste them here
# and we have some compression ratio log here
Compression ratio threshold is not met with temperature 0.0 (8.540984 > 2.400000)
Compression ratio threshold is not met with temperature 0.2 (8.540984 > 2.400000)
Compression ratio threshold is not met with temperature 0.4 (8.540984 > 2.400000)
Compression ratio threshold is not met with temperature 0.6 (7.133333 > 2.400000)
Compression ratio threshold is not met with temperature 0.8 (7.444444 > 2.400000)

Processing segment at 16:42.470

And it exits without any error. Just ring a "dou" song from the speaker.

from whisper-standalone-win.

Purfview commented on May 13, 2024

But it shows that it's working -> "Processing segment at 16:42.470".
If you hear beep then that indicates that it's successfully finished, and there should be written "Operation finished".

from whisper-standalone-win.

yuchaoqian commented on May 13, 2024

I think the output log is not correct.
There is no log be written "Operation finished" when using medium size model even it is successfully finished.
I could find the text from 16:42.470 to the end in the generated srt file after the beep, but the last line of log is "Processing segment at 16:42.470".

And the situation is even worse when not using "--verbose"
When it beep, the logged lines never reach the end, but the srt file is correct.

[10:21.460 --> 10:22.460]

from whisper-standalone-win.

Purfview commented on May 13, 2024

Created .srt and "beep" indicates that a task is successfully finished.
Just that the log output to console for some reason doesn't show up for you.

Check if r136.7 corrected this issue.

from whisper-standalone-win.

The output log in console is not correct with medium model about whisper-standalone-win HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent