Comments (11)
Post a screenshot of what is written when it finishes.
from whisper-standalone-win.
There is no log, the audio is 17min, and the task always stops in the middle of processing when using the medium model
from whisper-standalone-win.
And you sure that whisper-faster is not in memory anymore?
Do you run on CUDA? How much VRAM it has?
from whisper-standalone-win.
Yes it is run on CUDA using 3070. It has 8GB.
During the execution, i saw vram usage is below 4GB in total. Once the task exits, the ressource usage drops immediately.
When i tried the large model, it works just fine with success log
Transcription speed: 9.89 audio seconds/s
Subtitles are written to 'D:\workspace\whisper\fast-whisper\Whisper-Faster' directory.
Operation finished in: 165 seconds
from whisper-standalone-win.
Strange. Can you share that audio?
from whisper-standalone-win.
Nah I can not share it.
from whisper-standalone-win.
In what OS do you run it?
Maybe there would me more info with --verbose=true
.
Try --compute_type=int8
.
from whisper-standalone-win.
It is on Windows 10 22H2
2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] CPU: AuthenticAMD (SSE4.1=true, AVX=true, AVX2=true, AVX512=false)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] - Selected ISA: AVX2
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] - Use Intel MKL: false
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] - SGEMM backend: DNNL (packed: false)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] - GEMM_S16 backend: none (packed: false)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] - GEMM_S8 backend: DNNL (packed: false, u8s8 preferred: true)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] GPU #0: NVIDIA GeForce RTX 3070 (CC=8.6)
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] - Allow INT8: true
[2023-07-13 10:55:43.305] [ctranslate2] [thread 30220] [info] - Allow FP16: true (with Tensor Cores: true)
[2023-07-13 10:55:44.283] [ctranslate2] [thread 30220] [info] Using CUDA allocator: cuda_malloc_async
[2023-07-13 10:55:44.644] [ctranslate2] [thread 30220] [info] Loaded model D:\workspace\whisper\fast-whisper\Whisper-Faster\_models\faster-whisper-medium on device cuda:0
[2023-07-13 10:55:44.644] [ctranslate2] [thread 30220] [info] - Binary version: 6
[2023-07-13 10:55:44.644] [ctranslate2] [thread 30220] [info] - Model specification revision: 3
[2023-07-13 10:55:44.644] [ctranslate2] [thread 30220] [info] - Selected compute type: float16
Standalone Faster-Whisper r134.6 running on: CUDA
Number of visible GPU devices: 1
Supported compute types by GPU: {'float16', 'int8', 'int8_float16', 'float32'}
Model loaded in: 1.41 seconds
Processing audio with duration 17:38.773
VAD filter removed 00:31.889 of audio
VAD filter kept the following audio segments: [00:00.000 -> 17:06.884]
Audio processing finished in: 5.02 seconds
Processing segment at 00:00.000
# ....same Processing segment log here so I do not paste them here
# and we have some compression ratio log here
Compression ratio threshold is not met with temperature 0.0 (8.540984 > 2.400000)
Compression ratio threshold is not met with temperature 0.2 (8.540984 > 2.400000)
Compression ratio threshold is not met with temperature 0.4 (8.540984 > 2.400000)
Compression ratio threshold is not met with temperature 0.6 (7.133333 > 2.400000)
Compression ratio threshold is not met with temperature 0.8 (7.444444 > 2.400000)
Processing segment at 16:42.470
And it exits without any error. Just ring a "dou" song from the speaker.
from whisper-standalone-win.
But it shows that it's working -> "Processing segment at 16:42.470".
If you hear beep then that indicates that it's successfully finished, and there should be written "Operation finished".
from whisper-standalone-win.
I think the output log is not correct.
There is no log be written "Operation finished" when using medium size model even it is successfully finished.
I could find the text from 16:42.470 to the end in the generated srt file after the beep, but the last line of log is "Processing segment at 16:42.470".
And the situation is even worse when not using "--verbose"
When it beep, the logged lines never reach the end, but the srt file is correct.
[10:21.460 --> 10:22.460]
from whisper-standalone-win.
Created .srt and "beep" indicates that a task is successfully finished.
Just that the log output to console for some reason doesn't show up for you.
Check if r136.7 corrected this issue.
from whisper-standalone-win.
Related Issues (20)
- Faster-Whisper on drive D? HOT 5
- large-v3 error HOT 14
- "Starting transcription" then no output? HOT 11
- Suboptimal use of CPU ressources? HOT 2
- Text in the middle of audio files not transcribed. Help with parameters needed. HOT 2
- large V2 HOT 3
- WhisperHallu HOT 2
- [Subtitle Edit] "text not found" - "Loading result from STDOUT" HOT 23
- Important Please read. HOT 2
- Error on running linux app: "Could not load library libcudnn_ops_infer.so.8." HOT 24
- Whisper Timestamped HOT 11
- Google Soundstorm HOT 1
- Virus Reported In Latest Version HOT 5
- Hallucination loop HOT 24
- Faster-Whisper inside Docker fails HOT 1
- Seems like it's using all cpu cores? HOT 1
- RuntimeError: Library cublas64_11.dll is not found or cannot be loaded HOT 1
- Could not load library libcudnn_ops_infer.so.8 HOT 11
- missing segments HOT 30
- dot HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from whisper-standalone-win.