Hey there, looking for advice on how to debug. Trialling the latest commit (<a cla

Cribbing <a href="https://github.com/SYSTRAN/faster-whisper/pull/856#issuecomment-2174

major slowdown with batching commit - cpu only about faster-whisper HOT 15 OPEN

ooobo commented on July 30, 2024 1

major slowdown with batching commit - cpu only

from faster-whisper.

Comments (15)

x86Gr commented on July 30, 2024 6

This toxic behaviour is not getting you anything good. If you have few cores and that PR slowed you down, discuss it with the developers involved. Calm down and realize you're on this planet with other people which are not offending you full throttle. You're free to switch to adult behaviour and discuss politely, or to open a fork.

from faster-whisper.

aligokalppeker commented on July 30, 2024 2

I see that the problem occurs if you only have a very few CPU cores (0-3). If you have many more (16 or more), then it is indeed faster with the new commit. I tested with the file tests/data/jfk.flac from the directory. My suggestion would be to roll back to a previous release FTB.

One possible direction would be to modify setup.py with extras for the batched version.

This is really a shitty answer and your PR really messed up the whisper. It is really valid scenario to use whisper model instances on 3-4 cores, your bullshit PR make it slow down instead of make it faster.

from faster-whisper.

Jiltseb commented on July 30, 2024 2

I see that the problem occurs if you only have a very few CPU cores (0-3). If you have many more (16 or more), then it is indeed faster with the new commit. I tested with the file tests/data/jfk.flac from the directory. My suggestion would be to roll back to a previous release FTB.
One possible direction would be to modify setup.py with extras for the batched version.

This is really a shitty answer and your PR really messed up the whisper. It is really valid scenario to use whisper model instances on 3-4 cores, your bullshit PR make it slow down instead of make it faster.

I never claimed it was invalid to have 3-4 cores. The Faster-whisper PR conducted several evaluations that confirmed a significant speed-up in general. Batching was a highly requested feature for the Faster-whisper project. You are encouraged to open a PR with alternative solutions. Targeting specific contributors is disrespectful and against our community guidelines.

from faster-whisper.

x86Gr commented on July 30, 2024 2

This toxic behaviour is not getting you anything good. If you have few cores and that PR slowed you down, discuss it with the developers involved. Calm down and realize you're on this planet with other people which are not offending you full throttle. You're free to switch to adult behaviour and discuss politely, or to open a fork.

No this is not toxic behaviour, this is a strong response to avoid the destruction of the project as the issue is not only the performance of faster-whisper on low-core systems. Faster Whisper is not just a batch-oriented and offline project and this PR and these developers transform the project to it without knowing/respecting project origins and community usage.

Any PR as well as any release can have any number of bugs. The authors of the PR are looking into it, meanwhile you can use 1.0.2 or whatever version works well for your system. Bugs and regressions happen everyday, everywhere on github and the wise response is to be helpful to authors, not to offend them. You want faster-whisper to works well on 1-4 cores? Provide feedback, test stuff, be helpful.

from faster-whisper.

ooobo commented on July 30, 2024

Cribbing jobus0's test script, I ran some tests.

repository	compute_type	clip length	average elapsed time	relative %
without #856	int8_float32	30s	2.5980s	100%
without #856	int8	30s	2.6320s	101%
with #856	int8_float32	30s	3.5081s	135%
with #856	int8	30s	3.5254s	136%

So not as great a slowdown as I first found, but still considerable. Only have int8 and float32 on this server but notice same sort of slowdown with other compute_types on a Macbook.

I wonder if this is CPU related? Can anyone else try using CPU only and let me know?

test script:

import faster_whisper
import time

model = faster_whisper.WhisperModel(
            "small",
            device="cpu",
            compute_type="int8_float32")

# warm up
segments, info = model.transcribe("audio.wav", beam_size=5)

total_start_time = time.time()

repeats = 10
for i in range(repeats):
    start_time = time.time()
    segments, info = model.transcribe("audio.wav", beam_size=5)
    print(f"Elapsed time: {time.time() - start_time:.4f}")

print()
print(f"Total elapsed time: {time.time() - total_start_time:.4f}")
print(f"Average elapsed time: {(time.time() - total_start_time)/repeats:.4f}")```

from faster-whisper.

ooobo commented on July 30, 2024

Expanding the test to print segments from the segment generator too, the slowdown is much greater.

repository	compute_type	clip length	average elapsed time	relative %
without #856	int8_float32	30s	11.2693s	100%
with #856	int8_float32	30s	70.4650s	625%

test script:

```python import faster_whisper import time

model = faster_whisper.WhisperModel(
"small",
device="cpu",
compute_type="int8_float32")

warm up

segments, info = model.transcribe("audio.wav", beam_size=5)

total_start_time = time.time()

repeats = 10
for i in range(repeats):
start_time = time.time()
segments, info = model.transcribe("audio.wav", beam_size=5)
for segment in segments:
print(segment.text)
print(f"Elapsed time: {time.time() - start_time:.4f}")

print()
print(f"Total elapsed time: {time.time() - total_start_time:.4f}")
print(f"Average elapsed time: {(time.time() - total_start_time)/repeats:.4f}")```

from faster-whisper.

x86Gr commented on July 30, 2024

Have you tried with 10 minutes of audio and with the medium and large models?

from faster-whisper.

benniekiss commented on July 30, 2024

The CPU slowdown may be caused by the switch to using torch instead of onnxruntime. I may be wrong in that assumption, but I know theres a related pyannote issue regarding slowdowns after switching to torch which might be worth looking into.

from faster-whisper.

ooobo commented on July 30, 2024

The CPU slowdown may be caused by the switch to using torch instead of onnxruntime. I may be wrong in that assumption, but I know theres a related pyannote issue regarding slowdowns after switching to torch which might be worth looking into.

Mmm I did wonder that, torch seems to be the main change for the non-batched transcribe() function right? I remember having issues with pyannote and using torch tensor running slow on cpu, but can't remember if we got to a solution there.

Have you tried with 10 minutes of audio and with the medium and large models?

Good idea. I haven't run 10min versions of medium/large as the new commit version was looking likely to take hours. It's a pretty consistent 625+% slower to print segments.

repository	compute_type	model	clip length	average elapsed time	relative %
without #856	int8_float32	small	10min	194.8097	100%
with #856	int8_float32	small	10min	1218.948s	626%
without #856	int8_float32	medium	30s	29.0115s	100%
with #856	int8_float32	medium	30s	183.4392s	632%
without #856	int8_float32	large	30s	47.8324s	100%
with #856	int8_float32	large	30s	329.3989s	688%

from faster-whisper.

Jiltseb commented on July 30, 2024

I see that the problem occurs if you only have a very few CPU cores (0-3). If you have many more (16 or more), then it is indeed faster with the new commit.
I tested with the file tests/data/jfk.flac from the directory. My suggestion would be to roll back to a previous release FTB.

One possible direction would be to modify setup.py with extras for the batched version.

from faster-whisper.

x86Gr commented on July 30, 2024

I see that the problem occurs if you only have a very few CPU cores (0-8). If you have more (16 or more), then it is indeed faster with the new commit.

Cores or threads?

from faster-whisper.

ooobo commented on July 30, 2024

I see that the problem occurs if you only have a very few CPU cores (0-3). If you have many more (16 or more), then it is indeed faster with the new commit. I tested with the file tests/data/jfk.flac from the directory. My suggestion would be to roll back to a previous release FTB.

One possible direction would be to modify setup.py with extras for the batched version.

Thanks for taking a look - this would make sense. Tested on Macbook with 8 cores, ARM server with 4 cores, Intel server with 4 cores, all slower with new commit.
I'm targeting CPU-only hardware with low resources so maybe less common.

Do you think it's torch replacing numpy for the feature extraction and audio processing?
While a bit clunky, I note transformers whisper has torch optional, and uses numpy if it's missing in its feature extractor - could a similar thing work here?

from faster-whisper.

aligokalppeker commented on July 30, 2024

I see that the problem occurs if you only have a very few CPU cores (0-3). If you have many more (16 or more), then it is indeed faster with the new commit. I tested with the file tests/data/jfk.flac from the directory. My suggestion would be to roll back to a previous release FTB.
One possible direction would be to modify setup.py with extras for the batched version.

This is really a shitty answer and your PR really messed up the whisper. It is really valid scenario to use whisper model instances on 3-4 cores, your bullshit PR make it slow down instead of make it faster.

I never claimed it was invalid to have 3-4 cores. The Faster-whisper PR conducted several evaluations that confirmed a significant speed-up in general. Batching was a highly requested feature for the Faster-whisper project. You are encouraged to open a PR with alternative solutions. Targeting specific contributors is disrespectful and against our community guidelines.

How you can claim that is the requested feature for faster-whisper? And is this viable to solve the batching problem like this? Currently batching can be handled by alternative solutions.

What is the general scenarios you mention that speeds up the faster whisper? are you sure you are mapping to the all used scenarios ? how can you speak in your limited context?

You need to revert the commit to make any alternative solution to be done on top of it. Regarding PR and the things done are bullshit.

Getting faster-whisper's flexibility and agility and making it something bloated, is not what the faster-whisper community should demand.

Let's revert the code, and make all efforts to make this PR separate and agile. Please note that making an effort does not make you the owner of the repo.

from faster-whisper.

aligokalppeker commented on July 30, 2024

This toxic behaviour is not getting you anything good. If you have few cores and that PR slowed you down, discuss it with the developers involved. Calm down and realize you're on this planet with other people which are not offending you full throttle. You're free to switch to adult behaviour and discuss politely, or to open a fork.

No this is not toxic behaviour, this is a strong response to avoid the destruction of the project as the issue is not only the performance of faster-whisper on low-core systems. Faster Whisper is not just a batch-oriented and offline project and this PR and these developers transform the project to it without knowing/respecting project origins and community usage.

from faster-whisper.

aligokalppeker commented on July 30, 2024

This toxic behaviour is not getting you anything good. If you have few cores and that PR slowed you down, discuss it with the developers involved. Calm down and realize you're on this planet with other people which are not offending you full throttle. You're free to switch to adult behaviour and discuss politely, or to open a fork.

No this is not toxic behaviour, this is a strong response to avoid the destruction of the project as the issue is not only the performance of faster-whisper on low-core systems. Faster Whisper is not just a batch-oriented and offline project and this PR and these developers transform the project to it without knowing/respecting project origins and community usage.

Any PR as well as any release can have any number of bugs. The authors of the PR are looking into it, meanwhile you can use 1.0.2 or whatever version works well for your system. Bugs and regressions happen everyday, everywhere on github and the wise response is to be helpful to authors, not to offend them. You want faster-whisper to works well on 1-4 cores? Provide feedback, test stuff, be helpful.

You still do not get it and you think too simple to relate that to testing and bug fixing. This is normal as you do not see my overall analysis of the PR.

#937

Based on the design and implementation nature of the PR, any bug fix will not correct the issues.

from faster-whisper.

major slowdown with batching commit - cpu only about faster-whisper HOT 15 OPEN

Comments (15)

warm up

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent