Coder Social home page Coder Social logo

Comments (8)

exPHAT avatar exPHAT commented on May 28, 2024

Yes, you can follow the same steps shown in this issue #6

from swiftwhisper.

eppee avatar eppee commented on May 28, 2024

Yes, you can follow the same steps shown in this issue #6

Thank you, I have however already tried. I should have been more clear.
It works great for continuous speaking audios.
But it works really bad when there are pauses in the speech, is there anything else than those parameterers I can work with?

from swiftwhisper.

exPHAT avatar exPHAT commented on May 28, 2024

Can you expand on this more? How does it work "really bad" for pauses in speech? What changes?

from swiftwhisper.

eppee avatar eppee commented on May 28, 2024

Can you expand on this more? How does it work "really bad" for pauses in speech? What changes?

If I make a recording when continually speaking it works great. If I make a recording where I say something like
"Okay here I'm speaking about" and then I take a pause of like 10 seconds and continue. Then in that 10 second gap I expect it to not show any word but it does come words but in a slower pace. And if I stop speaking and then let it run for some seconds in the end it will also add words in that space.
Continuous speaking good
Breaks are filled with words and then accelerates back inte correct place when continuous speaking are reestablished

from swiftwhisper.

exPHAT avatar exPHAT commented on May 28, 2024

I haven't tested it, but have you looked into using params.no_speech_thold? Thats a value inside of whisper.cpp that is likely related to what you're looking for.

I don't think this is a bug inside of SwiftWhisper, so I'm going to keep this issue closed. Please re-open it if this a bug unique to this wrapper.

from swiftwhisper.

eppee avatar eppee commented on May 28, 2024

I haven't tested it, but have you looked into using params.no_speech_thold? Thats a value inside of whisper.cpp that is likely related to what you're looking for.

I don't think this is a bug inside of SwiftWhisper, so I'm going to keep this issue closed. Please re-open it if this a bug unique to this wrapper.

Unfortunately it seems to make no difference.
Its like it won't add any pauses at all actually.

In this example I make many long pauses, but as you can see the next word always starts where the previous ends. No gaps at all? Is this another parameter that I have forgotten?

0 - 120 :
120 - 400 : So
400 - 800 : if
800 - 1420 : I
1420 - 1790 : make
1790 - 1990 : a
1990 - 3000 : video
3000 - 4980 : and
4980 - 5580 : if
5580 - 6570 : I
6570 - 7040 : take
7040 - 7160 : a
7160 - 7360 : long
7360 - 7620 : pause
7620 - 7960 : like
7960 - 10070 : that

from swiftwhisper.

exPHAT avatar exPHAT commented on May 28, 2024

@eppee I haven't tested, but just released version 1.0.3 which uses the latest version of whisper.cpp. Might be worth checking if this is still a problem.

from swiftwhisper.

eppee avatar eppee commented on May 28, 2024

@eppee I haven't tested, but just released version 1.0.3 which uses the latest version of whisper.cpp. Might be worth checking if this is still a problem.

Unfortunately there is no difference.
Been reading a lot and this seems to be a solution. But I don't know how to implement it in Swift...

openai/whisper#435

from swiftwhisper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.