Comments (20)
It does run on macOS. However you have to use python3 so you have to install python3 via brew
.
Instead of pip
use pip3
.
And yes, you need to run it in the terminal.
from whisper-ctranslate2.
thank you I will try this and come back to you again if problems come up
from whisper-ctranslate2.
I couldn't figure out: what are supported --compute_type
on M1 Mac (ARM64) ?
from whisper-ctranslate2.
Just use auto
. The same I use for x86-64 platform. It will autoselect the best value for specific platform.
from whisper-ctranslate2.
does this only work for mp3? I have m4a file recorded on my phone.
Also, where do I need to exceute the code, whisper-ctranslate2 (file name) --model large
from whisper-ctranslate2.
when I run the command on the folder where the recording is located, it says no command of whisper-ctranslate2 is found
from whisper-ctranslate2.
If you installed python3
via brew install python3
, it should work, otherwise I cannot help you since you did something different from what I did on my machine.
You have to install brew
from www.brew.sh
then install python3 with this command brew install python3
, then you can install whisper-ctranslate2 by doing this command pip3 install -U whisper-ctranslate2
. Then you have to exit the terminal and launch the terminal again and you'll have access to whisper-ctranslate2
command. You can verify the version of the whisper-ctranslate2 by doing the command whisper-ctranslate2 --version
which should show the output whisper-ctranslate2 0.3.4
(version 0.3.4 as of today, it should be 0.3.5 if you used pip3 install git+https://github.com/jordimas/whisper-ctranslate2.git
).
It doesn't matter which audio is of any particular format, whisper-ctranslate2 automatically converts to 16KHz mono audio internally for the inference engine to transcribe.
You can also specify the path to where the audio file is located by prepending the absolute path to the filename.
Where you launch the command whisper-ctranslate2
(the current folder) will contain the transcription or the subtitle of the video saved as <name of the video>.vtt
.
If the instructions are too much for you, you could look at this macos app called WhisperScript it uses internally the code from faster-whisper which is also based on whisper-ctranslate2.
from whisper-ctranslate2.
thank you so much for the explanation. I did as per your instructions but I get this error when I install whisper-ctranslate2.
from whisper-ctranslate2.
You'll have to do brew install pkg-config
, cause that's what is missing for the building of PyAV per error message.
Are you on a M1 or M2 machine?
from whisper-ctranslate2.
I am on a m2 pro machine.
from whisper-ctranslate2.
do you think it installed properly?
from whisper-ctranslate2.
Yes, it is installed correctly. However you have outdated brew formulae which means you have to do brew update && brew upgrade
.
from whisper-ctranslate2.
ok thanks I did them but I still get this error when I run the command "pip3 install -U whisper-ctranslate2"
from whisper-ctranslate2.
from whisper-ctranslate2.
Sorry, that error is coming from trying to build av-10.0.0
. I have no idea how to help you with this error since I don't have access to arm based macs.
From what I see across the projects that do the whisper ai models for speech to text inference transcription like whisper.cpp/faster-whisper/whisper-ctranslate2, the code is running on Intel based macs primarily.
Your best chance to have it running is to try the app called WhisperScript, it uses the same code as faster-whisper which also is based off of whisper-ctranslate2. The link for the app is in one of my above replies. WhisperScript will run natively on arm based macs.
from whisper-ctranslate2.
ok thank you. So Whisper script is more or less the same thing as what I am trying to do here?
from whisper-ctranslate2.
Yes, it's a GUI based app. You don't need anymore to use the terminal.
from whisper-ctranslate2.
@dazzng Wasn't you running faster-whisper from my repo? This is practically the same thing, it won't work on your GPU.
from whisper-ctranslate2.
yeah I was just trying alternatives.
from whisper-ctranslate2.
yeah I was just trying alternatives.
I have an M1 and use whisper.cpp with large and medium Core ML models. First time large is run it takes 15 1/2 hours till it starts and first time medium takes 2 to 3 hours if I remember correctly. I didn't want to compile them with XCode and get a developer account so just downloaded precomipled models but still takes that extra "first time run".
I get around 1.7x realtime speed on large and about 3.5x realtime speed with medium. I use -ng (no graphics) for large since I only got 8GB RAM and it just goes a tad slower but at least I can still use the laptop.
st=$SECONDS && for f in *.opus ; do ffmpeg -hide_banner -i "$f" -f wav -ar 16000 -ac 1 - | nice ~/whisper/whisper.cpp-1.5.4/./main -m ~/whisper/whisper.cpp-1.5.4/models/ggml-$setmodel.bin - -ovtt -of "$f" -t 8 -l "$setlanguage" $translate -ml $maxlength -sow $setprintcolors $setng ; for f in *.vtt ; do sed -r -i .bak -e "s|\ballah\b|Allah|g" -e's|\[BLANK_AUDIO\]||g' "$f" ; done && for i in *opus.vtt ; do mv -i -- "$i" "$(printf '%s' "$i" | sed '1s/.opus.vtt/.vtt/')" ; [ ! -d vttsubs ] && mkdir vttsubs/ ; mv *.vtt vttsubs/ ; done && rm *.bak ; done && secs=$((SECONDS-st)); printf '\nwhisper.cpp took %02dh:%02dm:%02ds\n' $(($secs/3600)) $(($secs%3600/60)) $(($secs%60))
from whisper-ctranslate2.
Related Issues (20)
- Issue with output redirection to R console using processx HOT 2
- GPU or CoreML acceleration on Mac Arm
- RuntimeError HOT 4
- Docker image for running whisper-ctranslate2 HOT 1
- error while installing on windows HOT 1
- Transcribing from live video stream source
- Save/record the audio into files HOT 1
- do not support distil-***.en model HOT 1
- Exploit found in Malwarebytes
- Can you please add support for word_timestamps, highlight_words and max_words_per_line? HOT 2
- Problem with GPU allocation after updating to CTranslate2 4.0.0 HOT 2
- Problem running it on Windows 11 HOT 2
- Mic input for live transcription HOT 2
- --word_timestamps = True not working HOT 1
- Disable output_dir? HOT 1
- parallel_for failed: cudaErrorNoKernelImageForDevice on Windows 11 in Powershell
- Crash after dedecting language HOT 1
- Path to local model directory should be considered a valid model option HOT 2
- Python usage HOT 1
- Unable to work with cuda HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from whisper-ctranslate2.