Uses subtitle timestamps to extract the dialogue from an .mkv file. This script is written in plain Bourne Shell. The only dependency is ffmpeg.
First, ffmpeg is used to extract the subtitle file from the .mkv file. It then splits and converts the audio track of the .mkv file according to the timestamps of the subtitle file. This requires the subtitles in a text-based format (i.e. subtitles that are in the format .ass, .ssa, or .srt). The audio cuts are then concatenated and written to the output filename (default is output.mp3
).
There are several options that you can specify:
-i Specify the video input
-a Specify the audio track number to use
-s Either specify the subtitle track number to use or specify the filename of an external subtitle file
-o Specify the output filename
-p Specify padding (in milliseconds) around subtitle timestamps; must be less than 1000
-k Keep intermediate files stored under /tmp; useful for debugging purposes
-h Display usage message
Only the -i option is required. If not specified, the default behavior is to use the first audio track and the first subtitle track. The default output name is simply the name of the video file with the extension changed to .mp3. The default padding is 100 milliseconds. Similar to ffmpeg, the extension of the output name determines the format of the output.
Add command line options to specify the audio and subtitles trackAdd option to specify output nameAutomatically merge overlapping timestamps (especially when using padding)Add option to use an external subtitles fileAdd the option to pad the timestamps in the subtitles file- Improve subtitle parsing
- Improve documentation and error-handling
- Parse bitmap-based subtitle files