Coder Social home page Coder Social logo

mawkke / audiobook-split-ffmpeg Goto Github PK

View Code? Open in Web Editor NEW
15.0 1.0 0.0 1.53 MB

Split audiobook file into per-chapter files using chapter metadata and ffmpeg

License: Apache License 2.0

Python 94.53% Makefile 5.47%
audio audio-processing audiobook audiobook-convertor cli command-line ffmpeg python tool

audiobook-split-ffmpeg's Introduction

audiobook-split-ffmpeg

Split audiobook file into per-chapter files using chapter metadata and ffmpeg.

Useful in situations where your preferred audio player does not support chapter metadata.

NOTE: Works only if the input file actually has chapter metadata (see example below)

Example

Let's say, you have an audio file mybook.m4b, for which ffprobe -i mybook.m4b shows the following:

Chapter #0:0: start 0.000000, end 1079.000000
Metadata:
  title           : Chapter Zero
Chapter #0:1: start 1079.000000, end 2040.000000
Metadata:
  title           : Chapter One
Chapter #0:2: start 2040.000000, end 2878.000000
Metadata:
  title           : Chapter Two
Chapter #0:3: start 2878.000000, end 3506.000000

Then, running:

$ audiobook-split-ffmpeg --infile mybook.m4b --outdir /tmp/foo

..produces the following files:

  • /tmp/foo/001 - Chapter Zero.m4b
  • /tmp/foo/002 - Chapter One.m4b
  • /tmp/foo/003 - Chapter Two.m4b

You may then play these files with your preferred application.

Install

This package is not published in the PYPI, but you can install it directly from Github with

$ pip install --user git+https://github.com/MawKKe/audiobook-split-ffmpeg

However, I personally recommend using pipx which simplifies installing python applications into isolated virtualenvs:

$ pipx install git+https://github.com/MawKKe/audiobook-split-ffmpeg

afterwards, the audiobook-split-ffmpeg command should be available via your PATH.

Next, see Usage below.

Usage

See the help:

$ audiobook-split-ffmpeg -h

In the simplest case you can just call

$ audiobook-split-ffmpeg --infile /path/to/audio.m4b --outdir foo

Note that this script will never overwrite files in foo/, so you must delete conflicting files manually (or specify some other empty/nonexistent directory)

The chapter titles will be included in the filenames if they are available in the chapter metadata. You may prevent this behaviour with flag --no-use-title-as-filename, in which case the filenames will include the input file basename instead (this is useful is your metadata is crappy or malformed, for example).

You may specify how many parallel ffmpeg jobs you want with command line param --concurrency. The default concurrency is equal to the number of cores available. Note that at some point increasing the concurrency might not increase the throughput. (We specifically instruct ffmpeg to NOT perform re-encoding, so most of the processing work consists of copying the existing encoded audio data from the input file to the output file(s) - this kind of processing is more I/O bounded than CPU-bounded).

Dependencies

This application has no 3rd party library dependencies, as everything is implemented using the Python standard library. However, the script assumes the that the following system-executables are available somewhere in your $PATH:

  • ffmpeg
  • ffprobe

For Ubuntu, these can be installed with apt install ffmpeg.

Development and Testing

Clone the repo, and install the package in development mode:

$ git clone <url> audiobook-split-ffmpeg && cd audiobook-split-ffmpeg
$ pip install -e '.[dev]'

then run tests with:

$ pytest -vv

Features

  • The script does not transcode/re-encode the audio data. This speeds up the processing, but has the possibility of creating mangled audio in some rare cases (let me know if this happens).

  • This script will instruct ffmpeg to write metadata in the resulting chapter files, including:

    • track: the chapter number; in the format X/Y, where X = chapter number, Y = total num of chapters.
    • title: the chapter title, as-is (if available)
  • The chapter numbers are included in the output file names, padded with zeroes so that all numbers are of equal length. This makes the files much easier to sort by name.

  • The work is parallelized to speed up the processing.

License

Copyright 2018 Markus Holmström (MawKKe)

The works under this repository are licenced under Apache License 2.0. See file LICENSE for more information.

Contributing

This project is hosted at https://github.com/MawKKe/audiobook-split-ffmpeg

You are welcome to leave bug reports, fixes and feature requests. Thanks!

audiobook-split-ffmpeg's People

Contributors

mawkke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.