Coder Social home page Coder Social logo

dumrauf / youtube_to_audiobook Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 2.0 19 KB

Download any YouTube video as an audiobook

Home Page: https://www.how-hard-can-it.be/listen-to-youtube/?utm_source=GitHub&utm_medium=social&utm_campaign=About

License: MIT License

Shell 83.62% Dockerfile 16.38%
youtube audiobook bash youtube-dl mp3 docker docker-container youtube-link download-audio audiobooks

youtube_to_audiobook's Introduction

YouTube2Audiobook

This repository contains a Bash script that downloads the audio section of a given YouTube link and converts it into an audiobook. The audiobook can then be used like any other Mp3 file.

See also Continuously Listen to YouTube Audiobooks on Your Phone on How Hard Can It Be?! for an end-to-end solution for downloading any YouTube video as an audiobook and listening to it on your phone. While keeping track of playback positions. Even during app restarts and system reboots.

There is also a Docker container, which is automatically built from this repository, readily available for use on Docker Hub. It provides the same functionality by packaging the Bash script and all necessary dependencies.

You Have

Before you can use the Bash script in this repository out of the box, you need

  • youtube-dl which is a Python application that ships with a CLI and allows to download audio and video from YouTube
  • FFmpeg which is "a complete, cross-platform solution to record, convert and stream audio and video" (from https://ffmpeg.org/) and also ships with a CLI
  • jq which is a standalone, "lightweight and flexible command-line JSON processor" (from https://stedolan.github.io/jq/)

The corresponding Docker container on Docker Hub only requires a recent Docker engine.

Most likely, you also have a YouTube link that you want to convert to an audiobook.

You Want

After running the Bash script or the Docker container in this repository you get an audiobook version of the input YouTube link stored on your device.

Execution: Bash Script

The stand-alone Bash script yt2ab.sh is located in the root folder.

Getting Started

At minimum, the Bash script yt2ab.sh requires a YouTube URL to be passed in via the -u parameter.

YouTube link <YouTube-URL> can be converted into an audiobook via

./yt2ab.sh -u <YouTube-URL>

Depending on the size of the audio section in the given YouTube link, your connection speed, and the audio conversion speed, it may take several minutes before the Bash script successfully completes.

Advanced Use

Further options are available via additional CLI parameters.

The full list of currently available parameters is

Usage  : yt2ab.sh -u <youtube-url>                -s [audio-speed] -q [audio-quality] -v(erbose)
Example: yt2ab.sh -u https://youtu.be/WRbalzuvms4
Example: yt2ab.sh -u https://youtu.be/WRbalzuvms4 -s 1.5
Example: yt2ab.sh -u https://youtu.be/WRbalzuvms4 -s 1.5           -q 3
Example: yt2ab.sh -u https://youtu.be/WRbalzuvms4 -s 1.5           -q 3               -v
Note: for audio quality see also column 'ffmpeg option' on <https://trac.ffmpeg.org/wiki/Encode/MP3>; defaults to 4

The audio_speed parameter allows to increase or decrease the speed of the resulting audiobook; defaults to 1 and hence leaves the audio speed unchanged. This setting is particularly helpful when trying to normalise the original speed of the given YouTube link in the resulting audiobook. In that case, set the audio_speed so that original speed x audio_speed = 1, i.e., if the original speed of the YouTube link is 1.5, choosing a value of 2/3 for audio_speed will normalise the speed of the resulting audiobook as 1.5 x 2/3 = 1. However, the final audiobook may sound awkward as audio speed adjustments can be a lossy process.

The audio_quality parameter allows to set the quality of the resulting audiobook; defaults to 4. Here, the values are derived from column "ffmpeg option" in table "LAME Bitrate Overview" on https://trac.ffmpeg.org/wiki/Encode/MP3.

The v(erbose) parameter increases the verbosity of the script and can be used for debugging purposes.

Execution: Docker Container

The Docker container for version 1.3.0 (see the release page for the latest one) can be pulled from Docker Hub via

docker pull dumrauf/yt2ab:v1.3.0

The image includes the yt2ab.sh Bash script and all required dependencies. As such, it uses the identical input parameters. The only new parameter is specific to Docker.

By default, the Docker container stores all downloaded audiobooks in /data.

Download and convert a <YouTube-URL> into an audiobook in the current directory via

docker run -v "$(pwd)":/data dumrauf/yt2ab:v1.3.0 -u <YouTube-URL>

For more details on how to build your own Docker container and advanced uses, see the YouTube2Audiobook is on Docker Hub announcement.

FAQs

Below is a list of frequently asked questions.

I Know How to Make This Better!

Excellent. Feel free to fork this repository, make the changes in your fork, and open a pull request so we can make things better for everyone. Thanks!

Disclaimer

By using this software, you agree to always respect the copyright laws in your country and all other jurisdictions that may apply. Moreover, you also agree to always respect the Terms of Service of the provider you are downloading audio or video content from. The authors of this software are in no way responsible for any potential damages, liabilities, or losses.

youtube_to_audiobook's People

Contributors

dumrauf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

gratau abailoni

youtube_to_audiobook's Issues

Extract URL from Correct Key in JSON Document

The below example is taken from the "Introducing Playlist Support" blog post on How Hard Can It Be?!.

When closely inspecting the returned JSON documents for the "Genre: Science" YouTube playlist,

$ youtube-dl --dump-json --flat-playlist https://youtu.be/gmmgWcbAmjo?list=PLZ-bKJtH3G7BYYscYohAFZAq65RwCf-K-
{"url": "z5f2bpiAdLQ", "_type": "url", "ie_key": "Youtube", "id": "z5f2bpiAdLQ", "title": "The Fairyland of Science by Arabella B. BUCKLEY read by Various | Full Audio Book"}
{"url": "TGMGfCfpcqY", "_type": "url", "ie_key": "Youtube", "id": "TGMGfCfpcqY", "title": "The Extermination of the American Bison by William T. HORNADAY read by Various | Full Audio Book"}
{"url": "z5f2bpiAdLQ", "_type": "url", "ie_key": "Youtube", "id": "z5f2bpiAdLQ", "title": "The Fairyland of Science by Arabella B. BUCKLEY read by Various | Full Audio Book"}
...

the values of both the id and url keys seem to match for every individual JSON document; this can also be observed in numerous other playlists.

However, in order to remove the reliance on the assumption that the values of id and url keys match for every individual JSON document, we should move the extraction (and subsequent construction) of the YouTube URL from the id to the url key in the corresponding line of the main Bash script.

Add Playlist Handling

At the moment, yt2ab can only process single video links. This can be a meticulous task when wanting to download an entire playlist. Right now, this requires manual look up as well as providing all the video links contained in the playlist. Boooring!

We should extend yt2ab to accept playlist links as input, especially since the underlying youtube-dl already provides most of the required functionality.

Add WebP Thumbnail Support

Thumbnails retrieved by youtube-dl seem to be named after the file provided by the server where the suffix seems to match the MIME type of the downloaded image; usually this is JPEG.

While not widespread, the thumbnails of some videos are being retrieved as WebP files which breaks the current logic based on JPEG.

The Bash script hence needs to be reliably extended (i.e., not introducing race conditions by relying on something like the 'most recent' file to pick up the correct thumbnail) to also work with WebP files.

Document 'jq' dependency

The implementation of Issue #1 introduced a dependency on jq which needs to be documented and as such clearly visible to any user.

Provide Docker Container

While youtube2audiobook is only a very thin Bash script wapper around the great youtube-dl open source project, it still relies on numerous dependencies to be present before being able to run.

Not everyone might be willing or able to install the necessary dependencies. But, this being (late) 2020, there's a readily available solution for this: Containers.

We should really provide a Docker container that follows best practices, is properly tagged, and readily available on Docker Hub. Let's fully embrace the future!

ERROR: gmmgWcbAmjo: YouTube said: Unable to extract video data

Hi, I have tried to follow your instructions on Linux Mint 20.3 and get the following output:

VirtualBox:~$ docker run -v "$(pwd)":/data -u $(id -u ${USER}):$(id -g ${USER}) dumrauf/yt2ab:v1.3.0 -u "https://youtu.be/gmmgWcbAmjo?list=PLZ-bKJtH3G7BYYscYohAFZAq65RwCf-K-"

[INFO tini (1)] Spawned child process 'yt2ab' with pid '7'
ERROR: gmmgWcbAmjo: YouTube said: Unable to extract video data
[INFO tini (1)] Main child exited normally (with status '0')

Any ideas?

Prevent Throttling

The introduction of playlist support in #1 has opened the floodgates to (sequentially) downloading multiple videos from YouTube. Understandably, YouTube doesn't take to its content being mass downloaded too well.

Rather than hitting YouTube with a large number of download requests in a short span of time, we should introduce a mitigating strategy which waits a random amount of time between successive downloads. This should somewhat simulate a "regular" YouTube user and hopefully delay being locked out to the point of it being no longer a problem.

Ideally, the wait time should be configurable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.