Coder Social home page Coder Social logo

serene-arc / podcast-downloader Goto Github PK

View Code? Open in Web Editor NEW
15.0 3.0 0.0 79 KB

A simple command-line python tool to download podcasts

License: MIT License

Python 100.00%
rss-feed download-podcasts python podcast-downloader podcast podcast-fetcher archiver podcast-archival archive download

podcast-downloader's Introduction

podcast-downloader

This is a simple tool for downloading all the available episodes in an RSS feed to disk, where they can be listened to offline.

Firstly, Python 3 must be installed, then the requirements must be installed. These are documented in requirements.txt and can be installed via the command python3 -m pip install -r requirements.txt.

Arguments

Following are the arguments that can be supplied to the program:

  • destination is the directory that the folder structure will be created in and the podcasts downloaded to
  • -f, --feed is the URL for the RSS feed of the podcast
  • -o, --opml is the location of an OPML file with podcast data
  • --file is the location of a simple text file with an RSS feed URL on each line
  • -l, --limit is the maximum number of episodes to try and download from the feed; if left blank, it is all episodes, but a small number is fastest for updating a feed
  • -m, --max-downloads will limit the number of episodes to be downloaded to the specified integer
  • -w, --write-list is the option to write an ordered list of the episodes in the podcast in several different formats, as specified:
    • none
    • text
    • audacious
    • m3u
  • -t, --threads is the number of threads to run concurrently; defaults to 10
  • --max-attempts will specify the number of reattempts for a failed or refused connection; see below for more details

The following arguments alter the functioning of the program in a major way e.g. they do not download:

  • --skip-download will do everything but download the files; useful for updating episode playlists without a lengthy download
  • --verify will scan existing files for ones with a file-size outside a 2% and list them in results.txt
  • --update-tags will download episode information and write tags to all episodes already downloaded

The following arguments alter the verbosity and logging behaviour:

  • -s, --suppress-progress will disable all progress bars
  • -v, --verbose will increase the verbosity of the information output to the console
  • --log will log all messages to a debug level (the equivalent of -v) to the specified file, appending if it already exists

The --feed, --file, and --opml flags can all be specified multiple times to aggregate feeds from multiple locations.

Of these, only the destination is required, though one or more feeds or one or more OPML files must be provided or the program will just complete instantly.

Maximum Reattempts

In some cases, particularly when downloading a single or a few specific podcasts with a lot of episodes at once, the remote server will receive a number of simultaneous or consecutive requests. As this may appear to be atypical behaviour, this server may refuse or close incoming connections as a rate-limiting measure. This is normal in scraping servers that do not want to be scraped.

There are several countermeasures in the downloader for this behaviour, such as randomising the download list to avoid repeated calls to the same server in a short amount of time, but this may not work if there is only one or a few podcast feeds to download. As such, the method of last resort is a sleep function to wait until the server allows the download to continue. This is done with increasing increments of 30 seconds, with the maximum number or reattempts specified by the --max-attempts argument. For example, if left at the default of 10, the program will sleep for 30 seconds if the connection is refused. Then, if it was refused again, it will sleep for 60 before reattempting the download. It will do this until the 10th attempt, where it will sleep for 300 seconds, or five minutes. If the connection is refused after this, then an error will occur and the download thread will move on to the next podcast episode.

The maximum number of reattempts may need to be changed in several cases. If you wish to download the episode regardless of anything else, then you may want to increase the argument. This may result in longer wait times for the downloads to complete. However, a low argument will make the program skip downloads if they time out repeatedly, missing content but completing faster.

Warnings

The --write-list option should not be used with the --limit option. The limit option will be applied to the episode list in whatever format chosen, and this will overwrite any past episode list files. For example, if a --limit of 5 is chosen with -w audacious, then the exported Audacious playlist will only be 5 items long. Thus the -w option should only be used when there is not a limit.

Tags

The downloader has basic tag writing support. It will write ID3 tags to MP3 files and iTunes-compatible tags to m4a and MP4 files. The information written is as follows:

  • The episode title
  • The podcast title
  • The publishing date and time of the episode
  • The description accompanying the episode
  • The episode number (if available)

Example Command

Following is an example command to download a single feed to a podcasts folder.

python3 -m podcastdownloader media/podcasts --f 'http://linustechtips.libsyn.com/wanshow' -o podcasts.opml

Podcast Feed Files

A feed file, for use with the --file option, is a simple text file with one URL that leads to the RSS feed per line. The podcastdownloader will ignore all lines beginning with a hash (#), as well as empty lines to allow comments and a rudimentary structure if desired. Additionally, comments can be appended to the end of a line with a feed URL. As long as there is a space between the hash and the end of the URL, it will be removed when the file is parsed.

podcast-downloader's People

Contributors

serene-arc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

podcast-downloader's Issues

Mailing Updates

Idea for an enhancement: tool mails a list of new podcast episodes downloaded instead of having all of the output be in the logging, which can be quite dense.

Switch to using Click and subcommands

Switch from using argparse and an increasingly complex series of options and arguments to Click, which dedicated subcommands for each of the functions that the downloader performs

Directory permission error

Hi I don't know if this project is still being maintained but I am having some problems.

Firstly the example you give in the readme does not seem to be correct I have had to add the word download as a command.

I then get a permission error:

openhabian@openhabian:/ $ python -m podcastdownloader download / -l 1 -v -f 'https://anchor.fm/s/3cbbb3b8/podcast/rss'
[2022-10-06 18:06:06,180 - root - INFO] - 1 feeds found
[2022-10-06 18:06:06,184 - root - DEBUG] - Beginning retrieval for https://anchor.fm/s/3cbbb3b8/podcast/rss
[2022-10-06 18:06:14,293 - root - INFO] - Retrieved RSS for The WAN Show
[2022-10-06 18:06:14,294 - root - INFO] - All feeds filled
[2022-10-06 18:06:14,295 - root - INFO] - Limiting episodes per podcast to 1 entries
[2022-10-06 18:06:14,296 - root - INFO] - 1 episodes to download
[2022-10-06 18:06:14,297 - root - DEBUG] - Attempting download of episode USB Branding Changed Again... - WAN Show September 30, 2022 in The WAN Show
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/openhabian/.local/lib/python3.9/site-packages/podcastdownloader/__main__.py", line 173, in <module>
    cli()
  File "/home/openhabian/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/openhabian/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/openhabian/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/openhabian/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/openhabian/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/openhabian/.local/lib/python3.9/site-packages/podcastdownloader/__main__.py", line 118, in cli_download
    asyncio.run(download_episodes(all_feeds, destination, threads, write_playlist, limit))
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/openhabian/.local/lib/python3.9/site-packages/podcastdownloader/__main__.py", line 168, in download_episodes
    await asyncio.gather(*episode_downloaders)
  File "/home/openhabian/.local/lib/python3.9/site-packages/podcastdownloader/__main__.py", line 77, in download_individual_episode
    await episode.download(session)
  File "/home/openhabian/.local/lib/python3.9/site-packages/podcastdownloader/episode.py", line 89, in download
    self.file_path.parent.mkdir(exist_ok=True, parents=True)
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/The WAN Show'

Add proper tag engine

Add a proper tag engine that will add tags based on the file format. Needs to use mutagen and not the easy classes as they don't seem to work as well as manually specifying the proper frames

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.