Coder Social home page Coder Social logo

listparser's Introduction

listparser

Parse OPML subscription lists in Python.


If you're building a feed reader and you need to parse OPML subscription lists, you've come to the right place!

listparser makes it easy to parse and use subscription lists in multiple formats. It supports OPML, RDF+FOAF, and the iGoogle exported settings format, and runs on Python 3.8+ and on PyPy 3.8.

Usage

>>> import listparser
>>> result = listparser.parse(open("feeds.opml").read())

A dictionary will be returned with several keys:

  • meta: a dictionary of information about the subscription list
  • feeds: a list of feeds
  • lists: a list of subscription lists
  • version: a format identifier like "opml2"
  • bozo: True if there is a problem with the list, False otherwise
  • bozo_exception: (if bozo is 1) a description of the problem

For convenience, the result dictionary supports attribute access for its keys.

Continuing the example:

>>> result.meta.title
'listparser project feeds'
>>> len(result.feeds)
2
>>> result.feeds[0].title, result.feeds[0].url
('listparser blog', 'https://kurtmckee.org/tag/listparser')

More extensive documentation is available in the docs/ directory and online.

Bugs

There are going to be bugs. The best way to handle them will be to isolate the simplest possible document that susses out the bug, add that document as a test case, and then find and fix the problem.

...you can also just report the bug and leave it to someone else to fix the problem, but that won't be as much fun for you!

Bugs can be reported on GitHub.

Git workflow

listparser basically follows the git-flow methodology:

  • Features and changes are developed in branches off the main branch. They merge back into the main branch.
  • Feature releases branch off the main branch. The project metadata is updated (like the version and copyright years), and then the release branch merges into the releases branch. The releases branch is then tagged, and then it is merged back into main.
  • Hotfixes branch off the releases branch. As with feature releases, the project metadata is updated, the hotfix branch merges back into the releases branch, which is then tagged and merged back into main.

Development

To set up a development environment, follow these steps at a command line:

# Set up a virtual environment.
python -m venv .venv

# Activate the virtual environment in Linux:
. .venv/bin/activate

# ...or in Windows Powershell:
& .venv/Scripts/Activate.ps1

# Install dependencies.
python -m pip install -U pip setuptools wheel
python -m pip install poetry pre-commit tox scriv
poetry install --all-extras

# Enable pre-commit.
pre-commit install

# Run the unit tests.
tox

When submitting a PR, be sure to create and edit a changelog fragment.

scriv create

The changelog fragment will be created in the changelog.d/ directory. Edit the file to describe the changes you've made.

listparser's People

Contributors

dependabot[bot] avatar kurtmckee avatar pre-commit-ci[bot] avatar rongronggg9 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

listparser's Issues

Add option to return htmlUrls?

I've just found list parser and feel free to ignore this feature request!

But I hoped to use it to parse OPML and get the htmlUrl URLs from each item. But I think I'm right in saying that if an item has xmlUrl that will always be returned in feeds, and any htmlUrl ignored entirely?

Is there a timetable for a new release?

Hi there.

The release history of listparser has been radio-silent for 7 years.

I am the author of https://github.com/Rongronggg9/RSS-to-Telegram-Bot. My project, published to PyPI using the package name rsstt, depends on listparser. I really love the current unreleased version and would like to upgrade to it. However, packages published to PyPI are not allowed to depend on packages outside PyPI. And I would neither like to publish my own listparser package nor like to bundle it inside my package, unless I have to.

That might be a somewhat rude question, but could I ask that is there a timetable for a new release?

Expose both `feed.title` and `feed.text`

listparser only exposes feed.title. It will be the text attribute of an outline, or, if unavailable, the title attribute.

In practice, however, some software doesn't use the ``text``
attribute at all. Therefore, ``feeds[i].title`` is filled from the
``text`` attribute or, if that isn't available, the ``title``
attribute. For example:

However, OPML 2.0 marks the text attribute as mandatory.

Every outline element must have at least a text attribute, which is what is displayed when an `outliner`_ opens the OPML file.

Required attributes: type, text, xmlUrl.

As a result, if an RSS reader adheres OPML 2.0, the exported OPML may have both tilte and text even if there is no user-defined feed title.

I propose that listparser exposes feed.title (keep the current behavior), feed.title_orig, and feed.text, so that the above case can be identified by determining if feed.title_orig == feed.text.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.