Coder Social home page Coder Social logo

rss-to-activitypub's Introduction

RSS to ActivityPub Converter

This is a server that lets users convert any RSS feed to an ActivityPub actor that can be followed by users on ActivityPub-compliant social networks like Mastodon.

This is based on my Express ActivityPub Server, a simple Node/Express server that supports a subset of ActivityPub.

As of the v2.0.0 release of this project, only users who are authenticated with a particular OAuth server can create feeds. Any federated user can still read the feeds. I implemented this because running this service in the open invited thousands of spammers to create feeds and overwhelm the service. With this new model, you can run this as an added bonus for people in a community like a Mastodon server, and as the person running it you are taking on only the moderation burden of the users you are already responsible for on your federated server.

Requirements

This requires Node.js v10.10.0 or above.

You also need beanstalkd running. This is a simple and fast queueing system we use to manage polling RSS feeds. Here are installation instructions. On a production server you'll want to install it as a background process.

You'll also need to control some kind of OAuth provider that you can regsiter this application on. This application was designed to work with Mastodon as that OAuth provider (see more on setting that up below), but any OAuth 2.0 provider should work. Many federated software packages besides Mastodon can act as OAuth providers, and if you want something standalone, Keycloak and ORY Hydra are two open source providers you could try.

Installation

Clone the repository, then cd into its root directory. Install dependencies:

npm i

Then copy config.json.template to config.json:

cp config.json.template config.json

Update your new config.json file:

{
  "DOMAIN": "mydomain.com",
  "PORT_HTTP": "3000",
  "PORT_HTTPS": "8443",
  "PRIVKEY_PATH": "/path/to/your/ssl/privkey.pem",
  "CERT_PATH": "/path/to/your/ssl/cert.pem",
  "OAUTH": {
    "client_id": "abc123def456",
    "client_secret": "zyx987wvu654",
    "redirect_uri": "https://rss.example.social/convert",
    "domain": "example.social",
    "domain_human": "Example Online Community",
    "authorize_path": "/oauth/authorize",
    "token_path": "/oauth/token",
    "token_verification_path": "/some/path/to/verify/token"
  }
}
  • DOMAIN: your domain! this should be a discoverable domain of some kind like "example.com" or "rss.example.com"
  • PORT_HTTP: the http port that Express runs on
  • PORT_HTTPS: the https port that Express runs on
  • PRIVKEY_PATH: point this to your private key you got from Certbot or similar
  • CERT_PATH: point this to your cert you got from Certbot or similar
  • OAUTH: this object contains properties related to OAuth login. See the section below on "Running with OAuth" for more details.
    • client_id: also known as the "client key". A long series of characters. You generate this when you register this application with an OAuth provider.
    • client_secret: Another long series of characters that you generate when you register this application with an OAuth provider.
    • redirect_uri: This is the URI that people get redirected to after they authorize the application on the OAuth server. Must point to the server where THIS service is running, and must point to the /convert page. This uri has to match what you put in the application info on the OAuth provider.
    • domain: The domain of the OAuth provider. Not necessarily the same as this server (for example, you could host this at rss.mydomain.com and then handle all OAuth through some other server you control, like a Mastodon server).
    • domain_human: The human-readable name of the OAuth provider. This will appear in various messages, so if you say "Example Online Community" here then the user will see a message like "Click here to log in via Example Online Community".
    • authorize_path: This will generally be /oauth/authorize/ but you can change it here if your OAuth provider uses a nonstandard authorization path.
    • token_path: This will generally be /oauth/token/ but you can change it here if your OAuth provider uses a nonstandard token path.
    • token_verification_path: This should be the path to any URL at the OAuth server that responds with an HTTP status code 200 when you are correctly logged in (and with a non-200 value when you are not). This is the path relative to the domain you set, so if your domain is example.social and you set token_verification_path to /foo/bar/ then the full path that this service will run a GET on to verify you are logged in is https://example.social/foo/bar.

Run the server!

node index.js

Go to https://whateveryourdomainis.com:3000/convert or whatever port you selected for HTTP, and enter an RSS feed and a username. If all goes well it will create a new ActivityPub user with instructions on how to view the user.

Sending out updates to followers

There is also a file called queueFeeds.js that needs to be run on a cron job or similar scheduler. I like to run mine once a minute. It queries every RSS feed in the database to see if there has been a change to the feed. If there is a new post, it sends out the new post to everyone subscribed to its corresponding ActivityPub Actor.

Running with OAuth

OAuth is unfortunately a bit underspecified so there are a lot of funky implementations out there. Here I will include an example of using a Mastodon server as the OAuth provider. This is how I have my RSS service set up: I run friend.camp as my Mastodon server, and I use my admin powers on friend.camp to register rss.friend.camp as an application. The steps for this, for Mastodon, are:

  • log in as an admin user
  • go to Preferences
  • select Development
  • select New Application
  • type in an application name, and the URL where this service is running
  • type in the redirect URI, which will be whatever base domain this service is running at with the /convert path appended. So something like https://rss.example.social/convert
  • uncheck all scopes, and check read:accounts (this is the minimum required access, simply so this RSS converter can confirm someone is truly logged in)
  • once you're done, save
  • you will now have access to a "client key" and "client secret" for this app.
  • open config.js in an editor
  • fill in client_id with the client key, and client_secret with the client secret.
  • set the redirect_uri to be identical to the one you put in Mastodon. It should look like https://rss.example.social/convert (the /convert part is important, this software won't work if you point to a different path)
  • set domain to the domain of your Mastodon server, and domain_human to its human-friendly name
  • leave authorize_path and token_path on their defaults
  • set token_verification_path to /api/v1/accounts/verify_credentials
  • cross your fingers and start up this server

Local testing

You can use a service like ngrok to test things out before you deploy on a real server. All you need to do is install ngrok and run ngrok http 3000 (or whatever port you're using if you changed it). Then go to your config.json and update the DOMAIN field to whatever abcdef.ngrok.io domain that ngrok gives you and restart your server.

Then make sure to manually run updateFeed.js when the feed changes. I recommend having your own test RSS feed that you can update whenever you want.

Database

This server uses a SQLite database stored in the file bot-node.db to keep track of all the data. To connect directly to the database for debugging, from the root directory of the project, run:

sqlite3 bot-node.db

There are two tables in the database: accounts and feeds.

accounts

This table keeps track of all the data needed for the accounts. Columns:

  • name TEXT PRIMARY KEY: the account name, in the form [email protected]
  • privkey TEXT: the RSA private key for the account
  • pubkey TEXT: the RSA public key for the account
  • webfinger TEXT: the entire contents of the webfinger JSON served for this account
  • actor TEXT: the entire contents of the actor JSON served for this account
  • apikey TEXT: the API key associated with this account
  • followers TEXT: a JSON-formatted array of the URL for the Actor JSON of all followers, in the form ["https://remote.server/users/somePerson", "https://another.remote.server/ourUsers/anotherPerson"]
  • messages TEXT: not yet used but will eventually store all messages so we can render them on a "profile" page

feeds

This table keeps track of all the data needed for the feeds. Columns:

  • feed TEXT PRIMARY KEY: the URI of the RSS feed
  • username TEXT: the username associated with the RSS feed
  • content TEXT: the most recent copy fetched of the RSS feed's contents

License

Copyright (c) 2018 Darius Kazemi. Licensed under the MIT license.

rss-to-activitypub's People

Contributors

dariusk avatar phormanns avatar snan avatar umonaca avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rss-to-activitypub's Issues

Error updating feeds

Hello,
I have the following error running "node updateFeeds.js" :

/app/updateFeeds.js:21
  console.log(count, feed.feed);
                          ^

TypeError: Cannot read property 'feed' of undefined
    at Timeout.doFeed [as _onTimeout] (/app/updateFeeds.js:21:27)
    at listOnTimeout (internal/timers.js:535:17)
    at processTimers (internal/timers.js:479:7)

When I comment console.log(count, feed.feed); located line 21, it's working properly. So I guess you just forgot to comment it before pushing.

Have a nice day :-)

Wrong RSS feed is converted

Don't send a new toot when rss feed item is updated

Hello,
It seems that some websites have the bad habit to update their existing articles in order to "bump" their articles in their rss feed.
Would it be possible to not send new toot when an article is "updated" ?
Thank you :-)

Feed pages missing attribution link for server software

As it stands, the pages for individual feeds are self-explanatory, but lack context in the form of a link to the project's GitHub page, or even a link to the landing page of the site itself. I feel those would be appropriate in order to e.g. guide those who discover an account via a boost.

"account" icon

would love to be able to set an image to the feed so as to more easily recognize the bots. could even grab the favicon from the input url and attempt to make an icon out of that.

Hasn't update since 4 days ago

Hello, the rss feed seems stop updating since 4 days ago, any solution to that please? I really rely on this awesome tool!

Thank you!

Not updating

Since 3 days ago the RSS feeds stopped to update, may I ask what happened?

Suspending/deleting an account

Hi, I have created an account from a feed that turned out to be far too noisy. Since nobody else is following it, is there a way to stop it from sending more updates and/or delete it all together?

EDIT: In case it's not obvious: I'm running rss-to-activity on my own server, not using the demo version.

Issue with feed update

I created an account for a twitter feed after you've fixed the issue but it doesn't seem to create new items when something new appears on the feed. Is there some delay before creating items ?
For example this account has 4 new items on its feed but they doesn't appear to have been created on the profile page :
https://bots.tinysubversions.com/u/larueourien1

Send to fediverse avatar modification

Hello,
I already follow an account on my rss-to-activitypub instance. Then I decided to change the avatar of the account (it was blank), updating the "actor" field in "accounts". But it's not changed on Pleroma nor Mastodon.
I think avatar's modifications are never sent to Fediverse so my Pleroma instance can't know that the avatar has changed.
If it's not too hard work would it be possible to implement it ?
Thank you @dariusk, I use everyday your amazing piece of software. It was something I wanted for a long time without being able to tell how to do it properly (nor be able to develop it) and then you did it !

Content:encoded Support?

It seems that there are two ways to keep the text content in RSS feeds. Most of them use the <description> tag, but some of them still use <content:encoded>.

Reference: https://developer.mozilla.org/en-US/docs/Archive/RSS/Article/Why_RSS_Content_Module_is_Popular_-_Including_HTML_Contents

It seems that this service handles the <description> tags only. Is it possible to support the <content:encoded> way as well? Maybe the template is like this:

item.link and item.title
item.desciption
------
item.content:encoded (if it exists)
------
attachments (images?) // audio support is on the way: 
https://github.com/tootsuite/mastodon/pull/9480

Well, I come to request this feature because one of my feeds, https://bots.tinysubversions.com/u/theinitium, is not showing the full text content in Mastodon, and here is a formatted sample: feeds.txt.

And by the way, thanks for your work! It's quite convenient for RSS users!

Add (temporary) federation support of status link

Motivation

Sometimes you boost an interesting RSS post and hope that your followers can see it...

I know that this feature will consume more resources on server side, so federation support of each link can be temporary. For example, you can set federation support of each link available for 12 hours.

BTW, I don't know how Mastodon fetch a new status from a new server. Maybe it just use WebFinger. and the query string changes to status link ...

Follow request from Mastodon hangs

While the account is created and the RSS feed does update, it seems that the follow request does hang.
In Mastodon I searched for the account, clicked on "follow" and at first it seems, that the follow request does succeed. But several minutes later, when I search the account again, it shows me the hourglass symbol. Although the RSS feed has been updated in the meantime, no messages are visible.
One example is https://bots.tinysubversions.com/u/ndaktuell/

Publish items as Articles

As I'm using an article-enabled Mastodon instance, it would be nice if RSS posts showed up as Articles for me, instead of Notes.

(I don't really know enough about the details of ActivityPub to specify this better.)

Crontab : queueFeeds.js or updateFeeds.js

Hello @dariusk
Previously, I used for my crontab updateFeeds.js. Then you added the queue system, and I see on the changelog you say we have to create a crontab to run queueFeeds.js.
Nonetheless when I query queueFeeds.js I just have "!!! 69" et no other output and nothing seems to happen next.
When I run updateFeeds.js, I have the same output as before, and it's working properly.
So, what's the good thing to do ?
In think in both cases it queries beanstalkd because I don't have "ECONNREFUSED" anymore.
Thank you

ReferenceError: URL is not defined

โžœ  rss-to-activitypub (master) โœ” node index.js
No SSL key and/or cert found, not enabling https server
Express server listening on port 3000                                                                                                             

                                                                                                                                                  
{ feed: 'https://varia.zone/feeds/all-nl.rss.xml',
  username: 'potato' }
VALIDATING
Varia
end!!!!
/home/decentral1se/varia/rss-to-activitypub/routes/api.js:70
    let favUrl = new URL(feed);
                 ^

ReferenceError: URL is not defined

Add generate-rsa-keypair in dependancies.

generate-rsa-keypair is not installed npm i

resulting in

node index.js

internal/modules/cjs/loader.js:657
throw err;
^

Error: Cannot find module './build/Release/generate-rsa-keypair.node'
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:655:15)
at Function.Module._load (internal/modules/cjs/loader.js:580:25)
at Module.require (internal/modules/cjs/loader.js:711:19)
at require (internal/modules/cjs/helpers.js:14:16)
at Object. (/usr/home/RSS/rss-to-activitypub/node_modules/generate-rsa-keypair/index.js:1:18)
at Module._compile (internal/modules/cjs/loader.js:805:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:816:10)
at Module.load (internal/modules/cjs/loader.js:672:32)
at tryModuleLoad (internal/modules/cjs/loader.js:612:12)
at Function.Module._load (internal/modules/cjs/loader.js:604:3)

Docker image?

Seems like an ideal candidate for a micro-service run though Docker. Much appreciated if this could be included officially.

Error starting node

Hello, running "node index.js" I have the following error :

module.js:550
    throw err;
    ^

Error: Cannot find module './message'
    at Function.Module._resolveFilename (module.js:548:15)
    at Function.Module._load (module.js:475:25)
    at Module.require (module.js:597:17)
    at require (internal/module.js:11:18)
    at Object.<anonymous> (/app/routes/index.js:6:12)
    at Module._compile (module.js:653:30)
    at Object.Module._extensions..js (module.js:664:10)
    at Module.load (module.js:566:32)
    at tryModuleLoad (module.js:506:12)
    at Function.Module._load (module.js:498:3)

Thank you :-)

Small improvement to disable item title in content sent

Hello,
I just made a little tweak, disabling the item title to being sent. Indeed, I'm using "rss-bridge" to be able to follow twittos using RSS feeds. Nonetheless, the bridge uses part of the tweet content as title.

Consequently, then using rss-to-activity-pub you have something like this :
image

You have twice the same tweet.

So I changed a little bit the file updateFeeds.js, replacing :

 let message = `<p><a href="${item.link}">${item.title}</a></p><p>${item.content || ''}</p>`;

by

          let message;
          if(item.link.match('/twitter.com/')) {
             message = `${item.content}`;
          }
          else {
             message = `<p><a href="${item.link}">${item.title}</a></p><p>${item.content || ''}</p>`;
          }

So if the item links to twitter.com, only the content of item is sent to Mastodon. See how it is rendering :

image

Don't know if it can be useful in the code but I thought maybe somebody could be interested to doing the same thing.

I know it's not a good place to share a tweak and I'm sorry, I didn't know anywhere else to put it.

Have a nice day

Newlines not preserved in some circumstances (lists?)

I've observed some discrepancies in the server's formatting of new RSS entries vs. what appears in an RSS reader. Namely, the text preview lacks newlines where they exist in the original text. I think this may only occur with lists, but I don't have other feeds' outputs on hand to verify.

The view of a particular article in my RSS reader:

A screenshot of a Seattle Transit Blog article titled "News Roundup: Long Distance", featuring a picture of an Amtrak train, followed by a bulleted list of news headlines

The corresponding output from the server:

The same article entry as published by the server, with the bulleted list collapsed into one large, unreadable paragraph

Image Elements as Attachments?

Thanks for your work. But is it possible to send all images as attachments in a toot?

And I find that this tool will convert all mp3 to links, is it possible to convert all other elements that mastodon not support to links (e.g. <video>, <audio>, <img>, <iframe>, <embed>)?

By the way, I hope this site can search by username or rss links, since all posts will not be showed in public timeline and if someone reboost a message, only those follow him at the same instance can see the message ...

Remove redundant line breaks

Motivations

Some of my feeds are converted by RSSHub from weibo.com. For a better experience in RSS reader, RSSHub appends 2 <br> tags to each images, which will be converted to trailing line breaks in Mastodon federation ...

Here is an example: https://bots.tinysubversions.com/m/f2e894bf8e3fc844fb3808bbc639fe0e .

How to implement it, for example

  1. Replace images elements with links.
  2. Remove all trailing space and <br> tags at the end.
  3. If more than 1 <br> tag appear, remove all the redundant <br> tags.

Improve error message for incorrect URLs

When pasting in an incorrect URL to a feed (for example, with "view-source:" still prepended), the error message is "Bad Request." This isn't helpful, especially if the typo/etc is at the front of the URL, which may be invisible because of the width of the text box. A more helpful message for "I couldn't find a feed at that URL" would be useful. See #8.

Convert links of media attachments to https

Today I try to convert some podcasts to ActivityPub, and it seems that feeds from Japanese podcasts prefer to use http in audio enclosure instead of https (although they did support), which will cause 422 errors in media proxy of Mastodon.

Here is the sample podcast: https://bots.tinysubversions.com/u/tfm1 .

Hope it can be fixed asap ๐Ÿ˜ญ


BTW, podcast feeds often use content:encoded elements, too. Wish #17 can be fixed accordingly.

(Pre)Fetch old rss items?

Is it possible to load / prefetch (x) old rss items? Would be nice for example to view a not empty profile or also have old items searchable?

rss-to-activitypub running but unable to follow

Hi there. I have installed rss-to-activitypub on a Raspberry Pi at home. After some initial struggles (I had to downgrade nodejs to v10.) it seems to be running. I can enter a RSS feed and it gets converted to AP. I can view the account and it's posts from my Pleroma account. Yaj! I seem however unable to follow the feed. When I try to follow the feed from my Pleroma account I see the status "Request sent". It stays that way, even after an hour waiting. I do see ["https://server.domain.tld/users/username"] added to the accounts table, so apparently the following request gets registered by rss-to-activitypub.

I can imagine two causes:

  1. Rss-to-activitypub runs on the same machine as my Pleroma install, (it gets called under a different hostname). Since both Pleroma as rss-to-activitypub have to speak Activitypub, I can image this not working?
  2. I have rss-to-activitypub proxied behind nginx. Perhaps this can lead to this problem?

I hope someone can comment on these guesses? Or can there be another cause here?

Hard-coded domain name in HTML

The URL https://bots.tinysubversions.com/u/โ€ฆ is hard coded in public/convert/index.html. Line 77 in the version I have, message guarded by if (myJson.title). Also, https even though I haven't given it a certificate so HTTPS is turned off.

Not an problem for me personally; I'm just playing with it to help get my head round ActivityPub. Thanks for the help with that.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.