Coder Social home page Coder Social logo

pubsubhubbub's Introduction

PubSubHubbub

IMPORTANT NOTE: The PubSubHubbub protocol has now been adopted by the W3C and published as a Recommendation. It's also been renamed WebSub for clarity and concision. Please consider upgrading all older PubSubHubbub implementations to WebSub.

PubSubHubbub is an open protocol for distributed publish/subscribe communication on the Internet. It generalizes the concept of webhooks and allows data producers and data consumers to work in a decoupled way.

PubSubHubbub provides a way to subscribe, unsubscribe and receive updates from a resource, whether it's an RSS or Atom feed or any web accessible document (JSON...).

The current version of the spec is 0.4. Please, read it here.

Open hubs are provided by:

Several other publishing platforms, like Wordpress, include their own hubs.

If you're looking for tutorials on how to get started with PubSubHubbub, check the links below:

pubsubhubbub's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pubsubhubbub's Issues

Feature Request: Youtube push notifications for user activity on my channel

I'd like to request that the webhook API is expanded to include notifications for user activity on an authorized channel. Including new subscriptions and the username/display name of the subscriber (and email if possible), and any other user interaction on the channel if I subscribe to the webhook notification, such as comments, likes, shares, etc. Thanks

Include Content-* header fields in authenticated distribution signature

When all payloads are implied to be Atom/RSS, signing the payload is sufficient for authenticating the message because the HTTP headers are completely ignored.

As we move to a mode of arbitrary content distribution, there can be some corner-cases where the meaning of a notification can be changed by altering the headers while retaining the payload. I'd concede that this will be rare, but I believe we should plug this obvious "hole" in the authenticated distribution protocol as part of introducing arbitrary content types.

My proposal is that we define a new signature base string that incorporates:

  • every header field whose name starts with Content-
  • the content of the entity-body, including any content-encoding that has been applied, but not including any transfer-encoding applied to the message-body. (i.e. the same bytestream that would be used to compute the Content-MD5 HTTP header)
  • a nonce

The format of the header that provides this information must then change to incorporate the nonce, which would also be a good opportunity to either rename it to a non-experimental name (no X- prefix) or recast it as an HTTP authentication scheme using the Authorization header field.

Specify how publishers notify hubs

Version 0.4 of the spec only defines how subscribers and hubs interact with each other, but not how publishers notify the hub about updates.

It is crucial that this is specified, too - otherwise it is not possible for publishers to switch hubs without modifications to the code.


Superfeedr uses a POST to the hub URL with hub.mode=publish and hub.url=$url_that_was_updated - see http://documentation.superfeedr.com/publishers.html

Google's hub does the same; https://pubsubhubbub.appspot.com/


Hubs can still implement other ways of notifications, but they all should support a standardized way to be notified.

Fallback to 0.3 discovery

It would be nice if discovery could fall back to the 0.3 mechanism for certain data types, if the expected headers aren't there.

  • For Atom (application/atom+xml), look for <link rel="hub"> in the feed.
  • For RSS (application/rss+xml), look for <atom:link rel="hub"> in the feed.

Silent Rate-Limiting by the Google PubSubHubbub Hub?

[This is not an issue with the spec but with Google's PubSubHubbub Hub. Please let me know if there is a more appropriate place for my question.]

I'm publishing multiple feeds and I've noticed that the Google PubSubHubbub Hub seems to silently ignore all but ~ 10 randomly chosen URLs per POST request. It always replies with status code 204 which makes this issue hard to debug. I've tried sending the POST requests in batches of 10 URLs but that does not help. Sleeping for 20 (!) seconds between batches seems to help but I'll have to run more tests.

Are those rate-limits documented somewhere? Or can they be lifted? Does Superfeedr have less strict rate-limiting?

Google Hub's Subscriber Diagnostics Seems down

I'm not quite sure if this is the right place to report issue,
but Google Hub's "Subscriber Diagnostics" seems down from here.
Although other parts of the hub still works fine.
But diagnostics is still a heavy rotation during client development stage.
So can anyone help fix this, or perhaps provide where can I report this issue?
Thx.

Retain hub.verify_token

I think it's a mistake to remove the hub.verify_token parameter.

This identifier lets a subscriber know that the caller who's verifying a subscription is the same party they asked for a subscription in the first place.

An attacker could test a subscription callback URL to see if there are outstanding subscription requests for a list of well-known topics, and if any of them are a "hit", it could start sending false or abusive updates to the callback.

Section 5.1.1

Suggestion: Change the text in Section 5.1.1 to say "The topic URL MUST be the one advertised by the publisher during the discovery phase."

Section 5.1.1 currently says "The topic URL MUST be the one advertised by the publisher in a Self Link Header during the discovery phase."

However, section 4 says that the topic URL can be specified in either the Link header or a tag in the document body, such as an HTML link tag. This change makes the sections consistent.

Logo

The Group needs the Logo as "Avatar"!

PubSubHubbub Core 0.4: Fat pings vs. normal pings

It currently only seems to be possible that the hub sends an update
notification to the subscriber that contains the (full) content of the
subscribed resource/topic (fat ping).

Sending only the POST request without any content (light ping) seems to
be a violation of the spec.

Is this intentional?
Shouldn't hubs be able to send light pings?

Recommend HEAD for discovery

For large resources, retrieving the entire resource just to get the headers is inefficient.

It may make sense to recommend using HEAD to check for HTTP headers.

If they're not there, and the Content-Type is one of the fallback 0.3 types, use a GET and check the body for hub links.

Looking for server code

Hi, couldn't find a better place to post this. All the code.google.com repo links to set up a pubsubhubbub reference hub server are no longer working, and I couldn't find mirrors anywhere, all links lead here.
While there are alternatives, wondering if there is any copy of the reference implementation still?

Do we need arbitrary data types?

I want to double-check that we're doing the right thing by supporting arbitrary data types.

Atom, RSS 1.0, RSS 2.0 and Activity Streams are all feed formats that represent changes. That's the whole point of these syndication formats.

Changes to other kinds of documents and resources -- images, HTML pages, video and audio -- may be better represented with these document types.

So, rather than using PuSH for file like http://example.net/images/file.png, instead have a related Atom feed (or JSON feed, or RSS feed) that represents the history of that file.

PuSH 0.4 recommends old SHA1 signatures

Right now the spec says signatures for authed pings must be SHA1. http://pubsubhubbub.github.io/PubSubHubbub/pubsubhubbub-core-0.4.html#authednotify

Given that SHA1 is deprecated, it would seem a new solution is needed for the spec. I'm not sure the best step forward, since simply updating it to use SHA256 will likely encounter the same problem in a few years. Maybe going the route that JWT took where there is another property that indicates the signature method, so the spec doesn't have to change to support new crypto functions? On the other hand that would seem to lead to less interoperable solutions since clients couldn't guarantee availability of a specific signature method.

Make PuSH a "living" spec

Let's make the PuSH spec a living spec that can take over after 0.4. This will allow future work to be done more rapidly, allowing it to keep up with other specs.

Subscription Response Details regarding Validation

I have been reading the spec and the issues around but I just didn't find what I was looking for. If it has already been covered, then I apologize.

Would it be fair to expect a 4xx answer from the hub when subscription validation fails? Or will the hub answer with a 202 answer regardless? The spec does mention that the hub will answer with 4xx in case errors are found but in that case why wouldn't they send the reason within the subscription request instead of using the callback. On the other hand, if hubs do send out a 202 status code and later on might send out a subscription denial notice what options do subscribers have to foresee such event? Feels like the subscriber will mark the subscription as valid once he gets the 202 code and possibly invalidate it if a hub.reason ever gets to the callback url which will trigger both events (subscription and unsubscription).

Either way, if validation (and possibly verification of intent) fails and hubs do respond out with 4xx status codes don't they deserve a specific response status code during subscription?

Create a publisher compliance test.

This test should accept any url and perform the following test:

  • It should include a self link header
  • It should include a hub link header
  • The designated hub should allow for subscriptions
  • The designated hub should allow for unsubscriptions

There are more tests but I'm not sure how we can handle them:

  • The hub should ping the subscribe when the resource updates
  • The hub should check with the publisher if the subscriber can subscribe

PubSubHubbub Core 0.4: Validation vs. verification of intent

I found it hard to wrap my head around that there is a difference
between "validation" and "verification of intent", since I thought that
verifying the intent by calling the callback is a kind of validation
already.

Maybe it would help if an example for a validation rule/process would be
added to the spec.

clean up the repo

I think it would be more helpfull if the pubsubhubbub folder would only include the different specs because the code is a bit old and this could be a bit confusing. What about asking some people to move there libs to this repo to have a central code space (like the openid repo https://github.com/openid)?

Recommend secret callback as alternative to hub.secret

The main reason for signatures in content distribution is to prevent third parties from posting false data to the subscriber's callback URL.

For those subscribers that don't want to deal with HMAC signatures and all that jazz, it may be easier to simply provide a unique, secret callback per subscription.

So, instead of:

hub.callback=https://example.com/callback

...use:

hub.callback=https://example.com/<mybigsecret>/callback

...or:

hub.callback=https://example.com/callback?confirm=<mybigsecret>

Because this is vulnerable to MITM attacks (an observer who sees the subscription or verification calls could post false data to the callback), it's important to use HTTPS for subscription and verification.

Extract Publisher -> Hub updates to its own document

PubSubHubbub 0.3 defined a mechanism for publishers to ping the hub. This is no longer defined in 0.4.

It may be nice to extract that and give it its own brief document, for those hubs that do implement it.

Use PUT vs PATCH to distinguish diff vs. full payload

Currently it's left up to out-of-band information to differentiate between a publishing request that is a full payload or a diff. I think a hub should be able to switch between a full payload and a diff on a per-notification basis for the following reasons:

  • Sometimes a resource changes to such a great extent that it is more efficient to send the full payload than to send a diff.
  • If the subscriber does not have the resource to which the diff should apply, the hub can fall back on a full payload to reset the resource state.

If we switch from using POST to PUT and PATCH then we can make use of existing HTTP semantics to produce completely self-contained notifications that don't rely on outside information to be understood.

To be concrete:

  • a full content notification would be a PUT on the notification URL with a Content-Type describing the format of the payload.
  • a diff notification would be a PATCH on the notification URL with a Content-Type describing the patch format.
  • For diff formats that are riskier to apply to an unknown resource state, ETag/If-Match can be used by the hub to help the subscriber recognize the mismatch.
  • Where there is a resource state conflict, a hub can fall back on a PUT to reset the resource state for subsequent PATCH requests. This requires an in-band distinction between full payload and diff so that the hub can switch modes on the fly as required.

As a backward compatibility shim for 0.3, I'd suggest that a notification using the POST method be interpreted as a PATCH whose payload is either an Atom or an RSS feed, but any other content types MUST be sent as either PUT or PATCH.

For best results we could try to define a patch format negotiation protocol, but initially I think this could be left out-of-band as there is likely to only be one or two appropriate patch formats for a given resource and subscribers are generally tailored to a particular resource type.

Backwards compatibility with PubSubHubbub 0.3

So, I'd like to call this out as a separate issue.

Compatibility between 0.3 and 0.4 clients is extremely important to me. We have tens of thousands of installations of StatusNet on the Web, and it's not easy for us to upgrade them all at once. I don't want our interconnection to end when sites supporting 0.4 go online.

I realize that the vast majority of feeds running PubSubHubbub are running on one of two hubs, but there are also clients that support 0.3 and lots of smaller hubs.

Allowing redirects for hubs

Hubs should be able to redirect subscribers and publishers to another hub:
It's likely that hubs will die (some of died already) and if things are really decentralized in the way that the hub does not know about the publishers pointing to it (it's likely and happens a lot), then, we need a way for the hub to tell subscribers it moved.

PubSubHubbub Core 0.4: Acceptance of a subscription request

Does the client/subscriber have to see a subscription as being
"accepted" until he gets a "denied" state on his callback URL?

Why is there no "subscription accepted" call to the callback URL?

Also, why does the callback not get the signature passed for requests
with hub.mode=denied?
Bad people could fake unsubscription confirmations without it.

PubSubHubbub Core 0.4: Verifying during subscription request

Is it allowed to do the validation and verification of intent
process while the subscription request is running?

This would imply that the verification of intent callback and the
denied-callback is called when while the subscription request is still
being answered.

That would make it possible to skip all the worker setup that's
necessary to do delayed validation and verification.

Process diagram for inline validation:

client                     server/hub
 +---[subscription request]-->-----+
 |                                 |
 |  +<--[verification of intent]---+
 |  |
 |  +--[verification response]-->--+
 |                                 |
 |  +<--[denied callback]----------+
 |  |
 |  +----[response]------------->--+
 |                                 |
 +--<--[subscription response]-----+

This would also make it possible to not call the callback at all but
tell the denied-state directly in the response of the subscription
request. But that does not seem to be planned at all, right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.