pubsubhubbub / pubsubhubbub Goto Github PK
View Code? Open in Web Editor NEWThe PubSubHubbub protocol specification.
Home Page: http://pubsubhubbub.github.io/PubSubHubbub
The PubSubHubbub protocol specification.
Home Page: http://pubsubhubbub.github.io/PubSubHubbub
So, I'd like to call this out as a separate issue.
Compatibility between 0.3 and 0.4 clients is extremely important to me. We have tens of thousands of installations of StatusNet on the Web, and it's not easy for us to upgrade them all at once. I don't want our interconnection to end when sites supporting 0.4 go online.
I realize that the vast majority of feeds running PubSubHubbub are running on one of two hubs, but there are also clients that support 0.3 and lots of smaller hubs.
If the "verification of intent" fails, has the hub to inform the
subscriber (hub.mode=denied)?
Or more broadly, does a failed verification count as "denied"?
The main reason for signatures in content distribution is to prevent third parties from posting false data to the subscriber's callback URL.
For those subscribers that don't want to deal with HMAC signatures and all that jazz, it may be easier to simply provide a unique, secret callback per subscription.
So, instead of:
hub.callback=https://example.com/callback
...use:
hub.callback=https://example.com/<mybigsecret>/callback
...or:
hub.callback=https://example.com/callback?confirm=<mybigsecret>
Because this is vulnerable to MITM attacks (an observer who sees the subscription or verification calls could post false data to the callback), it's important to use HTTPS for subscription and verification.
Version 0.4 of the spec only defines how subscribers and hubs interact with each other, but not how publishers notify the hub about updates.
It is crucial that this is specified, too - otherwise it is not possible for publishers to switch hubs without modifications to the code.
Superfeedr uses a POST to the hub URL with hub.mode=publish
and hub.url=$url_that_was_updated
- see http://documentation.superfeedr.com/publishers.html
Google's hub does the same; https://pubsubhubbub.appspot.com/
Hubs can still implement other ways of notifications, but they all should support a standardized way to be notified.
https://pubsubhubbub.appspot.com/
stop working for my blogger there is no auto publish
that's happen after blogger update Blogger v2
I found it hard to wrap my head around that there is a difference
between "validation" and "verification of intent", since I thought that
verifying the intent by calling the callback is a kind of validation
already.
Maybe it would help if an example for a validation rule/process would be
added to the spec.
I'm not sure what the difference is, but we should probably consider using the already-standard Content-Location header rather than a self-URL.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#Content-Location
This test should accept any url and perform the following test:
There are more tests but I'm not sure how we can handle them:
For large resources, retrieving the entire resource just to get the headers is inefficient.
It may make sense to recommend using HEAD to check for HTTP headers.
If they're not there, and the Content-Type is one of the fallback 0.3 types, use a GET and check the body for hub links.
Hi, couldn't find a better place to post this. All the code.google.com repo links to set up a pubsubhubbub reference hub server are no longer working, and I couldn't find mirrors anywhere, all links lead here.
While there are alternatives, wondering if there is any copy of the reference implementation still?
Let's make the PuSH spec a living spec that can take over after 0.4. This will allow future work to be done more rapidly, allowing it to keep up with other specs.
Why is "X-Hub-Signature" prefixed with "X-"?
Isn't that considered bad practice now?
RFC 6648: Deprecating the "X-" Prefix and Similar Constructs in
Application Protocols
https://tools.ietf.org/html/rfc6648
Is it allowed to do the validation and verification of intent
process while the subscription request is running?
This would imply that the verification of intent callback and the
denied-callback is called when while the subscription request is still
being answered.
That would make it possible to skip all the worker setup that's
necessary to do delayed validation and verification.
Process diagram for inline validation:
client server/hub
+---[subscription request]-->-----+
| |
| +<--[verification of intent]---+
| |
| +--[verification response]-->--+
| |
| +<--[denied callback]----------+
| |
| +----[response]------------->--+
| |
+--<--[subscription response]-----+
This would also make it possible to not call the callback at all but
tell the denied-state directly in the response of the subscription
request. But that does not seem to be planned at all, right?
my blogger feed stop send notification automatically
I must send a notification manually from
HERE
pubsubhubbub
I think it's a mistake to remove the hub.verify_token parameter.
This identifier lets a subscriber know that the caller who's verifying a subscription is the same party they asked for a subscription in the first place.
An attacker could test a subscription callback URL to see if there are outstanding subscription requests for a list of well-known topics, and if any of them are a "hit", it could start sending false or abusive updates to the callback.
PuSH 0.3 allowed a "sync" verification mechanism.
It's not clear to me why we pulled this out.
Hubs should be able to redirect subscribers and publishers to another hub:
It's likely that hubs will die (some of died already) and if things are really decentralized in the way that the hub does not know about the publishers pointing to it (it's likely and happens a lot), then, we need a way for the hub to tell subscribers it moved.
It currently only seems to be possible that the hub sends an update
notification to the subscriber that contains the (full) content of the
subscribed resource/topic (fat ping).
Sending only the POST request without any content (light ping) seems to
be a violation of the spec.
Is this intentional?
Shouldn't hubs be able to send light pings?
When all payloads are implied to be Atom/RSS, signing the payload is sufficient for authenticating the message because the HTTP headers are completely ignored.
As we move to a mode of arbitrary content distribution, there can be some corner-cases where the meaning of a notification can be changed by altering the headers while retaining the payload. I'd concede that this will be rare, but I believe we should plug this obvious "hole" in the authenticated distribution protocol as part of introducing arbitrary content types.
My proposal is that we define a new signature base string that incorporates:
Content-
Content-MD5
HTTP header)The format of the header that provides this information must then change to incorporate the nonce, which would also be a good opportunity to either rename it to a non-experimental name (no X-
prefix) or recast it as an HTTP authentication scheme using the Authorization
header field.
Right now the spec says signatures for authed pings must be SHA1. http://pubsubhubbub.github.io/PubSubHubbub/pubsubhubbub-core-0.4.html#authednotify
Given that SHA1 is deprecated, it would seem a new solution is needed for the spec. I'm not sure the best step forward, since simply updating it to use SHA256 will likely encounter the same problem in a few years. Maybe going the route that JWT took where there is another property that indicates the signature method, so the spec doesn't have to change to support new crypto functions? On the other hand that would seem to lead to less interoperable solutions since clients couldn't guarantee availability of a specific signature method.
I have been reading the spec and the issues around but I just didn't find what I was looking for. If it has already been covered, then I apologize.
Would it be fair to expect a 4xx answer from the hub when subscription validation fails? Or will the hub answer with a 202 answer regardless? The spec does mention that the hub will answer with 4xx in case errors are found but in that case why wouldn't they send the reason within the subscription request instead of using the callback. On the other hand, if hubs do send out a 202 status code and later on might send out a subscription denial notice what options do subscribers have to foresee such event? Feels like the subscriber will mark the subscription as valid once he gets the 202 code and possibly invalidate it if a hub.reason ever gets to the callback url which will trigger both events (subscription and unsubscription).
Either way, if validation (and possibly verification of intent) fails and hubs do respond out with 4xx status codes don't they deserve a specific response status code during subscription?
The discovery phase currently requires that a document has two relation links:
rel=hub
rel=self
What is the reason for rel=self
?
In my eyes, rel=hub
should suffice since rel=self
will be the URL itself. It should be made optional.
cc @aaronpk @tantek - http://indiewebcamp.com/irc/2015-03-18#t1426690743557
Currently it's left up to out-of-band information to differentiate between a publishing request that is a full payload or a diff. I think a hub should be able to switch between a full payload and a diff on a per-notification basis for the following reasons:
If we switch from using POST to PUT and PATCH then we can make use of existing HTTP semantics to produce completely self-contained notifications that don't rely on outside information to be understood.
To be concrete:
As a backward compatibility shim for 0.3, I'd suggest that a notification using the POST method be interpreted as a PATCH whose payload is either an Atom or an RSS feed, but any other content types MUST be sent as either PUT or PATCH.
For best results we could try to define a patch format negotiation protocol, but initially I think this could be left out-of-band as there is likely to only be one or two appropriate patch formats for a given resource and subscribers are generally tailored to a particular resource type.
[This is not an issue with the spec but with Google's PubSubHubbub Hub. Please let me know if there is a more appropriate place for my question.]
I'm publishing multiple feeds and I've noticed that the Google PubSubHubbub Hub seems to silently ignore all but ~ 10 randomly chosen URLs per POST request. It always replies with status code 204 which makes this issue hard to debug. I've tried sending the POST requests in batches of 10 URLs but that does not help. Sleeping for 20 (!) seconds between batches seems to help but I'll have to run more tests.
Are those rate-limits documented somewhere? Or can they be lifted? Does Superfeedr have less strict rate-limiting?
Does the client/subscriber have to see a subscription as being
"accepted" until he gets a "denied" state on his callback URL?
Why is there no "subscription accepted" call to the callback URL?
Also, why does the callback not get the signature passed for requests
with hub.mode=denied?
Bad people could fake unsubscription confirmations without it.
Suggestion: Change the text in Section 5.1.1 to say "The topic URL MUST be the one advertised by the publisher during the discovery phase."
Section 5.1.1 currently says "The topic URL MUST be the one advertised by the publisher in a Self Link Header during the discovery phase."
However, section 4 says that the topic URL can be specified in either the Link header or a tag in the document body, such as an HTML link tag. This change makes the sections consistent.
This should be fairly easy. I'm thinking of PushPress by @josephscott and PubSubHubbub by @pfefferle at least. We could also ask @padraic if he wants to maybe deprecate this one in favor of the other ones.
It would be nice to retain the difference mechanism defined in 0.3 for Atom and RSS feeds, and provide full payloads for other types.
I want to double-check that we're doing the right thing by supporting arbitrary data types.
Atom, RSS 1.0, RSS 2.0 and Activity Streams are all feed formats that represent changes. That's the whole point of these syndication formats.
Changes to other kinds of documents and resources -- images, HTML pages, video and audio -- may be better represented with these document types.
So, rather than using PuSH for file like http://example.net/images/file.png, instead have a related Atom feed (or JSON feed, or RSS feed) that represents the history of that file.
I'd like to request that the webhook API is expanded to include notifications for user activity on an authorized channel. Including new subscriptions and the username/display name of the subscriber (and email if possible), and any other user interaction on the channel if I subscribe to the webhook notification, such as comments, likes, shares, etc. Thanks
I think it would be more helpfull if the pubsubhubbub folder would only include the different specs because the code is a bit old and this could be a bit confusing. What about asking some people to move there libs to this repo to have a central code space (like the openid repo https://github.com/openid)?
It would be nice if discovery could fall back to the 0.3 mechanism for certain data types, if the expected headers aren't there.
<link rel="hub">
in the feed.<atom:link rel="hub">
in the feed.PubSubHubbub 0.3 defined a mechanism for publishers to ping the hub. This is no longer defined in 0.4.
It may be nice to extract that and give it its own brief document, for those hubs that do implement it.
If verification of intent failed, shall the hub send a "denied" message to the subscriber's callback?
I submitted this idea at
https://code.google.com/p/pubsubhubbub/issues/detail?id=148
The answer seemed to have missed my point, so I'm continuing this discussion here, as suggested.
XML is not the point. Specific Resource Content-Types are the point.
Maybe it has already been decided that the generic x-www-form-encoded Content-Type is "good enough." If so, then I won't fight that. I just need to know.
I'm not quite sure if this is the right place to report issue,
but Google Hub's "Subscriber Diagnostics" seems down from here.
Although other parts of the hub still works fine.
But diagnostics is still a heavy rotation during client development stage.
So can anyone help fix this, or perhaps provide where can I report this issue?
Thx.
It is probably a good idea to point to the W3C community group in the feedback appendix.
The Group needs the Logo as "Avatar"!
In discovery, we've added "X-Hub-Url" and "X-Hub-Self-Url".
Let's use the nice standard Link: header from RFC 5988 instead. See http://tools.ietf.org/html/rfc5988 and http://www.w3.org/wiki/LinkHeader .
So:
Link: <http://example.com/hub>; rel=hub
and:
Link: <http://example.com/avatar.jpg>; rel=self
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.