Comments (6)
The problem is canonicalization/feed aliasing. Most feeds can be accessed
under many URLs (HTTP vs HTTPS, multiple hostnames, infinite spaces of
ignored query parameters). The publisher can't/won't ping all of them when
there's an update to the feed. The self link is an explicit promise to ping
the self link topic when the feed changes, and this is the topic that
subscribers should use. If we drop the self link requirement, we can either
let subscribers that ended up on a feed via a URL that is not the canonical
wait for updates in vain (bad) or make the hub's job much more difficult
because it needs to understand that a ping to http://example.com/feed.xml
might also affect subscribers to https://example.com/feed.xml?foo=bar. This
fits the overall "center complexity in the hub" design approach, but it
would probably lead to a worse user experience because it's hard to do this
kind of aliasing detection reliably.
I also expect the gains from this simplification to be small since adding
two links to a feed is basically the same amount of work as adding one link.
On Tue, May 26, 2015 at 2:00 PM, Christian Weiske [email protected]
wrote:
The discovery phase
http://pubsubhubbub.github.io/PubSubHubbub/pubsubhubbub-core-0.4.html#discovery
currently requires that a document has two relation links:
- rel=hub
- rel=self
What is the reason for rel=self?
In my eyes, rel=hub should suffice since rel=self will be the URL itself.
It should be made optional.cc @aaronpk https://github.com/aaronpk @tantek
https://github.com/tantek—
Reply to this email directly or view it on GitHub
#36.
from pubsubhubbub.
Actually, adding the hub link in Apache is a single configuration line only:
Header append Link '<http://phubb.cweiske.de/hub.php>; rel="hub"'
Adding the self URL is difficult because it's a dynamic URL. So it's not the same amount of work; quite the contrary.
I understand the issue about the same file being available under multiple URLs. But if there is no self link, the publisher could have to take care that the URLs are only available under one URL.
from pubsubhubbub.
I agree with not requiring rel=self.
re: canonicalization - there is prior art here we should be re-using, that is, rel=canonical - which is already well deployed and in use.
Thus here is a specific proposal.
Change: Publishers MUST have a rel=self link at their URL ("the URL")
To: Publishers SHOULD have a rel=self link, but MAY instead:
- provide a rel=canonical link (which they might have already) OR
- assume rel=self same as the URL
Thus consuming code:
- looks for a rel=self link, if not found
- looks for a rel=canonical link, if not found
- uses the current URL
Regarding: "since adding two links to a feed is basically the same amount of work as adding one link." - absolutely not true in experience. Example 1: what @cweiske said. Example 2: watching numerous users try to add the TWO links required for OpenID and screwing one of them up (in contrast to people trivially adding one rel=me link required for IndieAuth).
Basically, requiring two links instead of one for the very common case unnecessarily increases publisher responsibility and fragility of the whole system.
from pubsubhubbub.
I'm very strongly against this because this would bring one more case of silent failure. There's http vs https, there's also case issues and a bunch of other examples. Feedburner is pretty famous for this and f you subscribed to this URL instead of this one, you'd never get pings.
The worst case is for redirects and in this specific case, the hub has no way of matching the ping-ed URL and the actual feed resource.
Again, this is a particularly bad idea because this will silently fail. A subscriber who subscribes to a URL different from the one that is actually pinged to the hub will never receive notifications, and never be able to tell why (because he cannot know which URL is being pinged). THAT makes the protocol fragile.
I'm all sorry for anyone working with Apache in general, but I don't think it's a good idea to base a spec on the difficulty of implementing something with a specific web server. I believe most web frameworks will make it trivial to add one Link header vs. 2 (or 100).
Now, if the whole debate is to say that "canonical" is better than "self", I'll let you fight around this. We can easily change the spec to tell to subscribers:
- Use
self
if there is one - Use
canonical
if you can't find one
And to publishers: - put either
self
ofcanonical
.
from pubsubhubbub.
On Fri, May 29, 2015 at 9:34 AM, Julien Genestoux [email protected]
wrote:
Feedburner is pretty famous for this and f you subscribed to this URL
http://feeds.feedburner.com/TechCrunch/ instead of this one
http://feeds.feedburner.com/Techcrunch/, you'd never get pings.Minor correction: subscribing to any of these will work:
- http://feeds.feedburner.com/*Techcrunch*
- http://feeds.feedburner.com/*TechCrunch*
- https://feeds.feedburner.com/Techcrunch
- http://feedproxy.google.com http://feedproxy.google.com/Techcrunch
- etc.
This doesn't invalidate the point Julien is making. Topic aliasing is a
real problem. Correct self links are vital for ensuring that subscribers
are listening to the exact topics that the publisher is pinging.
Roman.
from pubsubhubbub.
I stand corrected, but that was a large painpoint for along time. I'm glad you guys fixed it :)
from pubsubhubbub.
Related Issues (20)
- Define Specific Content-Types HOT 1
- PubSubHubbub Core 0.4: Validation vs. verification of intent HOT 1
- PubSubHubbub Core 0.4: Fat pings vs. normal pings HOT 3
- PubSubHubbub Core 0.4: Verification of intent vs. "denied" HOT 1
- PubSubHubbub Core 0.4: Verifying during subscription request HOT 1
- PubSubHubbub Core 0.4: Acceptance of a subscription request HOT 7
- PubSubHubbub Core 0.4: X-Hub-Signature HOT 1
- Section 5.1.1 HOT 5
- Failed verification of intent: Send "denied" message to subscriber's callback? HOT 1
- Specify how publishers notify hubs HOT 19
- Subscription Response Details regarding Validation HOT 4
- PuSH 0.4 recommends old SHA1 signatures HOT 2
- Make PuSH a "living" spec HOT 5
- Looking for server code HOT 3
- Silent Rate-Limiting by the Google PubSubHubbub Hub? HOT 5
- Google Hub's Subscriber Diagnostics Seems down HOT 4
- its stop working for my blogger HOT 1
- is this deprecated ?
- Feature Request: Youtube push notifications for user activity on my channel HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pubsubhubbub.