Coder Social home page Coder Social logo

Comments (6)

themel avatar themel commented on April 28, 2024

The problem is canonicalization/feed aliasing. Most feeds can be accessed
under many URLs (HTTP vs HTTPS, multiple hostnames, infinite spaces of
ignored query parameters). The publisher can't/won't ping all of them when
there's an update to the feed. The self link is an explicit promise to ping
the self link topic when the feed changes, and this is the topic that
subscribers should use. If we drop the self link requirement, we can either
let subscribers that ended up on a feed via a URL that is not the canonical
wait for updates in vain (bad) or make the hub's job much more difficult
because it needs to understand that a ping to http://example.com/feed.xml
might also affect subscribers to https://example.com/feed.xml?foo=bar. This
fits the overall "center complexity in the hub" design approach, but it
would probably lead to a worse user experience because it's hard to do this
kind of aliasing detection reliably.

I also expect the gains from this simplification to be small since adding
two links to a feed is basically the same amount of work as adding one link.

On Tue, May 26, 2015 at 2:00 PM, Christian Weiske [email protected]
wrote:

The discovery phase
http://pubsubhubbub.github.io/PubSubHubbub/pubsubhubbub-core-0.4.html#discovery
currently requires that a document has two relation links:

  1. rel=hub
  2. rel=self

What is the reason for rel=self?

In my eyes, rel=hub should suffice since rel=self will be the URL itself.
It should be made optional.

cc @aaronpk https://github.com/aaronpk @tantek
https://github.com/tantek


Reply to this email directly or view it on GitHub
#36.

from pubsubhubbub.

cweiske avatar cweiske commented on April 28, 2024

Actually, adding the hub link in Apache is a single configuration line only:

Header append Link '<http://phubb.cweiske.de/hub.php>; rel="hub"'

Adding the self URL is difficult because it's a dynamic URL. So it's not the same amount of work; quite the contrary.

I understand the issue about the same file being available under multiple URLs. But if there is no self link, the publisher could have to take care that the URLs are only available under one URL.

from pubsubhubbub.

tantek avatar tantek commented on April 28, 2024

I agree with not requiring rel=self.

re: canonicalization - there is prior art here we should be re-using, that is, rel=canonical - which is already well deployed and in use.

Thus here is a specific proposal.

Change: Publishers MUST have a rel=self link at their URL ("the URL")
To: Publishers SHOULD have a rel=self link, but MAY instead:

  • provide a rel=canonical link (which they might have already) OR
  • assume rel=self same as the URL

Thus consuming code:

  • looks for a rel=self link, if not found
  • looks for a rel=canonical link, if not found
  • uses the current URL

Regarding: "since adding two links to a feed is basically the same amount of work as adding one link." - absolutely not true in experience. Example 1: what @cweiske said. Example 2: watching numerous users try to add the TWO links required for OpenID and screwing one of them up (in contrast to people trivially adding one rel=me link required for IndieAuth).

Basically, requiring two links instead of one for the very common case unnecessarily increases publisher responsibility and fragility of the whole system.

from pubsubhubbub.

julien51 avatar julien51 commented on April 28, 2024

I'm very strongly against this because this would bring one more case of silent failure. There's http vs https, there's also case issues and a bunch of other examples. Feedburner is pretty famous for this and f you subscribed to this URL instead of this one, you'd never get pings.

The worst case is for redirects and in this specific case, the hub has no way of matching the ping-ed URL and the actual feed resource.

Again, this is a particularly bad idea because this will silently fail. A subscriber who subscribes to a URL different from the one that is actually pinged to the hub will never receive notifications, and never be able to tell why (because he cannot know which URL is being pinged). THAT makes the protocol fragile.

I'm all sorry for anyone working with Apache in general, but I don't think it's a good idea to base a spec on the difficulty of implementing something with a specific web server. I believe most web frameworks will make it trivial to add one Link header vs. 2 (or 100).

Now, if the whole debate is to say that "canonical" is better than "self", I'll let you fight around this. We can easily change the spec to tell to subscribers:

  • Use self if there is one
  • Use canonical if you can't find one
    And to publishers:
  • put either self of canonical.

from pubsubhubbub.

romkatv avatar romkatv commented on April 28, 2024

On Fri, May 29, 2015 at 9:34 AM, Julien Genestoux [email protected]
wrote:

Feedburner is pretty famous for this and f you subscribed to this URL
http://feeds.feedburner.com/TechCrunch/ instead of this one
http://feeds.feedburner.com/Techcrunch/, you'd never get pings.

Minor correction: subscribing to any of these will work:

This doesn't invalidate the point Julien is making. Topic aliasing is a
real problem. Correct self links are vital for ensuring that subscribers
are listening to the exact topics that the publisher is pinging.

Roman.

from pubsubhubbub.

julien51 avatar julien51 commented on April 28, 2024

I stand corrected, but that was a large painpoint for along time. I'm glad you guys fixed it :)

from pubsubhubbub.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.