PubSubHubbub

What is all the Hubbub?

Making Web Feeds and Blogs Real-time

Real-time results are become more and more expected on the world wide web. Twitter was one of the first and now many systems are updating their content as things happens.

Facebook updates your wall as your friends make changes and the search engines are now including live updates in their search results.

This has left your typical blogger behind the times.

Feed Readers

Bloggers use a web feed to inform people and systems about their new blog. The most common format of feeds are RSS (Real Simple Syndication) and Atom. These are formatted files which contain information about the most recent blogs, including their URL, date, title and contents.

A Blogger would place a Feed file on their website and provide links to it, like my RSS Feed . This lets visitors subscribe to the feed via their favourite Feed Reader. That is, receive updates when new blogs are posted.

These Feed Readers typically use a "Pull" or "Polling" type mechanism. Every so often they will re-request the Feed file, check it for new entries and inform their users if there are any.

This has two main drawbacks:

The first is efficiency. The Reader has to periodically request the whole Feed file and process it. Some readers do this every 30 minutes. This adds a strain to both the Feed provider and the Feed Reader.

The second is that it is slow. Some Readers may take several days to get round to re-checking a feed. Not anywhere near real-time.

Pinging

The next improvement was that Feed Readers would let the blog owner send a "Ping" to them, indicate that their RSS feed has been updated. This means the Reader should re-request the RSS files faster and can cut down on polling the feed. There are even website you can use to ping multiple readers at the same time.

One of the weaknesses if this approach is that the blog owner has to know all the Feed Readers that are listening to their feed. Not an easy task, so many Readers may be left out of the loop.

PubSubHubbub

Enter PubSubHubbub with its strange name. It aims to solve all these issues so that bloggers can get their real-time feeds.

The basic idea is that Feed Readers can subscribe to a feed via a "Hub", which will then inform them when there are changes to the feed.

A PubSubHubbub supporting Feed file contains data indicating where the Hub is that manages subscriptions for the feed. The Feed Reader can then go to that Hub and subscribe to automatically receive updates. After that, the Reader can just sit and wait. No more inefficient requests

So that explains the Subscriber and Hub parts of PubSubHubbub.

Publishing

Now, when a blogger updates their blog and feed, all they have to do is ping the Hub. The Hub will then go to each subscriber and inform them of the new entries. This is called Publishing and hence the Pub in the name.

There are several ways to publish. The simplest is similar to pinging where the Reader still has to request the latest feed file once its been told. However, there is a more advanced method where the Hub actually sends the new blogs (Feed items) to the Subscribers. Thus achieving efficient real-time updates.

Open Protocol

PubSubHubbub is based on an Open Protocol. This has some great benefits.

It supports RSS and Atom formatted feeds making it easier for bloggers to integrate it into their existing feed system.

Many Feed Readers are signing up to the subscription system, including the big players.

As the Hub is not centrally controlled, you can create your own (I have), and I can imagine some websites will be appearing soon to provide Hub services for you.

The Protocol includes defined security measures to protect both the Hub and Subscribers from malicious attack.

Implementation

This is my first post using my own Hub. I hope it went well!

The specification, example code and even a Hub can be found Here .