Since Sam Ruby asked, I feel I must oblige. There have been a bunch of posts in Sam's blog pointing out that the RSS parser used by Apple's iTunes handles invalid RSS feeds which in turn encourages content producers to publish invalid RSS feeds which only work in iTunes.

In the post entitled Insensitive iTunes Sam wrote 

Mark Pilgrim: it appears that iTunes uses a real, draconian, namespace-aware XML parser... except that namespaces are case-insensitive.

What’s worse, is that the high profile Disney The Gears Behind the Ears feed appears to be depending on this functionality, as well as on other non- standard element definitions.

There are a couple of more issues with the iTunes parser mentioned by Mark Pilgrim in the comments to that post. The reason this is actually an issue at all is spelled out by Mark in another response to Sam's post where he wrote

Am I the only one who doesn’t think this is such a big deal?

Apple is an 800-lb. gorilla in this space (at least until Microsoft releases an RSS-enabled IE in Longhorn).  iTunes is to podcasting as Internet Explorer is to HTML.  RSS interoperability, at least as far as podcasting goes, now means “works with iTunes.”  Thousands of people and companies will begin making podcasts that “work with iTunes,” but unintentionally rely on iTunes quirks (e.g. Disney’s incorrect namespace).  This in turn will affect every developer who wants to consume RSS feeds, and who will be required to emulate all the quirks of iTunes to remain competitive.

Apple has effectively redefined the entire structure of an RSS feed, added multiple core RSS elements, made all RSS elements case-insensitive, made XML namespaces case-insensitive, created a new date format, made several previously required attributes optional, and created a morass of undocumented and poorly-documented extensions... to what was already a pretty messy format to begin with.

Case in point: my Universal Feed Parser, which already has 2751 test cases and is so incredibly liberal that it can parse an ill-formed EBCDIC-encoded RDF feed with regular expressions, will require hundreds of new test cases to cover all the schlock that iTunes accepts.  And I’m one of the lucky ones.

The supreme irony of all this is that I remember Dave Hyatt (Apple Safari developer) bitching and moaning about all the work he had to do to make Safari emulate the buggy, undocumented behavior of Internet Explorer, and how the world would be so much better if only everything used XML and everyone implemented draconian error handling.  Never mind the fact that the vast majority of problems that iTunes creates have nothing to do with XML well-formedness; iTunes doesn’t even require well-formed XML in the first place.  Utopia, it seems, will have to wait another decade.

Just like the browser wars I suspect this is going to get a lot worse before it gets any better. Hopefully the folks working on RSS at Apple [and at Microsoft] are paying attention to this discussion and will do the right thing.

The main problem is that every RSS reader is "liberal" to some degree. The problem that causes is that aggregator developers end up being asked to be bug compatible with some other popular RSS reader. I get complaints that RSS Bandit is more strict than RSS readers like Sharpreader all the time but often resist making changes to copy every quirk in other RSS readers. Once an RSS reader rises to dominance, the definition of what it means to be a valid RSS feed won't be what is in the spec but will be whatever that reader supports. This is what often happens in the software industry from web browsers to C compilers. It's great to see Sam fighting to prevent this from happening in the RSS space and his Feed Validator has gone a long way in preventing this from happening. I can only hope that the iTunes folks realize that it is best for everyone if they favor spec compliance to being liberal in what they receive.


 

Comments are closed.