Seeing Jon Udell's post about having difficulty with the Google PR team with regards to discussing the Google GData API reminded me that I needed to write down some of my thoughts on extending RSS and Atom based on looking at GData. There are basically three approaches one can take when deciding to extend an XML syndication format such as RSS or Atom
Add extension elements in a different namespace: This is the traditional approach to extending RSS and it involves adding new elements as children of the item
or atom:entry
element which carry application/context specific data beyond that provided by the RSS/Atom elements. Microsoft's Simple Sharing Extensions, Apple's iTunes RSS extensions, Yahoo's Media RSS extensions and Google's GData common elements all follow this model.
Provide links to alternate documents/formats as payload: This approach involves providing links to additional data or metadata from an item in the feed. Podcasting is the canonical example of this technique. One argument for this approach is that instead of coming up with extension elements that replicate existing file formats, one should simply embed links to files in the appropriate formats. This argument has been used in various discussions on syndicating calendar information (i.e. iCalendar payloads) and contact lists (i.e. vCard payloads). See James Snell's post Notes: Atom and the Google Data API for more on this topic.
Embed microformats in [X]HTML content: A microformat is structured data embedded within another markup language (typically HTML/XHTML). This allows one to represent both human-readable data and machine-readable data in a single document. The Structured Blogging initiative is an example of this technique.
All three approaches have their pros and cons. Option #1 is problematic because it encourages a proliferation of duplicative extensions and may lead to fragmenting the embedded data into multiple unrelated elements instead of a single document/format. Option #2 requires RSS/Atom clients to either build parsers for non-syndication formats or rely on external libraries for consuming information in the feed. The problem with Option #3 above is that it introduces a dependency on an HTML/XHTML parser for extracting the embedded data from the content of the feed.
From my experience with RSS Bandit, I have a preference for Option #1 although there is a certain architectural purity with Option #2 which appeals to me. What do the XML syndication geeks in the audience think about this?