Over a year ago, I wrote a blog post entitled SGML
on the Web: A Failed Dream? where I asked whether the original vision of XML had failed. Below are excerpts from that post
The people who got together to produce the XML 1.0 recommendation where motivated
to do this because they saw a need for SGML on the Web.
Specifically
their discussions focused on two general areas:
- Classes of software applications for which HTML was an inadequate
information format
- Aspects of the SGML standard itself that impeded SGML's acceptance as a
widespread information technology
The first discussion established the need for SGML on the web. By
articulating worthwhile, even mission-critical work that could be done on the
web if there were a suitable information format, the SGML experts hoped to
justify SGML on the web with some compelling business cases.
The second discussion raised the thornier issue of how to "fix" SGML so that
it was suitable for the web.
And thus XML was born.
...
The W3C's attempts to get people to author XML directly on the Web have
mostly failed as can be seen by the dismal adoption rate of XHTML and in fact many [including
myself] have come to the conclusion that the costs of
adopting XHTML compared to the benefits are too low if not non-existent.
There was once an expectation that content producers would be able to place
documents conformant to their own XML vocabularies on the Web and then display
would entirely be handled by stylesheets but this is yet to become widespread.
In fact, at least one member of a W3C working group has called this a bad
practice since it means that User Agents that aren't sophisticated enough to
understand style sheets are left out in the cold.
Interestingly enough although XML has not been as successfully as its
originators initially expected as a markup language for authoring documents on
the Web it has found significant success as the successor to the Comma Separated
Value (CSV) File Format. XML's primary usage on the Web and even within
internal networks is for exchanging machine generated, structured data between
applications. Speculatively, the largest usage of XML on the Web today is RSS and it
conforms to this pattern.
These thoughts were recently rekindled when reading Tim Bray's recent post Don’t Invent XML Languages
where Tim Bray argues that people should stop designing new XML
formats. For designing new data formats for the Web, Tim Bray advocates
the use of Microformats instead of XML.
The vision behind microformats is completely different from the XML vision.
The original XML inventers started with the premise that HTML is not
expressive enough to describe every possible document type that would
be exchanged on the Web. Proponents of microformats argue that one can
embed additional semantics over HTML and thus HTML is expressive enough
to represent every possible document type that could be exchanged on
the Web. I've always considered it a gross hack to think that instead
of having an HTML web page for my blog and an Atom/RSS feed, instead I
should have a single HTML page with <div class="rss:item">
or <h3 class="atom:title"> embedded in it instead. However given
that one of the inventors of XML (Tim Bray) is now advocating this
approach, I wonder if I'm simply clinging to old ways and have become
the kind of intellectual dinosaur I bemoan.