XML the Data Format,
Not
I read
Clemens original post on Saturday and agreed
with a lot of it only disagreeing with how he
presented his argument. The one thing I did agree
with most is that XML is not just a data format.
XML is a family of technologies that make working
with structured and semi-structured information
much easier than has ever been done before. This
harkens back to my
Why Use XML post from a few weeks back.
I'll go back to reasons #1 and #2 why I believe
people use XML. Reason (1) is that Everyone Else
is Using It this leads to easy transferability
of skillset and interop at the syntax level.
It is important to note that the interop story is
all about sharing and understanding how to process
UnicodeWithAngleBrackets
Reason (2) which is Huge Selection of
Off-The-Shelf Tools is a very compelling reason
why people use XML. This reason has little to do
with XML syntax itself and more to do with the fact
that hundreds of individuals and corporations have
built technologies and specifications around XML
while jumping on the hype bandwagon. These
off-the-shelf tools make XML very attractive to
people who want to process structured and
semi-structured data (RSS feeds, config files,
database exports, wire transfer formats, etc) who
may not be enamored with the syntax of XML or all
its esoteric rules. For these people there is the
Godsend that is the XML
Infoset.
The XML infoset gives leeway to people to use
alternate representations of XML by describing a
logical (as opposed to physical) model for an XML
document and opening the door to creating virtual
XML views. Now people can get access to XML
technologies like- Model Based APIs for in
memory representation - CHECK [DOM]
- Stream based APIs for
fast processing - CHECK [Pull-based APIs,
SAX]
- Grammer languages for
specifying valid content - CHECK [DTD, W3C XML
Schema]
- Query languages - CHECK
[XQuery, XPath]
- Ability to perform
regexes against the structure of the content -
CHECK [XPath, XSLT]
- Ability to create fairly
human readable serialization - CHECK [XML 1.0
serialization, looks good in IE]
without sacrificing themselves on the altar of
angle brackets and unicode text unless absolutely
necessary. However this is
not an interop or
data integration story unless everyone shares a
common
serialization of the
infoset syntax and [some] semantics for
exchanging data.
Now let's go back to Clemens Vasters' example with
Biztalk server. Biztalk supports various binary
data transfer protocols as well as XML 1.0. However
Biztalk needs to match on parts of the input
stream, transform it and or specify structure using
some schema. All of these are technologies that
already exist with off-the-shelf XML tools which
are infoset compliant so Biztalk can use XPath to
query a binary stream, XSLT to transform it or XML
Schema to specify structure for it while never
having to resort to converting either the input or
output stream to UnicodeWithAngleBrackets.
Basically, they get the best of both worlds.
Note I am not saying this is what BizTalk
does given that I don't work for or with them
except tangentially so I have no idea what they
actually do. This especially true given that I've
never needed actually used BizTalk Server. Duh.
XML Everywhere. The cry of a new
generation.
#Clarification
Andy Conrad didn't like the way I potrayed his
position in my
// Considered Dangerous post and wanted me to
clarify it. This slipped my mind but Andy has
beaten me to the punch and done a great job of
clarifying his position in his post
Serendipity and the Sith Lord of XPath#
Get yourself a
News Aggregator and subscribe to my
RSSfeedDisclaimer:
The above comments do not
represent the thoughts, intentions, plans or
strategies of my employer. They are solely my
opinion.