Semantic Markup and
HTML
One of the biggest gripes about HTML as the
primary authoring format for Web documents has been
that it focus too much on presentation and not
enough on semantics and content. This lack of
semantics in HTML markup makes it hard to
mechanically process web documents. The common
example of applications that get the short end of
the stick is search engines. Using just HTML there
is no way to differentiate via markup whether the
words "Dare Obasanjo" refer to a name or
an action.
Over time a number of seemingly arbitrary tags not
strictly related to presentation eventually made it
into
the HTML tag set such as
code,
cite, and
acronym. However the bigwigs in the W3C
realized that trying to define all the possible
semantic tags people would want to use on the Web
in HTML was the wrong approach and tried a
different approach. The first was to create a
markup language for the web that allowed users to
create their own semantic tags, this markup
language is XML.
The second part, which is still an area of ongoing
research which most call the Semantic Web,
is how to relate all these different pieces of
semantic markup. It takes little imagination to
realize that if markup aware search engines that
know how to "Find all documents authored by Dare
Obasanjo" by searching for <author>Dare
Obasanjo</author> in documents, they will
also need to be able to tell that documents
containing <creator>Dare
Obasanjo</creator> are also relevant. This is
rather difficult and has resulted in a number of
complicated Web ontology related technologies such
as RDF, DAML+OIL,
OWL, and more.
Going back to
Mark Pilgrim's post where he describes
processing the <cite> tags on his site he
statesLet's try pushing the
envelope of what HTML is actually designed to do,
before we get all hot and bothered trying to
replace it, mmmkay?
which seems to run counter to what his example
shows and in fact is akin to saying "Look ma, if I
put function pointers in a C struct I don't need an
object oriented programming language". Mark's
example hints at the kind of truly interesting
things people could do with Web documents if they
were actually marked up semantically and not just a
mass of <b>, <font> and <br>
tags. I for one would have preferred handling
semantic markup when I wrote
code to convert my K5 diary page to an RSS feed
or when I wrote the
K5
story parser for the
K5
user search engine. Mark's parting words
completely contradict the feature he has added to
his website and in fact is an example of why HTML
bears replacing.
Now for some clarification. The above comments do
not make me a semantic web advocate nor do they
indicate that
my previous thoughts about XHTML are changed. I
personally think the semantic web is a pipe dream
in much the same way "we will have real AI in the
next 20 years" was a pipe dream a few decades ago.
However in the same way those AI researchers ended
up giving us Lisp which brought us Emacs (M-x
all-hail-emacs) so also it is likely the semantic
web folks may inadvertanly produce really cool
technology without meaning to. Also
Google
has been doing the Semantic Web thing without
needing people to alter their existing documents,
understand complex specifications or make semantic
web related decisions when authoring
documents.
#Mac
Addicts
Doug
admits to being a Mac addict. His post reminded
me of the
mac addiction article on Wired. I was amused by
the fact the article states
What makes Mac users so loyal?
The answer, of course, depends on who is
asked:
...
But some common themes emerge: community, the
alternative to Microsoft, and the brand,
which connotes nonconformity, liberty and
creativity.
which makes me wonder about Doug. :)
He isn't the only one who I've seen bitten by the
lifestyle ad when it comes to geek passions. Most
of the kids running Linux when I was at school did
so because it was the "geek thing to do" and not
for any reasons they could argue coherently. I also
suspect this is the root of why I started using
Emacs although that has long been superseded by all
the cool shit I can actually do with it.
#Article
Translations
The Web can surprise the heck out of you sometimes.
I still can't get over there is
French translation of my C# vs. Java article
and a
Chines translation of my interview with Miguel.
It is weirdly humbling to see my words translated
and spread to an entire audience I had not
anticipated or expected to serve when I originally
wrote the articles. Truly, a World Wide Web.
#
Get yourself a
News Aggregator and subscribe to my
RSSfeedDisclaimer: The opinions in this diary
are my own and do not reflect the opinions,
thoughts, intentions or strategies of my
employer.