In his post Exploring Live Clipboard Jon Udell posts a screencast he made about LiveClipboard. He writes
I've been experimenting with microformats since before they
were called that, and I'm completely jazzed about Live
Clipboard. In this screencast I'll walk you through examples of Live
Clipboard in use, show how the hCalendar payload is wrapped, grab hCalendar data
from Upcoming and Eventful, convert it to iCalendar format for insertion into a
calendar program, inject it natively into Live Clipboard, and look at Upcoming
and Eventful APIs side-by-side.
All this leads up to a question: How can I copy an event from one of these
services and paste it into another? My conclusion is that adopting Live
Clipboard and microformats will be necessary but not sufficient. We'll also need
a way to agree that, for example, this
venue is the same as that venue. At the end,
I float an idea about how we might work toward such agreements.
The problem that Jon Udell describes is a classic problem when
dealing with mapping data from different domains. I posted about this a
few months ago in my post Metadata Quality and Mapping Between Domain Languages where I wrote
The problem Stefano has pointed out is that just being able to say that two
items are semantically identical (i.e. an artist field in dataset A is the same
as the 'band name' field in dataset B) doesn't mean you won't have to do some
syntactic mapping as well (i.e. alter artist names of the form "ArtistName, The"
to "The ArtistName") if you want an accurate mapping.
This is the big problem with data mapping. In Jon's example, the location is called Colonial Theater in Upcoming and Colonial Theater (New Hampshire) in Eventful.
In Eventful it has a street address while in Upcoming only the street
name is provided. Little differences like these are what makes data
mapping a hard problem. Jon's solution is for the community to come up
with global identifiers for venues as tags (e.g.
Colonial_Theater_NH_03431) instead of waiting for technologists to come
up with a solution. That's good advice because there really isn't a
good technological solution for this problem. Even RDF/Semantic Web
junkies like Danny Ayers in posts like Live clipboard and identifying things
start with assumptions like every venue has a unique identifier which
is it's URI. Of course this ignores the fact that coming up with a
global, unique identification scheme for the Web is the problem in the
first case. The problem with Jon's approach is the same one that is
pointed out in almost every critique of folksonomies, people won't use
the same tags for the same concept. Jon might useColonial_Theater_NH_03431 while I use
Colonial_Theater_95_Maine_Street_NH_03431 which leaves us with the same
problem of inconsistent identifiers being used for the same venue.
I assume that for the near future we continue seeing custom code being
written to make data integration across domains work. Unfortunately, no
developments on the horizon look promising in making this problem go
away.
PS: Ray Ozzie has a post on some of the recent developments in the world of Live Clipboard in his post Wiring Progress, check it out.