Dealing with the Data
Access Impedance Mismatch
Thanks to Erik
Meijer for pointing me to
The Impedance Imperative Tuples + Objects +
Infosets =Too Much Stuff! article. The team I
work for deals with data access technologies
(relational, object, XML aka ROX) so this impedance
mismatch is something that we have to rationalize
all the time.
Up until quite recently the primary impedance
mismatch application developers had to deal with
was the
Object<->Relational impedance mismatch.
Usually data was stored in a relational database
but primarily accessed, manipulated and transmitted
over the network as objects via some object
oriented programming language. Many felt (and still
feel) that this impedance mismatch is a significant
problem. Attempts to reduce this impedance mismatch
has lead to technologies such as
object oriented databases and various
object relational mapping tools. These
solutions take the point of view that the problem
of having developers deal with two domains or
having two sets of developers (DB developers and
application coders) are solved by making everything
look like a single domain, objects. One could also
argue that the flip side of this is to push as much
data manipulation as you can to the database via
technologies like stored procedures while mainly
manipulating and transmitting the data on the wire
in objects that closely model the relational
database such as the .NET Framework's
DataSet class.
Recently a third player has appeared on the scene,
XML. It is becoming more common for data to be
stored in a relational database, mainly manipulated
as objects but transmitted on the wire as XML. One
would then think that given the previously stated
impedance mismatch and the fact that XML is mainly
just a syntactic device that XML representations of
the data being transmitted is sent as serialized
versions of objects, relational data or some subset
of both. However, what seems to be happening is
slightly more complicated. The software world seems
to moving more towards using XML
Web Services built on standard technologies
such as HTTP, XML, SOAP and WSDL to
transmit data between applications. And taken from
the WSDL 1.1 W3C NoteWSDL recognizes the need for rich type
systems for describing message formats, and
supports the XML Schemas specification (XSD) [11]
as its canonical type system
So this introduces a third type system into the
mix, W3C XML Schema
structures
and
datatypes.
W3C XML Schema has a number of concepts that do not
map to concepts in either the object oriented or
relational models. To properly access and
manipulate XML typed using W3C XML Schema you need
new data access mechanisms such as
XQuery. Now
application developers have to deal with 3 domains
or we need 3 sets of developers. The first instinct
is to continue with the meme where you make
everything look like objects which is what a number
of XML Web Services toolkits do today including
Microsoft's .NET Framework via the
XML Serialization technology. This tends to be
particularly lossy because traditionally object
oriented systems do not have the richness to
describe the constraints that are possible to
create with a typical relational database let alone
the even richer constraints that are possible with
W3C XML Schema. Thus such object oriented systems
must evolve to not only capture the semantics of
the relational model but those of the W3C XML
Schema model as well. Another approach could be to
make everything look like XML and use that as the
primary data access mechanism. Technologies already
exist to
make
relational databases look like XML and
make objects look like XML. Unsurprisingly to
those who know me, this is the approach I favor.
The relational model can also be viewed as a
universal data access mechanism if one figured out
how to map the constraints of the W3C XML Schema
model. The .NET Framework's DataSet already does
some
translation of an XML structure defined in a W3C
XML Schema to a relational structure.
The problem with all three approaches I just
described is that they are somewhat lossy or
involve hacking one model into becoming the
uber-model. XML trees don't handle the graph
structures of objects well, objects can't handle
concepts like W3C XML Schema's derivation by
restriction and so on. There is also a fourth
approach which is endorsed by Erik Meijer in his
paper
Unifying Tables, Objects, and Documents where
one creates a new unified model which is a superset
of the pertinent features of the 3 existing models.
Of course, this involves introducing a fourth
model.
If you are interested in which approach(es) we
decided to take on my team then you should be at
PDC.
[I had more to write but I'll be late for a meeting
if I keep this up]
#Don't Get Too
Excited
Fumiaki
writes
PDC is coming closer. We are all excited about
what will be shown there. But remember, PDC is
for future.
Anyone remember PDC 2000? The bits were still
young there. We used webserviceutil.exe and
DataSetCommand. VB.NET was not like the one we
use today. Knowledge we got from the PDC 2000 was
not useful in the real life 2000, and most of
2001, although today the knowledge is the
advantage for us. PDC 2003 will be the same.
...
So, I would like to ask speakers a favor. Please
tell us more of why you made it that way, than
what you made. We will eventually gather
information about the new bits from books, MSDN,
and so on. Attending PDC should be our advantage
because we will almost exclusively know why the
features are there, why smart people at Microsoft
decide its architecture that way. It is that kind
of knowledge that will be our real advantage.
That is why I am going, even it takes 10 hours to
L.A. from Japan.
I have to agree with him here. The bits you'll get
at PDC will most likely change before the final
versions ship. There is anything from 1 - 3 years
from now until some of the stuff ships which is a
long time in software development. There are
already a number of changes in the bits I own from
what PDC folks will get at the code that is
currently checked in, no changes in logical
functionality but class renamings, API refactorings
and the like. As time goes on I expect there to be
more changes so the key thing of value folks should
be trying to get out of being at PDC is the main
concepts and functionality not focus on nitty
gritty issues about APIs (although we want your
feedback if something is broken) or specifc details
about features. I was inspired to write the entry
above by Fumiaki's statement that the
why is
more important than the
what. I'll be
helping with some PDC presentations even though I
won't be there and will make sure this at least
permeates the stuff around data access and
XML.
Also, next month's issue of
XML Journal
should have an article by me which discusses some
of the thought process that went into the
improvements we made to some of the core XML APIs
in the .NET Framework.
#Stock Options vs. Stock
Grants
The guy over at
Corp Law
Blog has an entry entitled
Greatest IPO Ever where he links to a number of
posts he's made about
Microsoft's plan to replace options grants with
actual stock. Informative stuff if you are a
Microsoft employee or interested in the details of
stock options and the like.
#
--
Get yourself a
News Aggregator and subscribe to my
RSSfeedDisclaimer:
The above comments do not
represent the thoughts, intentions, plans or
strategies of my employer. They are solely my
opinion.