Tim Ewald has been blogging about ways to add versioning to web services which work around the various limitations of the W3C XML Schema Definition Language (XSD).One bit of insight I always like to share when talking about XSD is that there are two primary usage scenarios that have
developed around XML document validation and XML schemas. My article XML Schema Design Patterns: Is Complex Type Derivation Unnecessary? describes them as
Describing and enforcing the contract between producers and
consumers of XML documents: An XML schema ordinarily serves as a
means for consumers and producers of XML to understand the structure of
the document being consumed or produced. Schemas are a fairly terse and
machine readable way to describe what constitutes a valid XML
document
according to a particular XML vocabulary. Thus a schema can be
thought of as contract between the producer and consumer of an
XML document. Typically the consumer ensures that the XML document
being received from the producer conforms to the contract by validating
the received document against the schema.
Creating the basis for processing and storing typed data
represented as XML documents: XSD describes the
creation of a type annotated infoset as a consequence of document
validation against a schema. During validation against an XSD, an input
XML infoset is converted into a post schema validation infoset (PSVI),
which among other things contains type annotations. However practical
experience has shown that one does not need to perform full document
validation to create type annotated infosets; in general many
applications that use XML schemas to create strongly typed XML such as
XML<->object mapping technologies do not perform full document
validation, since a number of XSD features do not map to concepts in
the target domain.
If you are building a SOAP-based XML Web service using the toolkits provided by the major vendors like IBM, Microsoft or BEA then it is most likely that your usage pattern aligns with scenario #2 above. This means that your Web service toolkit isn't completely enforcing that documents being consumed or generated by the service actually are a 100% valid against the schema. This seems bad until you realize that XSD is so limited in the constraints that it can describe that any XSD validation done would still need to be backed by a further business logic validation phase in your code. In his post Making everything optional Tim Ewald writes
DJ commented on my post addressing the problem Raimond raised with my versioning strategy.
He wondered if he'd missed an earlier post where I argued that you not
use XSD to validate your data because if you make content optional, you
can't use it to check what has to be there. Since I haven't written
about that yet, I figured I'd start to address it now.
When people build a schema for a single service, they tend to make
it reflect the precise requirements of that system at that moment in
time. Then, when those requirements change, they revise the schema. The
result is a system that tends to be very brittle. If you take the same
approach when you design a schema for use by multiple systems,
describing a corporate level model for customer data for instance,
things are even worse. Some systems won't have all the required data.
They have to decide whether to (a) collect the data, (b) make up bogus
data, or (c) not adopt the common model. None of these are good
approaches.
To solve both these problems, I've started thinking about my schema not as the definition of what this system needs right now but as the definition of what the data should look like if it's present
instead. I move the actual checking for what has to be present inside
the system (either client or service) and implement it using either
code or a narrowed schema that is duplicate of the contract schema with
more constraints in place.
There are important lessons in Tim's posts which are unfortunately often learned the hard way. A document or message can have different required/optional fields depending on what part of the process your are in or even whether it is being used as input vs. output. It's hard to come up with on single schema definition for a common type across a system without resorting to "everything is optional" and then relying on code to do the specific business logic validation for which phase in the process your are in.
There's another great comment in Tim's follow up post More on making everything optional
I think it's important not to confuse your schema with your contract. A client and a service have to agree on all sorts of things, only some of which are captured in your WSDL/XSD(/Policy). My goal in proposing that almost everything in your XSD be optional is to find the sweet-spot between easy coding and flexibility for evolution.
Amen! Preach on brother.