Matevz Gacnik points out Serious bug in System.Xml.XmlValidatingReader, he writes
The schema spec and especially RFC 2396 state that xs:anyURI instance can be empty, but System.Xml.XmlValidatingReader keeps failing on such an instance. To reproduce the error use the following schema: <?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="AnyURI" type="xs:anyURI"> </xs:element></xs:schema> And this instance document: <?xml version="1.0" encoding="UTF-8"?><AnyURI/> There is currently no workaround for .NET FX 1.0/1.1. Actually Whidbey is the only patch that fixes this. :)
The schema spec and especially RFC 2396 state that xs:anyURI instance can be empty, but System.Xml.XmlValidatingReader keeps failing on such an instance.
To reproduce the error use the following schema:
<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="AnyURI" type="xs:anyURI"> </xs:element></xs:schema>
And this instance document:
<?xml version="1.0" encoding="UTF-8"?><AnyURI/>
There is currently no workaround for .NET FX 1.0/1.1. Actually Whidbey is the only patch that fixes this. :)
The schema validation engine in the .NET Framework uses the System.Uri class for parsing URIs. This class doesn't consider an empty string to be a valid URI which is why our schema validation considers the above instance to be invalid according to its schema. However it isn't clear cut in the specs whether this is valid or not at least not without a bunch of sleuthing. As Micheal Kay (XSLT working group member) and C.M. Speilberg-McQueen (chairman of the XML Schema working group) wrote on XML-DEV
To: Michael Kay <michael.h.kay@ntlworld.com> Subject: RE: [xml-dev] Can anyURI be empty? From: "C. M. Sperberg-McQueen" <cmsmcq@acm.org> Date: 07 Apr 2004 10:49:51 -0600 Cc: xml-dev@lists.xml.org
On Wed, 2004-04-07 at 03:47, Michael Kay wrote:> > If it couldn't, it would be wrong. An empty string is a valid URI.> > On this, like so many other things, RFC 2396 is a total disaster. An empty> string is not valid according to the BNF syntax, but the RFC gives detailed> semantics for what it means (detailed semantics, though very imprecise> semantics).> > And the schema REC doesn't help. It has the famous note saying that the> definition places "only very modest obligations" on an implementation, and> it doesn't say what those obligations are.Yes. This is a direct result of our realization thatwe have as much trouble understanding RFC 2396 as anyoneelse. The anyURI type imposes the obligations of RFC 2396, whatever those are. Any attempt to paraphrasethem on our part would lead, I fear, to an unsatisfactoryresult: either we would make some mistake (like believingthat since the BNF does not accept the empty string,it must not be legal) or we would make no mistakes. Inthe one case, we'd be misleading our readers, and in either case, we'd find ourselves mired in a never-endingeffort to prove that our paraphrase was, or was not,correct.
RFC 2396 is one of the fundamental specifications of the World Wide Web yet it is vague and contradictory in a number of key places. Those of us implementing standards often have to go on gut feel or try and track the spec authors whenever we bump across issues like this but sometimes we miss them.
All I can do is apologize to people like Matevz Gacnik who have to bear the brunt of the lack of interoperability caused by vaguely written specifications implemented on our platform and for the fact that a fix for this problem won't be available until Whidbey.