This document is a draft document that describes the SiXDML language and application programming interface. This document is a work in progress and is available for review by all interested parties. The document is fairly stable although it may be updated as new ideas are discovered and old ideas discarded but it is unlikely that any of changes made will be particularly drastic.
The need for a single mechanism that allows programmers and regular users alike the means to manipulate the documents and collections is highlighted by the fact that currently most native XML databases as well as XML-enabled databases provide multiple means to perform sets of interrelated tasks that can be done via single mechanism in a relational or hierarchical database. It is not uncommon to see a native XML database that uses XPath for querying, an XML-based language or XML Document Object Model (DOM) interaction for updates, and a graphical or command line interface for managing documents and collections.
SiXDML was designed to create a common syntax and semantics for performing tasks most often required of XML repositories. SiXDML consists of two parts; a data definition and manipulation language inspired by SQL and an application programming interface based on the XML:DB Database API.
The SiXDML language is designed to be easily understood by programmers and non-programmers alike, internet-aware, and combine a minimum of complexity with a maximum of functionality while being straight-forward to implement. SiXDML aims to plug the hole that currently exists for a data manipulation language that allows one to query and update XML documents while also providing a means to perform database management activities. Although numerous query languages exist for XML including Lorel, Quilt, UnQL, XDuce, XML-QL, XPath, XQL, XQuery and YaTL yet no popularly accepted XML query language possesses data manipulation semantics such as DELETE, REPLACE and INSERT that exist in a relational data manipulation language like SQL or a hierarchical data manipulation language like DL/I.
XQuery which seems set to become the standard query language for XML repositories does not have update semantics in the December 2001 draft although some investigation on how to add updates to XQuery have been carried out by researchers like Patrick Lehti and vendors like Microsoft and Software AG. The XQuery working group has hinted that a version of XQuery that contains updates is about a year or more away. However, proposals for a language that also allows one to manage indices, collections and schemas in an XML repository in the same manner one can manage indexes, triggers and relations in a relational database with SQL have not been forthcoming.
In this section, some concepts and terminology from the realm of XML databases which are a basic part of the framework upon which SiXDML is built are described.
A document is considered to be a well-formed and optionally valid XML document that can be stored in an XML repository.
A collection contains a group of XML documents which may optionally conform to a single schema and can be queried and updated simultaneously by a single SiXDML statement. To SiXDML, a collection is akin to a directory on a filesystem that contains XML documents. A homogenous collection whose documents conform to one or more schemata can be considered to be akin to a row in relational table.
A schema is a document that is used to specify constraints on the XML documents in a particular collection. SiXDML is not designed with any specific schema technology given special consideration and instead will rely upon implementations to specify what schema technologies will be used or supported. Thus DTDs, W3C XML schema, XDR files and RELAX NG documents can all be supported by an XML repository that uses SiXDML.
In many XML repositories, query performance can be improved by specifying certain elements, attributes or access patterns that should be indexed. Different repositories have different indexing capabilities; some like eXcelon's XIS allow one to index each individual word in an element (via a text index), the text value of each element (via a value index), or the expected node structure that will be returned by the query (via a structure index).
The design of SiXDML takes into consideration the fact that many different indexing techniques for XML data exist.
An XML repository is a storage system, preferably one that supports ACID transactions, that manages XML documents and controls access to them. Native XML databases, XML content management systems and XML-enabled databases are all examples of XML repositories.
The query processor accepts SiXDML syntax, parses it and executes the query against the XML repository. It is expected that the query processor is capable of performing optimizations to improve performance of queries before executing them.
SiXDML is designed with familiarity in mind and thus borrows heavily from the syntax of SQL and DL/I for its keywords. Similarly, SiXDML uses XPath for performing queries due to its widespread popularity and relative simplicity. Since one of the goals of SiXDML is to provide a language that is easily understood by programmers and non-programmers alike programming constructs like variable declarations, bound variables, functions, and nested queries have been avoided. The combination of using XPath for queries and the ability to pipe the results of a SELECT statement to a filter such as an XSLT stylesheet or an XQuery expression makes the effects of a lack of sophisticated programming constructs in SiXDML minimal. An XML syntax for SiXDML, similar to that used for XUpdate, was considered and discarded because it would have added to the complexity of the language by making it unnecessarily verbose as well as unfamiliar to adopters of the language.
CreateStatement | ::= | 'CREATE' CreateCollectionClause |
CreateCollectionClause | ::= | CollectionClause ( 'CONSTRAINED BY'
URL )? |
The above command creates a collection with the specified path and optionally specifies the schema that will be used to constrain the documents in the collection. The schema can be located over the internet or on a local filesystem.
CREATE COLLECTION bookstore
CREATE COLLECTION bookstore/out_of_stock
CREATE COLLECTION bookstore CONSTRAINED BY http://www.25hoursaday.com/books.xsd
ShowStatement | ::= | 'SHOW' CollectionClause |
Lists the contents of the collection, which should typically be documents and/or subcollections.
The contents of the collection should either be displayed as XML that conforms to the W3C
XML schema below or as text that contains at least the same amount of information as would be
shown in the XML version.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="xmlns://www.sixdml.org/2002/04/language/"
elementFormDefault="qualified"
targetNamespace="xmlns://www.sixdml.org/2002/04/language/">
<xs:element name="collection-contents" type="collectionContentsType"/>
<xs:complexType name="collectionContentsType">
<xs:choice maxOccurs="unbounded">
<xs:element name="document" type="namedItem"/>
<xs:element name="collection" type="namedItem"/>
</xs:choice>
</xs:complexType>
<xs:complexType name="namedItem">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="name" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:schema>
SHOW COLLECTION bookstore
SHOW COLLECTION /
SHOW COLLECTION records/april95
DropStatement | ::= | 'DROP' CollectionClause |
Deletes a collection and all its children from the repository.
DROP COLLECTION bookstore
InsertStatement | ::= | 'INSERT' ( InsertXmlClause | InsertDocumentClause) |
InsertDocumentClause | ::= | URL 'INTO' CollectionClause |
InsertXmlClause | ::= | InlineXml 'NAMED' DocumentName 'INTO'
CollectionClause |
Adds a new document to the specified collection. If there is a schema constraining this collection, the document must be valid against the schema for this operation to succeed.
INSERT http://www.25hoursaday.com/books.xml INTO COLLECTION bookstore
INSERT{
<bk:bookstore xmlns:bk="urn:bookstore-schema">
<bk:book genre="Fantasy">
<bk:title>The Hobbit</bk:title>
<bk:author>
<bk:first-name>J.R.R.</bk:first-name>
<bk:last-name>Tolkien</bk:last-name>
</bk:author>
<bk:price>14.99</bk:price>
</bk:book>
</bk:bookstore>
}
NAMED fantasy-books.xml INTO COLLECTION bookstore;
DropStatement | ::= | 'DROP' DocumentPath 'FROM'
CollectionClause |
Removes a document from the collection.
DROP penguin_books.xml FROM COLLECTION bookstore/out_of_stock
ConstrainStatement | ::= | 'CONSTRAIN' CollectionClause 'WITH' URL |
Specifies the schema that will be used to constrain a collection. Every document in the collection must validate against the schema or the operation fails. If there is already a schema constraining this collection it must be replaced if all documents validate against the new schema.
CONSTRAIN COLLECTION bookstore WITH http://www.25hoursaday.com/books.xsd
DropStatement | ::= | 'DROP CONSTRAINTS ON' CollectionClause |
Deletes the schema that constrains the contents of a particular collection.
DROP CONSTRAINTS ON COLLECTION bookstore
ShowStatement | ::= | 'SHOW CONSTRAINTS ON' CollectionClause |
Displays the schema for the specified collection, if any.
SHOW CONSTRAINTS ON COLLECTION bookstore
CreateStatement | ::= | 'CREATE' CreateIndexClause |
CreateIndexClause | ::= | IndexClause 'OF TYPE' Nmtoken 'WITH KEY =' XPathExpression (',' KeyValuePair)* 'ON ' CollectionClause |
Creates an index on a particular attribute or element in a document. The key is the value
that will be searched against while the optional list of key value pairs is contains
parameters that are dependent on the type of index. The XPath expressions are subsets of
XPath 1.0.
NOTE: Index names should not have to be unique across collections. This may be
an onerous burden on some implementations so thus is not mandatory.
CREATE INDEX author-lastname-idx OF TYPE STRUCTURE_INDEX WITH
KEY=/author/last-name, ELEMENT=/author ON COLLECTION bookstore ;
CREATE INDEX person-name-idx OF TYPE VALUE_INDEX WITH KEY=/document/person/@name
ON COLLECTION bookstore ;
DropStatement | ::= | 'DROP' IndexClause 'FROM' CollectionClause |
Removes the specified index.
DROP INDEX val-index FROM COLLECTION bookstore
ShowStatement | ::= | 'SHOW' IndexClause 'ON' CollectionClause |
Lists the name, type, and index fields of an index that applies to a particular
collection.
The index on the collection should either be displayed as XML that conforms to the W3C
XML schema below or as text that contains at least the same amount of information as would
be shown in the XML version.
<xs:schema targetNamespace="xmlns://www.sixdml.org/2002/04/language/"
xmlns:sixdml="xmlns://www.sixdml.org/2002/04/language/"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="index" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string" />
<xs:element name="type" type="xs:string" />
<xs:element name="key" type="xs:string" />
<xs:any namespace="##any" processContents="lax" minOccurs="0"
maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="collection" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:schema>
SHOW INDEX book-title-idx ON COLLECTION bookstore
ShowStatement | ::= | 'SHOW INDEXES ON' CollectionClause |
Lists the name, type, and index fields of every index that applies to a particular
collection.
The indices on the collection should either be displayed as XML that conforms to the W3C
XML schema below or as text that contains at least the same amount of information as would
be shown in the XML version.
<xs:schema targetNamespace="xmlns://www.sixdml.org/2002/04/language/"
xmlns:sixdml="xmlns://www.sixdml.org/2002/04/language/"
elementFormDefault="qualified"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="indexes">
<xs:complexType>
<xs:sequence>
<xs:element name="index" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string" />
<xs:element name="type" type="xs:string" />
<xs:element name="key" type="xs:string" />
<xs:any namespace="##any" processContents="lax" minOccurs="0" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="collection" type="xs:string" />
</xs:complexType>
</xs:element>
</xs:schema>
SHOW INDEXES ON COLLECTION bookstore
NamespaceDecl | ::= | 'NAMESPACE' NCName '=' URI |
A namespace declaration defines a namespace prefix and associates it with a namespace URI. The namespace URI must be a valid URI, and must not be an empty string. The namespace declaration must be in scope for the rest of the SiXDML query or until the prefix is paired with another namespace. It is likely that for interactive sessions, implementations may want to alter the lifetime of a namespace declaration to beyond single queries. Lifetime of namespace declarations outside of a query's scope is dependent on implementations.
NAMESPACE xsl = "http://www.w3.org/1999/XSL/Transform"
SelectStatement | ::= | 'SELECT' XPathExpression 'FROM'
(CollectionClause
(WhereClause)? |
DocumentPath) ( 'AND TRANSFORM WITH' ('XSLT' | 'XQUERY' | 'XPATH'| Nmtoken) 'IN' (URL | InlineXml) )? ('USING ROOT =' NmToken)? |
Executes the XPath query on the document or collection and optionally sends the results to the query or stylesheet, which is declared inline for further processing. The results of executing the query [before additional processing] are returned as XML that conforms to the following schema
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="http://www.25hoursaday.com/sixdml/"
elementFormDefault="qualified"
targetNamespace="http://www.25hoursaday.com/sixdml/">
<xs:element name="query-results" type="queryResultsType"/>
<xs:complexType name="queryResultsType">
<xs:sequence maxOccurs="unbounded">
<xs:element name="query-result" type="queryResultType"/>
</xs:sequence>
<xs:attribute name="collection-name" type="xs:string"/>
<xs:attribute name="query" type="xs:string" use="required" />
<xs:attribute name="predicate" type="xs:string" use="optional" />
</xs:complexType>
<xs:complexType name="queryResultType" mixed="true">
<xs:sequence>
<xs:any minOccurs="0" maxOccurs="unbounded"
processContents="skip"/>
</xs:sequence>
<xs:attribute name="resource-name" type="xs:string"/>
</xs:complexType>
</xs:schema>
SELECT //tr FROM auctiondata/sort-bidder-.xsl
NAMESPACE bk: = "urn:my-bookstore"
SELECT //bk:title FROM COLLECTION bookstore
title
NAMESPACE bk: = "urn:my-bookstore"
SELECT count(//bk:book) FROM bookstore/my-books.xml USING ROOT=total-num-books
book
NAMESPACE bk: = "urn:my-bookstore"
SELECT count(//bk:book) FROM COLLECTION bookstore WHERE count(//bk:book) > 0
book
book
SELECT //TITLE[text()='Sandstone'] FROM
auctiondata/auction1.xml AND TRANSFORM WITH XSLT
IN{
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/XSL/Transform/1.0">
<xsl:template match="/">
<HTML>
<HEAD><TITLE><xsl:value-of select='/document/title'/></TITLE></HEAD>
<BODY><xsl:apply-templates/></BODY>
</HTML>
</xsl:template>
<xsl:template match="TITLE">
<H1><xsl:apply-templates/></H1>
</xsl:template>
</xsl:stylesheet>
}
SELECT //TITLE[text()='Sandstone'] FROM auctiondata/auction1.xml
AND TRANSFORM WITH XSLT IN http://www.25hoursaday.com/title.xsl
InsertStatement | ::= | 'INSERT' InsertXmlClause |
InsertXmlClause | ::= | InlineXml 'BEFORE'
XPathExpression 'IN'
(CollectionClause
(WhereClause)? |
DocumentPath) |
Inserts an xml document fragment, processing instruction, comment, text , CDATA section or any combination thereof before all nodes that match the expression. The data to be inserted is declared inline in regular XML syntax. The XPath expression is an Xpath 1.0 expression that must result in one or more comment, processing-instruction, text, element nodes to be valid. The operation must fail if the target node is the document root and an attempt is made to insert anything besides comments or processing instructions.
INSERT{
<!-- The following element is a book -->
} BEFORE //bk:book IN bookstore/books.xml
InsertStatement | ::= | 'INSERT' InsertXmlClause |
InsertXmlClause | ::= | InlineXml 'INTO'
XPathExpression 'IN'
(CollectionClause
(WhereClause)?|
DocumentPath) |
Inserts an xml document fragment, processing instruction, comment, text ,CDATA section or any combination thereof as a child of each node that match the expression. The data to be inserted is declared inline in regular XML syntax. The XPath expression is an XPath 1.0 expression that must fail if the results of the query are not one or more element nodes.
INSERT{
<bk:book xmlns:bk="urn:bookstore-schema" genre="Fantasy/Comedy">
<bk:title>Men At Arms</bk:title>
<bk:author>
<bk:first-name>Terry</bk:first-name>
<bk:last-name>Pratchett</bk:last-name>
</bk:author>
<bk:price>8.99</bk:price>
</bk:book>
} INTO //bk:bookstore IN COLLECTION bookstore
InsertStatement | ::= | 'INSERT' InsertXmlClause |
InsertXmlClause | ::= | InlineXml 'AFTER'
XPathExpression 'IN'
(CollectionClause
(WhereClause)? |
DocumentPath) |
Inserts an xml document fragment, processing instruction, comment, text, CDATA section or any combination thereof after all nodes that match the expression. The data to be inserted is declared inline in regular XML syntax. The XPath expression is an Xpath 1.0 expression that must result in one or more comment, processing-instruction, text, element nodes to be valid. The operation must fail if the target node is the document root and an attempt is made to insert anything besides comments or processing instructions.
INSERT{
<bk:book xmlns:bk="urn:bookstore-schema" genre="Fantasy/Comedy">
<bk:title>Interesting Times</bk:title>
<bk:author>
<bk:first-name>Terry</bk:first-name>
<bk:last-name>Pratchett</bk:last-name>
</bk:author>
<bk:price>8.99</bk:price>
</bk:book>
} AFTER /bk:bookstore/bk:book[1] IN COLLECTION bookstore
InsertStatement | ::= | 'INSERT' InsertAttributeClause |
InsertAttributeClause | ::= | 'ATTRIBUTE WITH NAME =' QName ', VALUE=' AttValue 'INTO' XPathExpression 'IN' (CollectionClause (WhereClause)? | DocumentPath) |
Inserts an attribute at the specified position for all nodes that match the XPath expression. If a namespace URI is provided then it is expected that the name provided is an XML qualified name. If an attribute with the same name already exists at that position then its value is overwritten. The XPath expression is an XPath 1.0 expression that must result in one or more element nodes to be valid.
INSERT ATTRIBUTE WITH NAME="location", VALUE="Atlanta" INTO //book IN
bookstore/fantasy-books.xml
NAMESPACE bk = "urn:bookstore-schema"
NAMESPACE inv = "urn:inventory-schema"
INSERT ATTRIBUTE WITH NAME="inv:status", VALUE="available"
INTO //bk:book IN COLLECTION bookstore
DeleteStatement | ::= | 'DELETE ' XPathExpression 'FROM' (CollectionClause (WhereClause)? | DocumentPath) |
Removes all nodes that match the XPath expression.
DELETE //comment() FROM COLLECTION bookstore
DELETE /bk:bookstore/bk:book/bk:price FROM COLLECTION bookstore
ReplaceStatement | ::= | 'REPLACE' XPathExpression 'WITH' InlineXml 'IN' (CollectionClause (WhereClause)? | DocumentPath) |
Replaces one or more element, processing-instruction, comment, or text nodes with one or more of the same. The data to be inserted is declared inline in regular XML syntax. The XPath expression is an XPath 1.0 expression that must result in one or more comment, processing-instruction, text, element nodes to be valid.
REPLACE //bk:price WITH{
<bk:price xmlns:bk="urn:bookstore-schema">Unavailable</bk:price>
<![CDATA[ Prices are being reduced by 20% or 25% if > $10 ]]>
<bk:rating xmlns:bk="urn:bookstore-schema" >9/10</bk:rating>
} IN COLLECTION bookstore/books.xml
RenameStatement | ::= | 'RENAME ' XPathExpression 'TO' QName 'IN' (CollectionClause (WhereClause)? | DocumentPath) |
This alters the name of the node(s) returned by the XPath expression to the one specified. The XPath expression is an XPath 1.0 expression that must result in one or more attribute or element nodes to be valid.
NAMESPACE bk = "urn:bookstore-schema"
RENAME //bk:bookstore TO library IN bookstore/books.xml
NAMESPACE bk = "urn:bookstore-schema"
RENAME //library TO bk:bookstore IN bookstore/books.xml
RENAME //@genre TO section IN COLLECTION bookstore
KNOWN ISSUE 1: The ability to perform joins and combine the
results of one or more queries on different documents is a feature that some would like but there
is difficulty in determining what syntax or semantics to use for specifying joins in SiXDML. A
suggestion by Kimbro Staken involves performing queries like
SELECT //node FROM /some/collection AS $x, //other-node FROM
/some/other/collection as $y WHERE $x/value = $y/value.
The problem is the results end up being two trees with no clean way to merge them. Research into
alternate syntax and semantics for joins is ongoing and suggestions would be welcome.
SixdmlQuery | ::= | (SixdmlStatement)+ |
SixdmlStatement | ::= | (NamespaceDecl)+
( CreateStatement | DropStatement | ConstrainStatement | InsertStatement | ShowStatement | SelectStatement | DeleteStatement | ReplaceStatement | RenameStatement) |
NamespaceDecl | ::= | 'NAMESPACE' NCName '=' URI |
CreateStatement | ::= | 'CREATE' ( CreateCollectionClause | CreateIndexClause ) |
CreateCollectionClause | ::= | CollectionClause ( 'CONSTRAINED BY'
URL )? |
CreateIndexClause | ::= | IndexClause 'OF TYPE' Nmtoken 'WITH KEY =' XPathExpression (',' KeyValuePair)* 'ON ' CollectionClause |
KeyValuePair | ::= | Nmtoken '=' (Char)+ |
DropStatement | ::= | 'DROP' ( CollectionClause | ( (DocumentPath | IndexClause) 'FROM' )| ( 'CONSTRAINTS ON') CollectionClause) ) |
ConstrainStatement | ::= | 'CONSTRAIN' CollectionClause 'WITH' URL |
InsertStatement | ::= | 'INSERT' ( InsertXmlClause | InsertDocumentClause | InsertAttributeClause) |
InsertDocumentClause | ::= | URL 'INTO' CollectionClause |
InsertXmlClause | ::= | InlineXml ( 'NAMED' DocumentName 'INTO' CollectionClause) | ( 'BEFORE' | 'INTO' | 'AFTER' ) XPathExpression 'IN' (CollectionClause (WhereClause)? | DocumentPath) |
InsertAttributeClause | ::= | 'ATTRIBUTE WITH NAME =' QName ', VALUE=' AttValue 'INTO' XPathExpression 'IN' (CollectionClause (WhereClause)? | DocumentPath) |
ShowStatement | ::= | 'SHOW' ( (IndexClause | 'CONSTRAINTS' | 'INDEXES') 'ON')? CollectionClause |
SelectStatement | ::= | 'SELECT' XPathExpression 'FROM'
(CollectionClause
(WhereClause)? |
DocumentPath) ( 'AND TRANSFORM WITH' ('XSLT' | 'XQUERY' | 'XPATH'| Nmtoken) 'IN' (URL | InlineXml) )? ('USING ROOT =' NmToken)? |
RenameStatement | ::= | 'RENAME ' XPathExpression 'TO' QName ( 'WITH NSURI =' NSName )? 'IN' (CollectionClause (WhereClause)? | DocumentPath) |
DeleteStatement | ::= | 'DELETE 'XPathExpression 'FROM' (CollectionClause (WhereClause)? | DocumentPath) |
ReplaceStatement | ::= | 'REPLACE' XPathExpression 'WITH' InlineXml 'IN' (CollectionClause (WhereClause)? | DocumentPath) |
IndexClause | ::= | 'INDEX' Nmtoken |
CollectionClause | ::= | 'COLLECTION' CollectionPath |
WhereClause | ::= | 'WHERE' XPathExpression |
CollectionPath | ::= | Nmtoken ( '/' Nmtoken )* |
DocumentPath | ::= |
(CollectionPath '/' )? DocumentName |
DocumentName | ::= | Nmtoken '.' Nmtoken |
InlineXml | ::= | '{' document | ( Comment | PI | CDSect | Element | CharData)+ '}' |
The following people contibuted ideas have gone into the design of SiXDML: Kimbro Staken, Tom Bradford, Mike Champion, Michael Rys, Sanjay Bhatia, Michael Brundage and Dr. Shamkant B. Navathe.