Introduction
XML documents can be checked against a Schema (W3C XML Schema, Relax NG, DTD) whether their structure/syntax is correct. The expressiveness of these Schema languages respecting semantic properties is strongly limited. CLiXML (Constraint Language in XML) is a complementary language to the syntax schema languages which allows to describe semantic properties. CLiXML combines XML and XPath to a first order language. OpenCLiXML is an java open source implementation of the freely available CLiXML specification from Systemwire.Why use CLiXML?
CLiXML is not the only language with the purpose to define semantic properties of XML documents. OASIS CAM and Schematron are the two most popular ones but have to follwing drawbacks that are corrected within CLiXML:- Missing hierarchical structure of tests. The whole complexity lies within a single test attribute of the assert element. CLiXML uses a deep element structure where the complexity of the attribute expressions are smaller and disperesed over a set of elements. This makes the schema less error prone and easier to process by other applications.
- Lacking recursive expressions. CLiXML has Macros that can be used recursively.
- Lacking predicators. CLiXML has predicators over which the validator can iterate.
Examples
Example 1
rule: Each child element must have a higher id-attribute than it's parent.With XML Schema it can be described, that there are elements having an unique id attribute, but constraints on the id attribute value of the child and parent elements can't be described.
A valid document looks like this:
<element id="1">
<element id="2"/>
</element>
<element id="2">
<element id="1"/>
</element>
<rule id="ids">
<forall var="element" in="//*">
<forall var="child" in="$element/child()">
<greater op1="$element/@id" op2="$child/@id"/>
</forall>
</forall>
</rule>
Example 2 - referential integrity
An xml representation of a graph (consisting of vertices and edges) may look as the following:<graph>
<vertex id="0">
<edge to="5"/>
<edge to="6"/>
</vertex>
<vertex id="5"/>
<vertex id="6"/>
</graph>
An invalid document:
<graph>
<vertex id="0">
<edge to="5"/>
<edge to="6"/>
</vertex>
<vertex id="5"/>
</graph>
<rule id="valid_edges">
<report>edge leads to non existing vertex $edgeID</report>
<forall var="edge" in="/graph/vertex/edge">
<exists var="vertex" in="/graph/vertex>
<equal op1="$vertex/@id" op2="edge/@to" />
</exists>
</forall>
</rule>
Download
See here or use the CVS Version:cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/clixml login cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/clixml co -P OpenCLiXML
How to use clixml
Get a jar version of clixml and run:java -jar clixml-0.3.0.jar inputfile.xml
java -jar clixml-0.3.0.jar clixschema.clx inputfile.xml
<graph>
<clx:schema location="./referantial.clx">
<clx:schema location="http://clixml.sf.net/loopfree.clx">
<vertex id="0">
<edge to="5"/>
<edge to="6"/>
</vertex>
<vertex id="5"/>
<vertex id="6"/>
</graph>