Open CLiX

an open source CLiXML Schema Validator.

Introduction

XML documents can be checked against a Schema (W3C XML Schema, Relax NG, DTD) whether their structure/syntax is correct. The expressiveness of these Schema languages respecting semantic properties is strongly limited. CLiXML (Constraint Language in XML) is a complementary language to the syntax schema languages which allows to describe semantic properties. CLiXML combines XML and XPath to a first order language. OpenCLiXML is an java open source implementation of the freely available CLiXML specification from Systemwire.

Why use CLiXML?

CLiXML is not the only language with the purpose to define semantic properties of XML documents. OASIS CAM and Schematron are the two most popular ones but have to follwing drawbacks that are corrected within CLiXML:

Examples

Example 1

rule: Each child element must have a higher id-attribute than it's parent.
With XML Schema it can be described, that there are elements having an unique id attribute, but constraints on the id attribute value of the child and parent elements can't be described.
A valid document looks like this:
<element id="1">
    <element id="2"/>
</element>
An invalid document:
<element id="2">
    <element id="1"/>
</element>
Since it is impossible to describe the semantic of this language in XML Schema we have to use the following CLiXML expression:
<rule id="ids">
<forall var="element" in="//*">
<forall var="child" in="$element/child()">
<greater op1="$element/@id" op2="$child/@id"/>
</forall>
</forall>
</rule>

Example 2 - referential integrity

An xml representation of a graph (consisting of vertices and edges) may look as the following:
<graph>
  <vertex id="0">
    <edge to="5"/>
    <edge to="6"/>
  </vertex>
  <vertex id="5"/>
  <vertex id="6"/>
</graph>
rule: Each edge must lead to an existing vertex.
An invalid document:
<graph>
  <vertex id="0">
    <edge to="5"/>
    <edge to="6"/>
  </vertex>
  <vertex id="5"/>
</graph>
here we have a edge leading to the not existing vertex 6. In schema we would use keyrefs (or for similar purposes the unique keyword). Unfortunately identifiers must be of type NCName. This is not the case in CLiXML, as it can be seen in the following CLiXML expressionm which assures referential integrity:
<rule id="valid_edges">
 <report>edge leads to non existing vertex $edgeID</report>
 <forall var="edge" in="/graph/vertex/edge">
  <exists var="vertex" in="/graph/vertex>
    <equal op1="$vertex/@id" op2="edge/@to" />
  </exists>
 </forall>    
</rule>

Download

See here or use the CVS Version:
cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/clixml login
cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/clixml co -P OpenCLiXML
Note that the Open CLiXML Schema Validator is still under heavy development. Use the CVS version if you want to have an up to date version.

How to use clixml

Get a jar version of clixml and run:
java -jar clixml-0.3.0.jar inputfile.xml
or
java -jar clixml-0.3.0.jar clixschema.clx inputfile.xml
A clixml schema file can be passed optional as parameter to the validator from command line. climxml schemas can also be bound to a document by adding schema elements with location attribute, directly under the root element of the document. schema elements can be in any namespace. example:
<graph>
  <clx:schema location="./referantial.clx">
  <clx:schema location="http://clixml.sf.net/loopfree.clx">

  <vertex id="0">
    <edge to="5"/>
    <edge to="6"/>
  </vertex>
  <vertex id="5"/>
  <vertex id="6"/>
</graph>

Links

Valid XHTML 1.1!


(c) 2005-2007 by Dominik Jungo