A Schematron sample: Count child nodes in batch mode for many files

Schematron is a powerful and useful technology to complement the standard W3C Schema and DTD validator in XML ValidatorBuddy. However, it is often complicated to configure an XML editor to apply a Schematron schema to any number of files in an easy to use and simple way.

In this post I want to illustrate how easy it can be to apply a Schematron schema to many files at once in XML ValidatorBuddy. I will use the batch-task configuration dialog in the editor to setup and run a batch task which can also be used to create standard (W3C/DTD) batch validator tasks.

The Schematron schema checks and reports how many child elements the root element has:

<?xml version="1.0" encoding="iso-8859-1"?>
<iso:schema xmlns="http://purl.oclc.org/dsdl/schematron" xmlns:iso="http://purl.oclc.org/dsdl/schematron" xmlns:sch="http://www.ascc.net/xml/schematron" queryBinding="xslt2" schemaVersion="ISO19757-3">
  <iso:title>ISO Schematron sample file.</iso:title>
  <iso:pattern id="root.elementcount">
    <iso:title>Count child elements of root</iso:title>
    <iso:rule context="/*">
      <iso:report test="count(*)">
        <iso:value-of select="count(*)"/> elements
      </iso:report>
      <iso:assert test="count(*) >= 1">A root element without any child elements</iso:assert>
    </iso:rule>
  </iso:pattern>
</iso:schema>

Use the “Configure and run a batch task…” command of the XML ValidatorBuddy editor to setup the batch. You can find this command in the context menu of the File Explorer tab or in the “File Explorer” menu.

Batch settings for Schematron in dialog

Schematron batch settings

Set the starting folder for the batch task in the “Start folder for batch” group.

To define a Schematron batch select “Schematron” at the “Parser to use:” field, check the “Use external schema” option and insert the path to the Schematron schema at the related edit field. If the batch will process a large number of files it is also good to select the “Use xinclude for efficient log file creation” option. In this case the log is created as two files. The first one is then the main log file which will include the second file, containing all of the <file> elements for each document, using a xinclude element. This makes it possible to write the complete XML log without the need to keep it in memory during the batch.

After you clicked the “Run batch” button the batch task is executed in a separate process and a XML log file will be created while the batch is running. For each document a file element is added to the log containing the following information if the root node has some child elements:

<file done="true" is_schema="false" path="C:UsersDocumentsxmlxmlschema2006-11-06msDataadditionalipo_s1.xml" schema_dtd_ref="false" time="06/08/12 16:08:10" valid="false">
  <general>Checked against C:UsersDocumentsxmlSchematroncount_root_child_elements.sch</general>
  <results>
    <result type="error">
      <position col="0" line="0"/>
      <description><![CDATA[3 elements]]></description>
    </result>
  </results>
</file>

And like this if there are no child elements:

<file done="true" is_schema="false" path="C:UsersDocumentsxmlxmlschema2006-11-06msDataadditionalisdefault001.xml" schema_dtd_ref="false" time="06/08/12 16:08:10" valid="false">
  <general>Checked against C:UsersDocumentsxmlSchematroncount_root_child_elements.sch</general> 
  <results>
    <result type="error">
      <position col="0" line="0"/>
      <description><![CDATA[A root element without any child elements]]></description>
    </result>
  </results>
</file>

The log is constantly written while the batch is running. Use the Large File Viewer in XML ValidatorBuddy to open also huge log files after the batch has been completed.

»crosslinked«

Leave a Reply