TOC 
Nine by NineG. Klyne
 Nine by Nine
 Feb 11, 2004

Semantic Web Inference Scripting with Haskell

Abstract

This memo describes the software package Swish, which is being developed to conduct experiments in using the programming language Haskell as a basis for scripting inference-based processing of Semantic Web data.

© 2003, 2004 G. Klyne; All rights reserved.

$Id: swish-0.2.1.html,v 1.1 2004/02/11 21:44:02 graham Exp $



 TOC 

Table of Contents




 TOC 

1. Introduction

This memo describes Swish, a software package that is intended to provide a framework for building programs in Haskell that perform inference on RDF [1] data.

Swish was primarily intended to be used as a starting point for developing new RDF applications in Haskell, but it also includes a stand-alone program that can be used to perform some simple scripted manipulation of RDF data. Currently, only the Notation 3 [6] serialization form is supported, but I'd like to add support for full RDF/XML in due course.

The software can be downloaded from the web by following links from http://www.ninebynine.org/RDFNotes/Swish/Intro.html. This software is made generally available under the GNU General Public Licence (GPL), version 2 (or later), a full copy of which is available at http://www.gnu.org/licenses/gpl.html, and also as file GPL.TXT in the Swish software distribution. Other licensing arrangements may be negotiated with the author: see LICENSING.TXT in the Swish software distribution.



 TOC 

2. Background

The Swish software package was developed as a result of experience using CWM [7], an off-the-shelf general-purpose RDF processing program, for evaluating simple inference rules on network access control information expressed in RDF [22]. Specifically, while the general inference capabilities of CWM were almost sufficient for the network access application, some capabilities were required that are unlikely to be provided by any completely general-purpose tool; e.g. analysis of IP network addresses by subnet and host address.

Additionally, the framework for datatyped literals currently proposed by the RDFcore working group [10] is quite open-ended, and it is not specified how generic applications may provide support for new or non-standard datatypes.

In light of these considerations, I sought ways of combining the full expressive capability of a general purpose programming language with the declarative style of inference rules and formal specifications. Using Haskell [13], a pure functional programming language, is the approach I have adopted.

To use Haskell as a basis for performing inference on RDF data, certain capabilities are neded:

Swish aims to provide these capabilities. Further, it provides capabilities to compare RDF graphs (insensitive to possible renaming of blank nodes), and to merge RDF graphs, renaming blank nodes as necessary to prevent unintended semantic consequences [3].

I anticipate that the main use for Swish will be as a support library for new utilities that apply predefined RDF inference rules. Where CWM is a general-purpose tool for manipulating RDF, I expect to use Swish as a toolkit for creating tools to deal with specific RDF processing requirements. In time, this may lead to identification of some useful capabilities that can guide the design of future general-purpose RDF processing tools. Swish also includes a self-contained program that can be used to perform simple operations on RDF data and to script simple inferences processing, thus providing a degree of CWM-like functionality in a ready-to-use program.

The programming language Haskell was chosen for a number of reasons:

More information about Haskell can be found at [13]. A useful paper discussing some particular characteristics of functional programming languages is [20].



 TOC 

3. Introduction to Swish

Swish is a framework for performing deductions in RDF data using a variety of techniques. Swish is conceived as a toolkit for experimenting with RDF inference, and for implementing stand-alone RDF file processors It explores Haskell as "a scripting language for the Semantic Web".

Swish is a work-in-progress, and currently incorporates:

3.1 Notation 3 input and output

Swish contains a Notation3 parser and formatter.

The parser does not recognize all of the very latest additions to Notation3, but seems to be able to deal with a good number of Notation3 files that appear in the wild.

The formatter is fairly primitive, focusing on round- tripping: generating Notation3 data that can be accepted by the parser.

3.2 RDF graph isomorphism testing and merging

Swish has logic for performing RDF graph isomorphism testing (graphs are considered equivalent if they can be made exactly equal by some reallocation of bnode identifiers). This is used mainly for software testing purposes.

Sometimes, for diagnostic purposes, it is useful to have some idea of the actual differences betwen a pair of non-identical graphs. Swish provides a capability to display such differences.

Swish also incorporates logic for merging graphs, renaming bnode identifiers as needed to avoid merging of bnodes from different graphs.

3.3 Inference framework based on logical inference rule concept

The Swish inference framework uses as its foundation a generic concept of an inference rule, having the following characteristics:

Not every rule is required to fully support both forward and backward chaining, but all are expected to support the proof-checking mode.

The underlying rule definition is not specific to RDF graphs, and is polymorphic in the type of expression to which the rule applies. Other modules in the Swish framework specialize the rule to apply to RDF graphs.

Axioms and rules are collected into rulesets. A collection of one or more rulesets can be used as a proof context

3.4 RDF Horn-style rule implementation

Swish provides a generic implementation of the rule framework that deduces a single specific conclusion from a simple conjunction of antecedent RDF graphs.

The antecedent and consequent graphs are specified as graph patterns that may contain variable nodes in place of any of the RDF nodes or properties. The deduction is considerd to be true for any substitution of the variale nodes through the whole rule (i.e. the variable nodes are universally quantified within the scope of the rule).

These rules can be used in both forward- and backward- chaining modes. Backward chaining may involve the introduction of new bnodes (existential variables). [[[Note: the code does not currently properly handle the scoping of such introduced existential variables -- to do this will require an extension to the RDF graph structure used.]]]

This kind of rule is implemented by a combination of simple graph query and back substitution. (Both forward and backward chaining operate similarly in this respect, but the details do vary.)

The basic query/substitution operation of simple rules can be modified through the use of variable binding modifiers and variable binding filters. These are used to manipulate the variable bindings obtained from the query phase, before performing the substitution phase. In a simple case, a variable binding filter (which is just a special case of a variable binding modifier) may cause some of the variable bindings to be discarded if they do not satisfy certain criteria; e.g. some rules are applicable only to certain forms of RDF graph node.

More complex variable binding modifier may actually add new variable bindings to those obtained from the query phase; e.g. the RDF "allocated to" rule that appears in the RDF formal semantics specification is handled by such means. The early design of Swish also anticipated that datatype-specific operations (e.g. integer addition) might be introduced through this mechanism.

3.5 Class restriction rule implementation

A second style of general purpose inference rule implementation provided by Swish is based on the notion of a class restriction, the basic idea of which is described by Pan and Horrocks [12] as a generalization of the Owl notion of a class restriction.

This form of inference rule has been implemented because it appears to be a more useful way of accessing datatype-related inferences. My analysis [11] is that this form allows a cleaner separation of datatype-specific inference functionality from application-specific use of the datatype properties.

Datatype inference capabilities built into Swish can, in principle, be accessed from RDF data without the need of any new parsing capability beyond standard RDF.

3.6 RDF formal semantics entailment rule implementation

Built-in to Swish is an implementation of almost all of the entailment rules described in the RDF Semantics specification. With the exception of subgraph entailment and simple entailment, these are all implemented as instances of the generic Horn-style rule with variable binding modifiers, described above. Subgraph entailment and simple entailment are implemented by special-purpose inference rules written in Haskell. (Some of the axiom and rule implementations are slightly restricted to avoid infinite graphs being generated by their application.)

The intent of these implementations is that they permit the Swish program to be used to check proofs based on the RDF core semantics.

The entailment rule whose implementation is not yet provided is equvalence of values with differing datatypes (rule 'rdfD3' in [3]).

3.7 Framework for datatypes

Swish implements a datatype model based on that defined for RDF (and XML schema datatypes). For each datatype, it embodies:

To these, Swish adds:

The last two provide overlapping functionality, in that they provide alternative ways to incorporate datatype computations into an inference process (e.g. arithmetic operations). One of the goals of this software is to allow experimentation with different styles of inference primitive. The advantage of the class restriction approach is that it provides access to datatype computations within the basic syntactic framework of RDF; the variable binding modifier approach requires the introduction of syntactic construct for formulae to define the antecedent and consequent graphs for Horn-style rules.

3.8 Swish main program

The Swish main program can be run as a stand-alone executable (when compiled using GHC, or by using the RunHugs utility), or from an invocation of the runSwish function from within a Hugs or GHCi interactive session.

Facilities available through command line options include:

A script file provides access to a range of Swish inference capabilities, shown in the example below, in Script file example below.



 TOC 

4. Description of Swish software

Swish comprises a number of modules that can be invoked by Haskell programs, and a stand-alone command-line utility that can be used to perform some basic processing of RDF data.

The Haskell source code for the stand-alone utility may also be used as a starting point for similar utilities that perform specific application processing of RDF data.

4.1 Swish software components

The following components are specific to Swish, and are found in a subdirectory called HaskellRDF:

Currently, the only supported RDF graph serialization format is Notation3, but future developments may add support for other formats. RDF/XML would clearly be most desirable. Meanwhile, utilities such as CWM [7] can be used convert RDF/XML to and from Notation 3 format.

4.2 Generic software components

The following generic components are not specific to Swish, and are found in a subdirectory called HaskellUtils:

There are also some work-in-progress modules in subdirectories of HaskellUtils that are intended to be offered as replacements for modules in the hierarchical library that ships with GHC and Hugs. Currently, Swish does not use these modules.

4.3 Swish command format

The Swish utility is a command-line utility that performs some simple RDF processing functions. The capabilities provided are with a view to testing the underlying RDF library software rather than performing any particular application purpose.

A Swish command contains a one or more command line options that are processed from left-to-right. The Swish program maintains an internal graph workspace, which is updated or referenced as the command options are processed.

Swish command options:

-?
Displays a summary of the command line options.
-n3
Indicates that Notation3 be used for subsequent input and output. (Currently, this is the only format option, and is selected by default.)
-i[=file]
read file into the graph workspace, replacing any existing graph. If the filename is omitted, the graph is read from standard input.
-m[=file]
read and merge file with the graph workspace. Blank nodes in the input file are renamed as necessary to avoid node identifiers already used by the existing graph. If the filename is omitted, the graph is read from standard input.
-c[=file]
read file and compare the resulting graph with the workspace. Graph comparison is done in a fashion that treats isomorphic graphs as equivalence, and is insensitive to renaming of blank nodes. This is intended to match the definition of graph equivalence in the RDF abstract syntax specification [4]. If the filename is omitted, the graph is read from standard input. If the graphs are unequal, the exit status code is 1.
-d[=file]
read a file and compare the resulting graph with the workspace, diusplaying a list of differences between the graphs. This is a diagnostic facility, intended to help developers or RDF authors to identify specifically where RDF graphs may differ.
-o[=file]
write the graph workspace to a file. If the filename is omitted, the graph is written to the standard output.
-s[=file]
executes a script from the named file. The script file format is described below.

The Swish program terminates with a status code that indicates the final status of the operation(s) performed. Haskell distinguishes between a success status code whose value is not specified, assumed to be system dependent, and a failure code which is associated with an integer value. The status code values returned by Swish are:

Success
Operation completed successfully; graphs compare equal.
1
Graphs compare different.
2
Input data file incorrect format.
3
File access problem.
4
Incorrect option in command line.
5
Error in script file.

Here are some example Swish command lines:

4.4 Function interfaces

[[[To be provided; until then see the source files, notably SwishCommands.hs, GraphClass.hs, RDFGraph.hs, and the various test modules.]]]



 TOC 

5. Swish script file format

The script syntax is loosely based on Notation3, but it is a quite different language, except that embedded graphs (enclosed in {...}) are presented using Notation3 syntax.

A script file provides access to a range of Swish inference capabilities. The following script file commands are currently provided:

@prefix
@prefix pref : <uri> .

Define a namespace prefix and URI.

The prefix thus defined is available for use in any subsequent script command, and also in any graphs contained within the script file. (So, prefix declarations do not need to be repeated for each graph contained within the script.) Graphs read from external files must contain their own prefix declarations.

Define named graph or list
name :- graph
name :- ( graph* )

where graph is a Notation 3 graph expression enclosed in braces, or a name. A name is a qname (prefix:local) or a URI enclosed in angle brackets.

Defines a named graph or list of graphs.

@read
@read name <uri>?

Read and name a graph from a given filename or standard input. (Input from an http: URI is not yet implemented.)

@write
@write name <uri>? ; comment

Write a named graph to a named file or to standard output. The comment text is included at the start of the output. (Output to an http: URI is not yet implemented.)

@merge
@merge ( name* ) => name

Create a new named graph that is the merge two or more graphs, renaming bnodes as required to avoid node-merging.

@compare
@compare name name

Compare two graphs for isomorphism, setting the Swish exit status to reflect the result.

@asserteq
@asserteq name name ; comment

Test two graphs or lists of graphs for isomorphism, reporting if they differ. The comment text is included with any report generated.

@assertin
@assertin name name ; comment

Test if a graph is isomorphic to a member of a list of graphs, reporting if no match is found. The comment text is included with any report generated.

@rule
@rule name :- ( name* ) => name
@rule name :- ( name* ) => name | ( (name var*)* )

Define a named Horn-style rule.

The list of names preceding => are the antecedent graphs (which may contain variable nodes of the form ?var.

The name following => is the consequent graph (which may also contain variable nodes of the form ?var).

The final part, if present, is a list of variable binding modifiers, each of which consists of a name and a list of variables ?var to which the modifier is applied. Variable binding modifiers are built in to Swish, and are used to incorporate datatype value inferences into a rule.

@ruleset
@ruleset name :- ( name* ) ; ( name* )

Define a named ruleset (a collection of axioms and rules). The first list of names are the axioms that are part of the ruleset, and the second list are the rules.

@constraints
@constraints name :- ( name* ) | ( name* )

Define a named ruleset containing class-restriction rules based on a datatype value constraint (see section 4.5 of [11]). The first list of names is a list of graphs that together comprise the class-restriction definitions (rule names are the names of the corresponding restriction classes). The second list of names is a list of datatypes whose datatype relations are referenced by the class restriction definitions.

@proof
@proof name ( name* )
@input name
@step name ( name* ) => name
:
@result name

Check a proof, reporting the step that fails, if any. See also: @input, @step, @result.

The @proof line names the proof and specifies a list rulesets (proof context) used. The remaining lines specify the input expression, proof steps and final result that is demonstrated by the proof.

@input
In a proof, indicates an input expression upon which the proof is based. Exactly one of these immediately follows the @proof command.

@step
In a proof, defines a step of the proof. Any number of these immediately follow the @input command.

It indicates the name of the rule applied for this step, a list of antecedent graphs, and a named graph that is deduced by this step.

(For convenience, the deduced graph may introduce a new named graph using an expression of the form: name :- { statements }.)

@result
In a proof, introduces the goal of the proof, and completes a proof definition. Exactly one of these immediately follows the @step commands.

(For convenience, the result statement may introduce a new named graph using an expression of the form: name :- { statements }.)

@fwdchain
@fwdchain name name ( name* ) => name

Define a new graph obtained by forward-chaining a rule. The first name is the ruleset to be used. The second name is the rule name. The list of names are the antecedent graphs to which the rule is applied. The name following the => names a new graph that is the result of formward chaining from the given antecedents using the indicated rule.

@bwdchain
@bwdchain name name name <= name

Define a new list of alternative graphs obtained by backward-chaining a rule. The first name is the ruleset to be used. The second name is the rule name. The third name (before <=) is the name of a goal graph from which to backward chain. The final name (after <=) names a new list of graphs, each of which is an alternative antecedent from which the given goal can be deduced using the indicated rule.

5.1 Script file example

This example of a script file demonstrates use of most of the features described above.

# -- Example Swish script --
#
# Comment lines start with a '#'
#
# The script syntax is loosely based on Notation3, but it is a quite 
# different language, except that embedded graphs (enclosed in {...})
# are encoded using Notation3 syntax.
#
# -- Prefix declarations --
#
# As well as being used for all labels defined and used by the script
# itself, these are applied to all graph expressions within the script 
# file, and to graphs created by scripted inferences, 
# but are not applied to any graphs read in from an external source.

@prefix ex:  <http://id.ninebynine.org/wip/2003/swishtest/> .
@prefix pv:  <http://id.ninebynine.org/wip/2003/swishtest/pv/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix xsd_integer: <http://id.ninebynine.org/2003/XMLSchema/integer#> .
@prefix rs_rdf:  <http://id.ninebynine.org/2003/Ruleset/rdf#> .
@prefix rs_rdfs: <http://id.ninebynine.org/2003/Ruleset/rdfs#> .
@prefix :   <http://id.ninebynine.org/default/> .

# Additionally, prefix declarations are provided automatically for:
#    @prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#    @prefix rdfs:  <file:///E:/Download/www.w3.org/2000/01/rdf-schema#> .
#    @prefix rdfd:  <http://id.ninebynine.org/2003/rdfext/rdfd#> .
#    @prefix rdfo:  <http://id.ninebynine.org/2003/rdfext/rdfo#> .
#    @prefix owl:   <http://www.w3.org/2002/07/owl#> .


# -- Simple named graph declarations --

ex:Rule01Ant :- { ?p ex:son ?o . }

ex:Rule01Con :- { ?o a ex:Male ; ex:parent ?p . }

ex:TomSonDick :- { :Tom ex:son :Dick . }

ex:TomSonHarry :- { :Tom ex:son :Harry . }


# -- Named rule definition --

@rule ex:Rule01 :- ( ex:Rule01Ant ) => ex:Rule01Con


# -- Named ruleset definition --
#
# A 'ruleset' is a collection of axioms and rules.
#
# Currently, the ruleset is identified using the namespace alone;
# i.e. the 'rules' in 'ex:rules' below is not used.  
# This is under review.

@ruleset ex:rules :- (ex:TomSonDick ex:TomSonHarry) ; (ex:Rule01)

# -- Forward application of rule --
#
# The rule is identified here by ruleset and a name within the ruleset.

@fwdchain ex:rules ex:Rule01 { :Tom ex:son :Charles . } => ex:Rule01fwd

# -- Compare graphs --
#
# Compare result of inference with expected result.
# This is a graph isomorphism test rather than strict equality, 
# to allow for bnode renaming.
# If the graphs are not equal, a message is generated
# The comment (';' to end of line) is included in any message 
# generated

ex:ExpectedRule01fwd :- { :Charles a ex:Male ; ex:parent :Tom . }  

@asserteq ex:Rule01fwd ex:ExpectedRule01fwd
   ; Infer that Charles is male and has parent Tom

# -- Display graph --
#
# Write graph as Notation3 to standard output.
# The comment is included in the output.

@write ex:Rule01fwd ; Charles is male and has parent Tom

# -- Write graph to file --
#
# The comment is included at the head of the file.
# (TODO: support for output to Web using HTTP.)

@write ex:Rule01fwd <Example1.n3> ; Charles is male and has parent Tom

# -- Read graph from file --
#
# Creates a new named graph in the Swish environment.
# (TODO: support for input from Web using HTTP.)

@read ex:Rule01inp <Example1.n3>

# -- Proof check --
#
# This proof uses the built-in RDF and RDFS rulesets, 
# which are the RDF- and RDFS- entailment rules described in the RDF
# formal semantics document.
#
# To prove:
#     ex:foo ex:prop "a" .
# RDFS-entails
#     ex:foo ex:prop _:x .
#     _:x rdf:type rdfs:Resource .
#
# If the proof is not valid according to the axioms and rules of the 
# ruleset(s) used and antecedents given, then an error is reported 
# indicating the failed proof step.

ex:Input01 :- { ex:foo ex:prop "a" . }

ex:Result :- { ex:foo ex:prop _:a . _:a rdf:type rdfs:Resource . }

@proof ex:Proof01 ( rs_rdf:rules rs_rdfs:rules )
  @input  ex:Input01
  @step   rs_rdfs:r3 ( rs_rdfs:a10 rs_rdfs:a39 )
          => ex:Step01a :- { rdfs:Literal rdf:type rdfs:Class . }
  @step   rs_rdfs:r8 ( ex:Step01a )
          => ex:Step01b :- { rdfs:Literal rdfs:subClassOf rdfs:Resource . }
  @step   rs_rdfs:r1 ( ex:Input01 )
          => ex:Step01c :- { ex:foo ex:prop _:a . _:a rdf:type rdfs:Literal . }
  @step   rs_rdfs:r9 ( ex:Step01b ex:Step01c )
          => ex:Step01d :- { _:a rdf:type rdfs:Resource . }
  @step   rs_rdf:se  ( ex:Step01c ex:Step01d )   => ex:Result
  @result ex:Result

# -- Restriction based datatype inferencing --
#
# Datatype inferencing based on a general class restriction and
# a predefined relation (per idea noted by Pan and Horrocks).

ex:VehicleRule :-
  { :PassengerVehicle a rdfd:GeneralRestriction ;
      rdfd:onProperties (:totalCapacity :seatedCapacity :standingCapacity) ;
      rdfd:constraint xsd_integer:sum ;
      rdfd:maxCardinality "1"^^xsd:nonNegativeInteger . }

# Define a new ruleset based on a declaration of a constraint class
# and reference to built-in datatype.
# The datatype constraint xsd_integer:sum is part of the definition 
# of datatype xsd:integer that is cited in the constraint ruleset
# declaration.  It relates named properties of a class instance.

@constraints pv:rules :- ( ex:VehicleRule ) | xsd:integer

# Input data for test cases:

ex:Test01Inp :-
  { _:a1 a :PassengerVehicle ;
      :seatedCapacity "30"^^xsd:integer ;
      :standingCapacity "20"^^xsd:integer . }

# Forward chaining test case:

ex:Test01Fwd :- { _:a1 :totalCapacity "50"^^xsd:integer . }

@fwdchain pv:rules :PassengerVehicle ex:Test01Inp => :t1f
@asserteq :t1f ex:Test01Fwd  ; Forward chain test

# Backward chaining test case:
#
# Note that the result of backward chaining is a list of alternatives,
# any one of which is sufficient to derive the given conclusion.

ex:Test01Bwd0 :-
  { _:a1 a :PassengerVehicle .
    _:a1 :totalCapacity "50"^^xsd:integer .
    _:a1 :seatedCapacity "30"^^xsd:integer . }

ex:Test01Bwd1 :-
  { _:a1 a :PassengerVehicle .
    _:a1 :totalCapacity "50"^^xsd:integer .
    _:a1 :standingCapacity "20"^^xsd:integer . }

# Declare list of graphs:

ex:Test01Bwd :- ( ex:Test01Bwd0 ex:Test01Bwd1 )

@bwdchain pv:rules :PassengerVehicle ex:Test01Inp <= :t1b

@asserteq :t1b ex:Test01Bwd  ; Backward chain test

# Can test for graph membership in a list

@assertin ex:Test01Bwd0 :t1b ; Backward chain component test (0)
@assertin ex:Test01Bwd1 :t1b ; Backward chain component test (1)

# -- Merge graphs --
#
# Merging renames bnodes to avoid collisions.

@merge ( ex:Test01Bwd0 ex:Test01Bwd1 ) => ex:Merged

# This form of comparison sets the Swish exit status based on the result.

ex:ExpectedMerged :-
  { _:a1 a :PassengerVehicle .
    _:a1 :totalCapacity "50"^^xsd:integer .
    _:a1 :seatedCapacity "30"^^xsd:integer .
    _:a2 a :PassengerVehicle .
    _:a2 :totalCapacity "50"^^xsd:integer .
    _:a2 :standingCapacity "20"^^xsd:integer . }

@compare ex:Merged ex:ExpectedMerged

# End of example script



 TOC 

6. Software installation

The Swish software is distributed as a single ZIP archive. Start installation by creating an empty directory for the software, and extracting the content of the ZIP archive into that directory. Select a ZIP option that uses directory information from the archive so that the sub-directory structure is preserved.

The ZIP archive should unpack into two main subdirectories of the installation directory: HaskellRDF contains Swish, RDF-specific software modules and test data. HaskellUtils contains generic software modules that may be used by other programs. When compiling or running Haskell programs, this directory is assumed to be on the list of directories that are searched for library sounce code.

The following sections deal with how get get the software running in different Haskell environments. The instructions relate to MS Windows operating systems, but it should be fairly obvious how to adapt the procedures for Unix/Linux systems.

6.1 System requirements

Swish is written entirely in Haskell, and should work with any Haskell system that supports Haskell 98 together with the extensions noted below. The software has been tested using Hugs [14] (version November 2003), Glasgow Haskell Compiler (GHC) [15] (version 6.2) and the interactive version of GHC (GHCi).

The required extensions to standard Haskell-98 are:

Some freely available additional Haskell libraries are used, as described later. For convenience, these are included with the Swish software distribution, but are not themselves part of the Swish software for licensing purpose. More details are given later.

My development has been performed mostly using Hugs on a 1.3GHz PC with 256Mb of memory (more recently, 768Mb). For most purposes, this has been more than adequate. Some of the larger test cases, and the more perverse graph comparisons, may take several minutes to run on this platform (SwishTest takes about 20 minutes). In practice, I think the applications are likely to be more demanding than basic requirements of Swish itself.

The consolidated test program cannot be compiled using Hugs as distributed, but the individual test cases can all be run. To run the consolidated test program, Hugs must be re-build (from source) with an enlarged code storage area. Such a version, compiled for Win32 platforms, is available at: http://www.ninebynine.org/Software/FatHugs.html

6.2 Installation files

The Swish software distribution includes the following files, where subdirectories are indicated relative to the installation directory:

HaskellRDF subdirectory
Swish-0.2.1.html, Swish-0.2.1.xml: this documentation file, and XML source code.
*.hs: Haskell source files (see software overview above).
*Test.hs: unit test Haskell source files.
*.ss: Swish scripting test file(s).
*.bat: MS-Windows command files for building and testing the software using GHC.
*.txt: additional information, including licensing details.
HaskellUtils subdirectory
*.hs: Haskell source files (see software overview above).
*Test.hs: unit test Haskell source files.
HaskellRDF/Data subdirectory
Contains Notation3 data files used by the SwishTest program.
HaskellRDF/Parsec subdirectory
Contains the Parsec library used by Swish.
HaskellRDF/HUnit subdirectory
Contains the HUnit library used by Swish test modules.
HaskellRDF/Sort subdirectory
Contains the Quicksort library used by Swish. (References to this module can be removed, and the standard Haskell function List.sort used in place of QuickSort.)
HaskellRDF/Dfa subdirectory
Contains the Deterministic Finite-state Automoton library used by Swish. (Currently, this is used only for checking the syntax of an integer literal.)

6.3 System dependent details

6.3.1 Installation using Hugs

Running the Swish software under Hugs is straightforward. The Hugs option -98 must be specified.

Special steps that might help include:

The full settings reported by my Hugs installation are:

Current settings: +fewuiRWXN -stgGl.qQkoOIHT -h5000000 -p"%s> " -r$$ -c40
Search path     : -P.;{Hugs}/libraries;{Hugs}/oldlib;
                      {Hugs}/libraries/win32;
                      {Hugs}/libraries/Parsec;
                      {Hugs}/libraries/Dfa;
                      {Hugs}/libraries/Sort;
                      {Hugs}/libraries/Hunit;
                      ../HaskellUtils
Project Path    :
Source suffixes : -S.hs;.lhs
Editor setting  : -E"C:\\Program Files\\TextPad 4\\TextPad.exe"
Preprocessor    : -F
Compatibility   : Hugs Extensions (-98)
               

6.3.2 Installation using GHCi

Running the Swish software under GHCi is almost as easy as using Hugs. GHCi command line options used include '-fglasgow-exts' and '-i..\HaskellUtils; F:\Haskell\Lib\HUnit; F:\Haskell\Lib\Parsec; F:\Haskell\Lib\Sort; F:\Haskell\Lib\Dfa' (adjusted according to the directories containing the library files). Working under MS-Windows, I find it convenient to create a desktop shortcut to run GHCi, specifying the Swish source directory and other options as properties of the shortcut.

To run a program in the GHCi command interpreter, follow the same procedure that is described for running a program under Hugs. The GHCi and Hugs command shells are very similar.

There is a GHCi initialization file '.ghci' included with the Swish software distribution, which, if placed in the appropriate startup directory, is read automatically by GHCi and defines some convenient commands for running the non-interactive GHC compiler from within the GHCi shell. This file will need editing to reflect the actual directories used for the Swish software.

6.3.3 Installation using GHC

MS-Windows command scripts have been prepared to compile and run the Swish software in an MS-Windows command window. It should be straightforward to create Unix equivalents using information from these. The relevant files are:

The file ghcc.bat assumes a standard GHC installation, with the GHC compiler on the current search path. It will probably need to be edited to reflect the actual locations of the support libraries used.

Once the programs have been compiled and linked, they can be run in the usual way by using entering the program name at a command prompt. The test programs do not expect any command line options and run to completion. The program Swish.exe takes command line options as descriped above.

6.4 Installation testing

A Swish installation under GHC can be tested by running the program TestSwishAll. [[[The stand-alone test programs used by the command script TestSwish.bat cannot be built with the current release of GHC]]], and ensuring that all tests complete with zero errors. On a 1.7GHz PC running Windows 2000, the tests take a few minutes to complete.

To test the installation from an interactive shell, the test programs need to be loaded and executed individually. Alternatively, the consolidated test program, SwishTestAll, can be loaded and run under GHCi, or under Hugs with an enlarged code storage area, as noted above. To confirm a successful installation, it is probably sufficient to load SwishMain, and evaluate the following expression, which should complete quite quickly (about 30 seconds under Hugs on a 1.3GHz PC):

runSwish "-s=SwishScript.ss"

The output should indicate that two proofs were checked OK:

Proof satisfied: ex:Proof01
Proof satisfied: ex:Proof02

For a more complete test, run SwishTest which takes about 20 minutes on the same system.

(For comparison, when compiled with GHC to a stand-alone application, SwishTest runs in about 1-2 minutes. This suggests that for serious inference work, Swish should be compiled with GHC.)

6.5 Additional libraries used

Swish uses some additional libraries that are not part of the Swish software, but which are included with the Swish software distribution for the convenience of users.

Please note that these support libraries are distributed under their own licensing terms and conditions, which I have reproduced below where available. Please contact the respective authors for further information.

6.5.1 Parsec

Parsec [16] is a monadic parser combinator library for Haskell. I found it to be excellently documented and generally easy to use. It also serves as a useful introduction, to using monads in Haskell.

6.5.1.1 Parsec licence

Copyright 1999-2000, Daan Leijen. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

This software is provided by the copyright holders "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright holders be liable for any direct, indirect, incidental, special, exemplary, or consequential damages ( including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.

6.5.2 Quicksort

Quicksort is part of a collection of sorting functions in haskell, published by Ralf Hinze [18].

At the time of writing, I can find no claim for copyright or distribution licensing terms.

6.5.3 HUnit

HUnit [17] is a unit testing framework for Haskell, loosely modelled on the JUnit framework that is popular with Java programmers.

Swish application code does not use HUnit, but the test programs do make extensive use of it.

6.5.3.1 HUnit licence

HUnit is Copyright (c) Dean Herington, 2002, all rights reserved, and is distributed as free software under the following license.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT ( INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

6.5.4 Dfa

DFA is a deterministic finite state automaton based pattern matching engine, written by Andrew Bromage and available from Sourceforge [19].

6.5.4.1 Dfa licence

DFA is copyright of Andrew Bromage, and is distributed as free software under the following license:

Copyright © 2003 Andrew J. Bromage

This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

A copy of version 2.1 of this licence is included in the Swish distribution in file lgpl.txt.



 TOC 

7. Extending Swish

Swish is designed to be readily extensible in the following ways. For the most part, these extensions involve the addition of new Haskell code, and very small changes to existing code.

7.1 Adding datatypes

The primary means for extending Swish capabilities is anticipated to be the addition of new datatypes.

A new datatype can be added as a single module, written in Haskell, following the pattern of an existing datatype module.

If only one of the available datatype inference styles is required to be supported, then the other can be left undefined. If an application attempts to use a datatype inference style that is not defined, a program run-time error will occur.

7.2 Adding rulesets, axioms and inference rules

A ruleset is a named collection of axioms and inference rules. A ruleset might be built into a Swish-derived program to express some body of domain knowledge. All built-in rulesets are accessible in the Swish script language (this is how the standard RDF and RDFS entailment rules are accessed).

7.3 Adding new forms of inference rule

Swish contains two built-in styles of generic inference rule:

It also contains a small number of specific hand-coded inference rules to describe some RDF core semantics:

It would be entirely possible to write a complete new style of inference rule in Haskell, and expose it to the script processor via a new ruleset. There are, however, some limitations to doing generic inference rule patterns in this way, as the script processor would have to be enhanced to provide a way to instantiate rules based on such a generic pattern.

7.4 Adding new script language commands

The script processor has a fairly simple modular structure, with a parser that compiles the script to a higher order monadic function (incorporating the IO monad), and then evaluates that function. Adding new functions would be reasonably easy for an experienced Haskell programmer, but it does require an understanding of the way that Haskell uses monads for managing state and I/O.

A "compiled" script, returned by the script parser, is an instance of the 'SwishStateIO' monad. This is a function that can be executed simply by applying it to an initial state value.

7.5 New program using the existing libraries

For specific applications, it may be most effective to ditch the script processor and simply write a new Haskell main program to perform the required processing. This could be done based on the existing SwishStateIO monad, or by using that as an example to guide the definition of an application-specific monad, and using the existing code as a starting point for the new program. Using the existing code in this way means that the details of command line access and setting up the Haskell I/O environment are mostly taken care of.

The parsing, formatting and inference functions are all defined as pure functions, so there should be no difficulty in taking these and incorporating them into some other Haskell program.

The script parser itself is also a pure function, so it could be used to compile an application-specific script that is embedded within a program, and then to execute that script with an appropriate initial state.



 TOC 

8. Conclusions and future work

Swish is very much a work-in-progress, and the present release is another step along a path with many possible options for future developments.

This step in the development of Swish adds some RDF inference tools, including support for datatype-aware deductions. My intent is to revisit my earlier work [22], and learn how that work may be served by the inference framework that has been introduced. Version 0.2.1 introduces a graph partitioning module, which provides the basis of a facility in Swish to isolate and display differences between graphs. It is intended that future versions will use this as the basis of smarter Notation 3 and XML output formatting.

The Swish code itself is far from perfect, and there is much additional functionality and improvement that can be made. But it does pass an extensive array of tests, and I believe it is quite stable and functional.

The following are some features that I would like to see added to Swish:

The software distribution contains a file named TODO.TXT, which lists in greater detail a number of specific possible enhancements that have been identified to date.



 TOC 

9. Acknowledgements

I would like to thank the following, whose previous work has been most helpful to me (though, of course, they bear no responsibility for the shortcomings of my work):

This document has been authored in XML using the format described in RFC 2629 [5], and converted to HTML using the XML2RFC utility developed by Marshall Rose (http://xml.resource.org/).



 TOC 

References

[1] Beckett, D., "RDF/XML Syntax Specification (Revised)", W3C Recommendation rdf-syntax-grammar, February 2004 (HTML).
[2] Brickley, D. and R. Guha, "RDF Vocabulary Description Language 1.0: RDF Schema", W3C Recommendation rdf-schema, February 2004 (HTML).
[3] Hayes, P., "RDF Semantics", W3C Recommendation rdf-mt, December 2003 (HTML).
[4] Klyne, G. and J. Carroll, "Resource Description Framework (RDF): Concepts and Abstract Syntax", W3C Recommendation rdf-concepts, December 2003 (HTML).
[5] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, June 1999.
[6] Berners-Lee, T., "Notation3: Logic and Rules on RDF", Design Issues Ideas about Web Architecture - yet another notation, 1998.
[7] Berners-Lee, T., "Cwm (closed world machine)", September 2002.
[8] Carroll, J., "Matching RDF Graphs", July 2001.
[9] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", May 2003.
[10] "World Wide Web Consortium: RDFcore working group".
[11] Klyne, G., "Using datatype-aware inferences with RDF", November 2003.
[12] Horrocks, I. and J. Pan, "Web Ontology Reasoning with Datatype Groups", 2003.
[13] "Haskell community web site".
[14] "Hugs online: Hugs98 web site".
[15] "The Glasgow Haskell Compiler (GHC) web site".
[16] Leijen, D., "Daan online: Parsec", Oct 2001.
[17] Herington, D., "HUnit - Haskell Unit Testing", Feb 2002.
[18] Ralf, R., "A library of sorting routines", Apr 2002.
[19] Andrew, A., "DFA: Deterministic Finite-state Automaton", Oct 2003.
[20] Hughes, J., "Why Functional Programming Matters", 1984.
[21] Thompson, S., "Haskell: The Craft of Functional Programming, Second Edition", Addison-Wesley ISBN 0-201-34275-8, 1999.
[22] Klyne, G., "Using RDF for Home Network Configuration", December 2002.
[23] Brickley, D. and K. Sharp, "European Semantic Web Advanced Development", 2002.
[24] Miller, L., "SWAD-Europe: Developer Workshop Report 2 - Semantic Web calendaring", 2002.
[25] Rose, M., "xml2rfc at xml.resource.org".


 TOC 

Author's Address

  Graham Klyne
  Nine by Nine
EMail:  GK-swish@ninebynine.org
URI:  http://www.ninebynine.net/


 TOC 

Appendix A. Revision history

2003-05-30:
  • Document initially created.

2003-12-19:
  • Document updated for Swish 0.2.0.

2004-02-11:
  • Document updated for Swish 0.2.1:
  • Implemented consolidated test program (needs recompiled Hugs). All modules compile with GHC 6.2 and Hugs Nov-2003 releases.
  • Reorganized source code into HaskellRDF and Haskell Utils directories.
  • Added graph-difference capability.
  • Updated RDF references to W3C Recommendation of 2004-02-10.



 TOC 

Appendix B. Unresolved issues



 TOC 

Appendix C. CVS revision log

$Log: swish-0.2.1.html,v $
Revision 1.1  2004/02/11 21:44:02  graham
Swish release docs to CVS

Revision 1.4  2004/02/11 17:54:25  graham
Fix some typos.

Revision 1.3  2004/02/11 17:47:07  graham
Refer to TestSwishAll rather then TestSwish.bat.

Revision 1.2  2004/02/11 17:42:04  graham
Re-arrange some artwork text.

Revision 1.1  2004/02/11 17:37:48  graham
Update for Swish 0.2.1

Revision 1.5  2003/12/20 13:24:53  graham
Fix up files to compile and test with GHCi 5.04.3

Revision 1.4  2003/12/20 12:00:14  graham
Introduced new TraceHelpers module for Hugs-2003 compatibility.

Revision 1.3  2003/12/19 21:29:33  graham
Minor edits

Revision 1.2  2003/12/19 16:34:26  graham
Fix up document details

Revision 1.1  2003/12/19 15:50:20  graham
Updated documentation for Swish 0.2

Revision 1.6  2003/06/03 16:17:44  graham
Fix another typo

Revision 1.5  2003/06/03 11:31:13  graham
Fix typos in documentation

Revision 1.4  2003/05/31 00:11:21  graham
Fix various typos and omissions

Revision 1.3  2003/05/30 19:12:57  graham
Fixed some document typos and added RDF semantics reference

Revision 1.2  2003/05/30 18:37:25  graham
First formatted version of Swish documentation

Revision 1.1  2003/05/30 16:41:22  graham
Swish documentation, initial version.