$Id: TODO.txt,v 1.56 2004/07/13 17:33:51 graham Exp $ HaskellRDF TODO =============== (As of Swish 0.2.1) General ------- Flesh out datatype subtyping. Need to provide access to subtyping via RDFLabel values (for BuiltInMap). Add subtype discovery (DatatypeRel access) to BuiltInMap. Add reflexivity case to datatype definition. Finish definition of rdfdr3 and sameDatatypedValue function. Think about: how to add rule that infers datatyped literals from plain literals for a given property, based on the availability of appropriate schema information. Needs "level-breaker" to pick lexical part from any literal, and to construct a new literal from its parts. Do this, write up and float to RDF-IG. Add variable scoping mechanism (list variables bound in graph?) Currently, antecedents for forward chaining are assumed to share bnodes -- this isn't generally a safe assumption (RDFRuleset.hs). Use variable scoping when merging -- new merge function? Add inference-discovery and proof-generation functions Extend N3 reader logic to allow input using HTTP GET. Extend N3 writer logic to allow output using HTTP PUT. Haddock-ize all comments and generate API documentation HARP (Haskell RDF Parser) ------------------------- Preparation: / Separate RDFLabel type from RDFGraph module / Sketch parser API (simple!): XML document -> [(RDFLabel,RDFLabel,RDFLabel)] (i.e. keep decoupled from Graph, so that parser can be used separately.) / API should support named graphs ala TRiX/Notation3; for RDF/XML just use XML base URI. - Review RDF syntax specification - Plan design (expect something like a monadic traversal of the XML source document) / Organize RDFHaskell into a hierarchy, ala: RDF RDF.Label RDF.Graph RDF.Harp RDF.Swish Test suite: / Create testing framework / Create some simple test cases / Obtain copy of RDF test cases. - Make sure each (parser) test case type is handled by the test suite Parser: / Create stub module and built test suite / Run should show multiple failures! - Implement parser to pass all tests Graph: - Provide means to load output of Parser Notation3: - Rework parser to use RDF/XML parser output interface (i.e. decouple from Graph) - Improve efficiency of Parser/Graph structure Bugs or clarification of required behaviour ------------------------------------------- Forward chaining and proof-check of RDF graphs currently assume that bnodes are shared between graphs. Think about: change RDF forward chaining to return a separate graph for each match of the antecedent: mainly, this will require a trivial change to RDFRuleset.graphClosureFwdApply, and may need some more thinking about handling of lists of graphs in the script processor. [Later] this would also require an explicit merge rule, so that graphs can be recombined; but the merge rule would have to take account of bnode renaming. So maybe its better to recombine the result graphs while we know that the union, not the merge, is the appropriate result. This position might change when proper bnode scoping is introduced. Think about handling of comparison functions (i.e. with no computed result) in constraint rules. I don't think it makes sense to infer a contradiction, so maybe the cardinality constraints should be dropped? Multiple variable binding modifiers in RDF closure rule don't have provision for using different sequencing to ensure compatibility with the bound variables supplied. This consideration may be overkill, but if needed, see modules SwishScript and RDFRuleset. It may be necessary to extend the graph closure to allow a list of variable binding modifiers, then compose them dynamically as needed. Decide how to handle statements with literal as subject (e.g. ex:node is ex:prop of "someLit" .) [Hint: if I do nothing for now, in the fullness of time we may find literal subjects are permitted, so all that remains is to adjust the parser.] RDFGraph.hs(188): -- The list of quoting options here is incomplete -- This has been moved to MiscHelpers RDFGraph, elsewhere: use of fmap, fmapM can result in a graph that contains duplicate statements, which in turn may affect some entailment tests. It seems that the type signature for map, fmap doesn't allow 'nub' tests to be used there. Maybe GraphEq should nub its arguments? The cost of nub at this point may be very acceptable. Or just using a better graph data structure. N3Parser.hs: Language tags are parsed as names, rather than using the more restricted syntax of RFC3066. VarBinding.hs Decide whether it is acceptable to have multiple results for a given variable usage pattern. If rules are defined consistently, this should happen only when the different results give the same answer. Improved functions: ------------------- Think about: adding rule composition function, so that an arbitrary pair of rules can be cascaded and used as a single rule. This could be useful, for example, when combining RDF entailment rules with subgraph entailment. Also, have an alternative-composition rule that applies all rules to the same input, and merges the results (and backward chaining yields alternatives based on the alternative rules?). (Note that these structures seem to be remarkably like Monad and MonadPlus. Develop idea of rules as monads?) Default base URI for N3 parser: supply document URI as argument, and use that. (Awaiting final decisions of RDFcore working group.) Rework the LookupMap class to use a hash table, or one of the available search structure libraries in place of a simple linear list. This should radically improve overall performance (according to profiling and my understanding of the code structure). Rework the Graph class to be more efficient, particularly with respect to the primitive access methods used by RDFQuery. Document function interfaces: rework code to use Haddock conventions, and generate function reference from that. New functions: -------------- Provide for construction of CWM-style rules by prescanning given rules for special properties and adding variable binding modifiers accordingly. Think about approach to backward chaining that evaluates new antecedents as soon as they are generated, and instantiates rule variables accordingly. (Does Haskell lazy evaluation mean that something like this already happens?) NTFormatter (N-triples output) N3Formatter improved output (initial implementation is focused mainly on round-tripping) Code improvements: ------------------ Combine check and explain proof functions in Proof.hs. (checkProof should use explainProof, and discard the explanation.) Sort out Formula naming mess in RDFGraph.hs. Rework DatatypeMod to use DatatypeRel logic as much as possible. Re-implement getDTMod based on this. (Module Datatype.hs) RDFGraph merge function: create new function that operates on a list of graphs, and optimize the common (one graph) case. Remove the existing merge function, if not used. Refactor RDFQuery module to use RDFQuerySimple for all graph access. Separate GraphClosure from module RDFRuleset. Note that listProduct is a specialization of 'sequence' for lists. Eliminate listProduct. Rework GraphMem.hs to properly separate the graph container structure from the graph label type. Now I have better understanding of multiparameter classes and kinds in type expressions, this should be doable. Tidy up N3Parser use of Parsec library, particularly with respect to token parsing. GraphMatch.hs(222): -- replace Equivalence class pair by (index,[lb],[lb]) ? GraphMatch.hs(223): -- possible optimization use of graphMapEq test GraphMatch.hs(297): -- replace test with is isJust try GraphTest.hs(687): -- test hash value collision on non-variable label N3Formatter.hs(243): -- use pattern for subject/property/object loops? N3Formatter: -- (a) Initial prefix list to include nested formulae; -- then don't need to update prefix list for these. -- (b) blank nodes used just once, can be expanded inline using -- [...] syntax. -- (c) generate multi-line literals when appropriate -- (d) more flexible terminator generation for formatted formulae -- (for inline blank nodes.) Done: ----- Eliminate un-needed references to QName, getQName (from ScopedName). Swish now uses module Namespace for all coped name handling. N3Parser.hs(804): / rework the URI parser to use the Parsec library URI parsing: (a) URI parser in N3 parser: doesn't permit base scheme name. (b) Rework URI parsing to use Parsec library. Create separate UnitTestHelper module for common test patterns. / Have created module TestHelpers, but individual test mdoules still define their own. ---------------------------------------------------------------------- $Log: TODO.txt,v $ Revision 1.56 2004/07/13 17:33:51 graham RDF/XML parser passes all test cases. Revision 1.55 2004/07/01 15:05:31 graham Moved Lavel and Vocabulary modules to separate directory Revision 1.54 2004/06/30 16:48:35 graham Remove references to module QName from Swish, which now uses module Namespace for all its scoped name handling. Revision 1.53 2004/06/30 11:34:16 graham Update Swish code to use hierarchical libraries for Parsec and Network. Revision 1.52 2004/02/09 22:22:44 graham Graph matching updates: change return value to give some indication of the extent match achieved in the case of no match. Added new module GraphPartition and test cases. Add VehicleCapcity demonstration script. Revision 1.51 2004/01/07 19:57:24 graham Reorganized RDFLabel details to eliminate separate language field, and to use ScopedName rather than QName. Removed some duplicated functions from module Namespace. Revision 1.50 2004/01/07 19:49:13 graham Reorganized RDFLabel details to eliminate separate language field, and to use ScopedName rather than QName. Removed some duplicated functions from module Namespace. Revision 1.49 2003/12/30 12:18:05 graham Update TODO notes Revision 1.48 2003/12/18 20:46:24 graham Added xsd:string module to capture equivalence of xsd:string and plain literals without a language tag Revision 1.47 2003/12/18 18:27:47 graham Datatyped literal inferences all working (except equivalent literals with different datatypes) Revision 1.46 2003/12/16 07:05:37 graham Working on updated RDFProofContext Revision 1.45 2003/12/11 19:11:07 graham Script processor passes all initial tests. Revision 1.44 2003/12/10 03:48:58 graham SwishScript nearly complete: BwdChain and PrrofCheck to do. Revision 1.43 2003/12/08 23:55:36 graham Various enhancements to variable bindings and proof structure. New module BuiltInMap coded and tested. Script processor is yet to be completed. Revision 1.42 2003/12/04 02:53:28 graham More changes to LookupMap functions. SwishScript logic part complete, type-checks OK. Revision 1.41 2003/11/28 20:26:05 graham Updated. Mention SwishScript.txt. Revision 1.40 2003/11/25 23:02:17 graham Reworked datatype variable modifier logic. Limited range of test cases so far all pass. Revision 1.39 2003/11/24 17:27:35 graham Separate module Vocabulary from module Namespace. Revision 1.38 2003/11/24 15:46:03 graham Rationalize N3Parser and N3Formatter to use revised vocabulary terms defined in Namespace.hs Revision 1.37 2003/11/19 22:13:03 graham Some backward chaining tests passed Revision 1.36 2003/11/14 21:48:35 graham First cut cardinality-checked datatype-constraint rules to pass test cases. Backward chaining is still to do. Revision 1.35 2003/11/12 20:44:24 graham Added some vocabulary to Namespace. Enhaced ScopedName to allow null namespace prefixes, following N3 display conventions. Revision 1.34 2003/11/11 21:02:55 graham Working on datatype class-constraint inference rule. Incomplete. Revision 1.33 2003/11/07 21:45:47 graham Started rework of datatype to use new DatatypeRel structure. Revision 1.32 2003/10/24 21:05:08 graham Working on datatype inference. Most of the variable binding logic is done, but the rule structure still needs to be worked out to support forward and backward chaining through the same rule. Revision 1.31 2003/10/22 15:47:46 graham Working on datatype inference support. Revision 1.30 2003/10/16 16:01:49 graham Reworked RDFProof and RDFProofContext to use new query binding framework. Also fixed a bug in the variable binding filter code that caused failures when a variable used was not bound. Revision 1.29 2003/10/14 20:31:07 graham Add separate module for generic variable binding functions. Revision 1.28 2003/10/09 17:16:13 graham Added test cases to exercise features of rules used to capture RDF semantics. Also added proof test case using XML literal. Revision 1.27 2003/10/09 13:58:59 graham Sync with CVS. Preparing to eliminate QueryBindingFilter in favour of using just QueryBindingModifier. Revision 1.26 2003/09/24 18:50:52 graham Revised module format to be Haddock compatible. Revision 1.25 2003/09/24 13:50:46 graham QName handling separated from RDFGraph module, and QName splitting moved from URI module to QName module.