$Id: TODO.txt,v 1.48 2003/12/18 20:46:24 graham Exp $ HaskellRDF TODO =============== ... possible release point? Add variable scoping mechanism (list variables bound in graph?) Currently, antecedents for forward chaining are assumed to share bnodes -- this isn't generally a safe assumption (RDFRuleset.hs). Use variable scoping when merging -- new merge function? Flesh out datatype subtyping. Need to provide access to subtyping via RDFLabel values (for BuiltInMap). Add subtype discovery (DatatypeRel access) to BuiltInMap. Add reflexivity case to datatype definition. Finish definition of rdfdr3 and sameDatatypedValue function. Add inference-discovery and proof-generation functions Extend N3 reader logic to allow input using HTTP GET. Extend N3 writer logic to allow output using HTTP PUT. Haddock-ize all comments and generate API documentation Bugs or clarification of required behaviour ------------------------------------------- Forward chaining and proof-check of RDF graphs currently assume that bnodes are shared between graphs. Think about: change RDF forward chaining to return a separate graph for each match of the antecedent: mainly, this will require a trivial change to RDFRuleset.graphClosureFwdApply, and may need some more thinking about handling of lists of graphs in the script processor. [Later] this would also require an explicit merge rule, so that graphs can be recombined; but the merge rule would have to take account of bnode renaming. So maybe its better to recombine the result graphs while we know that the union, not the merge, is the appropriate result. This position might change when proper bnode scoping is introduced. Think about handling of comparison functions (i.e. with no computed result) in constraint rules. I don't think it makes sense to infer a contradiction, so maybe the cardinality constraints should be dropped? Multiple variable binding modifiers in RDF closure rule don't have provision for using different sequencing to ensure compatibility with the bound variables supplied. This consideration may be overkill, but if needed, see modules SwishScript and RDFRuleset. It may be necessary to extend the graph closure to allow a list of variable binding modifiers, then compose them dynamically as needed. Decide how to handle statements with literal as subject (e.g. ex:node is ex:prop of "someLit" .) [Hint: if I do nothing for now, in the fullness of time we may find literal subjects are permitted, so all that remains is to adjust the parser.] URI parser in N3 parser: doesn't permit base scheme name. RDFGraph.hs(188): -- The list of quoting options here is incomplete RDFGraph, elsewhere: use of fmap, fmapM can result in a graph that contains duplicate statements, which in turn may affect some entailment tests. It seems that the type signature for map, fmap doesn't allow 'nub' tests to be used there. Maybe GraphEq should nub its arguments? The cost of nub at this point may be very acceptable. Or just using a better graph data structure. N3Parser.hs: Language tags are parsed as names, rather than using the more restricted syntax of RFC3066. VarBinding.hs Decide whether it is acceptable to have multiple results for a given variable usage pattern. If rules are defined consistently, this should happen only when the different results give the same answer. Improved functions: ------------------- Think about: adding rule composition function, so that an arbitrary pair of rules can be cascaded and used as a single rule. This could be useful, for example, when combining RDF entailment rules with subgraph entailment. Also, have an alternative-composition rule that applies all rules to the same input, and merges the results (and backward chaining yields alternatives based on the alternative rules?). (Note that these structures seem to be remarkably like Monad and MonadPlus. Develop idea of rules as monads?) Default base URI for N3 parser: supply document URI as argument, and use that. (Awaiting final decisions of RDFcore working group.) Rework the LookupMap class to use a hash table, or one of the available search structure libraries in place of a simple linear list. This should radically improve overall performance (according to profiling and my understanding of the code structure). Rework the Graph class to be more efficient, particularly with respect to the primitive access methods used by RDFQuery. Document function interfaces: rework code to use Haddock conventions, and generate function reference from that. New functions: -------------- Provide for construction of CWM-style rules by prescanning given rules for special properties and adding variable binding modifiers accordingly. Think about approach to backward chaining that evaluates new antecedents as soon as they are generated, and instantiates rule variables accordingly. Does Haskell lazy evaluation mean that something like this already happens? NTFormatter (N-triples output) N3Formatter improved output (initial implementation is focused mainly on round-tripping) RDFParser (from XML) Framework for handling datatyped literals. (Awaiting final decisions of RDFcore working group.) Revise label structure when RDFCore confirms new approach to literal language tags. RDFGraphTest.hs(154) RDFGraph.hs(129) Code improvements: ------------------ Combine check and explain proof functions in Proof.hs. (checkProof should use explainProof, and discard the explanation.) Revise RDFGraph to use ScopedName rather than QName. Sort out Formula naming mess in RDFGraph.hs. Rework DatatypeMod to use DatatypeRel logic as much as possible. Re-implement getDTMod based on this. (Module Datatype.hs) RDFGraph merge function: create new function that operates on a list of graphs, and optimize the common (one graph) case. Remove the existing merge function, if not used. Refactor RDFQuery module to use RDFQuerySimple for all graph access. Separate GraphClosure from module RDFRuleset. Note that listProduct is a specialization of 'sequence' for lists. Eliminate listProduct. Rework GraphMem.hs to properly separate the graph container structure from the graph label type. Now I have better understanding of multiparameter classes and kinds in type expressions, this should be doable. Rework URI parsing to use Parsec library. Tidy up N3Parser use of Parsec library, particularly with respect to token parsing. Create separate UnitTestHelper module for common test patterns. See VarBindingTest for starters. See also RDFProofTest. See also RDFDatatypeXsdIntegerTest. (testElem) RDFGraph.hs: eliminate type Language, and use (Maybe String) in its place. GraphMatch.hs(222): -- replace Equivalence class pair by (index,[lb],[lb]) ? GraphMatch.hs(223): -- possible optimization use of graphMapEq test GraphMatch.hs(297): -- replace test with is isJust try GraphTest.hs(687): -- test hash value collision on non-variable label N3Formatter.hs(243): -- use pattern for subject/property/object loops? N3Formatter: -- (a) Initial prefix list to include nested formulae; -- then don't need to update prefix list for these. -- (b) blank nodes used just once, can be expanded inline using -- [...] syntax. -- (c) generate multi-line literals when appropriate -- (d) more flexible terminator generation for formatted formulae -- (for inline blank nodes.) N3Parser.hs(804): -- rework the URI parser to use the Parsec library ParseURI.hs(355): -- factor higher order function $Log: TODO.txt,v $ Revision 1.48 2003/12/18 20:46:24 graham Added xsd:string module to capture equivalence of xsd:string and plain literals without a language tag Revision 1.47 2003/12/18 18:27:47 graham Datatyped literal inferences all working (except equivalent literals with different datatypes) Revision 1.46 2003/12/16 07:05:37 graham Working on updated RDFProofContext Revision 1.45 2003/12/11 19:11:07 graham Script processor passes all initial tests. Revision 1.44 2003/12/10 03:48:58 graham SwishScript nearly complete: BwdChain and PrrofCheck to do. Revision 1.43 2003/12/08 23:55:36 graham Various enhancements to variable bindings and proof structure. New module BuiltInMap coded and tested. Script processor is yet to be completed. Revision 1.42 2003/12/04 02:53:28 graham More changes to LookupMap functions. SwishScript logic part complete, type-checks OK. Revision 1.41 2003/11/28 20:26:05 graham Updated. Mention SwishScript.txt. Revision 1.40 2003/11/25 23:02:17 graham Reworked datatype variable modifier logic. Limited range of test cases so far all pass. Revision 1.39 2003/11/24 17:27:35 graham Separate module Vocabulary from module Namespace. Revision 1.38 2003/11/24 15:46:03 graham Rationalize N3Parser and N3Formatter to use revised vocabulary terms defined in Namespace.hs Revision 1.37 2003/11/19 22:13:03 graham Some backward chaining tests passed Revision 1.36 2003/11/14 21:48:35 graham First cut cardinality-checked datatype-constraint rules to pass test cases. Backward chaining is still to do. Revision 1.35 2003/11/12 20:44:24 graham Added some vocabulary to Namespace. Enhaced ScopedName to allow null namespace prefixes, following N3 display conventions. Revision 1.34 2003/11/11 21:02:55 graham Working on datatype class-constraint inference rule. Incomplete. Revision 1.33 2003/11/07 21:45:47 graham Started rework of datatype to use new DatatypeRel structure. Revision 1.32 2003/10/24 21:05:08 graham Working on datatype inference. Most of the variable binding logic is done, but the rule structure still needs to be worked out to support forward and backward chaining through the same rule. Revision 1.31 2003/10/22 15:47:46 graham Working on datatype inference support. Revision 1.30 2003/10/16 16:01:49 graham Reworked RDFProof and RDFProofContext to use new query binding framework. Also fixed a bug in the variable binding filter code that caused failures when a variable used was not bound. Revision 1.29 2003/10/14 20:31:07 graham Add separate module for generic variable binding functions. Revision 1.28 2003/10/09 17:16:13 graham Added test cases to exercise features of rules used to capture RDF semantics. Also added proof test case using XML literal. Revision 1.27 2003/10/09 13:58:59 graham Sync with CVS. Preparing to eliminate QueryBindingFilter in favour of using just QueryBindingModifier. Revision 1.26 2003/09/24 18:50:52 graham Revised module format to be Haddock compatible. Revision 1.25 2003/09/24 13:50:46 graham QName handling separated from RDFGraph module, and QName splitting moved from URI module to QName module.