Network working group G. Klyne, Clearswift Internet draft 9 April 2002 Expires: October 2002 An XML format for mail and other messages Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC 2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress". To view the entire list of current Internet-Drafts, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu (US West Coast). Copyright Notice Copyright (C) The Internet Society 2001. All Rights Reserved. Abstract This document describes a coding of email and other messages in XML. This coding is intended for use by XML applications that exchange information about such messages. Discussion of this document Send comments to . To subscribe to this list, send a message with the body 'subscribe' to . Klyne Internet draft [Page 1] XML coding of RFC822 messages 9 April 2002 Table of contents 1. Introduction.............................................3 1.1 Structure of this document ...........................3 1.2 Document terminology and conventions .................4 1.3 About MIME and XML ...................................5 2. Message structures.......................................6 2.1 Message header overview ..............................6 2.2 Multipart/related message structure ..................7 2.3 Inline XML message structure .........................8 2.4 Content type Message/Email+XML .......................8 3. Message header...........................................9 3.1 The element ................................9 3.2 Content of element .........................10 3.3 Use of XML namespaces ................................10 3.4 The element ................................11 3.5 General form of header field elements ................12 3.6 RFC822-derived header elements .......................12 3.7 Header fields containing addresses ...................13 3.7.1 Header fields containing address groups..........14 3.8 Header elements containing human readable text .......15 3.9 MIME header fields ...................................15 3.10 Other header fields .................................15 3.10.1 Mandatory extensions............................16 4. Summary of RFC822-derived header elements................17 5. IANA considerations......................................17 6. Internationalization considerations......................18 6.1 International URIs in XML ............................19 7. Security considerations..................................19 8. Acknowledgements.........................................20 9. References...............................................20 10. Author's address........................................23 Appendix A: Message/Email+XML content-type registration.....24 Appendix B: DTD for Email+XML message format................24 Appendix C: XML schema for Email+XML message format.........24 Appendix D: RDF representation of Email+XML message.........24 Appendix E: RDF schema for Email+XML message format.........25 Appendix F: Amendment history...............................25 Appendix G: Outstanding issues..............................26 Full copyright statement....................................26 Klyne Internet draft [Page 2] XML coding of RFC822 messages 9 April 2002 1. Introduction This document describes a coding of email and similar messages (such as RFC822 [1]) using XML [2], described here as the Email+XML message format. The present document is presented as a design that can be used by XML applications that deal with email and similar messages. The XML coding is designed to address the following goals: o to fully capture the semantics of Internet email messages, per RFC822 [1]. However it is not intended to provide a loss-less coding of RFC822 syntax. o to extend the scope of address information that can be conveyed to arbitrary URIs [3]. o to take account of 8-bit clean transfer environments. o to fully support, where applicable, international character sets and languages within the message header and content [4,5]. o to be usable in MIME [6] and pure XML [2] transfer environments. o to be fully compliant with the XML [2] and XML namespace [9] specifications. o to allow header information to be compatible with RDF format [10], for use by generalized metadata processing applications. 1.1 Structure of this document Section 2 describes the overall message structure, showing how the message header and message content can be conveyed in MIME and XML transfer environments. Section 3 describes the message header in greater detail, with particular reference to differences in the value of individual fields compared their RFC822 counterparts. Section 4 discusses issues that may arise when converting between traditional RFC822 and the Email+XML message format described here. Appendix A contains a MIME content-type registration for Message/Email+XML. Klyne Internet draft [Page 3] XML coding of RFC822 messages 9 April 2002 Appendix B contains a DTD for the Email+XML message format. Appendix C contains an XML schema for the Email+XML message format. (XML schema are set to replace DTDs are the prferred way to describe XML docoment content.) Appendix D briefly discusses the RDF representation [10] and its applicability to the Email+XML message format. Appendix E contains an RDF schema [23] description for the Email+XML message format. 1.2 Document terminology and conventions Message an assemblage of information that constitutes a communication of information from a sender to one or more recipients. Consists of a message header and message content. Message header contains information about the message that is conveyed between message user agents, and not used by the message transfer mechanisms. This may include who the message is from, who it is addressed to, other parties to whom it has been copied, subject of the message, date the message was composed, etc. Message content some arbitrary data carried in a message. Email+XML is the message format defined by this document. (This name uses the XML content type labelling convention [11].) The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [19]. NOTE: Comments like this provide additional nonessential information about the rationale behind this document. Such information is not needed for building a conformant implementation, but may help those who wish to understand the design in greater depth. [[[Editorial comments and questions about outstanding issues are provided in triple brackets like this. These working comments should be resolved and removed prior to final publication.]]] Klyne Internet draft [Page 4] XML coding of RFC822 messages 9 April 2002 1.3 About MIME and XML There has been much discussion about the relative merits of MIME and XML. The position of this document is that they serve different purposes, and are complementary rather than alternatives. MIME is a framework primarily for encapsulating and composing arbitrary data entities, and offers the following capabilities: o Content type labelling. o Transfer encoding for handling arbitrary data on restricted channels. o Assembly of different kinds of data into composite entities. o End of data detection without need to parse or understand the data content. XML is a framework primarily for describing data structures, including semi-structured document data, and offers the following capabilities: o Construction of arbitrary data structures based on an annotated tree model. o Fine-grained labelling of structure components and data attributes. o Cross-linking between data structure components. o A standard format for interchange of structured information between diverse systems. There is, of course, some overlap in capabilities, and reasonable people may disagree about the appropriateness of using MIME and/or XML in particular circumstances. This document is predicated on the idea that XML is a useful mechanism (in addition to existing facilities) for structuring message header information. It aims to be agnostic with regard to using MIME or some other framework for composing and encapsulating messages. Klyne Internet draft [Page 5] XML coding of RFC822 messages 9 April 2002 2. Message structures A message consists of a message header and message content: o The message header contains information about the message: who it was sent by, who it is addressed to, its subject, date it was sent, and many other related pieces of information. o The message content is any data that is carried by the message: e.g. a text message, fax image, voice message or arbitrary application data. In principle, any data that can be transfered as a MIME object can be message content, though specific applications may limit the kinds of data that can be transferred. The Email+XML message format uses a URI-reference [3] in the message header to reference the message content. Thus, the message content may be completely separate from the message header; the message header is the root information of a message, from which message content may be discovered. Two specific message structure scenarios are contemplated here: o Multipart/related, and o An XML element within the message header. These are described below. Other message structures are possible (e.g. multiple resources on a web server, multiple channels in a multiplexed protocol), but are not described here. 2.1 Message header overview The message header is an XML document whose root element is . This contains a number of elements; an initial set of such elements is defined based on RFC822 message headers [1]. The message content is indicated by an attribute of the element whose value is a URI-reference for the content. The message header is discussed in greater detail in section 3 below. Klyne Internet draft [Page 6] XML coding of RFC822 messages 9 April 2002 2.2 Multipart/related message structure A message whose content is formatted as a MIME object [6] may be sent as a Multipart/related object [15]: Content-type: multipart/related; boundary="boundary"; start="<1@100Aker.org>"; type="message/Email+XML" --boundary Content-type: Message/Email+XML Content-ID: <1@100Aker.org> mailto:Pooh@PoohCorner.100Aker.org Winnie the Pooh mailto:Piglet@BeechTree.100Aker.org MR SANDERS Woozle Hunting --boundary Content-Type: text/plain;charset=UTF-8 Content-ID: <2@100Aker.org> I have Been Foolish and Deluded I am a Bear of No Brain at All --boundary-- In this case, the Multipart/related contains two MIME parts: o the message header, and o the message content. Klyne Internet draft [Page 7] XML coding of RFC822 messages 9 April 2002 The Multipart/related content-type header indicates the root of the message by its Content-ID value [6]. In turn, the message header refers to the message content with a element 'content=' attribute whose value is a 'cid:' URI [16]. 2.3 Inline XML message structure When the message content can be expressed as simple text or XML, it may be included within the message header using a element containing the message content instead of a 'content=' attribute. Content-type: Message/Email+XML mailto:Christopher.Robin@GreenDoor.org Christopher Robin mailto:Pooh@PoohCorner.100Aker.org Winnie the Pooh Re: Woozle hunting You're the Best Bear in All the World This example shows the message contained within a single Message/Email+XML MIME object. The element indicates the message content. When present, this element MUST be the last element contained in a element. 2.4 Content type Message/Email+XML This specification defines a new MIME content-type called Message/Email+XML. Klyne Internet draft [Page 8] XML coding of RFC822 messages 9 April 2002 A Message/Email+XML entity contains an XML document conforming to the DTD known by the SYSTEM identifier 'urn:ietf:params:xml:dtd:email-xml:', per [24]. The document may contain and declarations, but these are not required. The body of the document is a element, as described below. The character set encoding used in a Message/Email+XML entity is UTF-8. A Content-type registration template for Message/Email+XML is contained in Appendix A of this document. 3. Message header The Email+XML message header contains header fields based on RFC822, and coded in XML. The message header contains information about the message that is conveyed between message user agents, and not used by the message transfer mechanisms. This may include who the message is from, who it is addressed to, other parties to whom it has been copied, subject of the message, date the message was composed, etc. The message header also contains a reference to the message content, as described in the previous section. 3.1 The element The element contains the message header, and references the message content. Possible attributes are: o 'xmlns=' or 'xmlns:tag=' is used to indicate a default XML namespace or XML namespace tag [9] that applies to the entire element. o 'content=' specifies a URI-reference [3] that references the message content, if such content is not contained inline in a '' element. Typically, the value is a 'cid:' URI as described in the previous section. Other message content URI values are possible, but such use is beyond the scope of this specification. Klyne Internet draft [Page 9] XML coding of RFC822 messages 9 April 2002 o 'xml:lang=' [2] may be used, in which case it specifies the language of any text in the message header, except where overridden by an 'xml:lang=' attribute of an enclosed element. 3.2 Content of element The content of a element is: o a sequence of zero of more header field elements, and o an optional element. Header field elements may appear in any order. When present, the element MUST the last one in the . The element MUST contain either a 'content=' attribute or a single element. It must not contain both. 3.3 Use of XML namespaces The element,
and related element names, the element and element names name are all associated with a namespace called 'URN:ietf:params:email-xml:'. RFC822 header element names are associated with a namespace called 'URN:IANA:namespace:rfc822:'. (These namespace identifiers are based on "A URN Sub-namespace for Registered Protocol Parameters" [20].) The namespaces must be declared, either as a default namespace or using a namespace prefix (which is an arbitrary local name). The namespace declaration may appear as an attribute of the element, or in the surrounding XML context. Klyne Internet draft [Page 10] XML coding of RFC822 messages 9 April 2002 The message examples in section 2 use namespace prefixes 'emx:' and 'rfc822', but any prefix could be used here. Here is a different message example using a default namespace rather than a namespace prefix for the non-RFC822-derived names: Content-type: Message/Email+XML
im:Eeyore@ThistlyCorner.100Aker.org Eeyore
Anyone Why? Wherefore? Inasmuch as which?
3.4 The element The element is used to include the message content as text or XML data in the message header. It is present when the element does not have a 'content=' attribute. Possible attributes are: o 'type=' is optional, and indicates the MIME content-type of the message content. If not specified, a content type of "text/xml" is assumed. (Whatever MIME content-type may be declared, the message content must be well-formed XML or character data. In practice, this means the content must be some character-based data representation.) o 'xml:lang=' [2] may be used, in which case it specifies the language of the message content. Klyne Internet draft [Page 11] XML coding of RFC822 messages 9 April 2002 The character encoding for the message content is the same as that used for the surrounding XML. This is typically UTF-8, from the character set encoding of the MIME content-type Message/Email+XML.) The message content may be any well-formed XML, which includes simple character data. Characters '<' and '&' that are not part of XML markup MUST be represented as '< and '&' respectively. The character '>' appearing in the sequence ']]>', other than at the end of a CDATA section, MUST be represented as '>'. 3.5 General form of header field elements Each header field is represented by an XML element that identifies the field. The element content is the header field value. For RFC822 and MIME header fields, the field value is character data in which the characters '<', '&' and '>' are represented as for character data in (see above). 3.6 RFC822-derived header elements For representing information about email messages, this specification introduces message header elements with names and semantics based on RFC822 header fields [1]. The intent is that the semantics of any RFC822 header field is easily represented in an Email+XML header element; it is not a goal to capture the detailed syntax of any particular RFC822 message, or to construct a corresponding RFC822 message from any Email+XML message. RFC822-derived header elements have names based on RFC822 header names, using all lower-case characters (noting that XML element names are case sensitive). RFC822-derived header elements are associated with an XML namespace, as noted above at section 3.3, and may need to be combined with a namespace prefix if it is not the default namespace. (See examples in sections 2.2 and 2.3.) RFC822-derived header element contents have the same syntax and meaning as corresponding RFC822 header field values, except that: o Characters are not limited to US-ASCII. UTF-8 character set encoding is typically used. o Encoded words ('=?...?=') are not needed, and no special processing is defined for sequences of this form. Klyne Internet draft [Page 12] XML coding of RFC822 messages 9 April 2002 o Special considerations apply to fields containing address values (from, to, etc.) -- see section below. o Special considerations apply to fields containing human-readable text values (subject, comments, etc.) -- see section below. 3.7 Header fields containing addresses Parts of an RFC822 address value are separated out into separate elements, all contained within an
element. The element types defined here are and . A major change from RFC822 is that all addresses are presented as URIs, rather than as RFC822 'addr-spec' values. Email addresses (the only kind that appear in RFC822 headers) are expressed as 'mailto:' URLs [21]. Address URIs are enclosed in an element. This change anticipates that XML-based message headers may be used with a variety of different protocols with different addressing schemes. Finally, only one address per message header element is allowed (or an address group: see below). Where permitted, multiple values are represented by repeating the header element for each value. Note that characters in URIs are drawn from a limited repertoire; the URI '%' escape sequence may be used to represent other characters that are legal for the URI scheme used [14]. The RFC822 address structures using 'phrase' are supported. The 'phrase' is a "formal name", and is enclosed in a element. The RFC822 structures using source-route values (i.e. 'route' in 'route-addr') are not supported. RFC822 'comment' values within addresses are not supported. Thus, RFC822 e-mail addresses that might be expressed as: Piglet@TrespassersW.100Aker.org (MR SANDERS) which is generally equivalent to: MR SANDERS must be presented in the form: mailto:Piglet@TrespassersW.100Aker.org Klyne Internet draft [Page 13] XML coding of RFC822 messages 9 April 2002 MR SANDERS Any '<', '&' and certain '>' characters appearing in a formal name ( element) MUST be represented using '<', '&' or '>' as noted previously in section 3.4. 3.7.1 Header fields containing address groups Some RFC822 headers can have address group values as well as just address values. The RFC822 'group' structure associates a collection of addresses with a name for that collection. The individual addresses in a group may be omitted. An address group is expressed using a element containing the name of the group and zero, one or more elements each containing an
: Christopher-Robins-friends mailto:Pooh@PoohCorner.100Aker.org Winnie the Pooh mailto:Piglet@TrespassersW.100Aker.org MR SANDERS
im:Eeyore@ThistlyCorner.100Aker.org Eeyore
Omitting the individual member addresses, this would be: Christopher-Robins-friends Klyne Internet draft [Page 14] XML coding of RFC822 messages 9 April 2002 3.8 Header elements containing human readable text Header fields that contain human readable text MAY have an 'xml:lang=' attribute of the header element to indicate a language for the contained text. In the absence of such an attribute, any language applicable to the surrounding XML is to be assumed. 3.9 MIME header fields MIME content header fields MAY be part of the message header, using the same general format and XML namespace as RFC822-derived header fields (i.e. element name based on the MIME header field name, and associated with the same XML namespace). But note that most MIME header fields are not appropriate for use with the Email+XML message format. When the message content is supplied as a separate MIME entity then MIME content header fields SHOULD be applied to that entity. It is expected that MIME header fields may be useful in the following circumstances: o When the message content is included as inline XML, to convey information about it that cannot be conveyed using native XML mechanisms; e.g. the Content-features header [22]. o MIME headers, not having an obvious XML counterpart, that express information that might be taken as metadata applying to the message as a whole, in isolation from the specific message content; e.g. the Content-description header field. 3.10 Other header fields A message header MAY contain header fields that are not derived from RFC822 or MIME. Any such header field names used MUST be associated with a different namespace. This specification does not define any such additional header fields. Klyne Internet draft [Page 15] XML coding of RFC822 messages 9 April 2002 3.10.1 Mandatory extensions In general, a message handler should ignore any header fields that it does not understand. But sometimes it is desirable to introduce new header fields that must be understood for proper processing of the message to take place. This specification defines an XML attribute 'mustUnderstand=', which indicates whether or not the element to which it applies must be understood by a message processor: mustUnderstand='false' is the default case, and indicates that the corresponding element MAY safely be ignored. mustUnderstand='true' indicates that the element to which it applies MUST be processed, OR processing of the entire message (or message header) MUST be abandoned. In XML namespace terms [9], the 'mustUnderstand=' attribute belongs to a "per-element-type namespace partition". Interpretation of the attribute is a property of the element to which it applies. In any case, the DTD or XML schema must declare that the element is allowed on any particular XML element type. It is strongly recommended that any header elements used within an Email+XML message header allow this attribute with the interpretation described here. Non-validating XML processors used to handle Email+XML message headers MAY interpret the 'mustUnderstand=' attribute appearing on any header field element as described here. Notwithstanding the presence or absence of a 'mustUnderstand=' attribute, individual applications may require that certain header elements are present or absent from any header that they interpret. Klyne Internet draft [Page 16] XML coding of RFC822 messages 9 April 2002 4. Summary of RFC822-derived header elements RFC822 fields containing a simple address: return-path from sender resent-from resent-sender RFC822 fields containing an address or group: to cc bcc reply-to resent-to resent-cc resent-bcc resent-reply-to RFC822 fields containing human-readable text: keywords subject comments Other RFC822 fields: received date resent-date message-id resent-message-id in-reply-to references encrypted 5. IANA considerations This specification calls for the registration of the new MIME content-type Message/Email+XML. The registration template is at appendix A. [[[XML document identifier -- URN from IANA space?]]] [[[XML namespace identifier -- URN from IANA space?]]] Klyne Internet draft [Page 17] XML coding of RFC822 messages 9 April 2002 [[[Waiting on [20]...]]] 6. Internationalization considerations This specification attempts to relax the restriction of international data imposed by RFC822. RFC822 limits characters in address local parts to US-ASCII. This specification uses URIs and XML-based address format, relaxing that constraint so that foreign language personal names can be represented. Character restrictions apply to URIs, and the %-escape mechanism defined by RFC2396 must be followed for representing non-URI characters. The character encoding used is dependent on the URI scheme, but UTF-8 is the strongly recommended choice. [[[todo: cite IRI work, and charmod?]]] Similarly, the characters that can be used in domain names are currently severely constrained. Work is under way to define international forms for domain names. Message content is tagged using standard MIME capabilities (charset parameter for text data [13], and Content-language header for language tagging [22]). Mandating handling of international data formats is a matter for particular applications; it is recommended that applications using the Email+XML message format be required to process UTF-8 coded character data. That does not necessarily mean that all characters received can be displayed. For content included in an XML element, language tagging can be achieved by including an 'xml:lang=' attribute [16] in the element (subject to appropriate DTD or XML schema permission to use that attribute). Klyne Internet draft [Page 18] XML coding of RFC822 messages 9 April 2002 6.1 International URIs in XML This sub-section is commentary, not part of this specification: In a message to the W3C URI mailing list (http://lists.w3.org/Archives/Public/uri/2000Oct/0008.html), Martin Duerst wrote: The original XML spec says (http://www.w3.org/TR/1998/REC-xml- 19980210#sec-external-ent): An XML processor should handle a non-ASCII character in a URI by representing the character in UTF-8 as one or more bytes, and then escaping these bytes with the URI escaping mechanism (i.e., by converting each byte to %HH, where HH is the hexadecimal notation of the byte value). This says that the XML processor should do this for you, and therefore it should be okay for you to put in the original characters. But there are three problems here: o It says 'should', not must. o It's not clear whether it applies to all URIs, or just to the URIs used in System Identifiers, and in the former case, it's not clear how an XML processor would find all URIs in a document (without e.g. Schema information). o The text in the second edition of XML (http://www.w3.org/TR/REC-xml#sec-external-ent) is much clearer about how the conversion has to take place; unfortunately, it doesn't make clear who should do this conversion (the original document producer or the XML processor). The idea was not to change this for the second edition, but somehow it got lost. I'm following up on this. 7. Security considerations This document for the most part describes an alternative coding of an existing message structure, and is not believed to introduce any new security exposure not already inherent in existing systems. MIME based messages may be protected using existing MIME security frameworks, such as S/MIME [12], OpenPGP [13], etc. Klyne Internet draft [Page 19] XML coding of RFC822 messages 9 April 2002 Using a non-MIME, pure XML message format means that alternative security frameworks may be applicable, such as XML digital signatures [14]. Note that this framework is not designed to allow the conversion of message formats (e.g. between RFC822 and XML) while preserving signatures or other security information. If a signature is applied in a MIME body part, and that body part is moved to a message with a different header format, then the signature may be expected to remain intact. 8. Acknowledgements The author thanks the following for their comments and/or contributions: Harald Alvestrand, Dave Crocker, Simon Josefsson, [[[...]]]. 9. References [1] Crocker, D., "Standard for the format of ARPA Internet text messages", RFC 822, STD 11, August 1982. [2] Tim Bray, Jean Paoli, and C. M. Sperberg-McQueen, "Extensible Markup Language (XML) 1.0", W3C recommendation: , 10 February 1998. [3] Berners-Lee, T., Fielding, R.T. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998. [4] Weider, C., Preston, C., Simonsen, K., Alvestrand, H., Atkinson, R., Crispin, M., Svanberg, P., "Report from the IAB Character Set Workshop", RFC 2130, April 1997. Alvestrand, H, "IETF Policy on Character Sets and Languages", RFC 2277, BCP 18, January 1998. Klyne Internet draft [Page 20] XML coding of RFC822 messages 9 April 2002 Freed, N., and J. Postel, "IANA Charset Registration Procedures", BCP 19, RFC 2278, January 1998. [[[Is there a more definitive reference?]]] [5] Alvestrand, H., "Tags for the Identification of Languages", RFC 1766, March 1995. (Defines Content-language header.) [6] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [7] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", RFC 2046 November 1996. [8] Freed, N., Klensin, J., and J. Postel, "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", RFC 2048, BCP 13, November 1996. [9] Tim Bray, Dave Hollander, and Andrew Layman "Namespaces in XML", W3C recommendation: , 14 January 1999. [10] Lassila, O. and R. Swick, "Resource Description Framework (RDF) Model and Syntax Specification", W3C recommendation: , 22 February 1999. [11] Kohn, D., Murata, M. and S. St.Laurent, "XML Media Types", draft-murata-xml-09.txt, September 2000. (Introduces '+XML' content-type naming convention.) Klyne Internet draft [Page 21] XML coding of RFC822 messages 9 April 2002 [12] Ramsdell, B., "S/MIME Version 3 Message Specification", RFC 2633, June 1999. [13] Callas, J., Donnerhacke, L., Finney, H. and R. Thayer, "OpenPGP Message Format", RFC 2440, November 1998. [14] Eastlake, D., Reagle, J. and D. Solo, "XML-Signature Syntax and Processing", Work in progress: , August 2000. [15] Levinson, E., "The MIME Multipart/Related Content-type", RFC 2387, August 1998. [16] Levinson, E., "Content-ID and Message-ID Uniform Resource Locators", RFC 2392, August 1998. [17] Daniel, R., DeRose, S. and E. Maler "XML Pointer Language (XPointer) Version 1.0", W3C Candidate Recommendation: 7 June 2000. [18] Fallside, D., "XML Schema Part 0: Primer", W3C Working Draft: , 22 September 2000. Thompson, H., Beech, D., Maloney, M., and N. Mendelsohn "XML Schema Part 1: Structures", W3C Working Draft: 22 September 2000. Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes", W3C Working Draft: 22 September 2000. Klyne Internet draft [Page 22] XML coding of RFC822 messages 9 April 2002 [19] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, March 1997. [20] Mealling, M., Masinter, L., Hardie, T., and G. Klyne, "A URN Sub-namespace for Registered Protocol Patameters", draft-mealling-iana-urn-01.txt (work in progress), August 2001. [21] Hoffman, P., Masinter, L., and J. Zawinski, "The mailto URL scheme", RFC 2368, July 1998. [22] Klyne, G., "Indicating Media Features for MIME Content", RFC 2912, September 2000. [23] Brickley, D. and R. V. Guha, "Resource Description Framework (RDF) Schema Specification", W3C recommendation: , 27 March 2000. [24] Mealling, M., "The IETF XML Registry", draft-mealling-iana-xmlns-registry-02.txt (work in progress), June 2001. 10. Author's address Graham Klyne MIMEsweeper Group Clearswift Corporation 1310 Waterside Arlington Business Park Theale Reading, RG7 4SA United Kingdom Telephone: +44 11 8903 8903 E-mail: Graham.Klyne@MIMEsweeper.com Klyne Internet draft [Page 23] XML coding of RFC822 messages 9 April 2002 Appendix A: Message/Email+XML content-type registration [[[TBD]]] Appendix B: DTD for Email+XML message format [[[TBD]]] Appendix C: XML schema for Email+XML message format [[[TBD]]] Appendix D: RDF representation of Email+XML message The message header format described here is designed to be compatible with RDF [10]. To prepare a message header for presentation to an RDF processor, it should be enclosed in an element having an appropriate RDF namespace declaration. In RDF terms, the message header is a resource, having a property arc for each header element and also one for the message content. Here is an informal representation of the RDF graph corresponding to the message example from section 2.3: [] | +--rfc822:from--> [
] | | | ----------- | | | +--adrs-->"im:Eeyore@ThistlyCorner.100Aker.org" | +--name-->"Eeyore" | +--rfc822:to-------> [] | | | +--name--> "Anyone" | +--rfc822:subject--> "Why?" | +--content--> "Wherefore? Inasmuch as which?" There is a subtle difference in the RDF form of a message with inline content and one that references a separate content object: Klyne Internet draft [Page 24] XML coding of RFC822 messages 9 April 2002 both have a 'content' property whose value is a resource; if the content is defined externally, the value of the 'content' property is an RDF resource containing the content; when the content is inline, the property value is an RDF literal. If inline message content contains XML markup, to ensure complete RDF compatibility the 'content' element should have a property 'parseType="Literal"', to prevent the RDF processor from trying to interpret the content as RDF. Appendix E: RDF schema for Email+XML message format [[[TBD]]] Appendix F: Amendment history 00a 13-Oct-2000 Memo initially created. 00b 16-Oct-2000 Add reference to XML spec note about non-ASCII text in a URI. 00c 18-Oct-2000 Change RFC822|XML to RFC822+XML (per later XML- MIME spec). 01a 04-Jan-2001 Change draft title and message format name. Indicate that this is not an exact coding of RFC822 messages, but an attempt to capture their essential semantics. Change syntax of address elements to be RDF compliant. 01b 10-Jan-2001 Add RFC822 group structure to address format. Distinguish between headers that allow group values and those that allow simple addresses. Use separate namespaces for message structure and headers derived from RFC822. Add brief discussion of RDF compatibility. 01c 12-Jan-2001 Add discussion list details. 02a 19-Jan-2001 Add clarification to security considerations that message signatures are not generally expected to survive any message format conversion. 02b 10-Sep-2001 Update contact details. Update proposed namespace names in line with [20]. Update proposed DTD name, per [24] (new reference). Klyne Internet draft [Page 25] XML coding of RFC822 messages 9 April 2002 03a 09-Apr-2002 Update contact details. Change name of 'seeNoEvil' attribute to 'mustUnderstand'. Appendix G: Outstanding issues o Review namespace URIs. o Review MIME type name. (Message/XML? Application/Message+XML?) o Allow more flexible use of RDF syntax to reduce verbosity (but increase number of different ways of expressing some constructs in XML; e.g. adrs and name attributes for
)? o Clarify effect of namespaces (or not) on element attribute names. XML attributes do not follow the same default namespace rules as elements. o Define DTD, XML schema and RDF schema. o Finalize IANA considerations. Full copyright statement Copyright (C) The Internet Society 2001. All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. Klyne Internet draft [Page 26] XML coding of RFC822 messages 9 April 2002 This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Klyne Internet draft [Page 27]