This specification defines a form of URN to be used for language tags defined or registered according to RFC 3066. The URN namespace used is 'ietf', defined by RFC 2648 and extended by RFC????.


Table of Contents


1. Introduction

This specification defines a form of URN to be used for language tags defined or registered according to RFC 3066 [4]. The URN namespace used is 'IETF', defined by RFC 2648 [3] and extended by RFC???? [7] to include names for IETF protocol parameters.

1.1 Background

RFC 3066 [4] defines a construction and registry for tags used to identify human languages. These tags can be used to describe the language used by human-readable text, or other data intended for human prsentation.

There are situations in which it is desired to express a language tag in the form of a URI, but no definitive URI form has been defined.

For example, W3C have defined RDF, [5], a generic XML-based metadata format that uses URIs to identify objects and the relationships between them. There has been some requirement to use language tags with RDF. RFC 3066 language tags have the desired semantics, but it is not clear how they should be represented in the abstract graph model of RDF. The normal form of identification used by RDF (and the Web in general) is a URI or URI-reference as defined by RFC 2396 [2].

This specification defines a way to embed RFC 3066 language tags in a urn: form of URI, which can be used to identify a language in contexts where a URI is prefered to a text string or token.


2. Registration template

The URN sub-namespace for language tags is defined as follows.

Registry name:
RFC 3066 [4].
See also non-registered language tags defined by RFC 3066 [4].
Index value:
The language tag name is the registry index value. RFC 3066 allows this tag name to contain uppercase letters, lowercase letters, digits, and dash ("-"). Language tags are case-insensitive.

Some allowed unregistered language tag values are defined by reference to ISO standard 639 [6].
URN formation:
The URN for a language tag is formed as: "urn:ietf:params:language:<tag-name>", where <tag-name> is the language registry index value, expressed using lower case letters.

RFC 2141 [1] defines the format of URNs. Allowable characters include all of those noted above.

URNs are defined by RFC 2141 [1] as lexically equivalent if they are identical following case normalization of the urn scheme name, the namespace name and any %-escaping used. Language tags are defined such that upper- and lower-case ASCII characters are not distinguished. In forming a URN, all ASCII characters in the language tag must be expressed in lower case.


3. Examples

This table lists some language tags, and the corresponding urn: URIs.

MN (Mongolian)
en-US (American English)
zh-yue (Cantonese)
sgn-GB (British sign language)


4. IANA considerations

This document calls for the creation of a new IETF sub-namespace per RFC???? [7]. Registration details are in the preceding section.


5. Security considerations

No security considerations are introduced by this specification beyond those already inherrent in the use of language tags [4].


6. Acknowledgements

The author gratefully acknowledges the contributions of: Martyn Horner, [[[...]]]



Full Copyright Statement