In many environments, DNS will not be available. The lowest common denominator is to specify exactly how to resolve a URN based on information in a text configuration file. Because of the variety of scope of URN references, this config file requires extreme amounts of flexibility. Many of the cues for this have been taken directly from the work involved in the DNS based NAPTR [3, 4] record.
The file consists of groupings of resolution services that are delimited by namespace identifiers. Each logical group starts with the namespace ID (NID) and is followed by the resources needed to resolve that namespace. The declaration of a new NID or the end of file, terminates the current NID services.
The first step in the resolution service is to extract some meaningful information from the raw URN. In the DNS based system, there are a set of rewrite rules that are used to extract the next level domain to query for further resolution. For the file based resolver, everything is resolved within the context of the one file. There is no need to consult any further services. Instead, we make use of the same regular expression syntax to allow us to grab the name of a group that may be able to provide further resolution. The aim of this group is to provide the next level of resolution.
At the group level, we are now committed to resolving the URN to a particular resource. Within this group are the list of resources that we may resolve the URN to. Because there will be a necessary rewriting of the URN to the resource, we also need another regular expression to extract the required information and add it to the raw resource string.
All resources are specified as URLs because we must give an exact location of where to locate the resource. URLs are the most compact expression available to us in a text file.
At the top level, we know exactly what URN NIDs may be handled. Anything not defined in this file cannot be resolved. Resolvable namespaces are indicated using the "NID:" keyword:
NID: vrmlThe namespace identifier may be treated as case-insensitive when comparing it during a resolution request.
Next we need to specify a group of resources that we think are appropriate to
resolve the namespace with. To do this, we specify a regular expression that
will extract a group name from the URN. For simplisity and commonality, we
use exactly the same format of regular expression as what may be specified
in the replacement field of an NAPTR record. The output of
applying this regexp to the urn should give us the group name to look at.
The regular expression is specified in the file using the "REGEXP:" keyword and shall immediately follow the namespace identifier that it is being used on. For example, using the following definition:
NID: vrml
REGEXP: /urn:vrml:([^\/:]+)/\1/i
on the urn
urn:vrml:umel:texture/wood.gif
would result in the group name umel being generated.
Groups are specified in the file using the "GRP:" keyword. Under each group is a list of the resources that may be used to complete the resolution and potential access of the named object. These are specified in order of preference. The highest preference first to the lowest preference last. The preference list is terminated when a new group or namespace is started.
Specifying a resource is by use of the "RES:" keyword. Following this is a single fully specified URL that is quoted. Whitespace is then used to delimit the start of the regular expression. The regular expression is used to extract information from the full, original, URN and append it to the URL specified in the resource. For example, given the following specification:
GRP: umel
RES: "file:///c:/urn/media/" /urn:vrml:umel:([^\/])\/(.*)/\1/i
RES: "http://urn.vrml.org/umel/" /urn:vrml:umel:([^\/])\/(.*)/\1/i
onto our previously defined urn would lead to the production of the URLs:
file:///c:/urn/media/texture/wood.gif
http://urn.vrml.org/umel/texture/wood.gif
However, by changing the production rules for the second resource to:
RES: "http://urn.vrml.org/umel/fetch_resource.pl" /urn:vrml:umel:([^\/])\/(.*)/?category=\1+object=\2/i
should result in the full URL of
http://urn.vrml.org/umel/fetch_resource.pl?category=texture+object=wood.gif
file = namespace_ids
namespace_ids = namespace_id | namespace_id namespace_ids
namespace_id = "NID:" namespace_str grp_regexp groups
namespace_str = any valid namespace identifier string (see [6])
grp_regexp = "REGEXP:" grp_exp
grp_exp = (see [1], NAPTR RR Format, Replacement)
groups = group | group groups
group = "GRP:" grp_str resources
grp_str = 1*GRP_CHAR
GRP_CHAR = "-" | "." | "a" | ... | "z" | "A" | ... | "Z" | "0" | ... | "9"
resources = resource | resource resources
resource = "RES:" <">url<"> res_regexp
url = Any valid fully qualified URL (see [7])
res_regexp = delim_char ere delim_char repl delim_char *flags
delim_char = "/" | "!" ... (Any non-digit or non-flag character other than
backslash '\'. All occurances of a delim_char in a res_regexp
must be the same character.)
ere = POSIX Extended Regular Expression (see [5],
section 2.8.4)
repl = repl_str | backref | repl repl_str | repl backref
repl_str = 1*REPL_CHAR
backref = "\" 1POS_NUMBER
flags = "i"
REPL_CHAR = "-" | "?" | "+" | "%" | "." | ":" | "#" |
"a" | ... | "z" | "A" | ... | "Z" | "0" | ... | "9"
(see [7] for full list)
POS_NUMBER = "1" | "2" | ... | "9" | "10" | ... ; 0 is not an
allowed backref value domain name.
The following notes appl to the res_regexp regular expression substitution.
The hash '#' character shall be treated as the start of a comment. Anything following the character shall be ignored. The comment is terminated by an end of line character.
Backref expression in the repl portion of the substitution expression are replaced by the (possibly empty) string of characters enclosed by '(' and ')' in the ERE portion of the substitution expression. N may be any positive digit and specifies the N'th backref expression. The N'th backref expression is the one that begins with the N'th '('and continues to the matching ')'. For example, the ERE:
(A(B(C)DE)(F)G)
has backref expressions
The first character in the substitution expression shall be used as the character that delimmits the components of the substitution expression There must be exactly three non-escaped occurrences of the delimiter character in a substitution expression. Since escaped occurrences of the delimiter character will beinterpreted as occurrences of that character, digits shall not be used as delimiters. Backrefs would be confused with literal digits if this were allowed. Similarly, if flags are specified in the substitution expression, the delimiter character must not also be a flag character.
The URL of the resource shall always be quoted. This is to avoid confusion with the boundary between the URL and the beginning of the regular expression. Under some operating systems, file: URL types may include spaces in the directory names. By quoting the string, it is easier to delimit the extent of the URL.
# vrml name spaces:
# urn:vrml:umel:/some/dir/file.ext
NID: vrml
REGEXP: /urn:vrml:([^\/:]+)/\1/i
GRP: umel
RES: "file:///c:/urn/media/" /urn:vrml:umel:([^\/])\/(.*)/\1/i
RES: "http://urn.vrml.org/umel/" /urn:vrml:umel:([^\/])\/(.*)/\1/i
GRP eai
RES: "http://urn.vrml.org/eai/" /urn:vrml:eai:([^\/])\/(.*)/\1/i
# Experimental CID namespace (from draft NAPTR spec)
# urn:cid:199606121851.1@mordred.gatech.edu
NID: cid
REGEXP: /urn:cid:.+@([^\.]+\.)(.*)$/\2/i
GRP: gatech.edu
RES: "http://www.gatech.edu/cgi-bin/resources.pl" /urn:cid:.+@([^\.]+\.)(.*)$/\?uid=\1/i
[2]Gulbrandsen, A. & Vixie, P. "A DNS Resource Record for Specifying the Location of Services", RFC2052, Oct 1996
[3]Daniel, R. & Mealing, M. "Resolution of Uniform Resource Identifiers using the Domain Name System", RFC2168, June 1997
[4]Mealing, M. & Daniel, R. "URI Resolutions Services Necessary for URN Resolution", IETF Internet Draft 7 (draft-ietf-urn-resolution-services-07.txt), Nov 1998
[5]IEEE Standard for Information Technology - Portable Operating System Interface (POSIX) - Part 2: Shell and Utilities (Vol 1), IEEE Std 1003.2-1992, The Institute Of Electrical Engineers, New York. 1993. ISBN:1-55937-255-9.
[6]Moats, R. "URN Syntax", RFC 2141, May 1997.
[7]Berners-Lee, T., Masinter, L., and M. McCahill, Editors, "Uniform Resource Locators (URL)", RFC 1738, December 1994.