From: Kevin M. Rosenberg This document contains the following sections: 1.0 IntroductionURI support in Allegro CL
+
+
+2.0 The URI API definition
+3.0 Parsing, escape decoding/encoding and the path
+4.0 Interning URIs
+5.0 Allegro CL implementation notes
+6.0 Examples
+
This version of the Allegro CL URI support documentation is for distribution with the +Open Source version of the URI code. Links to Allegro CL documentation other than +URI-specific files have been supressed. To see Allegro CL documentation, see http://www.franz.com/support/documentation/, +which is the Allegro CL documentation page of the franz inc. website. Links to Allegro CL +documentation can be found on that page.
+ +URI stands for Universal Resource Identifier. For a description of +URIs, see RFC2396, which can be found in several places, including the IETF web site (http://www.ietf.org/rfc/rfc2396.txt) and +the UCI/ICS web site (http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt). +We prefer the UCI/ICS one as it has more examples.
+ +URIs are a superset in functionality and syntax to URLs (Universal Resource Locators) +and URNs (Universal Resource Names). That is, RFC2396 updates and merges RFC1738 and +RFC1808 into a single syntax, called the URI. It does exclude some portions of RFC1738 +that define specific syntax of individual URL schemes.
+ +In URL slang, the scheme is usually called the `protocol', but it is called +scheme in RFC1738. A URL `host' corresponds to the URI `authority.' The URL slang +`bookmark' or `anchor' is `fragment' in URI lingo.
+ +The URI facility was available as a patch to Allegro CL 5.0.1 and is included with
+release 6.0. the URI facility might not be in an Allegro CL image. Evaluate (require
+:uri)
to ensure the facility is loaded (that form returns nil
if the
+URI module is already loaded).
Broadly, the URI facility creates a Lisp object that represents a URI, and provides +setters and accessors to fields in the URI object. The URI object can also be interned, +much like symbols in CL are. This document describes the facility and the related +operators.
+ +Aside from the obvious slots which are called out in the RFC, URIs also have a property +list. With interning, this is another similarity between URIs and CL symbols.
+ +Symbols naming objects (functions, variables, etc.) in the uri module are
+exported from the net.uri
package.
URIs are represented by CLOS objects. Their slots are:
+ ++scheme +host +port +path +query +fragment +plist ++ +
The host
and port
slots together correspond to the authority
+(see RFC2396). There is an accessor-like function, uri-authority,
+that can be used to extract the authority from a URI. See the RFC2396 specifications
+pointed to at the beginning of the 1.0 Introduction for details
+of all the slots except plist
. The plist
slot contains a
+standard Common Lisp property list.
All symbols are external in the net.uri
package, unless otherwise noted.
+Brief descriptions are given in this document, with complete descriptions in the
+individual pages.
+
+
uri
: the class of URI objects. urn
: the class of URN objects. Arguments: object
+Returns true if object is an instance of class uri
.
+
Arguments: uri &key + place scheme host port path query fragment plist
+Copies the specified URI object. See the description page for information on the + keyword arguments.
+Arguments: uri-object
+These accessors return the value of the associated slots of the uri-object
+Arguments: uri-object +
+Returns the authority of uri-object. The authority combines the host and port.
+Arguments: uri + stream
+Print to stream the printed representation of uri.
+Arguments: string &key + (class 'uri)
+Parse string into a URI object.
+Arguments: uri + base-uri &optional place
+Return an absolute URI, based on uri, which can be relative, and base-uri + which must be absolute.
+Arguments: uri + base
+Converts uri into a relative URI using base as the base URI.
+Arguments: uri +
+Return the parsed representation of the path.
+Arguments: object
+Defined methods: if argument is a uri object, return it; create a uri object if + possible and return it, or error if not possible.
+The method uri-path returns the path +portion of the URI, in string form. The method uri-parsed-path +returns the path portion of the URI, in list form. This list form is discussed below, +after a discussion of decoding/encoding.
+ +RFC2396 lays out a method for inserting into URIs reserved characters. You do +this by escaping the character. An escaped character is defined like this:
+ ++escaped = "%" hex hex + +hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f" ++ +
In addition, the RFC defines excluded characters:
+ ++"<" | ">" | "#" | "%" | <"> | "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`" ++ +
The set of reserved characters are:
+ ++";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | "," ++ +
with the following exceptions: + +
From the RFC, there are two important rules about escaping and unescaping (encoding and +decoding): + +
The implication of this is that to decode the URI, it must be in a parsed state. That +is, you can't convert %2f (the escaped form of +"/") until the path has been parsed into its component parts. Another important +desire is for the application viewing the component parts to see the decoded values of the +components. For example, consider:
+ ++http://www.franz.com/calculator/3%2f2 ++ +
This might be the implementation of a calculator, and how someone would execute 3/2. +Clearly, the application that implements this would want to see path components of +"calculator" and "3/2". "3%2f2" would not be useful to the +calculator application.
+ +For the reasons given above, a parsed version of the path is available and has the +following form:
+ ++([:absolute | :relative] component1 [component2...]) ++ +
where components are:
+ ++element | (element param1 [param2 ...]) ++ +
and element is a path element, and the param's are path element parameters. +For example, the result of
+ ++(uri-parsed-path (parse-uri "foo;10/bar:x;y;z/baz.htm")) ++ +
is
+ ++(:relative ("foo" "10") ("bar:x" "y" "z") "baz.htm") ++ +
There is a certain amount of canonicalization that occurs when parsing: + +
(:absolute)
or (:absolute "")
is
+ equivalent to a nil
path. That is, http://a/
is parsed with a nil
+ path and printed as http://a
. "foob%61r"
+ is parsed into "foobar"
and appears as "foobar"
+ when the URI is printed. This section describes how to intern URIs. Interning is not mandatory. URIs can be used +perfectly well without interning them.
+ +Interned URIs in Allegro are like symbols. That is, a string representing a URI, when +parsed and interned, will always yield an eq object. For example:
+ ++(eq (intern-uri "http://www.franz.com") + (intern-uri "http://www.franz.com")) ++ +
is always true. (Two strings with identical contents may or may not be eq +in Common Lisp, note.)
+ +The functions associated with interning are: + +
Arguments: &key + size
+Make a new hash-table object to contain interned URIs.
+Arguments:
+Return the object into which URIs are currently being interned.
+Arguments: uri1 uri2
+Returns true if uri1 and uri2 are equivalent.
+Arguments: uri-name + &optional uri-space
+Intern the uri object specified in the uri-space specified. Methods exist for strings + and uri objects.
+Arguments: uri + &optional uri-space
+Unintern the uri object specified or all uri objects (in uri-space if specified)
+ if uri is t
.
Arguments: (var &optional + uri-space result) &body body
+Bind var to all currently defined uris (in uri-space if specified) and + evaluate body.
+(uri= (parse-uri "http://www.franz.com/")
(parse-uri "http://www.franz.com"))
(eq (intern-uri "http://www.franz.com/")
(intern-uri "http://www.franz.com"))
(eq (intern-uri "http://www.franz.com:80/foo/bar.htm")
(intern-uri "http://www.franz.com/foo/bar.htm"))
#u"..."
is shorthand for (parse-uri "...")
+ but if an existing #u
dispatch macro definition exists, it will not be
+ overridden. +user(10): (setq u #u"http://foo.bar.com/foo/bar") +#<uri http://foo.bar.com/foo/bar> +user(11): (setf (net.uri:uri-host u) "foo.com") +"foo.com" +user(12): u +#<uri http://foo.com/foo/bar> +user(13): ++ +
This allows URIs behavior to follow the principle of least surprise.
+ ++uri(10): (use-package :net.uri) +t +uri(11): (parse-uri "foo") +#<uri foo> +uri(12): #u"foo" +#<uri foo> +uri(13): (setq base (intern-uri "http://www.franz.com/foo/bar/")) +#<uri http://www.franz.com/foo/bar/> +uri(14): (merge-uris (parse-uri "foo.htm") base) +#<uri http://www.franz.com/foo/bar/foo.htm> +uri(15): (merge-uris (parse-uri "?foo") base) +#<uri http://www.franz.com/foo/bar/?foo> +uri(16): (setq base (intern-uri "http://www.franz.com/foo/bar/baz.htm")) +#<uri http://www.franz.com/foo/bar/baz.htm> +uri(17): (merge-uris (parse-uri "foo.htm") base) +#<uri http://www.franz.com/foo/bar/foo.htm> +uri(18): (merge-uris #u"?foo" base) +#<uri http://www.franz.com/foo/bar/?foo> +uri(19): (describe #u"http://www.franz.com") +#<uri http://www.franz.com> is an instance of #<standard-class net.uri:uri>: + The following slots have :instance allocation: + scheme :http + host "www.franz.com" + port nil + path nil + query nil + fragment nil + plist nil + escaped nil + string "http://www.franz.com" + parsed-path nil + hashcode nil +uri(20): (describe #u"http://www.franz.com/") +#<uri http://www.franz.com> is an instance of #<standard-class net.uri:uri>: + The following slots have :instance allocation: + scheme :http + host "www.franz.com" + port nil + path nil + query nil + fragment nil + plist nil + escaped nil + string "http://www.franz.com" + parsed-path nil + hashcode nil +uri(21): #u"foobar#baz%23xxx" +#<uri foobar#baz#xxx> ++ +
Copyright (c) 1998-2001, Franz Inc. Berkeley, CA., USA. All rights reserved. +Created 2001.8.16.
+ +