[RFC] 2396 [http://www.ietf.org/rfc/rfc2396.txt] glosses URI as "'''U'''niform '''R'''esource '''I'''dentifier", and rather famously ordains, "a compact string of characters for identifying an abstract or physical resource." The most common type of URI is the [URL]. URIs are exactly what the [XML] world calls "system IDs". [tcllib] supplies a package called "uri" to parse RFC-compliant values. Its documentation appears at http://tcllib.sourceforge.net/doc/uri.html . [[examples]] ---- uri::split is the command to break a uri up into its component parts. The parts identified depend on the schemes supported. For instance, for ftp, the pieces identified are * host * path * port * passwd * scheme * type * user ---- uri doesn't yet support the data: (or file:?) protocols. 'Twould be fun to add that (those) in. [[This'd be a good place for examples.]] [LV] What does "doesn't yet support" mean? [AK] That the code does not know to handle such urls (split/join). There are no regexp patterns either. [LV] but I see this behavior with tcl8.4 and tcllib 1.0: % package require uri 1.0 % set name "file://home/lwv26/myfile.txt" file://home/lwv26/myfile.txt % uri::split $name path /lwv26/myfile.txt scheme file host home So it LOOKS like file is supported. ---- Here is a urn: scheme handler. ([AK]: This handler was added to [tcllib] after the 1.1.0 release) [DMA]: so this outdated code could be deleted, i'd suggest... # urn-scheme.tcl - Copyright (C) 2001 Pat Thoyts # # extend the uri package to deal with URN (RFC 2141) # see http://www.normos.org/ietf/rfc/rfc2141.txt # # Released under the tcllib license. # # $Id: 850,v 1.17 2004-04-28 06:00:09 jcw Exp $ # ------------------------------------------------------------------------- package require uri 1.0 package provide uri::urn 1.0 namespace eval uri { namespace eval urn { variable NIDpart {[a-zA-Z0-9][a-zA-Z0-9-]{0,31}} variable esc {%[0-9a-fA-F]{2}} variable trans {a-zA-Z0-9$_.+!*'(,):-=@;} variable NSSpart "($esc|\[$trans\])+" variable URNpart "($NIDpart):($NSSpart)" variable url "urn:$NIDpart:$NSSpart" lappend [namespace parent]::schemes urn URN } } # ------------------------------------------------------------------------- # Description: # Called by uri::split with a url to split into its parts. # proc uri::SplitUrn {uri} { #@c Split the given uri into then URN component parts #@a uri: the URI to split without it's scheme part. #@r List of the component parts suitable for 'array set' upvar \#0 [namespace current]::urn::URNpart pattern array set parts {nid {} nss {}} if {[regexp ^$pattern $uri -> parts(nid) parts(nss)]} { return [array get parts] } else { return {nid {} nss {}} } } # ------------------------------------------------------------------------- proc uri::JoinUrn args { #@c Join the parts of a URN scheme URI #@a list of nid value nss value #@r a valid string representation for your URI array set parts [list nid {} nss {}] array set parts $args set url [urn::quote "urn:$parts(nid):$parts(nss)"] return $url } # ------------------------------------------------------------------------- # Quote the disallowed characters according to the RFC for URN scheme. # ref: RFC2141 sec2.2 proc uri::urn::quote {url} { variable trans set ndx 0 while {[regexp -start $ndx -indices "\[^$trans\]" $url r]} { set ndx [lindex $r 0] scan [string index $url $ndx] %c chr set rep %[format %.2X $chr] set url [string replace $url $ndx $ndx $rep] incr ndx 3 } return $url } # ------------------------------------------------------------------------- # Perform the reverse of urn::quote. proc uri::urn::unquote {url} { set ndx 0 while {[regexp -start $ndx -indices {%([0-9a-zA-Z]{2})} $url r]} { set first [lindex $r 0] set last [lindex $r 1] set str [string replace [string range $url $first $last] 0 0 0x] set c [format %c $str] set url [string replace $url $first $last $c] set ndx [expr $last + 1] } return $url } # ------------------------------------------------------------------------- # Local Variables: # indent-tabs-mode: nil # End: ---- The new Tcl 8.4a4 VFS layer by Vince Darley simplifies this work. See the "tclvfs" extension on SourceForge [http://sf.net/projects/tclvfs] for example code which opens http, ftp, zip, and more - using the "blah:..." notation. ---- <> Failed to use uri to simulate a smart address bar [HaO] 2013-05-03 I tried to use 'uri::canonicalize' to implement a smart url completer, e.g. to say transform "test.de" to "http://test.de/". ====== % package require uri 1.2.2 % uri::canonicalize test.de http:///test.de ====== So there are 3 slashes instead of two and no trailing slash. This is not what I intended. The reason is, that 'uri::canonicalize' internally uses 'uri::split' and 'uri::join'' and the first interpretes the given data as a path and not as a host: ====== % ::uri::split test.de fragment {} port {} path test.de scheme http host {} query {} ====== <> ---- !!!!!! %| [Category Package], subset [Tcllib] |% !!!!!!