Version 5 of uri

Updated 2001-10-27 10:33:34

Documentation can be found at http://tcllib.sourceforge.net/doc/uri.html


uri::split is the command to break a uri up into its component parts.

The parts identified depend on the schemes supported. For instance, for ftp, the pieces identified are

  • host
  • path
  • port
  • passwd
  • scheme
  • type
  • user

uri doesn't yet support the data: (or file:?) protocols. 'Twould be fun to add that (those) in.

[This'd be a good place for examples.]

LV What does "doesn't yet support" mean?

AK That the code does not know to handle such urls (split/join). There are no regexp patterns either.

LV but I see this behavior with tcl8.4 and tcllib 1.0:

 % package require uri
 1.0
 % set name "file://home/lwv26/myfile.txt"
 file://home/lwv26/myfile.txt
 % uri::split $name
 path /lwv26/myfile.txt scheme file host home

So it LOOKS like file is supported.


Here is a urn: scheme handler. (AK: This handler was added to tcllib after the 1.1.0 release)

 # urn-scheme.tcl - Copyright (C) 2001 Pat Thoyts <[email protected]>
 #
 # extend the uri package to deal with URN (RFC 2141)
 # see http://www.normos.org/ietf/rfc/rfc2141.txt
 #
 # Released under the tcllib license.
 #
 # $Id: 850,v 1.6 2002-06-21 02:29:38 jcw Exp $
 # -------------------------------------------------------------------------

 package require uri 1.0
 package provide uri::urn 1.0

 namespace eval uri {
     namespace eval urn {
        variable NIDpart {[a-zA-Z0-9][a-zA-Z0-9-]{0,31}}
         variable esc {%[0-9a-fA-F]{2}}
         variable trans {a-zA-Z0-9$_.+!*'(,):-=@;}
         variable NSSpart "($esc|\[$trans\])+"
         variable URNpart "($NIDpart):($NSSpart)"
        variable url "urn:$NIDpart:$NSSpart"

        lappend [namespace parent]::schemes urn URN
     }
 }

 # -------------------------------------------------------------------------

 # Description:
 #   Called by uri::split with a url to split into its parts.
 #
 proc uri::SplitUrn {uri} {
     #@c Split the given uri into then URN component parts
     #@a uri: the URI to split without it's scheme part.
     #@r List of the component parts suitable for 'array set'

     upvar \#0 [namespace current]::urn::URNpart pattern
     array set parts {nid {} nss {}}
     if {[regexp ^$pattern $uri -> parts(nid) parts(nss)]} {
         return [array get parts]
     } else {
         return {nid {} nss {}}
     }
 }


 # -------------------------------------------------------------------------

 proc uri::JoinUrn args {
     #@c Join the parts of a URN scheme URI
     #@a list of nid value nss value
     #@r a valid string representation for your URI

     array set parts [list nid {} nss {}]
     array set parts $args
     set url [urn::quote "urn:$parts(nid):$parts(nss)"]
     return $url
 }

 # -------------------------------------------------------------------------

 # Quote the disallowed characters according to the RFC for URN scheme.
 # ref: RFC2141 sec2.2
 proc uri::urn::quote {url} {
     variable trans

     set ndx 0
     while {[regexp -start $ndx -indices "\[^$trans\]" $url r]} {
         set ndx [lindex $r 0]
         scan [string index $url $ndx] %c chr
         set rep %[format %.2X $chr]        
         set url [string replace $url $ndx $ndx $rep]
         incr ndx 3
     }
     return $url
 }

 # -------------------------------------------------------------------------

 # Perform the reverse of urn::quote.
 proc uri::urn::unquote {url} {
     set ndx 0
     while {[regexp -start $ndx -indices {%([0-9a-zA-Z]{2})} $url r]} {
         set first [lindex $r 0]
         set last [lindex $r 1]
         set str [string replace [string range $url $first $last] 0 0 0x]
         set c [format %c $str]
         set url [string replace $url $first $last $c]
         set ndx [expr $last + 1]
     }
     return $url
 }

 # -------------------------------------------------------------------------
 # Local Variables:
 #   indent-tabs-mode: nil
 # End:

The new Tcl 8.4a4 VFS layer by Vince Darley simplifies this work. See the "tclvfs" extension on SourceForge [L1 ] for example code which opens http, ftp, zip, and more - using the "blah:..." notation.


Category Package, subset Tcllib