'''[Split]ting strings with embedded strings''' [Richard Suchenwirth] 2001-05-31: - Robin Lauren wrote in [comp.lang.tcl]: I want to split an argument which contains spaces within quotes into proper name=value pairs. But I can't :) Consider this example: ====== set tag {body type="text/plain" title="This is my body"} set element [lindex $tag 0] set attributes [lrange $tag 1 end] ;# *BZZT!* Wrong answer! ====== My attributes becomes the list {type="text/plain"} {title="This} {is} {my} {body"} (perhaps even with the optional backslash before the quotes), which isn't really what i had in mind. ** Answer ** If there are always exactly two attribute definitions following an element name, one simple solution is to `[scan]` the string, and then enclose the name/value pairs in sublists: ====== % set parts [scan $tag {%s %[^=]="%[^"]" %[^=]="%[^"]"}] # -> body type text/plain title {This is my body} % set result [list] % foreach {name value} [lrange $parts 1 end] { lappend result [list $name $value] } % set result # -> {type text/plain} {title {This is my body}} ====== For a more general solution, where there can be less or more than two definitions, a `[regexp]` match might be useful: ====== % set matches [regexp -inline -all {(\S+?)="(.*?)"} $tag] # -> type=\"text/plain\" type text/plain {title="This is my body"} title {This is my body} % set result [list] % foreach {- name value} $matches { lappend result [list $name $value] } % set result # -> {type text/plain} {title {This is my body}} ====== (Note that here, the `[foreach]` command extracts ''three'' values from the list during each iteration: the first value is just discarded.) The problem can also be solved using list/string manipulation commands, but then we need to make sure that we see the data in the same way as Tcl does. To a human, ''$tag'' intuitively looks like a list of three items, but according to Tcl list syntax, it has 6 items, and the second item, for example, contains two literal quotes. ====== % llength $tag # -> 6 % lmap item $tag { format "{%s}" $item } # -> {{body}} {{type="text/plain"}} {{title="This}} {{is}} {{my}} {{body"}} ====== One simple solution is to rewrite the tag string into something that is convenient for list manipulation (careful with the quoting in the `[string map]` here!): (Oops, the syntax highlighting in the wiki renderer was confused by my initial invocation (`string map {=\" " \{" \" \}} $tag`): the one below works better but obfuscates the code somewhat. `\x22` is double quote, `\x7b` is left brace, `\x7d` is right brace.) ====== % set taglist [string map [list =\x22 " \x7b" \x22 \x7d] $tag] # -> body type {text/plain} title {This is my body} % set result [list] % foreach {name value} [lrange $taglist 1 end] { lappend result [list $name $value] } % set result # -> {type text/plain} {title {This is my body}} ====== Another solution `[split]`s the tag string into a list, not by white space but by double quotes (again, `\x22` is just a wiki-friendly way to insert a double quote character: a `\"` or `{"}` will work in the Tcl interpreter): ====== % set taglist2 [split $tag \x22] # -> {body type=} text/plain { title=} {This is my body} {} ====== Obviously, the result needs a little more processing: 1. the element name is joined up with the first attribute name, 2. the equal sign stays attached to the attribute name, 3. the second (and third, etc) attribute name is preceded by leftover whitespace, and 4. there is an empty element which resulted from splitting at the last double quote before the end of the string. The first three problems are easily dealt with (a string consisting of a space and some non-space characters can be split into a list with an empty first item and the nonspace substring as the second element): ====== % set attrnames [list] % foreach item {{body type=} { title=}} { lappend attrnames [string trimright [lindex [split $item] 1] =] } % set attrnames # -> type title ====== and the fourth problem can be solved by `[break]`ing out of the loop if any attribute name is the empty string: ====== foreach {name value} $taglist2 { if {$name eq {}} { break } lappend result [list [string trimright [lindex [split $name] 1] =] $value] } ====== ======none % set result # -> {type text/plain} {title {This is my body}} ====== All of the above solutions also handle empty attribute value strings, but not attribute values that are not surrounded by double quotes. ---- [AM] Also see: [Splitting a string on arbitrary substrings] <> Parsing | String Processing | Arts and crafts of Tcl-Tk programming