list stripping

ulis, 2002-07-06

Stripping a list is not easy.

If you know the depth of the list you want to strip (or internal sublists do not matter) see at the end of this page.

For a more complex and maybe complete solution, see below


If, for some syntax analysis, you need to strip the braces of a list, here is a discussion about that.

Giving "{{{this is} a {complex list}}}" the goal is to get "{this is} a {complex list}".


At first glance stripping the list {{{a list}}} to get {a list} is simple: while there is only one element in the list, get this element and recurse.

  set l {{{a list}}}
  while {[llength $l] == 1} { set l [lindex $l 0] }
  puts $l
  -> a list

There is a flaw: the initial list can be a one (or zero) element list. In this case the string corresponding to the list does not contain any space.

For this kind of lists list commands can't help us (Is this true?). So we need to try string commands.

On a string, stripping is trimming:

  set l1 {{{a list}}}
  set l2 {{{single}}}
  proc trimming {l} \
  { return [string trim $l \{\}] }
  puts [trimming $l1]
  -> a list
  puts [trimming $l2]
  -> single

Again, there is a flaw: generaly, nested lists contain sublists.

  set l3 {{{a} complex {list}}}
  puts [trimming $l3]
  -> a} complex {list

Here, the nested lists were half-stripped. Not so pretty.


Finding the depth of a list.

 {{{a list}}}
   /------\ <- this list can be stripped three times
  /        \
 /          \
 123       210
   ^      ^
   |---------- max (3)
          |--- min (3)

 {{{this is} a {complex list}}}
   /-------\   /------------\
  /--------------------------\ <- this list can be stripped two times
 /                            \
 123        2  3             210 <- level reached
   ^          ^^
   |-----------|--------------- max (3)
              |---------------- min (2)

 {{{and {this is}} a} {more {complex list}}}
        /-------\
   /----         \          /------------\
  /               --\ /-----              \ 
 /-----------------------------------------\ <- this list can be stripped only one time
 123    4        32  12     3             210 <- level reached
        ^            ^
        |----------------------------------- max (4)
                     |---------------------- min (1)

  max is the higher level reached.
  min is the lower level reached before a downward transition.
  The stripping level is the min level if reached, else the max level.

The implementation:

  # ---------------
  # the lists
  # ---------------
  set l1 {{{a list}}}
  set l2 {{{single}}}
  set l3 {{{a} complex {list}}}
  set l4 {{{this is} a {complex list}}}
  set l5 {{{and {this is}} a} {more {complex list}}}
  # ---------------
  # the proc
  # ---------------
  proc stripping {l} \
  {
    set len [string length $l]
    set level 0
    set max 0
    set last \{
    for {set p 0} {$p < $len} {incr p} \
    {
      set ch [string index $l $p]
      switch -exact -- $ch \
      {
        \{  { 
              # compute min
              if {$last != $ch} \
              {
                if {![info exists min]} { set min $level } \
                elseif {$min > $level} { set min $level }
              }
              # set level
              incr level
              # compute max
              if {$max < $level} { set max $level }
            }
        \}  { incr level -1 }
      }
      set last $ch
    }
    set n [expr {[info exists min] ? $min : $max}]
    return [string range $l $n end-$n]
  }
  # ---------------
  # the result
  # ---------------
  foreach l {l1 l2 l3 l4 l5} \
  { puts [stripping [set $l]] }
  -> a list
  -> single
  -> {a} complex {list}
  -> {this is} a {complex list}
  -> {{and {this is}} a} {more {complex list}}

Now, where is the next flaw?


AEC, 2002-05-28

I use the built-in command join to do the same thing.

 set text "{{{this is} a {nested list}}}"
 {{{this is} a {nested list}}}
 %   set textt [join $text]
 {{this is} a {nested list}}
 %   set textt [join [join $text]]
 {this is} a {nested list}
 %   set textt [join [join [join $text]]]
 this is a nested list

This could easily be encapsulated into a proc to simplify use:

 proc striplist {args} { 
     # Usage:
     #
     # striplist ?-level num? list
     #
     # Level defaults to 0 if omitted.  This means all levels of list nesting
     #   are removed by the proc.  For each level requested, a level of list nesting
     #   is removed.
     #
     # determine level
     set idx [lsearch $args -level]
     if {$idx == 0} {
         set level [lindex $args [incr idx]]
         set args [lreplace $args [expr $idx - 1] $idx]
     } else {
         set level 0
     }
     # while text seems braced and level is not exhausted
     while {1} { 
         # strip outer braces and expose inners
         incr level -1 
         set newargs [join $args]
         if {$newargs == $args} {
             break
         } else {
             set args $newargs
         }
         if {$level == 0} {
             break
         }
     }
     return $args
 }

Example usage:

 % set a {this {is a} test}
 % striplist -level 1 $a
 this {is a} test
 % striplist -level 2 $a
 this is a test

Zia - 2010-01-06 08:37:40

The stripping procedure has a minor flaw where the following type of list gets truncated undesirably:

set str "{{this is } a list}"
puts [stripping $str]
-> this is } a lis

I have made some modifications to the procedure to take care of this. I am not sure if there is still any corner case that may still hit any issues.

proc stripping {l} {
    set len [string length $l]
    set level 0
    set max 0
    set last \{
    for {set p 0} {$p < $len} {incr p} {
        set ch [string index $l $p]
        switch -exact -- $ch {
            \{  {
                # compute min
                if {$last != $ch} {
                    if {![info exists min]} { set min $level } \
                        elseif {$min > $level} { set min $level }
                }
                # set level
                incr level
                # compute max
                if {$max < $level} { set max $level }
            }
            \}  { incr level -1 }
            #### Change starts here
            default {
                # Here, it is checked if the last character was 
                if {$last == "\}"} {
                    if {![info exists min]} {
                        set min $level
                    } elseif {$min > $level} {
                        set min $level
                    }
                }
            }
            #### Change ends here
        }
        set last $ch
    }
    set n [expr {[info exists min] ? $min : $max}]
    return [string range $l $n end-$n]
}

set l1 {{{a list}}}

set l2 {{{single}}}

set l3 {{{a} complex {list}}}

set l4 {{{this is} a {complex list}}}

set l5 {{{and {this is}} a} {more {complex list}}}

set l6 {{{this is } a list}}

foreach l {l1 l2 l3 l4 l5 l6} \

    { puts [stripping [set $l]] }

-> a list

-> single

-> {a} complex {list}

-> {this is} a {complex list}

-> {{and {this is}} a} {more {complex list}}

-> {this is } a list


neb 2010-08-22 If we're still flattening arbitrary-depth lists; does this help?

  proc stripping l {
          set old ""
          while {$l != $old} {
                  set old $l
                  set l [concat {*}$l]
          }
          return $l
  }

Put into the previous example, return this:

  >tclsh85 striplist.tcl
  a list
  single
  a complex list
  this is a complex list
  and this is a more complex list
  this is a list