Transposing a matrix

Richard Suchenwirth 2001-12-10 - Let a matrix (a rectangular arrangement of data, typically numerical) be represented as a list of lists, where the sublists represent the rows of the matrix. One frequent requirement is to transpose a matrix, i.e. swap rows and columns. The following implementation exploits some unique features of Tcl to do this in compact code. Transposition example:

 a b c d        a e i
 e f g h   ->   b f j
 i j k l        c g k
                d h l

Iterating over a list of lists goes most naturally first over the outer layer (the rows), then for each element of the current row. We collect the elements by lappending to variables, one for each column. But as we don't know how many columns the matrix will have, we have to create a set of n row variables, where n is the row length, or number of columns. This is most easily done with an index vector generator, iota:

 proc iota n {
    # index vector generator, e.g. iota 5 => 0 1 2 3 4
    set res {}
    for {set i 0} {$i<$n} {incr i} {
        lappend res $i
    set res

An amazing feature of Tcl is that a variable name can be any string, including of course digits - most other languages restrict the choice of names, in order to distinguish variables and constants. Tcl does that, as we all know, explicitly with dollar sign substitutions (or a single-argument set). So the result of iota is a list of perfectly valid variable names, over which we will iterate (I picked the fancy variable name *col to indicate this "points to" the name) in parallel with the elements of each row:

 proc transposeMatrix m {
    set cols [iota [llength [lindex $m 0]]]
    foreach row $m {
        foreach element $row   *col $cols {
            lappend ${*col} $element
    eval list $[join $cols " $"]

The last line may need some explanation. We want to return the list of columns, the command for which e.g. for 5 columns would look like this:

  list $0 $1 $2 $3 $4

Joining the elements of cols with a space and a dollar sign brings us most of the way:

 0 $1 $2 $3 $4

Only the first element lacks its dollar sign, which is prepended to the result of join. Note that both dollar signs need not be escaped with backslashes, as "only a single round of substitutions takes place", so the result of the bracketed join command is not again evaluated in context with the dollar sign - not until we order it: we also prepend the list command name and then eval the whole thing that this tiny code generator produced, the result (the transposed matrix) being the last result of this procedure, and hence implicitly taken as return value.

The treatment of malformed matrices, with rows of uneven length, depends on the length of the first row: rows that are shorter will be padded with empty strings, while elements running beyond the length of the first row will be collected in a variable ${} (this too is a legal name), and ultimately discarded. The values of matrix elements can be any string, so one might use this for "transposing" spreadsheets as well - or strimjes (see strimj - string image routines where I have used the same technique to rotate by 90 degrees; or A matrix gadget where the above proc was in a tighter framework which has not won poularity since then).

It is always fascinating to see how seemingly routine mathematical tasks can lead to more insight into the features and possibilities of Tcl...

A (even simpler) generalization of the above is function mapping with list.

This code from Playing sudoku is probably simpler, hence better, than the above:

 proc transpose matrix {
    set cmd list
    set i -1
    foreach col [lindex $matrix 0] {append cmd " \$[incr i]"}
    foreach row $matrix {
        set i -1
        foreach col $row {lappend [incr i] $col}
    eval $cmd

For the more conservatively minded, here is a transpose procedure that avoids dynamically building commands:

 proc transpose {matrix} {
    set res {}
    for {set j 0} {$j < [llength [lindex $matrix 0]]} {incr j} {
       set newrow {}
       foreach oldrow $matrix {
          lappend newrow [lindex $oldrow $j]
       lappend res $newrow
    return $res

MJ - While exploring the depths of lsearch the following option emerged (8.5+):

 proc transpose` {matrix} {
   for {set index 0} {$index < [llength [lindex $matrix 0]]} {incr index} {
       lappend res [lsearch -all -inline -subindices  -index $index $matrix *]
   return $res

For large lists this method is quite a bit faster. Results of:

 set m [lrepeat 1000 [lrepeat 1000 a]]
 package require struct::matrix
 struct::matrix m deserialize [list 1000 1000 $m]
 puts [time {m transpose} 10]
 puts [time {transpose $m} 10] ; # last transpose proc on this page
 puts [time {transpose` $m} 10]


 1260621.0 microseconds per iteration
 330535.9 microseconds per iteration
 81381.5 microseconds per iteration

Lars H: I see the lsearch variant being faster (probably due to evaluating fewer Tcl commands) already for 7x7 matrices on such matrices, but that comparison is probably unfair for numerical matrices. The reason is that lsearch probably cannot avoid generating string representations for the matrix elements, whereas the other transposes never looks inside them. Transposing a matrix like this m where all elements are the same Tcl_Obj hides that source of possible overhead, but it is difficult to set up a time that would capture it.