Shuffle a file

Based on code from shuffle a list, here's a first draft of a program that shuffles a file - read in the file, turn it into a list, shuffle the list, then output the lines in the random order.

Note that from my initial test, the code is not right yet. The code needs to trim off the possible last empty element, perhaps setting a flag indicating whether or not there was a trailing newline.

Why bother with this? Sometimes having a set of data come into a program in a random order is useful for testing.

if { $::argc == 0 } {
    puts stderr "USAGE: $::argv0 filename"
    return 0
}

set fd [open [lindex $::argv 0] "r"]
set str [read $fd]
set lst [split $str "\n"]

proc shuffle10a list {
    set len [llength $list]
    while {$len} {
        set n [expr {int($len*rand())}]
        set tmp [lindex $list $n]
        lset list $n [lindex $list [incr len -1]]
        lset list $len $tmp
    }
    return $list
}

set str [join [shuffle10a $lst] "\n"]
puts $str

ferrieux A slight variation on this uses offset indexing to allow for very large files: first build a list containing the byte offsets of all beginnings-of-line in the file, then shuffle that list, and finally read back the lines with seek. Notice that tell is not even used, by sheer superstition (no perf measurements, sorry).

puts stderr "(indexing...)"
set fd [open [lindex $::argv 0] r]
fconfigure $fd -translation binary
set ll {}
set off 0
while {[gets $fd line]>=0} {
    lappend ll $off
    incr off [expr {[string length $line]+1}]
}

puts stderr "(now shuffling !)"
foreach off [shuffle10a $ll] {
    seek $fd $off
    gets $fd line
    puts $line
}

If you only want one line at random from a file, see random line from file.