This page was started by [HZe], comments are welcome. An expanded version of this package [tclSylk], by [CMcC] is available in the half-bakery: [http://wiki.tcl.tk/_repo/tclSylk/] To write out simple, table oriented data to interface with other systems, sometimes [csv] formatted files are used. But if you have to deal with different [localization], soon you get screwed up by different configurations of decimal points (dot or comma) and list separators (comma, semicolon). Even worse, on [Windows] each user may reconfigure the locale settings. So I tried to use a different format. SYLK is the file format of old [Microsoft] Multiplan and is still supported by almost all modern spreadsheet programs. The benefits are * ASCII format * decimal point is defined as dot (.) * some basic formatting is possible * cells are addressed by x/y values, so no separator unfortunately, there is also a drawback * format is not documented well (see e.g. [http://netghost.narod.ru/gff/graphics/summary/micsylk.htm] which is almost everything you can find) [GWM] also see [http://en.wikipedia.org/wiki/SYmbolic_LinK_%28SYLK%29]. The package below is able to read in SYLK files and store it into an array. While doing this, some parts are parsed (like the content, the format and the formulas), some parts are just stored (like the header information). This enables the user to read a given SYLK file (e.g. exported by Excel) and access the values with $SYLK(,,value). Also, modifications can be made and the whole structure can be saved back into a SYLK file. Since there is no access layer (yet), you will have to access the array directly. So, some caution is necessary. Here are some remarks when using the package: * if you overwrite a value with a formula, make sure the formula is deleted (unset SYLK(,,formula)) * format and values are exactly stored as defined in SYLK * the code makes heavily use of regexp, so performance can be improved * SYLK(header) is used to store all header lines as a TCL list Although the code is perhaps not complete, it already may be helpful for some others... 2007 Dec 31 - [GWM] added parsing for further format types, cell names and references. Used in [TekSel] project. ---- ====== ##################################################################### # Access procedures for SYLK files reading/writing # # (c) Copyright 2003 LMS Deutschland GmbH # Holger Zeinert ##################################################################### # feel free to use without any warrenty ##################################################################### # # internal data structure of a SYLK file is stored in an array: # # SYLK(id) # identification of the SYLK file # SYLK(,,value) # value of cell # SYLK(,,formula) # formula to calculate value # (a cell needs a value; if a formula is given, but no value, the cell # will not be stored into the SYLK file) # SYLK(,,format) # formatstring of cell # SYLK(xMax), SYLK(yMax) # Boundary X and Y, i.e. maximum column and row index # SYLK(header) # part of the header, which is not handled; this # is used to write it unmodified (uninterpreted) # back to the file # ##################################################################### package provide Sylk 0.2 #-------------------------------------------------------------------- # parse X and Y values from a SYLK record # # Both values are optional. If a value is missing, the corresponding # value is not modified #-------------------------------------------------------------------- proc parsePosition {record xName yName} { upvar $xName x upvar $yName y if {[regexp {;Y([0-9]+)(;|$)} $record dummy value]} { set y $value } if {[regexp {;X([0-9]+)(;|$)} $record dummy value]} { set x $value } } #-------------------------------------------------------------------- # read a SYLK file into the internal structure #-------------------------------------------------------------------- proc readSYLK {filename arrayName} { upvar $arrayName SYLK if {[catch {set fp [open $filename r]}]} { return 0 } # store table boundaries set xMax 0 set yMax 0 # hold current postition set xAct 0 set yAct 0 while {![eof $fp]} { gets $fp line switch -regexp $line { ^ID { # e.g. ID;PWXL;N;E regexp {^ID;(.*)$} $line dummy value set SYLK(id) $value } #^P - ^O { # ignore some default formatting information, # which I did not find documented } ^B { # e.g. B;Y107;X105;D0 0 106 104 regexp {^B;Y([0-9]+);X([0-9]+)} $line dummy yMax xMax } ^F { parsePosition $line xAct yAct if {$xAct == 0 && $yAct == 0} { # still header formats lappend SYLK(header) $line } else { # cell formats # Dec 2007 extend to handle ;f format ;p excel index if {[regexp {;S([^;]*)(;|$)} $line dummy style]} { set SYLK($xAct,$yAct,format) $style } if {[regexp {;P([^;]*)} $line dummy pfmt]} { set SYLK($xAct,$yAct,pformindex) $pfmt } if {[regexp {;F([^;]*)} $line dummy fmt]} { set SYLK($xAct,$yAct,numformat) $fmt } } } ^C { # e.g. C;Y1;X1;K"Name" parsePosition $line xAct yAct if {[regexp {;K([^;]*)(;E|$)} $line dummy value]} { regsub {"(.*)"} $value {\1} value set SYLK($xAct,$yAct,value) $value } # with formula? E.g. C;K-1188;E(R[+1]C-0.5)*(R3C2-R2C2)/R4C2+R2C2 if {[regexp {;E(.*)(;|$)} $line dummy value]} { # [GWM] to convert relative positions in formulae set value [substrelatives $value $xAct $yAct] ;# gwm 03.12.07 set SYLK($xAct,$yAct,formula) $value } } ^NN { ;# gwm 23.12.07 name labels # eg NN;Naltitud;ER3C4 regsub "NN;N(.*);E.*" $line {\1} name regsub ".*;E(.*)" $line {\1} cell set SYLK(name,$name) $cell } ^E$ { # SYLK file ends here break } default { # add to header (uninterpreted) lappend SYLK(header) $line } } } close $fp set SYLK(xMax) $xMax set SYLK(yMax) $yMax return 1 } #-------------------------------------------------------------------- # write a SYLK file from the internal structure #-------------------------------------------------------------------- proc writeSYLK {filename arrayName} { upvar $arrayName SYLK set fp [open $filename w] # write if {[info exists SYLK(id)]} { puts $fp "ID;$SYLK(id)" } else { puts $fp "ID;PWXL;N;E" } # boundary? if {[info exists SYLK(yMax)] && [info exists SYLK(yMax)]} { puts $fp "B;Y$SYLK(yMax);X$SYLK(xMax)" } # first, write header if existing if {[info exists SYLK(header)]} { foreach line $SYLK(header) { puts $fp $line } } # write data foreach idx [lsort -dictionary [array names SYLK *,*,value]] { regexp {([0-9]+),([0-9]+),} $idx dummy x y if {[info exists SYLK($x,$y,format)]} { set line "F;S$SYLK($x,$y,format);Y$y;X$x" puts $fp $line } if {[string is double $SYLK($x,$y,value)]} { set value $SYLK($x,$y,value) } else { set value "\"$SYLK($x,$y,value)\"" } set line "C;Y$y;X$x;K$value" if {[info exists SYLK($x,$y,formula)]} { append line ";E$SYLK($x,$y,formula)" } puts $fp $line } # end-of-file marker puts $fp "E" close $fp } # GWM dec 2007 substitute relative references in formula # in Excel (at least) saved slk files use references such as R[-2]C[+2] # to refer to cells 2 above and 2 to right of current cell. proc substrelatives {formula xAct yAct} { # [GWM] converts relative positions in formulae # eg R[-1]C[+1] refers to cell row above, col to right. # eg RC[+1] R[-1]C refers to cell same row above, or same column as previous. # Assume no cell depends on itself to avoid circular references. # step 1 replace [+ or - N with [expression xy+/-N] exact cell is found. set f1 $formula set f1 [regsub -all {C\[(.\d*)]} $f1 {C[expr $xAct\1]}] set f1 [regsub -all {R\[(.\d*)]} $f1 {R[expr $yAct\1]}] set f1 [subst $f1] # now convert the 'same row' formula and 'same col' with bracket after # care needed to avoid changing any letter C eg in a name such as sCheme to sCNNheme set f1 [string map {RC R${yAct}C} [subst -nocommands {$f1}]] set f1 [regsub -all {(R.\d*C)(\D)} $f1 {\1${xAct}\2}] set f1 [regsub -all {(R.\d*C)\Z} $f1 {\1${xAct}}] if {[catch {subst $f1} errmsg]} {puts "Erro subst $formula:> $f1 - $errmsg"} # substitute the expressions on return return [subst $f1] } ====== <> Glossary | Spreadsheet