[Arjen Markus] (9 March 2005) I created this page to collect small code fragments about the various tricks you can use to process text. Most of the time these tricks do not merit a page of their own, but they do merit a place on the Wiki :). ---- [AM] I had a problem with files that contain "continued lines". Here is a sketch: line with info \ continued on the next line (see the backslash) \ and the info is: Aha=BBBB another line with info - here the info is: Aha=CC I needed to extract information from the complete lines. Now usually I read files line by line and analyse the lines one by one. You can not do that with this type of layout. Or can you? Here is my little trick: set contents [read $infile] set contents [string map [list "\\\n" " "] $contents] foreach line [split $contents \n] { .. process the line ... } This little fragment of code=: * reads the complete file (in my case they were not very large) * replaces the trailing backslash (and the newline) by a single space * splits the contents into separate lines again No need to check if the line is complete or not - just use a few commands. ---- [CLN] When you get the contents of a text widget, you get an extra trailing newline. If you read the contents of a file, insert it in the widget, and just save those contents, you'll add a blank line at the end of the file for each save. The solution is to save one less character than [[$text get 1.0 end]] returns, something like [[puts $fid [[string range [[$text get 1.0 end]] 0 end-1]]]]. [ECS]: Why not this: [[puts $fid [[$text get 1.0 end-1c]]]]? [CLN] Oops. I guess that'll work, too (though I haven't verified either). [rdt] Well, isn't this because your puts is adding the newline? So wouldn't 'puts -nonewline $text' just do the job? [CLN] No good deed goes unpunished! ;-) In my too-quick example, both the ''get'' from the text widget and the [[puts]] added newlines. Try this: % text .t .t % pack .t % .t insert end "Foo" % string length [.t get 1.0 end] 4 So, I guess to write out only what you see in the text widget, you'd have to do: puts -nonewline [.t get 1.0 end-1c] [RS]: Note that text files not ending in a newline are considered ill-behaved, e.g. by ''diff''... ---- [schlenk] If a data file or text is already quite similar to a Tcl program one can sometimes easily map it to a Tcl program and just execute it. One Example for this: A plotter data file like this: ;PU 640, 6900 ;PD 640, 6909 , 640, 6913 , 640, 6917 , 640, 6921 , 640, 6924 ;PU 641, 6928 ;PD 641, 6932 , 642, 6936 , 643, 6940 , 644, 6944 , 645, 6947 , 646, 6951 Looks already quite similar to a Tcl program, we just need to reformat it a little bit. This does the trick: set data [string map { \n "" ; \n , "" } $data] Now we have to setup a nice evaluation environment, so we do not get surprised: proc dummy_unknown {args} {return} proc PD {args} { foreach {x y} $args { puts "PenDown ( $x , $y )" } } proc PU {args} { foreach {x y} $args { puts "PenUp ( $x , $y )" } } set i [interp create -safe ] $i eval {namespace delete ::} interp alias $i unknown {} dummy_unknown interp alias $i PD {} PD interp alias $i PU {} PU And at last just evaluate our little program: interp eval $i $data ---- Splitting or processing a text file as an array of lines: set lines [split [read $fd] \n] so, the number of lines in the file is [[llength $lines]], the n'th line in the file is [[lindex $lines $n]] and so forth. ---- ... next trick? ... ---- [[ [Category Word and Text processing] - [Category File] - [Category String Processing] ]]