Text widget improvements

A page to collect ideas for improving Tk's text widget, and to list current limitations.

lm 2009/05/18: The scrolling of the text widget is a r(eal problem. In bioinfomartics, it is of common use to work with file of 200-500 lines of 2000-6000 characters long, all being tagged. The problem was already pointed out by John Ousterhout in 1996 !! : Alas, very long lines are also slow in Tk text widgets, (...) . Tk uses a b-tree to represent the contents of the widget, but the leaves of the b-tree are lines (everything between two newline characters), so Tk has to do linear searches through lines to find anything. It probably would have made sense to make the leaves of the tree smaller units than lines and perhaps some day we'll do this, but this is a bigger change than fixing the problems with large numbers of tags. Maybe this is a hint to improve the widget ?

  • Text widget cloning or Text peer widgets (Fixed in Tk 8.5 - see [L1 ])
  • Foremost amongst these is the 'scrollbar problem' when there are wrapped lines of varying lengths. As you scroll, the length of the scroll box changes! (Fixed in Tk 8.5, see [L2 ])
  • Add a 'blockcursor' option to present a flashing block rather than thin line.
  • Add a method to count the number of characters between two index positions (currently the only way to do this is string length [.text get $idx1 $idx2])
  • Add helper code to implement a text replace operation which doesn't mess with the current insertion position or the current scroll position, unless such changes are actually required. (The naive use of delete followed by insert really isn't very satisfactory.) (At least 8.5.8+ has the "replace" sub-command which implements this feature.)
  • Add the ability to find the index at the beginning or end of any display line, whether that line is actually currently displayed or not.
  • Add the ability to move up or down by display lines.
  • Fix the 'slow deletion of lots of text with lots of tags' problem.

Vince has placed a work-in-progress patch at http://sourceforge.net/tracker/?func=detail&aid=791292&group_id=12997&atid=312997 which addresses all of the above issues.

  • Printing from the widget
  • Backing with a Tcl variable
  • TV It would be nice to have a -autoindent 3 option, for programming (a la 'a+' earlier and currently 'set ai' in vi) (easy for me to say, not having contributed a line of code, and not in charge of keeping core sizes non-galactic..). Vince adds that this is indeed something very useful and nice, but really something that can be implemented in a few lines of Tcl by whatever piece of code is trying to turn the text widget into an editor of some kind (which will also have to add code to load, save, etc). I don't particularly see a need to add this to Tk's core.

TV I remember wanting to have it real-time, that is to get an ident after typing a return, and though I might be famous (or infamous) for my one liners, at that particular time it seemed not feasible without a delay to prevent return from coming in after the bind to create the ident, putting the cursor always at the beginning of a line, which is not the idea of making prettyprint-while-you-type.

Vince adds -- you're definitely mistaken there. Take a look at Alphatk which is a Tcl-Tk based editor which does auto-indentation. You can easily achieve this sort of thing with the text widget as it stands.

  • Add the possibility to "chain" a text between peer widgets, ie automatically manage them so that -endline of the first peer is just before the -startline of the second peer whose -endline is just before the -startline of the third ..., be able to go from one another without having to use tabulation, and be able to synchronize defilement of the text for all the peer widgets... With this, it should be possible to do some layouts in columns or emulate the float algorithme of html.
  • Add backround image capabilities
  • Add alpha chanel handling (image, background color)
  • Add horizontal line special symbol.

My biggest (and only) frustration with the text widget is that it is too slow or weak. Loading really large files into it is very slow and painful. That renders apps like Tkdiff or Tkgrep practically useless. -- RS: Well, comparing with Notepad, Word or such, text still loads quite acceptably, if you load the file line by line with updates in between - start reading on top, while bottom still loads. Even on a slow iPaq, iRead: a Gutenberg eBook reader is usable for files of say 400 KB (where PocketWord bails out at about 180KB)

I suppose you're right, but... Notepad and Word??? Were you out of good parameters when you wrote that? RS: Well, how else would I open a Word doc that someone sent me? [L3 ] And Notepad at least isn't bloated, and can handle exotic Unicodes sort of well. OTOH, emacs on Linux won't allow me to paste something from Windows if it's outside ASCII...


MGS [2003/08/22] - I would like to see Move cursor by display line in a text widget a standard feature. I guess it should be optional (and keep the current default method of cursor control for backwards compatibility).

Vince adds that he has just added '+Ndisplaylines' and '-Ndisplaylines' to the above patch so that basic display-line cursor control is now available.


Bryan Oakley: I agree with RS; I think it can also be said that the "uselessness" of an app is partly related to how the app is written. For example, tkdiff now works by leaving the whole window blank until all diffs have been processed. It is possible to write tkdiff where it would display data as it is available 1, 10 or 100 lines at a time, making it come up and be usable significantly faster. True, viewing the last diff will take just as long but most people use tools like tkdiff top down.

That's not to say I don't want the widget to be faster, but I do think it's a stretch to say tkdiff or tkgrep is "practically useless".

OK. What about a serious text editor, which is expected to display the entire document all the time? You can use that trick and display only portions at any given time, but programming most of the functionality becomes a lot more complex. ViRTually every single event, even key presses, and their relation with what is displayed or hidden must be managed all the time. - RS: No, it's easiest to load big files in one go into the text widget - just update while you're loading :D Bryan Oakley: right. I wasn't suggesting you don't load the whole document, only that you load it a bit lazily.


MGS 2003-08-22: Of course, you can load chunks of the input file during the idle event loop with something like this:

proc text:load {text channel {size 1024}} {
    $text insert end [read $channel $size]

    if { [eof $channel] } {
        close $channel
        puts "channel closed at [clock format [clock seconds]]"
    } else {
        after idle [list after 0 [namespace code [info level 0]]]
    }
}

text .t -yscrollcommand [list .y set]
scrollbar .y -orient vertical -command [list .t yview]

pack .y -side right -expand 0 -fill y
pack .t -side left  -expand 1 -fill both

set channel [open all.tcl r]
puts "channel opened at [clock format [clock seconds]]"
text:load .t $channel

Loading a 15 Mb file took about 4 seconds on my 700Mhz K7 Athlon with 384 Mb RAM.

The whiny fellow who's only contributed with pessimistic remarks in italics until now reports that the method above indeed seems to be quite efficient. But first let's put it through a serious stress test, then we'll see.

Please report your findings. We're only here to help! As far as performance goes, you might like to play with the size = 1024 value in the proc above. I only pulled that figure out of ... the air.

Yes, it looks better and better. 4096 seems to be the best size step on my machine. Don't know about other machines. And I still have to toy with [update]. Thanks.

Has anyone done any profiling to determine why loading a few megabytes into the text widget is slow. Is there an obvious bottleneck which could be improved?

On the same K7 Athlon above with the same 15 Mb file takes about 750 milliseconds to read in the data, and then about 1150 milliseconds to insert into a text widgets. That doesn't strike me as being so slow, but then I'm always open to performance improvements :-}


Parsing/Syntax coloring

Would it help to have some infrastructure support for genuine parsing of the contents of a text widget? For example, the ability to add to any line a particular code/Tcl_Obj (the parser state).


Peter Newman 17 March 2005: (moved from text) A useful (and surely very easily implemented,) enhancement to the text widget, would be to add a -fieldToFieldTabbing tabbing option - so that one could instantly create a multi-line text entry widget - simply by using the standard text widget, and setting that option ON.

The option itself would just switch the text widget's existing <Key-Tab>/<Shift-Key-Tab> and <Control-Key-Tab>/<Control-Shift-Key-Tab> bindings - so that Tab and Shift+Tab did the field to field tabbing that's usually desired with widgets in a data entry form - instead of the text editor/word processor type 8 space tabbing that they would otherwise do (and which is usually not really required, in a multi-line text entry widget).


One serious aesthetic glitch which I have found, is when text is centered or justified to the right... and when tags have been placed over the x.0 leftmost character - then the background color of that tag will extend all the way across to the left margin of the text area. The only way to fix this in 8.4, would seem to be by adding an elided character at the x.0 mark of every line.

Furthermore, I would really like a built-in option of preventing the mouse text block selection color from extending beyond the end of the line to the right (an example of this style is found in microsoft's "wordpad.exe").

One other thing which I noticed, seems to have been already put into the plans for the 8.5 version of tk - but it's still worthwhile to note that when you elide more than one line of text at once in 8.4, the scrolling of the text widget becomes unpredictable. The elided text is included in the calculations for scrolling.

It would be nice to have the ability to use a particular designated font file (embed fonts) with the distribution of a program.

It would be nice to have an option of letting the arrow keys move through wrapped lines, in a manner which is more typical of other text widget programs out there. It might involve some mathematics, but it'd be nice that the cursor would move up and down smoothly through the screen lines of text.

Finally, I believe that there ought to be more "convenience" procedures, for dealing with text dumps. This would encourage developers to write more polished textual programs - when they can work behind the scenes, just as easily as working point-by-point, visually, on the screen (One nice idea in this vein, which I saw mentioned on the 8.5a development page - was the ability to save a textdump, and then use that that dumptext, to instantaneously create a new text window which is an exact duplicate of the first.).

-- Laif

dumping and restoring the text widget is fairly straight-forward. Sure, it would be nice if it were built-in, but the format of the output of the dump command is very simple to parse. For an old, still-working-but-not-very-optimal solution see http://www.purl.org/net/oakley/tcl/ttd/index.html

-- Anonymous reply

I appreciate that information. I'll check it out. Another comment I'll add here, is that I believe elided text shouldn't be editable, as it is in 8.4 The fact that it's elided would usually mean that it has some mechanical function. This functionality shouldn't be inadvertantly harmed by the user of the text widget, as he runs his cursor back and forth.

-- Laif