[Arjen Markus] (10 august 2006) In the chatroom yesterday [Richard Suchenwirth] came up with an elegant solution for the following problem: Given a string (actually the contents of a file) that contains empty lines (so \n\n), split it into parts on these empty lines. You could use [regexp] to do this, but a much more elegant way (IMHO) is to replace the substring that you want to split on by a character that is not present in the original string and then use [split]. The choice of that character is of course a bit delicate and the method is limited to ''fixed'' substrings. Still, the code is simple: set list [split [string map [list $substring $splitchar] $string] $splitchar] Now, what character could you choose for $splitchar? One fascinating choice is '''\u0080''' - it is part of a region of the UNICODE character map that is more or less forbidden or reserved. It means that it is very unlikely to be present in the original string (unless that is a binary string of course, in which case most if not all bets are off, but splitting binary strings is a rare and dangerous thing anyway). If you need to split on a substring that may vary (for instance a sequence of one or more empty lines), check out the [[split_re]] method in [Tcllib]. ---- [WJP] (10 August 2006) \u0080 is fairly safe but you can't be quite sure since it is a legal Unicode control character. A better choice is \uFFFE or \uFFFF. Both are guaranteed not to be characters and so are absolutely safe. ---- [JMN] 2006-11-02 My timings indicate the above method is about 10x faster than textutil::split::splitx However.. Tcl's [split] alone on a single-char separator is 4x faster again. I'd love to see a multi-character 'split' in the core. [string split] perhaps? ---- [[ [Category String Processing] ]]