Version 16 of regsub

Updated 2008-08-26 20:33:54 by JH

Perform substitutions based on regular expression pattern matching.


http://www.purl.org/tcl/home/man/tcl8.4/TclCmd/regsub.htm


*** Examples

See also Regular Expression Examples and Advanced Regular Expression Examples

[Feel free to add below various examples, demonstrating the use of the various flags, etc.]


One example of using regsub from Brent Welch's BOOK Practical Programming in Tcl and Tk is:

 regsub -- {([^\.]*)\.c} file.c {cc -c & -o \1.o} ccCmd

The & is replaced by file.c, and the \1 is replaced by file.


Recently on the Tcler's Wiki chat room, someone had the desire to converta string like this:

 rand ||=> this is some text <=|| rand

to

 rand ||=> some other text <=|| rand


 set unique1 {\|\|=>}
 set unique2 {<=\|\|}
 set string {rand ||=> this is some text <=|| rand}
 set replacement {some other text}
 set new [regsub -- "($unique1) .* ($unique2)" $string "\\1$replacement\\2" string ]
 puts $new
 puts $string

Note that the regular expression metacharacters in unique1 and unique2 need to be quoted so they are not treated as metacharacters.


AM (7 october 2003) I asked about a complicated substitution in the chatroom:

Here is the question:

I have a fixed substring that delimits a variable number of characters. Anything in between (including the delimiters) must be replaced by a repetition of another string. For example:

        1234A000aadA12234 --> 1234BXBXBXBX12234

(A000aadA is 8 characters, my replacing string fits 4 times in that)

arjen: I do not think I can use some clever regexp to do this ... (note: things will always fit)

arjen: The regexp to identify the substring could be: {A^A*A}

arjen: But now to get the replacing string ...

CoderX2 easy... one sec

CoderX2

   set string "1234A000aadA12234"
   set substring "BX"
   regsub -all {(A[^A]*A)} $string {[string repeat $substring [expr {[string length "\1"] / [string length $substring]}]]} new_string
   set new_string [subst $new_string]

(conversation edited to highlight this wonderful gem!)


Has a -eval flag to regsub ever been suggested? It would apply in the above example, and some other common idioms, e.g., url-deoding:

  regsub -all -eval {%([:xdigit:][:xdigit:]} $str {binary format H2 \1} str

The idea is that the replacement string gets eval-ed after expanding the \1 instead of just substituted in. To safely do this otherwise needs an extra call to regsub before (to protect existing []s) and a call to subst afterwards to do the evaluation.

-JR

DKF: Yes, and I mean to do something about it sometime (too many things to do, too little time). Meantime, try this:

 proc regsub-eval {re string cmd} {
    subst [regsub $re [string map {\[ \\[ \] \\] \$ \\$ \\ \\\\} $string] "\[$cmd\]"]
 }
 regsub-eval {%([:xdigit:][:xdigit:]} $str {binary format H2 \1}

JH: Once upon a time I coded up regsub -eval in full in C (still have the patch around somewhere). I decided to not push it forward since it was actually slower than the full subst work-around. I believe this was due to the overhead of many small Tcl_Eval calls versus a one-time subst-pass that could be more effective. There are some newer Tcl_Eval* APIs to try and we should resuscitate this one.


elfring 2003-10-29 TCL variables can be marked that an instance contains a compiled regular expression. REs can be pre-compiled by the call "regexp $RE {}" [L1 ].

DKF: Technically, the compiled RE is cached in the internal representation of the RE value and not the variable. The effect is pretty much indistinguishable though (in all sane programs).


See also:


Tcl syntax help - Arts and Crafts of Tcl-Tk Programming - Category Command - Category String Processing