Error processing request

Parameters

CONTENT_LENGTH0
REQUEST_METHODGET
REQUEST_URI/revision/Mismatch+between+regexp+%2Dindices+and+switch+%2Dregexp+%2Dindexvar?V=11
QUERY_STRINGV=11
CONTENT_TYPE
DOCUMENT_URI/revision/Mismatch+between+regexp+-indices+and+switch+-regexp+-indexvar
DOCUMENT_ROOT/var/www/nikit/nikit/nginx/../docroot
SCGI1
SERVER_PROTOCOLHTTP/1.1
HTTPSon
REMOTE_ADDR172.70.126.42
REMOTE_PORT13358
SERVER_PORT4443
SERVER_NAMEwiki.tcl-lang.org
HTTP_HOSTwiki.tcl-lang.org
HTTP_CONNECTIONKeep-Alive
HTTP_ACCEPT_ENCODINGgzip, br
HTTP_X_FORWARDED_FOR18.220.16.184
HTTP_CF_RAY87e1a0ce8978e164-ORD
HTTP_X_FORWARDED_PROTOhttps
HTTP_CF_VISITOR{"scheme":"https"}
HTTP_ACCEPT*/*
HTTP_USER_AGENTMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; [email protected])
HTTP_CF_CONNECTING_IP18.220.16.184
HTTP_CDN_LOOPcloudflare
HTTP_CF_IPCOUNTRYUS

Body


Error

Unknow state transition: LINE -> END

-code

1

-level

0

-errorstack

INNER {returnImm {Unknow state transition: LINE -> END} {}} CALL {my render_wikit {Mismatch between regexp -indices and switch -regexp -indexvar} {[TJE]
This little mismatch has bitten me in code, so I thought I'd shed a little more light on the matter for those who may not have picked up on it from the documentation...

The ranges placed in a switch statement's "-indexvar" target are inclusive of the character AFTER the match.  This differs from the behavior of [regexp]'s "-indices" option, which is exclusive of the same character.

Here's a simple example:

  % set line {foo bar}
  foo bar
  % regexp -inline -indices {foo} $line
  {0 2}
  % switch -regexp -indexvar index -- $line {foo} {set index}
  {0 3}

As you can see, regexp reports the actual match (character '0' through '2' matches "foo"), whereas switch reports the match PLUS the character after (character '0' through '3' matches "foo ").

Don't get bitten!

----
Curiously, [TIP]#75 [http://tip.tcl.tk/75] (which seems to be the only one that specifies this feature) states (emphasis added):
    :   the new option `-indexvar` will also be provided which will name a variable into which a list of match indices (each a two item list of values ''in the same way that [[regexp -indices]]'' computes) will be placed
This rather suggests that the stated mismatch is a bug...

'''[DGP]''' Please have a look at the documentation for [regexp] and [switch].
[http://www.tcl.tk/man/tcl8.5/TclCmd/regexp.htm] [http://www.tcl.tk/man/tcl8.5/TclCmd/switch.htm].
Appears to me the [switch] '''-indexvar''' option is operating exactly as
it is documented to do.  ''As a meta-comment, I think the Tracker is a much better
place to resolve question like this than the wiki.''

Is it significant that tcl/tests/switch.test does, in fact, have tests
for use of -indexvar (in conjuntion with -matchvar) and the tests appear
to be passing? Either the tests aren't testing indexvar the way one would
think, the test writer ''cooked'' the tests so they would pass even
though not producing the real expected results, or the code is doing the
right thing, but has, perhaps, the wrong docs. See tcl/generic/tclCmdM.c,
function Tcl_SwitchObjCmd for the code which provides the [switch] functionality.

'''[male]''' - 2008-02-15 - for me it is not really interesting, what the man page is telling, if it references the behaviour of regexp, which is different! And even documented behaviour could be buggy! In my eyes both regexp based features should behave the same, no matter what the man page is telling!
======
set string "The quick brown fox jumped over the lazy dogs."

set matches {}
set indexes {}

switch -regexp -matchvar matches -indexvar indexes -- $string {
  ^(.*)u([a-z]+)(.*)(o[a-z]+)(.*)\.  {
        puts "Found"
        puts " matches = .${matches}."
        puts " indexes = .${indexes}."
   }
  default {
        puts "string = $string"
  }
}

Found
 matches = .{The quick brown fox jumped over the lazy dogs.} {The quick brown fox j} mped { over the lazy d} ogs {}.
 indexes = .{0 46} {0 21} {22 26} {26 42} {42 45} {45 45}.
======

[DKF]: Match ranges are the same. Ending index not (one-off). File a bug.
<<br>>
Correction, not a bug. It's documented to be what it is. (Not saying whether it is "morally" right. Just not a bug ''per se''.)
======
[TJE] Note that I credit the documents (if indirectly) with correctness in this matter.  I don't LIKE the behavior, but it is, indeed, documented.  My code is fixed with a weird-looking little '-1' appendage.  Here $arg is the switched-upon value and $ipair is the extracted index pair I care about:

  set substring [string range $arg {*}$ipair-1]

Whee!
----
!!!!!!
[Tcl syntax help]|[Arts and crafts of Tcl-Tk programming]
%| [Category String Processing] | [Category Example] |%
!!!!!!} regexp2} CALL {my render {Mismatch between regexp -indices and switch -regexp -indexvar} {[TJE]
This little mismatch has bitten me in code, so I thought I'd shed a little more light on the matter for those who may not have picked up on it from the documentation...

The ranges placed in a switch statement's "-indexvar" target are inclusive of the character AFTER the match.  This differs from the behavior of [regexp]'s "-indices" option, which is exclusive of the same character.

Here's a simple example:

  % set line {foo bar}
  foo bar
  % regexp -inline -indices {foo} $line
  {0 2}
  % switch -regexp -indexvar index -- $line {foo} {set index}
  {0 3}

As you can see, regexp reports the actual match (character '0' through '2' matches "foo"), whereas switch reports the match PLUS the character after (character '0' through '3' matches "foo ").

Don't get bitten!

----
Curiously, [TIP]#75 [http://tip.tcl.tk/75] (which seems to be the only one that specifies this feature) states (emphasis added):
    :   the new option `-indexvar` will also be provided which will name a variable into which a list of match indices (each a two item list of values ''in the same way that [[regexp -indices]]'' computes) will be placed
This rather suggests that the stated mismatch is a bug...

'''[DGP]''' Please have a look at the documentation for [regexp] and [switch].
[http://www.tcl.tk/man/tcl8.5/TclCmd/regexp.htm] [http://www.tcl.tk/man/tcl8.5/TclCmd/switch.htm].
Appears to me the [switch] '''-indexvar''' option is operating exactly as
it is documented to do.  ''As a meta-comment, I think the Tracker is a much better
place to resolve question like this than the wiki.''

Is it significant that tcl/tests/switch.test does, in fact, have tests
for use of -indexvar (in conjuntion with -matchvar) and the tests appear
to be passing? Either the tests aren't testing indexvar the way one would
think, the test writer ''cooked'' the tests so they would pass even
though not producing the real expected results, or the code is doing the
right thing, but has, perhaps, the wrong docs. See tcl/generic/tclCmdM.c,
function Tcl_SwitchObjCmd for the code which provides the [switch] functionality.

'''[male]''' - 2008-02-15 - for me it is not really interesting, what the man page is telling, if it references the behaviour of regexp, which is different! And even documented behaviour could be buggy! In my eyes both regexp based features should behave the same, no matter what the man page is telling!
======
set string "The quick brown fox jumped over the lazy dogs."

set matches {}
set indexes {}

switch -regexp -matchvar matches -indexvar indexes -- $string {
  ^(.*)u([a-z]+)(.*)(o[a-z]+)(.*)\.  {
        puts "Found"
        puts " matches = .${matches}."
        puts " indexes = .${indexes}."
   }
  default {
        puts "string = $string"
  }
}

Found
 matches = .{The quick brown fox jumped over the lazy dogs.} {The quick brown fox j} mped { over the lazy d} ogs {}.
 indexes = .{0 46} {0 21} {22 26} {26 42} {42 45} {45 45}.
======

[DKF]: Match ranges are the same. Ending index not (one-off). File a bug.
<<br>>
Correction, not a bug. It's documented to be what it is. (Not saying whether it is "morally" right. Just not a bug ''per se''.)
======
[TJE] Note that I credit the documents (if indirectly) with correctness in this matter.  I don't LIKE the behavior, but it is, indeed, documented.  My code is fixed with a weird-looking little '-1' appendage.  Here $arg is the switched-upon value and $ipair is the extracted index pair I care about:

  set substring [string range $arg {*}$ipair-1]

Whee!
----
!!!!!!
[Tcl syntax help]|[Arts and crafts of Tcl-Tk programming]
%| [Category String Processing] | [Category Example] |%
!!!!!!}} CALL {my revision {Mismatch between regexp -indices and switch -regexp -indexvar}} CALL {::oo::Obj5029839 process revision/Mismatch+between+regexp+%2Dindices+and+switch+%2Dregexp+%2Dindexvar} CALL {::oo::Obj5029837 process}

-errorcode

NONE

-errorinfo

Unknow state transition: LINE -> END
    while executing
"error $msg"
    (class "::Wiki" method "render_wikit" line 6)
    invoked from within
"my render_$default_markup $N $C $mkup_rendering_engine"
    (class "::Wiki" method "render" line 8)
    invoked from within
"my render $name $C"
    (class "::Wiki" method "revision" line 31)
    invoked from within
"my revision $page"
    (class "::Wiki" method "process" line 56)
    invoked from within
"$server process [string trim $uri /]"

-errorline

4