Error processing request

Parameters

CONTENT_LENGTH0
REQUEST_METHODGET
REQUEST_URI/revision/Tcl+invoke+performance?V=17
QUERY_STRINGV=17
CONTENT_TYPE
DOCUMENT_URI/revision/Tcl+invoke+performance
DOCUMENT_ROOT/var/www/nikit/nikit/nginx/../docroot
SCGI1
SERVER_PROTOCOLHTTP/1.1
HTTPSon
REMOTE_ADDR172.70.127.43
REMOTE_PORT56888
SERVER_PORT4443
SERVER_NAMEwiki.tcl-lang.org
HTTP_HOSTwiki.tcl-lang.org
HTTP_CONNECTIONKeep-Alive
HTTP_ACCEPT_ENCODINGgzip, br
HTTP_X_FORWARDED_FOR3.17.128.129
HTTP_CF_RAY87ea02890c3f28ef-ORD
HTTP_X_FORWARDED_PROTOhttps
HTTP_CF_VISITOR{"scheme":"https"}
HTTP_ACCEPT*/*
HTTP_USER_AGENTMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; [email protected])
HTTP_CF_CONNECTING_IP3.17.128.129
HTTP_CDN_LOOPcloudflare
HTTP_CF_IPCOUNTRYUS

Body


Error

Unknow state transition: LINE -> END

-code

1

-level

0

-errorstack

INNER {returnImm {Unknow state transition: LINE -> END} {}} CALL {my render_wikit {Tcl invoke performance} {**Summary**
See also [Tcl IO performance], where we reached the

'''Interim conclusion''' The time spent in the original benchmark seems to be decomposable in
   * 8% in startup and loop overhead (0.19/2.5)
   * 20% in command call and setting a variable in the called command (0.50/2.50) ''Further experimentation with a C-coded command that just returns TCL_OK shows that the cost of setting the variable is almost negligible''
   * 72% doing the file access, reading and conversion within [gets] itself
   * the cost of setting the variable in the called command seems to be below the measurable threshold for the larger dataset (read1a actually faster than read2 - the difference ''must'' be noise)

----
**Full Results, v1**
Further experiments
   * on a core instrumented with code developed together with [GPS] [https://sourceforge.net/tracker/index.php?func=detail&aid=1828178&group_id=10894&atid=310894]
   * running a tiny do-almost-nothing script (see below)
   * using a command '''empty''' defined in C to just ''return Tcl_Ok;''

provide the following timings for the most-exercised opcodes (measured in cpu ticks at 1.6MHz):

======
8.5a6 time: 1822167696, count: 20004698
----------------------------------------------------
op         %T       avgT         %ops         Nops
6       24.53     446.87         5.00      1000323      INST_INVOKE_STK1
80      22.68     413.32         5.00      1000019      INST_LIST_INDEX
29      10.04      91.47        10.00      2000048      INST_INCR_SCALAR1_IMM
105      9.77      44.52        20.00      4000452      INST_START_CMD
103      7.58     138.16         5.00      1000004      INST_LIST_INDEX_IMM
10       7.20      26.22        25.00      5000750      INST_LOAD_SCALAR1
17       5.78      52.67        10.00      2000236      INST_STORE_SCALAR1
47       5.23      95.32         5.00      1000001      INST_LT
1        4.19      38.11        10.00      2001039      INST_PUSH1
3        2.95      53.83         5.00      1000302      INST_POP
======
''Note'': later measurements were done on 5M instead of 1M runs. If you want to compare the total runtime, this first batch 
should be considered to take 9110838480 clicks (1822167696*5)


***Striking observations***
(These require an explanation)

Taking the fastest opcode as comparison basis:

   1. INST_LOAD_SCALAR1 is the fastest opcode (faster that INST_PUSH1 and INST_POP!), INST_STORE_SCALAR1 is pretty fast too
   1. [lindex] is amazingly slow - the non-immediate version is as slow as a command invocation
   1. command invocations are expensive (remark that only '''empty''' is invoked in the loop)
   1. comparisons (INST_LT) and basic arithmetic (INST_INCR_SCALAR_IMM1) are amazingly slow when compared to basic variable access
   1. the "pure loss" INST_START_CMD is amazingly slow

(do note that INST_LOAD_SCALAR1 provides a ''generous'' upper bound on the cost of opcode dispatch - and the possible savings when improving that part only)
***Test Script***
The script being run is
======
lappend auto_path /home/CVS/emptyFunc/
package require empty

exec /usr/bin/taskset -p 0x00000001 [pid]

proc main N {
    set y 0
    set a [list foo boo moo]
    for {set i 0} {$i < $N} {incr i} {
	empty 1
	incr y
	set z [lindex $a 1]
	set z 1
	lindex $a $z
    }
}

if {[llength $argv]} {
    main [lindex $argv 0]
} else {
    main 1000000
}
======

----
**Full Results, v2**
2007-11-09 Committed patch #1829248 [https://sourceforge.net/tracker/index.php?func=detail&aid=1829248&group_id=10894&atid=310894] that caches some frequently accessed TSD fields in the [ekeko]. '''INST_INVOKE_STK1 goes down from 503 ticks to 306''' (same measurements as above, did it get slower in the meantime??). New timings:

======

8.5a6 time: 8187552172, count: 100004702
----------------------------------------------------
op         %T       avgT         %ops         Nops
80      25.07     410.45         5.00      5000019      INST_LIST_INDEX
6       18.63     305.00         5.00      5000323      INST_INVOKE_STK1
29      11.37      93.12        10.00     10000048      INST_INCR_SCALAR1_IMM
105      9.67      39.58        20.00     20000453      INST_START_CMD
10       8.20      26.85        25.00     25000750      INST_LOAD_SCALAR1
47       7.35     120.43         5} regexp2} CALL {my render {Tcl invoke performance} {**Summary**
See also [Tcl IO performance], where we reached the

'''Interim conclusion''' The time spent in the original benchmark seems to be decomposable in
   * 8% in startup and loop overhead (0.19/2.5)
   * 20% in command call and setting a variable in the called command (0.50/2.50) ''Further experimentation with a C-coded command that just returns TCL_OK shows that the cost of setting the variable is almost negligible''
   * 72% doing the file access, reading and conversion within [gets] itself
   * the cost of setting the variable in the called command seems to be below the measurable threshold for the larger dataset (read1a actually faster than read2 - the difference ''must'' be noise)

----
**Full Results, v1**
Further experiments
   * on a core instrumented with code developed together with [GPS] [https://sourceforge.net/tracker/index.php?func=detail&aid=1828178&group_id=10894&atid=310894]
   * running a tiny do-almost-nothing script (see below)
   * using a command '''empty''' defined in C to just ''return Tcl_Ok;''

provide the following timings for the most-exercised opcodes (measured in cpu ticks at 1.6MHz):

======
8.5a6 time: 1822167696, count: 20004698
----------------------------------------------------
op         %T       avgT         %ops         Nops
6       24.53     446.87         5.00      1000323      INST_INVOKE_STK1
80      22.68     413.32         5.00      1000019      INST_LIST_INDEX
29      10.04      91.47        10.00      2000048      INST_INCR_SCALAR1_IMM
105      9.77      44.52        20.00      4000452      INST_START_CMD
103      7.58     138.16         5.00      1000004      INST_LIST_INDEX_IMM
10       7.20      26.22        25.00      5000750      INST_LOAD_SCALAR1
17       5.78      52.67        10.00      2000236      INST_STORE_SCALAR1
47       5.23      95.32         5.00      1000001      INST_LT
1        4.19      38.11        10.00      2001039      INST_PUSH1
3        2.95      53.83         5.00      1000302      INST_POP
======
''Note'': later measurements were done on 5M instead of 1M runs. If you want to compare the total runtime, this first batch 
should be considered to take 9110838480 clicks (1822167696*5)


***Striking observations***
(These require an explanation)

Taking the fastest opcode as comparison basis:

   1. INST_LOAD_SCALAR1 is the fastest opcode (faster that INST_PUSH1 and INST_POP!), INST_STORE_SCALAR1 is pretty fast too
   1. [lindex] is amazingly slow - the non-immediate version is as slow as a command invocation
   1. command invocations are expensive (remark that only '''empty''' is invoked in the loop)
   1. comparisons (INST_LT) and basic arithmetic (INST_INCR_SCALAR_IMM1) are amazingly slow when compared to basic variable access
   1. the "pure loss" INST_START_CMD is amazingly slow

(do note that INST_LOAD_SCALAR1 provides a ''generous'' upper bound on the cost of opcode dispatch - and the possible savings when improving that part only)
***Test Script***
The script being run is
======
lappend auto_path /home/CVS/emptyFunc/
package require empty

exec /usr/bin/taskset -p 0x00000001 [pid]

proc main N {
    set y 0
    set a [list foo boo moo]
    for {set i 0} {$i < $N} {incr i} {
	empty 1
	incr y
	set z [lindex $a 1]
	set z 1
	lindex $a $z
    }
}

if {[llength $argv]} {
    main [lindex $argv 0]
} else {
    main 1000000
}
======

----
**Full Results, v2**
2007-11-09 Committed patch #1829248 [https://sourceforge.net/tracker/index.php?func=detail&aid=1829248&group_id=10894&atid=310894] that caches some frequently accessed TSD fields in the [ekeko]. '''INST_INVOKE_STK1 goes down from 503 ticks to 306''' (same measurements as above, did it get slower in the meantime??). New timings:

======

8.5a6 time: 8187552172, count: 100004702
----------------------------------------------------
op         %T       avgT         %ops         Nops
80      25.07     410.45         5.00      5000019      INST_LIST_INDEX
6       18.63     305.00         5.00      5000323      INST_INVOKE_STK1
29      11.37      93.12        10.00     10000048      INST_INCR_SCALAR1_IMM
105      9.67      39.58        20.00     20000453      INST_START_CMD
10       8.20      26.85        25.00     25000750      INST_LOAD_SCALAR1
47       7.35     120.43         5}} CALL {my revision {Tcl invoke performance}} CALL {::oo::Obj5489392 process revision/Tcl+invoke+performance} CALL {::oo::Obj5489390 process}

-errorcode

NONE

-errorinfo

Unknow state transition: LINE -> END
    while executing
"error $msg"
    (class "::Wiki" method "render_wikit" line 6)
    invoked from within
"my render_$default_markup $N $C $mkup_rendering_engine"
    (class "::Wiki" method "render" line 8)
    invoked from within
"my render $name $C"
    (class "::Wiki" method "revision" line 31)
    invoked from within
"my revision $page"
    (class "::Wiki" method "process" line 56)
    invoked from within
"$server process [string trim $uri /]"

-errorline

4