'''[Strick]:''' In some languages, e.g. [Java], you can "[https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#intern()%|%intern]" a string, which gives you one unique "interned instance" of the string with that value. Here's a proc to do it in Tcl: ====== proc intern s { if { ! [info exists ::INTERN($s)] } { set ::INTERN($s) $s } set ::INTERN($s) } ====== I used Tcl a dozen years before needing this, but today i wrote a program that used many of the same long strings (file pathnames) in some Tcl lists. There's two different reasons you might want to intern a string: * To make string comparisons as fast as pointer comparisons * To save memory, when strings with the same value are used many times Most often you intern strings for the first reason, but in Tcl that can't work: there are no separate "value" and "address" comparisons. Today I interned for the second reason. [MS] remarks that Tcl has a shared pool of literal strings. Any code within [proc] bodies reuses literals from the shared pool. In a sense, all literal strings are already "interned", and this mechanism is mainly useful for strings that are generated at runtime. [Lars H]: remarks that all ''direct'' copies of a string are shared automatically (it's the same [Tcl_Obj]). For example, in the result of ====== proc lineindex {filename} { set res {} set F [open $filename r] while {![eof $F]} { lappend res [list $filename [tell $F]] gets $F ; # Skip ahead one line } close $F return $res } ====== there is one list element for every line of the file, but all of them share the Tcl_Obj for the $filename. [NEM] has interned strings for the first reason (fast comparisons). In Tcl, I'd do that like this: ====== set INTERN_COUNTER 0 proc intern str { if {![info exists ::INTERN($str)]} { set ::INTERN($str) [incr ::INTERN_COUNTER] } return $::INTERN($str) } ====== That returns a unique integer for each string and you can then do integer tests for equality. Interned strings are used often for programming language interpreters to represent symbols/ atoms/variables names to optimise for equality checks and hash-table lookups. ---- [JBR]: A quick look at tclExecute.c shows that the byte code interpreter in fact contains the trivial optimization of string compare for comparing an instance to itself. "tclExecute.c" line 2457 of 6089: ======c if (valuePtr == value2Ptr) { /* In the pure equality case, set lengths too for the checks below (or we could goto beyond it). */ iResult = s1len = s2len = 0; } else if ((valuePtr->typePtr == &tclByteArrayType) ====== Interning a string using method 1 above optimized string compare and memory usage. ---- [AK]: I have done Tcl-level interning of strings as a form of compression. Pathnames again, several thousand times the same ones. Think versions of the same file in a version control system, for example. Hm ... The abstract operation behind this is "moving the replicated parts of the data into a separate place". For relational databases this is normalization, creating a second table storing the replicated data using unique strings, and referencing them in the original table through a small key. ** See Also ** [Constant String Hack]: Another page about the same thing. <> String Processing | Performance