string length

Difference between version 17 and 18 - Previous - Next
'''`[string] length`''' returns the number of characters in a [value].



** Synopsis **

    :   '''string length''' ''string''



** Description **

A single character may require more than one byte of storage, so the length of
the string may be smaller than the space required to store the string.  A
string in which every character requires only one byte of storage may be
represented internally as a '''[Tcl_Obj%|%ByteArray]'''.   `[binary format]`
creates such values, as does [read%|%reading] from a [chan
configure%|%binary-encoded] channel.  Because binary data is typically
represented as a string where each character only requires one byte of storage,
`[string length]` is the right routine to use to get the size of binary data,
while `[string bytelength]` is deprecated as a historical oddity.

The following `[alias]` has a name familiar from [C]:

======
interp alias {} strlen {} string length
======



** Implementing `string length` **

2003-10-17: i In the [Tcl chatroom], some of us played around with silly pure-Tcl
implementations of `string length`:

======
proc strlen s {
    set n 0
    foreach char [split $s {}] {incr n}
    set n
} ;# RS


proc strlen s {
    llength [split $s {}]
} ;# AM


proc strlen string {
    regsub -all . $string +1 string
    expr 0$string
} ;# MS


interp alias {} strlen {} regexp -all . ;# MS, jcw


proc strlen string {
    expr 0[regsub -all . $string +1]
} ;# dkf


# The ''functional'' way:

proc strlen s {
    expr {[regexp {.(.*)} $s - s] ? (1+[strlen $s]) : 0}
} ;# EB
======


[ulis], A ''recursive'' way:

======
proc strlen string {
    if {$string eq {}} {
        return 0
    } else {
        expr {[strlen [string range $string 1 end]] + 1}
    }
}
======

And the classical ''iterative'' way:

======
proc strlen {string} {
    set n 0
    while {$string ne {}} {
        set string [string range $string 1 end]
        incr n
    }
    return $n
}
======

-----

Powers of ten (works only for short strings):

======
proc strlen s {
    expr round(log10(1.[regsub -all . $s *10]))
} ;# RS
======

----

At times, people ask which is better for determining whether a string is '''[empty string]''':

======
string equal x x$str

string equal {} $str

![string compare {} $b]

[string length $str] == 0

![string length $str]

$str eq {}
======

`[string length]` or `[string equal]` are a bit quicker as they look to see if
the strings are of equal size first.



** [regexp] Bug **


'''Negative-length strings''': A bug (SF 230589) in [regexp] produces incredible consequences:

======none
% regexp {[ ]*(^|[^@])@} x@ m; puts [string length $m]
-109537
======

Numbers vary by platform, the above was 8.4.1 on Solaris. ([CMcC] via [RS])


<<categories>> Tcl syntax | Arts and Crafts of Tcl-Tk Programming | Command | String Processing