A didactic implementation of SHA-256 in pure Tcl. It follows exactly the indication of FIPS180-2 document.
LM 2006-02-13:
Note that all variables are unsigned 32-bit integers and all calculations have to be performed modulo 2^32.
A conservative programming technique was adopted; some optimisations are possible.
There is the possibility of calculating the sha256 hash value with a multistep strategy (see the following Long Message test).
Comments and suggestions are appreciated.
AK: Note that the CVS head of Tcllib has an implementation of sha224/256 as well.
(See sha2)
# sha-256.tcl # SHA-256 hash algorithm # # Lino Monaco - Feb 2006 # _________________________________________ proc sha256_init {} { global K global h0 h1 h2 h3 h4 h5 h6 h7 # Set the SHA-256 constants set K { 0x428a2f98 0x71374491 0xb5c0fbcf 0xe9b5dba5 0x3956c25b 0x59f111f1 0x923f82a4 0xab1c5ed5 0xd807aa98 0x12835b01 0x243185be 0x550c7dc3 0x72be5d74 0x80deb1fe 0x9bdc06a7 0xc19bf174 0xe49b69c1 0xefbe4786 0x0fc19dc6 0x240ca1cc 0x2de92c6f 0x4a7484aa 0x5cb0a9dc 0x76f988da 0x983e5152 0xa831c66d 0xb00327c8 0xbf597fc7 0xc6e00bf3 0xd5a79147 0x06ca6351 0x14292967 0x27b70a85 0x2e1b2138 0x4d2c6dfc 0x53380d13 0x650a7354 0x766a0abb 0x81c2c92e 0x92722c85 0xa2bfe8a1 0xa81a664b 0xc24b8b70 0xc76c51a3 0xd192e819 0xd6990624 0xf40e3585 0x106aa070 0x19a4c116 0x1e376c08 0x2748774c 0x34b0bcb5 0x391c0cb3 0x4ed8aa4a 0x5b9cca4f 0x682e6ff3 0x748f82ee 0x78a5636f 0x84c87814 0x8cc70208 0x90befffa 0xa4506ceb 0xbef9a3f7 0xc67178f2 } # ... and initial hash value set h0 0x6a09e667 set h1 0xbb67ae85 set h2 0x3c6ef372 set h3 0xa54ff53a set h4 0x510e527f set h5 0x9b05688c set h6 0x1f83d9ab set h7 0x5be0cd19 } # SHA-256 logical functions ___________________________________________________ proc Ch {x y z} { return [expr {($x & $y) ^ (~($x) & $z)}] } proc Maj {x y z} { return [expr {($x & $y) ^ ($x & $z) ^ ($y & $z)}] } proc SIGMA_0 x { set a [expr {(($x >> 2) & 0x3FFFFFFF) | (($x << (32 - 2)) & 0xFFFFFFFF)}] set b [expr {(($x >> 13) & 0x0007FFFF) | (($x << (32 - 13)) & 0xFFFFFFFF)}] set c [expr {(($x >> 22) & 0x000003FF) | (($x << (32 - 22)) & 0xFFFFFFFF)}] return [expr {$a ^ $b ^$c}] } proc SIGMA_1 x { set a [expr {(($x >> 6) & 0x03FFFFFF) | (($x << (32 - 6)) & 0xFFFFFFFF)}] set b [expr {(($x >> 11) & 0x001FFFFF) | (($x << (32 - 11)) & 0xFFFFFFFF)}] set c [expr {(($x >> 25) & 0x0000007F) | (($x << (32 - 25)) & 0xFFFFFFFF)}] return [expr {$a ^ $b ^$c}] } proc sigma0 x { set a [expr {(($x >> 7) & 0x01FFFFFF) | (($x << (32 - 7)) & 0xFFFFFFFF)}] set b [expr {(($x >> 18) & 0x00003FFF) | (($x << (32 - 18)) & 0xFFFFFFFF)}] set c [expr {($x >> 3) & 0x1FFFFFFF}] return [expr {$a ^ $b ^$c}] } proc sigma1 x { set a [expr {(($x >> 17) & 0x00007FFF) | (($x << (32 - 17)) & 0xFFFFFFFF)}] set b [expr {(($x >> 19) & 0x00001FFF) | (($x << (32 - 19)) & 0xFFFFFFFF)}] set c [expr {($x >> 10) & 0x003FFFFF}] return [expr {$a ^ $b ^$c}] } #______________________________________________________________________________ proc sha256_pad {msg len} { # Padding function: works only with messages that have a byte-aligned length # "len" is the total bytes length of whole message # append the value 0x80 to message append msg [binary format c 0x80] # append "0" bits until the message length is equal to 64 - 8 - 1 bytes set mlen [expr {($len + 8 + 1) % 64}] while {$mlen < 64} { append msg [binary format c 0x0] incr mlen } # append a 64-bits big-endian integer giving the original message length (in bits) append msg [binary format W [expr {$len*8}]] return $msg } proc sha256_round msg { global K global h0 h1 h2 h3 h4 h5 h6 h7 # Divide the message into 32-bits integer binary scan $msg I* crunch set len [llength $crunch] # Work 16 integers at a time for {set i 0} {$i < $len} {incr i 16} { # Prepare the message scheduler set W {} for {set j 0} {$j < 16} {incr j} { lappend W [lindex $crunch [expr {$i + $j}]] } for {set j 16} {$j < 64} {incr j} { set W15 [lindex $W [expr {$j - 15}]] set W2 [lindex $W [expr {$j - 2}]] set W16 [lindex $W [expr {$j - 16}]] set W7 [lindex $W [expr {$j - 7}]] set s0 [sigma0 $W15] set s1 [sigma1 $W2] lappend W [expr {($W16 + $s0 + $W7 + $s1) & 0xFFFFFFFF}] } # Initialize the working variables set a $h0 set b $h1 set c $h2 set d $h3 set e $h4 set f $h5 set g $h6 set h $h7 # 64 Hash rounds for {set j 0} {$j < 64} {incr j} { set s0 [SIGMA_0 $a] set maj [Maj $a $b $c] set t0 [expr {($s0 + $maj) & 0xFFFFFFFF}] set s1 [SIGMA_1 $e] set ch [Ch $e $f $g] set Kj [lindex $K $j] set Wj [lindex $W $j] set t1 [expr {($h + $s1 + $ch + $Kj + $Wj) & 0xFFFFFFFF}] set h $g set g $f set f $e set e [expr { ($d + $t1) & 0xFFFFFFFF}] set d $c set c $b set b $a set a [expr {($t0 + $t1) & 0xFFFFFFFF}] } # Compute the intermediate hash value set h0 [expr {($h0 + $a) & 0xffffffff}] set h1 [expr {($h1 + $b) & 0xffffffff}] set h2 [expr {($h2 + $c) & 0xffffffff}] set h3 [expr {($h3 + $d) & 0xffffffff}] set h4 [expr {($h4 + $e) & 0xffffffff}] set h5 [expr {($h5 + $f) & 0xffffffff}] set h6 [expr {($h6 + $g) & 0xffffffff}] set h7 [expr {($h7 + $h) & 0xffffffff}] } } proc sha256_end {} { global h0 h1 h2 h3 h4 h5 h6 h7 # format the hashing value set h0 [format %08X $h0] set h1 [format %08X $h1] set h2 [format %08X $h2] set h3 [format %08X $h3] set h4 [format %08X $h4] set h5 [format %08X $h5] set h6 [format %08X $h6] set h7 [format %08X $h7] return "$h0 $h1 $h2 $h3 $h4 $h5 $h6 $h7" } proc sha256 msg { # glue all the work sha256_init set msg [sha256_pad $msg [string bytelength $msg]] sha256_round $msg sha256_end } # Test vectors and results set msg abc puts "One block message test: $msg" puts "[sha256 $msg]\n" set msg abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq puts "Multi-block message test: $msg" puts [sha256 $msg]\n # In case of very long messages, the hash calculation can be executed # in more steps. # The message has to be divided into 64 bytes pieces calling repeatedly # "sha256_round" procedure. The last time giving as input parameter the # result of "sha256_pad". # # The padding procedure has to be called with the last bytes of the message # if < 64 or an empty string in case of a message with a length multiple of # 64 bytes. puts "Long Message test:" puts "a string which consists of 1,000,000 repetition of the character 'a'" set msg [string repeat a 64] sha256_init # 15625 * 64 = 1,000,000 for {set i 0} {$i < 15625} {incr i} { sha256_round $msg } sha256_round [sha256_pad {} 1000000] puts [sha256_end]\n
Test results:
One block message test: abc BA7816BF 8F01CFEA 414140DE 5DAE2223 B00361A3 96177A9C B410FF61 F20015AD Multi-block message test: abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq 248D6A61 D20638B8 E5C02693 0C3E6039 A33CE459 64FF2167 F6ECEDD4 19DB06C1 Long Message test: a string which consists of 1,000,000 repetition of the character 'a' CDC75E5C 9914FB92 81A1C7E2 84D73E67 F1809A48 A497200E 046D39CC C7112CD0
LM 2006-02-14
This is an optimised version of SHA-256 hash algorithm implementation: less code and it runs faster on my test machine.
time {sha256 abc} 1000 672 microseconds per iteration
vs.
766 microseconds per iteration
of the original version.
# sha-256.tcl # SHA-256 hash algorithm optimised implementation # # Lino Monaco - Feb 2006 # _______________________________________________ proc sha256_init {} { global K global H # Set the SHA-256 constants set K { 0x428a2f98 0x71374491 0xb5c0fbcf 0xe9b5dba5 0x3956c25b 0x59f111f1 0x923f82a4 0xab1c5ed5 0xd807aa98 0x12835b01 0x243185be 0x550c7dc3 0x72be5d74 0x80deb1fe 0x9bdc06a7 0xc19bf174 0xe49b69c1 0xefbe4786 0x0fc19dc6 0x240ca1cc 0x2de92c6f 0x4a7484aa 0x5cb0a9dc 0x76f988da 0x983e5152 0xa831c66d 0xb00327c8 0xbf597fc7 0xc6e00bf3 0xd5a79147 0x06ca6351 0x14292967 0x27b70a85 0x2e1b2138 0x4d2c6dfc 0x53380d13 0x650a7354 0x766a0abb 0x81c2c92e 0x92722c85 0xa2bfe8a1 0xa81a664b 0xc24b8b70 0xc76c51a3 0xd192e819 0xd6990624 0xf40e3585 0x106aa070 0x19a4c116 0x1e376c08 0x2748774c 0x34b0bcb5 0x391c0cb3 0x4ed8aa4a 0x5b9cca4f 0x682e6ff3 0x748f82ee 0x78a5636f 0x84c87814 0x8cc70208 0x90befffa 0xa4506ceb 0xbef9a3f7 0xc67178f2 } # ... and initial hash value set H {0x6a09e667 0xbb67ae85 0x3c6ef372 0xa54ff53a \ 0x510e527f 0x9b05688c 0x1f83d9ab 0x5be0cd19} } # SHA-256 logical functions ___________________________________________________ proc Ch {x y z} { return [expr {($x & $y) ^ (~($x) & $z)}] } proc Maj {x y z} { return [expr {($x & $y) ^ ($x & $z) ^ ($y & $z)}] } proc SIGMA_0 x { return [expr {((($x >> 2) & 0x3FFFFFFF) | ($x << (32 - 2))) ^ ((($x >> 13) & 0x0007FFFF) | ($x << (32 - 13))) ^ ((($x >> 22) & 0x000003FF) | ($x << (32 - 22))) & 0xffffffff }] } proc SIGMA_1 x { return [expr {((($x >> 6) & 0x03FFFFFF) | ($x << (32 - 6))) ^ ((($x >> 11) & 0x001FFFFF) | ($x << (32 - 11))) ^ ((($x >> 25) & 0x0000007F) | ($x << (32 - 25))) & 0xffffffff }] } proc sigma0 x { return [expr {((($x >> 7) & 0x01FFFFFF) | ($x << (32 - 7))) ^ ((($x >> 18) & 0x00003FFF) | ($x << (32 - 18))) ^ (($x >> 3) & 0x1FFFFFFF) & 0xffffffff}] } proc sigma1 x { return [expr {((($x >> 17) & 0x00007FFF) | ($x << (32 - 17))) ^ ((($x >> 19) & 0x00001FFF) | ($x << (32 - 19))) ^ (($x >> 10) & 0x003FFFFF) & 0xffffffff}] } #______________________________________________________________________________ proc sha256_pad {msg len} { # Padding function: works only with messages that have a byte-aligned length # "len" is the total bytes length of whole message # append the value 0x80 to message # append msg [binary format c 0x80] # append "0" bits until the message length is equal to 64 - 8 - 1 bytes # set mlen [expr {64 - (($len + 8 + 1) % 64)}] append msg [string repeat [binary format c 0x00] $mlen] # append a 64-bits big-endian integer giving the original message length (in bits) # append msg [binary format W [expr {$len*8}]] return $msg } proc sha256_round msg { global K global H # Divide the message into 32-bits integer binary scan $msg I* crunch set len [llength $crunch] # Work 16 integers at a time for {set i 0} {$i < $len} {incr i 16} { # Prepare the message scheduler # set W {} for {set j 0} {$j < 16} {incr j} { lappend W [lindex $crunch [expr {$i + $j}]] } for {set j 16} {$j < 64} {incr j} { lappend W [expr {( [lindex $W [expr {$j - 16}]] + [sigma0 [lindex $W [expr {$j - 15}]]] + [lindex $W [expr {$j - 7}]] + [sigma1 [lindex $W [expr {$j - 2}]]] ) & 0xffffffff }] } # Initialize the working variables foreach {a b c d e f g h} $H {} # 64 Hash rounds for {set j 0} {$j < 64} {incr j} { set t0 [expr {([SIGMA_0 $a] + [Maj $a $b $c]) & 0xffffffff}] set t1 [expr {( $h + [SIGMA_1 $e] + [Ch $e $f $g] + [lindex $K $j] + [lindex $W $j]) & 0xffffffff}] foreach {h g f e d c b a} [list $g $f $e [expr { ($d + $t1) & 0xFFFFFFFF}] \ $c $b $a [expr {($t0 + $t1) & 0xFFFFFFFF}]] {} } # Compute the intermediate hash value set H1 {} foreach hh $H tt [list $a $b $c $d $e $f $g $h] { lappend H1 [expr {($hh + $tt) & 0xffffffff}] } set H $H1 } } proc sha256_end {} { global H # format the hashing value set res {} foreach hh $H { append res [format {%08X } $hh] } return $res } proc sha256 msg { # glue all the work sha256_init set msg [sha256_pad $msg [string bytelength $msg]] sha256_round $msg sha256_end } # Test vectors and results set msg abc puts "One block message test: $msg" puts [sha256 $msg]\n set msg abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq puts "Multi-block message test: $msg" puts [sha256 $msg]\n puts {Long Message test:} puts {a string which consists of 1,000,000 repetition of the character 'a'} set msg [string repeat a 64] sha256_init # 15625 * 64 = 1,000,000 for {set i 0} {$i < 15625} {incr i} { sha256_round $msg } sha256_round [sha256_pad {} 1000000] puts [sha256_end]\n
RML: 2006-02-22
I found this code very helpful but I ran into two issue:
1) I found that when using multi-byte characters, this code fails. This was fixed by making the following change in the sha256 procedure:
set msg [sha256_pad $msg [string bytelength $msg]]
to:
set msg [sha256_pad $msg [string length $msg]]
2) In the sha256_pad procedure, this line:
append msg [binary format W [expr {$len*8}]]
only works in Tcl 8.4+ since that's when the W option was introduced. For folks, like myself, using earlier versions, this can be replaced by this line to get things working:
append msg [binary format II 0 [expr {$len*8}]]
PT: Let us reiterate - this algorithm is also implemented in tcllib where is has undergone the usual round of testing with tcl 8.2 and above. Also the tcllib implementations have been quite heavily optimised in pure Tcl and will additionaly make use of C compiled extensions where available (although we don't have one for this particular algorithm this is generally the case).