base64 is an ASCII binary encoding.
Base64 represents arbitrary binary data using only printable ASCII characters so the data can be passed through channels that are not 8bit-clean.
Like uuencode, Base64 uses up to 4 bytes to represent three 3 bytes binary data, but it uses a different table to map the data into printable characters. Base64 encoded data can take up to one-third more space than the original data data conversion.
Base64 is used in MIME and by the Tk image commands.
BAS: There's also a derivation on base64 commonly referred to as base64url, which essentially just makes the base64 encode string viable for inclusion as a query parameter in an URL. This is used in JWT . From what I understand, it replaces the + encoded char with - and the / with _, and with no = padding. So, I believe it would be something like (using Tcl 8.6 binary command):
set base64url [string map {+ - / _ = {}} [binary encode base64 $string]] ;# untested
And of course, decoding would just be the reverse process.
ray2501: Thanks BAS information. I add my version for base64url encode and decode.
proc base64url_encode string { tailcall string map {+ - / _ = {}} [binary encode base64 $string] } proc base64url_decode string { tailcall binary decode base64 [string map {- + _ /} $string] }
2006-07-23: RS could not resist to do base64 encoding as a little weekend fun project, like this:
proc b64en str { binary scan $str B* bits switch [expr {[string length $bits] % 6}] { 0 {set tail {}} 2 {append bits 0000; set tail ==} 4 {append bits 00; set tail =} } return [string map { 000000 A 000001 B 000010 C 000011 D 000100 E 000101 F 000110 G 000111 H 001000 I 001001 J 001010 K 001011 L 001100 M 001101 N 001110 O 001111 P 010000 Q 010001 R 010010 S 010011 T 010100 U 010101 V 010110 W 010111 X 011000 Y 011001 Z 011010 a 011011 b 011100 c 011101 d 011110 e 011111 f 100000 g 100001 h 100010 i 100011 j 100100 k 100101 l 100110 m 100111 n 101000 o 101001 p 101010 q 101011 r 101100 s 101101 t 101110 u 101111 v 110000 w 110001 x 110010 y 110011 z 110100 0 110101 1 110110 2 110111 3 111000 4 111001 5 111010 6 111011 7 111100 8 111101 9 111110 + 111111 / } $bits]$tail }
On short test strings, it delivers the same results as "the real thing". What's missing is insertion of whitespace for longer strings. In any case, I think this implementation is pretty educational...
Jannis 2009-12-08: Wrote a base64 decoder in the same manner as above. No warranties, works for me and I learned a lot. Doesn't honor "wrap characters", just as the encoder above. Many thanks to RS for the encoder!
proc b64de str { set tail [expr {[string length $str] - [string length [string trimright $str =]]}] set str [string trimright $str =] set bits [string map { A 000000 B 000001 C 000010 D 000011 E 000100 F 000101 G 000110 H 000111 I 001000 J 001001 K 001010 L 001011 M 001100 N 001101 O 001110 P 001111 Q 010000 R 010001 S 010010 T 010011 U 010100 V 010101 W 010110 X 010111 Y 011000 Z 011001 a 011010 b 011011 c 011100 d 011101 e 011110 f 011111 g 100000 h 100001 i 100010 j 100011 k 100100 l 100101 m 100110 n 100111 o 101000 p 101001 q 101010 r 101011 s 101100 t 101101 u 101110 v 101111 w 110000 x 110001 y 110010 z 110011 0 110100 1 110101 2 110110 3 110111 4 111000 5 111001 6 111010 7 111011 8 111100 9 111101 + 111110 / 111111 } $str] set bytes [binary format B* $bits] return [string range $bytes 0 end-$tail] }
DominicErnst 2011-12-06: just tried Jannis' small base64 decoder - it's not working properly! The matter is, the decoder does not remove the 2- or 4-bit tail at the end, in some cases, this deletes the last character of the decoded string. I fixed this issue with my own version of b64de as follows:
proc b64de {_str} { set nstr [string trimright $_str =] set dstr [string map { A 000000 B 000001 C 000010 D 000011 E 000100 F 000101 G 000110 H 000111 I 001000 J 001001 K 001010 L 001011 M 001100 N 001101 O 001110 P 001111 Q 010000 R 010001 S 010010 T 010011 U 010100 V 010101 W 010110 X 010111 Y 011000 Z 011001 a 011010 b 011011 c 011100 d 011101 e 011110 f 011111 g 100000 h 100001 i 100010 j 100011 k 100100 l 100101 m 100110 n 100111 o 101000 p 101001 q 101010 r 101011 s 101100 t 101101 u 101110 v 101111 w 110000 x 110001 y 110010 z 110011 0 110100 1 110101 2 110110 3 110111 4 111000 5 111001 6 111010 7 111011 8 111100 9 111101 + 111110 / 111111 } $nstr] switch [expr {[string length $_str]-[string length $nstr]}] { 0 {#nothing to do} 1 {set dstr [string range $dstr 0 {end-2}]} 2 {set dstr [string range $dstr 0 {end-4}]} } return [binary format B* $dstr] }
There are incompatibilities between various base64 packages and versions of Tcl. CL has pursued this most. The root problem is a change in the semantics of binary. Tcl 8.0 with the base64 in tcllib 0.8 and before was definitely bad. This thread [L2 ] details this aspect of the change from 2.1 to 2.2 of the base64.tcl source code.
That there is some base64 decoding support built into Tk which is different code than what is in tcllib. There may be other bits around as well. This can be relevant when fixing bugs, etc.
Older versions of Critcl also had a Base64 implementation, see "ascenc" in critlib and a critical mindset about policy