This VFS is based upon TclVFS, and allows accessing the content of a Compiled Html (CHM) file, as those used as help for Windows applications.
My intent was allowing the deployment of a whole site, by mean of a single file, yet available in such format. E.g. I don't want to download the whole html help for ActiveTcl, create a war, and deploy it. This comes from the need to distribute useful documentation among my development team, and also from the habit of downloading sites to my Palm, using Plucker and the like.
CHMvfs is not thoroughly tested, but worked fine with version ActiveTcl8.4.11.0.162119-html and ActiveTcl8.4.12.0.226725-html of documentation.
Main namespace is related to chm, as a filesystem. I'd like to add access to chm-specific resources, like search for indexes.
lzx namespace is a porting from chmtools, implemented by Matthew T. Russotto, whom I thank a lot, because his work was absolutely necessary to end this program. This LZX decruncher was pulled out of the program cabextract 0.2 by Stuart Caie <[email protected]> and modified to be useful as an LZX decruncher outside the context of CAB files. I simplified it even more.
The difference between CHMvfs and chmtools, is that I needed a Random Access, which I achieved using the block index at the end of the Reset Table: reading the whole file each time, uncompressing it, required a huge time.
If You like it, You can extend with all the things I forgot, and pulling all the loose strings. You can also use this as a starting point for a CABvfs.
I am not interested in using chm files other than for reading, so it is a read-only filesystems, like zips and other archives.
package provide vfs::chm 0.5 set ::verbose 0 package require vfs 1.0 package provide chmvfs 0.5 set ::vfs::debug 1 # Basic idea is having access to a site structure as saved in a chm file, then # serving it via a tcl-based http server. namespace eval vfs::chm { variable status array set status [list / /] } namespace eval ::lzx { variable bitbuf "" variable bufsiz 0 variable bufpos 0 variable status array set status [list / /] variable LZX_FRAME_SIZE 32768 variable LZX_CHECK_BLOCK [expr (32768 + 6144)*8] variable LZX_MIN_MATCH 2 variable LZX_MAX_MATCH 257 variable LZX_NUM_CHARS 256 #define LZX_BLOCKTYPE_VERBATIM (1) #define LZX_BLOCKTYPE_ALIGNED (2) #define LZX_BLOCKTYPE_UNCOMPRESSED (3) variable LZX_PRETREE_NUM_ELEMENTS 20 variable LZX_ALIGNED_NUM_ELEMENTS 8 variable LZX_NUM_PRIMARY_LENGTHS 7 variable LZX_NUM_SECONDARY_LENGTHS 249 variable LZX_LENGTH_MAXSYMBOLS [expr $LZX_NUM_SECONDARY_LENGTHS + 1] variable LZX_LENGTH_TABLEBITS 12 variable LZX_PRETREE_MAXSYMBOLS $LZX_PRETREE_NUM_ELEMENTS variable LZX_PRETREE_TABLEBITS 6 variable LZX_MAINTREE_MAXSYMBOLS [expr $LZX_NUM_CHARS + 50 * 8] variable LZX_MAINTREE_TABLEBITS 12 variable LZX_ALIGNED_MAXSYMBOLS $LZX_ALIGNED_NUM_ELEMENTS variable LZX_ALIGNED_TABLEBITS 7 variable extra_bits variable position_base # Initialize LZX lookup tables for { set i 0; set j 0} { $i < 51 } { incr i 2 } { # 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7... set extra_bits($i) $j set extra_bits([expr $i + 1]) $j # 0,0,1,2,3,4...15,16,17,17,17,17... if { ($i != 0) && ($j < 17) } { incr j } } for { set i 0; set j 0 } { $i < 51 } { incr i } { # 0,1,2,3,4,6,8,12,16,24,32,... set position_base($i) $j # 1,1,1,1,2,2,4,4,8,8,16,16,32,32,... incr j [expr 1 << $extra_bits($i)] } } proc vfs::chm::Mount {chm local} { if {![file exists $chm] || ![file isfile $chm] } { error "No such file $chm" } set fileHeader(size) 56 set fileHeader(structure) { {n type f a4} {n version f i} {n totLength f i} {n unknown f i} {n timestamp f i} {n langId f i} {n guid1 f h32} {n guid2 f h32}} set recordFormat "h32" set fl [::open $chm r] fconfigure $fl -translation binary set fmt "" set varList {} foreach headerElement $fileHeader(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList fileHeader($element(n))] } set content [read $fl $fileHeader(size)] eval binary scan \$content $fmt $varList if { $fileHeader(version) != 2 } { set hst(size) 40 set hst(structure) { {n offs0 f w} {n len0 f w} {n offs1 f w} {n len1 f w} {n offs2 f w}} } else { set hst(size) 32 set hst(structure) { {n offs0 f w} {n len0 f w} {n offs1 f w} {n len1 f w}} } set fmt "" set varList {} foreach headerElement $hst(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList hst($element(n))] } set content [read $fl $hst(size)] set offset $fileHeader(size) eval binary scan \$content $fmt $varList set hSect0(size) $hst(len0) set hSect0(structure) { {n unkn0 f H4} {n unkn1 f i} {n fileSize f w } {n unkn2 f i} {n unkn3 f i}} set fmt "" set varList {} foreach headerElement $hSect0(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList hSect0($element(n))] } set content [read $fl $hSect0(size)] eval binary scan \$content $fmt $varList set hSect1(size) 84 set hSect1(structure) { { n type f a4 } { n version f i } { n length f i } { n unkn0 f i } { n chunksz f i } { n density f i } { n depth f i } { n rtchunkno f i } { n PGML1 f i } { n PMGLls f i } { n unkn1 f i } { n dirChunks f i } { n wLangId f i } { n guid f h32 } { n len2 f i } { n unkn2 f i } { n unkn3 f i } { n unkn4 f i } } set fmt "" set varList {} foreach headerElement $hSect1(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList hSect1($element(n))] } seek $fl $hst(offs1) set content [read $fl $hSect1(size)] eval binary scan \$content $fmt $varList set offset [expr $hst(offs1) + $hSect1(length)] seek $fl $offset for {set chunk 0} { $chunk < $hSect1(dirChunks) } { incr chunk } { set chOffset 0 set content [read $fl $hSect1(chunksz)] incr offset $hSect1(chunksz) set fmt a4 eval binary scan \$content $fmt chunkType if { $chunkType == "PMGL" } { set begLstChunk(size) 20 set begLstChunk(structure) { { n type f a4 } { n length f i } { n zero f i } { n prev f i } { n next f i } } set fmt "" set varList {} foreach headerElement $begLstChunk(structure) { array set element $headerElement set fmt "$fmt$element(f)" set varList [concat $varList begLstChunk($element(n))] } eval binary scan \$content $fmt $varList set quickref [expr $chOffset + $hSect1(chunksz) - $begLstChunk(length)] set end [expr $chOffset + $hSect1(chunksz) - 2] set fmt s eval binary scan \$content @${end}$fmt begLstChunk(entryNum) lappend begLstChunk(structure) {n entryNum f s} set chunks($chunk) [array get begLstChunk] incr chOffset $begLstChunk(size) set chunkList {} for {set entryno 0} { $entryno < $begLstChunk(entryNum) } {incr entryno} { set chOffset [readEntry $content $chOffset entry] array set temp $entry set name [string trim $temp(NAME) /] if { $name == "/" } { set name "/" } set ::vfs::chm::tree${chm}($name) $entry lappend chunkList $entry } set chunks(lst$chunk) $chunkList } else { if { $chunkType == "PMGI" } { } else { error "Unknown chunk type $chunkType" } } } if { $fileHeader(version) != 2 } { set offset $hst(offs2) } else { incr offset $hSect1(chunksz) } close $fl if { $fileHeader(version) != 2 } { set contentStart $hst(offs2) } else { set contentStart $offset } set ::vfs::chm::tree${chm}(contentStart) $contentStart set nlfid [open $chm ::DataSpace/NameList r 438] set content [read $nlfid] close $nlfid binary scan $content ss len num set offset 4 for { set idx 0 } { $idx < $num } { incr idx } { binary scan $content @${offset}s fileNameLen incr offset 2 set beg $offset incr offset [expr 2 * $fileNameLen] if { $idx > 0 } { set fname [string range $content $beg $offset] set fname [join [split $fname "\0"] ""] array set sectData [set ::vfs::chm::tree${chm}(::DataSpace/Storage/$fname/Content)] incr sectData(OFFSET) $contentStart set ctrlfd [open $chm ::DataSpace/Storage/$fname/ControlData r 438] set control [read $ctrlfd] close $ctrlfd binary scan $control ia4iiiii wnum sectData(SIGN) sectData(VERSION) sectData(RSTINT) sectData(WINSZ) sectData(CACSZ) zero if { $sectData(VERSION) == 2 } { set sectData(WINSZ) [expr $sectData(WINSZ) * 32768] set sectData(RSTINT) [expr $sectData(RSTINT) * 32768] } switch $sectData(WINSZ) { 32768 { set sectData(WINBT) 15 } 65536 { set sectData(WINBT) 16 } 131072 { set sectData(WINBT) 17 } 262144 { set sectData(WINBT) 18 } 524288 { set sectData(WINBT) 19 } 1048576 { set sectData(WINBT) 20 } 2097152 { set sectData(WINBT) 21 } default { error "bad controldata window size" } } set ctrlFileName "::DataSpace/Storage/$fname/Transform/{7FC28940-9D31-11D0-9B27-00A0C91E9C7C}/InstanceData/ResetTable" set ctrlfd [open $chm $ctrlFileName r 438] set control [read $ctrlfd] close $ctrlfd binary scan $control iiiiwwwww sectData(unk) sectData(RSTBEN) sectData(RSTBSZ) sectData(RSTBLN) sectData(RSTBULN) sectData(RSTBCLN) sectData(RSTBBSZ) sectData(RSTBZRO) sectData(RSTBBND) array set rstTable [set ::vfs::chm::tree${chm}($ctrlFileName)] set sectData(RSTBLOFFS) $rstTable(OFFSET) set ::vfs::chm::tree${chm}(Section$idx) [array get sectData] } incr offset 2 } set ::vfs::chm::status($local) $chm set ::vfs::chm::tree${chm}(mountPoint) $local ::vfs::log "chm $chm mounted at $local" vfs::filesystem mount $local [list vfs::chm::handler $chm] vfs::RegisterMount $local [list vfs::chm::Unmount] return $local } proc vfs::chm::Unmount {local} { set chm $::vfs::chm::status($local) unset ::vfs::chm::tree$chm unset ::vfs::chm::status($local) vfs::filesystem unmount $local } proc vfs::chm::handler {chm cmd root relative actualpath args} { if {$cmd == "matchindirectory"} { eval [list $cmd $chm $relative $actualpath] $args } else { eval [list $cmd $chm $relative] $args } } # If we implement the commands below, we will have a perfect # virtual file system for Compiled HTML archives. proc vfs::chm::stat {chm name} { ::vfs::log "stat $name" if { $name == ""} { return [list type directory size 0 mode 0555 \ ino -1 depth 0 name $name atime 0 ctime 0 mtime 0 dev -1 \ uid -1 gid -1 nlink 1] } if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { error "No such file: $x" } array set entry $x if { $entry(LENGTH) == 0 } { return [list type directory size 0 mode 0111 \ ino -1 depth 0 name $name atime 0 ctime 0 mtime 0 dev -1 \ uid -1 gid -1 nlink 1] } return [list type file size $entry(LENGTH) mode 0111 \ ino -1 depth 0 name $name atime 0 ctime 0 mtime 0 dev -1 \ uid -1 gid -1 nlink 1] } proc vfs::chm::access {chm name mode} { if {$mode & 2} { vfs::filesystem posixerror $::vfs::posix(EROFS) } if { $name == ""} { return 1 } if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { vfs::filesystem posixerror $::vfs::posix(ENOENT) } return 1 } proc vfs::chm::exists {chm name} { if { $name == ""} { return 1 } if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { vfs::filesystem posixerror $::vfs::posix(ENOENT) } return 1 } proc vfs::chm::matchindirectory {chm path actualpath pattern type} { set pattern [file join $path $pattern] set biggest [array name ::vfs::chm::tree${chm} $pattern] set root [set ::vfs::chm::tree${chm}(mountPoint)] set bigger [list] foreach p $biggest { if { [string match "${pattern}/*" $p ]} continue lappend bigger [file join $root $p] } #::vfs::log "got $newres" return [::vfs::matchCorrectTypes $type $bigger] } proc vfs::chm::open {chm name mode permissions} { switch -- $mode { "" - "r" { if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { vfs::filesystem posixerror $::vfs::posix(ENOENT) } array set entry $x if { !$entry(LENGTH) } { # There are no empty files: they are folders! vfs::filesystem posixerror $::vfs::posix(EPERM) } set nfd [vfs::memchan] fconfigure $nfd -translation binary set chmfd [::open $chm r] fconfigure $chmfd -translation binary if { $entry(SECTION) } { # needs decompression array set sectData [set ::vfs::chm::tree${chm}(Section${entry(SECTION)})] set entryNum [expr $entry(OFFSET) / $sectData(RSTBBSZ) / 2] set entry(NUM) [expr $entryNum * 2] if { $entryNum > $sectData(RSTBEN) } { error "Entry $entryNum not present in Reset Table($sectData(RSTBEN))" } set entryOffset [expr [set ::vfs::chm::tree${chm}(contentStart)] + $sectData(RSTBLOFFS) + $sectData(RSTBLN) + $entry(NUM) * $sectData(RSTBSZ)] set fl [::open $chm r] fconfigure $fl -translation binary seek $fl $entryOffset set addr [read $fl $sectData(RSTBSZ)] if { $sectData(RSTBSZ) == 4 } { binary scan "$addr" i compOffset } else { binary scan "$addr" w compOffset } set entry(OFFS) $compOffset set skip [expr $entry(OFFSET) - $entry(NUM) * $sectData(RSTBBSZ)] seek $chmfd $sectData(OFFSET) seek $chmfd $entry(OFFS) current ::lzx::decompress $chmfd $nfd $entry(LENGTH) $skip $sectData(WINBT) } else { seek $chmfd [set ::vfs::chm::tree${chm}(contentStart)] start seek $chmfd $entry(OFFSET) current set data [read $chmfd $entry(LENGTH)] puts -nonewline $nfd $data } fconfigure $nfd -translation auto seek $nfd 0 return [list $nfd] } default { vfs::filesystem posixerror $::vfs::posix(EROFS) } } } proc vfs::chm::createdirectory {chm name} { #::vfs::log "createdirectory $name" vfs::filesystem posixerror $::vfs::posix(EROFS) } proc vfs::chm::removedirectory {chm name recursive} { #::vfs::log "removedirectory $name" vfs::filesystem posixerror $::vfs::posix(EROFS) } proc vfs::chm::deletefile {chm name} { #::vfs::log "deletefile $name" vfs::filesystem posixerror $::vfs::posix(EROFS) } proc vfs::chm::fileattributes {chm name args} { #::vfs::log "fileattributes $args" switch -- [llength $args] { 0 { # list strings return [list NAME SECTION OFFSET LENGTH] } 1 { # get value if {[catch "set ::vfs::chm::tree${chm}($name)" x]} { vfs::filesystem posixerror $::vfs::posix(ENOENT) } array set entry $x set index [lindex $args 0] set name [lindex [list NAME SECTION OFFSET LENGTH] $index] return $entry($name) } 2 { # set value set index [lindex $args 0] set val [lindex $args 1] vfs::filesystem posixerror $::vfs::posix(EROFS) } } } proc vfs::chm::utime {fd path actime mtime} { vfs::filesystem posixerror $::vfs::posix(EROFS) } proc vfs::chm::readEntry { content offset lst } { set l 0 set b -1 while { $b < 0 } { eval binary scan \$content @${offset}c1 b incr offset set l [expr ( $l << 7 ) | ($b & 127) ] } eval binary scan \$content @${offset}a$l n incr offset $l set s 0 set b -1 while { $b < 0 } { eval binary scan \$content @${offset}c1 b incr offset set s [expr ( $s << 7 ) | ($b & 127) ] } set o 0 set b -1 while { $b < 0 } { eval binary scan \$content @${offset}c1 b incr offset set o [expr ( $o << 7 ) | ($b & 127) ] } set l 0 set b -1 while { $b < 0 } { eval binary scan \$content @${offset}c1 b incr offset set l [expr ( $l << 7 ) | ($b & 127) ] } upvar $lst x set x [list NAME $n SECTION $s OFFSET $o LENGTH $l] return $offset } proc ::lzx::checkBuffer { fds } { variable bufpos variable bufsiz variable bitbuf # How many bytes to be skipped? set bigSkip [expr ($bufpos >> 3 ) & -2] # How many bits to be skipped? set smlSkip [expr $bufpos & 15] # In case of exhausted buffer, read some # We should have at least LZX_CHECK_BLOCK = ( LZX_FRAME_SIZE + 6k ) * 8 bits, # the maximum size an uncompressed block can achieve, but we # read a MegaByte at the time if { $bufsiz < $::lzx::LZX_CHECK_BLOCK } { set bytes [read $fds 1048576] set bitbuf [string range $bitbuf $bigSkip end] append bitbuf $bytes # Use real bytes length, in case of less input data incr bufsiz [expr [string length $bytes] * 8] set bufpos $smlSkip set bigSkip 0 } } proc ::lzx::bits { len } { variable bufpos variable bufsiz variable bitbuf # How many bytes to be skipped? set bigSkip [expr ($bufpos >> 3 ) & -2] # How many bits to be skipped? set smlSkip [expr $bufpos & 15] # I read an entire DWORD: it should be enough if { [binary scan $bitbuf @${bigSkip}ss num1 num2] < 2} { error "Exhausted input" } set num [expr ($num1 << 16) | ($num2 & 65535)] set res [expr $num >> ( 32 - $smlSkip - $len ) ] set res [expr $res & ( ( 1 << $len ) - 1 ) ] incr bufpos $len incr bufsiz -$len return $res } proc ::lzx::build_table { obj } { set nsyms [set ::lzx::LZX_${obj}_MAXSYMBOLS ] set nbits [set ::lzx::LZX_${obj}_TABLEBITS ] set table_mask [expr 1 << $nbits] set bit_mask [expr $table_mask >> 1] set next_symbol $bit_mask set bit_num 1 set pos 0 while { $bit_num <= $nbits } { for { set sym 0 } { $sym < $nsyms } { incr sym } { if { [set ::lzx::status(${obj}_len$sym)] == $bit_num } { set leaf $pos incr pos $bit_mask if { $pos > $table_mask } { error "Table overrun" } set fill $bit_mask while { $fill > 0 } { set ::lzx::status(${obj}_table$leaf) $sym incr fill -1 incr leaf } } } set bit_mask [expr $bit_mask >> 1] incr bit_num } if { $pos != $table_mask } { for { set sym $pos } { $sym < $table_mask } { incr sym } { set ::lzx::status(${obj}_table$sym) 0 } set pos [expr $pos << 16] set table_mask [expr $table_mask << 16] set bit_mask $::lzx::LZX_FRAME_SIZE while { $bit_num <= 16 } { for { set sym 0 } { $sym < $nsyms } { incr sym } { if { [set ::lzx::status(${obj}_len$sym)] == $bit_num } { set leaf [expr $pos >> 16] for { set fill 0 } { $fill < $bit_num - $nbits } { incr fill } { if { $::lzx::status(${obj}_table$leaf) == 0 } { set s [expr $next_symbol << 1] set ::lzx::status(${obj}_table$s) 0 incr s set ::lzx::status(${obj}_table$s) 0 set ::lzx::status(${obj}_table$leaf) $next_symbol incr next_symbol } set leaf [expr $::lzx::status(${obj}_table$leaf) << 1] if { ($pos >> (15-$fill)) & 1 } { incr leaf } } set ::lzx::status(${obj}_table$leaf) $sym incr pos $bit_mask if { $pos > $table_mask } { error "table overflow" } } } set bit_mask [expr $bit_mask >> 1] incr bit_num } } if { $pos == $table_mask } return } proc ::lzx::read_huffsym { obj } { set mp $::lzx::bufpos set ms $::lzx::bufsiz set bits [set ::lzx::LZX_${obj}_TABLEBITS ] set r [::lzx::bits $bits] set i $::lzx::status(${obj}_table$r) if { $i >= [set ::lzx::LZX_${obj}_MAXSYMBOLS ] } { set j [expr 1 << (32 - $bits)] set go 1 while { $go || $i >= [set ::lzx::LZX_${obj}_MAXSYMBOLS ] } { set j [expr $j >> 1] set i [expr $i << 1] set i [expr $i | [::lzx::bits 1]] if {!$j} { error "Illegal data" } set go 0 set i $::lzx::status(${obj}_table$i) } } set j $::lzx::status(${obj}_len$i) set ::lzx::bufpos $mp set ::lzx::bufsiz $ms incr ::lzx::bufpos $j incr ::lzx::bufsiz -$j return $i } proc ::lzx::read_lengths { obj first last } { for { set x 0 } { $x < 20 } { incr x } { set ::lzx::status(PRETREE_len$x) [::lzx::bits 4] } build_table PRETREE for { set x $first } { $x < $last} {} { set z [::lzx::read_huffsym PRETREE] if { $z == 17 } { set y [::lzx::bits 4]; incr y 4 while { $y} { incr y -1 set ::lzx::status(${obj}_len$x) 0 incr x } } else { if { $z == 18 } { set y [::lzx::bits 5]; incr y 20 while { $y} { incr y -1 set ::lzx::status(${obj}_len$x) 0 incr x } } else { if { $z == 19 } { set y [::lzx::bits 1]; incr y 4 set z [::lzx::read_huffsym PRETREE] set z [expr $::lzx::status(${obj}_len$x) - $z] if { $z < 0 } { incr z 17 } while { $y} { incr y -1 set ::lzx::status(${obj}_len$x) $z incr x } } else { set z [expr $::lzx::status(${obj}_len$x) - $z] if { $z < 0 } { incr z 17 } set ::lzx::status(${obj}_len$x) $z incr x } } } } } proc ::lzx::decompress {infd askfd len skip wndbit } { variable bufsiz variable bufpos variable bitbuf set bufsiz 0 set bufpos 0 set bitbuf "" if { $wndbit == 20 } { set posn_slots 42 } else { if { $wndbit == 21 } { set posn_slots 50 } else { set posn_slots [expr $wndbit * 2] } } if { $skip } { set outfd [vfs::memchan] fconfigure $outfd -translation binary } else { set outfd $askfd } # decompress as much as needed, but outputs only the content, skip previous # the idea is to start with initial values: LZX status is not used nor needed set window_posn 0 set last_window_posn 0 set window_size [expr 1 << $wndbit] set bufsiz 0 set main_elements [expr $::lzx::LZX_NUM_CHARS + ($posn_slots << 3)] set todo [expr $len + $skip] while { $todo > 0 } { ::lzx::checkBuffer $infd if { $window_posn == $window_size } { set window_posn 0 } if { $window_posn == 0 } { set header_read 0 for { set i 0 } { $i < $::lzx::LZX_MAINTREE_MAXSYMBOLS } { incr i } { set ::lzx::status(MAINTREE_len$i) 0 } for { set i 0 } { $i < $::lzx::LZX_LENGTH_MAXSYMBOLS } { incr i } { set ::lzx::status(LENGTH_len$i) 0 } set R0 1; set R1 1; set R2 1 } if { !$header_read } { set i 0; set j 0 set k [::lzx::bits 1] set intel_filesize 0 if { $k } { set intel_filesize [::lzx::bits 32] } set header_read 1 } set block_type [::lzx::bits 3] set block_remaining [::lzx::bits 24] set block_length $block_remaining switch $block_type { 1 { # LZX_BLOCKTYPE_VERBATIM read_lengths MAINTREE 0 256 read_lengths MAINTREE 256 $main_elements build_table MAINTREE if { [set ::lzx::status(MAINTREE_len232)] != 0} { set intel_started 1 } read_lengths LENGTH 0 $::lzx::LZX_NUM_SECONDARY_LENGTHS build_table LENGTH } 2 { # LZX_BLOCKTYPE_ALIGNED for { set i 0 } { $i < 8 } { incr i } { set ::lzx::status(ALIGNED_len$i) [::lzx::bits 3] } build_table ALIGNED read_lengths MAINTREE 0 256 read_lengths MAINTREE 256 $main_elements build_table MAINTREE if { [set ::lzx::status(MAINTREE_len232)] != 0} { set intel_started 1 } read_lengths LENGTH 0 $::lzx::LZX_NUM_SECONDARY_LENGTHS build_table LENGTH } 3 { # LZX_BLOCKTYPE_UNCOMPRESSED set intel_started 1 set b [expr $::lzx::bufpos & 15] if { $b } { set b [expr 16 - $b] incr bufpos $b incr bufsiz -$b } set R0 [::lzx::bits 32] set R1 [::lzx::bits 32] set R2 [::lzx::bits 32] } default { error "Illegal block type $block_type" } } set this_run $block_remaining while { $this_run > 0 && $todo > 0 } { if { $this_run > $todo } { set this_run $todo } incr todo -$this_run incr block_remaining -$this_run set last_window_posn $window_posn set window_posn [expr $window_posn & ( $window_size - 1 )] if { ($window_posn + $this_run) > $window_size } { error "Invalid format" } switch $block_type { 1 { # LZX_BLOCKTYPE_VERBATIM while { $this_run > 0 } { set main_element [read_huffsym MAINTREE] if { $main_element < $::lzx::LZX_NUM_CHARS } { puts -nonewline $outfd [binary format c $main_element] incr window_posn incr this_run -1 } else { incr main_element -$::lzx::LZX_NUM_CHARS set match_length [expr $main_element & $::lzx::LZX_NUM_PRIMARY_LENGTHS] if { $match_length == $::lzx::LZX_NUM_PRIMARY_LENGTHS } { incr match_length [read_huffsym LENGTH] } incr match_length $::lzx::LZX_MIN_MATCH set match_offset [expr $main_element >> 3] if { $match_offset > 2 } { if { $match_offset != 3 } { set extra $::lzx::extra_bits($match_offset) set verbatim_bits [::lzx::bits $extra] set match_offset [expr $::lzx::position_base($match_offset) - 2 + $verbatim_bits] } else { set match_offset 1 } set R2 $R1; set R1 $R0; set R0 $match_offset } elseif { $match_offset == 0 } { set match_offset $R0 } elseif { $match_offset == 1 } { set match_offset $R1 set R1 $R0; set R0 $match_offset } else { set match_offset $R2 set R2 $R0; set R0 $match_offset } set length $match_length while { $length } { set rundest [tell $outfd] seek $outfd -$match_offset end set match [read $outfd $length] seek $outfd $rundest puts -nonewline $outfd $match set l [string length $match] incr length -$l } incr window_posn $match_length incr this_run -$match_length } if { ($window_posn % 32768 ) == 0 && $window_posn != 0 } { set b [expr $::lzx::bufpos & 15] if { $b } { set b [expr 16 - $b] incr ::lzx::bufpos $b incr bufsiz -$b } } } } 2 { # LZX_BLOCKTYPE_ALIGNED while { $this_run > 0 } { set main_element [read_huffsym MAINTREE] if { $main_element < $::lzx::LZX_NUM_CHARS } { puts -nonewline $outfd [binary format c $main_element] incr window_posn incr this_run -1 } else { incr main_element -$::lzx::LZX_NUM_CHARS set match_length [expr $main_element & $::lzx::LZX_NUM_PRIMARY_LENGTHS] if { $match_length == $::lzx::LZX_NUM_PRIMARY_LENGTHS } { incr match_length [read_huffsym LENGTH] } incr match_length $::lzx::LZX_MIN_MATCH set match_offset [expr $main_element >> 3] if { $match_offset > 2 } { # It is not a repeated offset set extra $::lzx::extra_bits($match_offset) set match_offset [expr $::lzx::position_base($match_offset) - 2] if { $extra > 3 } { incr extra -3 set verbatim_bits [::lzx::bits $extra] incr match_offset [expr $verbatim_bits << 3] set aligned_bits [::lzx::read_huffsym ALIGNED] incr match_offset $aligned_bits } elseif { $extra == 3 } { set aligned_bits [::lzx::read_huffsym ALIGNED] incr match_offset $aligned_bits } elseif { $extra > 0 } { set verbatim_bits [::lzx::bits $extra] incr match_offset $verbatim_bits } else { set match_offset 1 } set R2 $R1; set R1 $R0; set R0 $match_offset } elseif { $match_offset == 0 } { set match_offset $R0 } elseif { $match_offset == 1 } { set match_offset $R1 set R1 $R0; set R0 $match_offset } else { set match_offset $R2 set R2 $R0; set R0 $match_offset } set length $match_length while { $length } { set rundest [tell $outfd] seek $outfd -$match_offset end set match [read $outfd $length] seek $outfd $rundest puts -nonewline $outfd $match set l [string length $match] incr length -$l } incr window_posn $match_length incr this_run -$match_length } if { ($window_posn % 32768 ) == 0 && $window_posn != 0 } { set b [expr $::lzx::bufpos & 15] if { $b } { set b [expr 16 - $b] incr bufpos $b incr bufsiz -$b } } } } 3 { # LZX_BLOCKTYPE_UNCOMPRESSED set tt [expr $::lzx::bufpos / 8] set part [string range $::lzx::bitbuf $tt [expr $tt + $this_run - 1]] puts -nonewline $outfd $part incr ::lzx::bufpos [expr $this_run * 8] incr bufsiz [expr -$this_run * 8] incr window_posn $this_run if { $::lzx::bufpos & 8 } { incr ::lzx::bufpos 8 incr bufsiz -8 } } default { error "Illegal block type $block_type" } } set this_run $block_remaining } } if { $skip } { seek $outfd $skip set tmp [read $outfd $len] puts -nonewline $askfd $tmp } } proc ::vfs::chm::test {} { ::vfs::chm::Mount c:/Tcl/doc/ActiveTclHelp8.4.chm c:/chm cd c:/chm set d [glob -types d -directory c:/chm -tails ActiveTcl8.*] cd $d puts [pwd] puts [join [lsort [glob *]] \n] catch "file mkdir test" err puts $err set fid [::open c:/chm/$d/aspn.css r] set content [read $fid] close $fid set fid [::open c:/aspn.css w] fconfigure $fid -translation binary puts $fid $content close $fid cd c:/ ::vfs::chm::Unmount c:/chm }
This is just a DustMote modification to browse the CHM file as a site.
package require vfs::chm #source chmvfs.tcl ::vfs::chm::Mount c:/Tcl/doc/ActiveTclHelp8.4.chm c:/chm set d [glob -types d -directory c:/chm -tails ActiveTcl8.*] set root "c:/chm/$d" set default "at.toc.html" set port 80 proc bgerror {trouble} {puts stdout "bgerror: $trouble"} proc answer {socketChannel host2 port2} { fileevent $socketChannel readable [list readIt $socketChannel] } proc readIt {socketChannel} { global root default fconfigure $socketChannel -blocking 0 set gotLine [gets $socketChannel] if { [fblocked $socketChannel] } then {return} fileevent $socketChannel readable "" set shortName "/" regexp {/[^ ]*} $gotLine shortName set many [string length $shortName] set last [string index $shortName [expr $many-1] ] if {$last=="/"} then {set shortName $shortName$default } set wholeName $root$shortName set err "" puts "Serving $wholeName" if [catch {set fileChannel [open $wholeName RDONLY] } err ] { puts $socketChannel "HTTP/1.0 404 Not found" puts $socketChannel "" puts $socketChannel "<html><head><title><No such URL.></title></head>" puts $socketChannel "<body><center>" puts $socketChannel "The URL you requested does not exist on this site." puts $socketChannel "</center>$err</body></html>" puts $err } else { fconfigure $fileChannel -translation binary fconfigure $socketChannel -translation binary -buffering full puts $socketChannel "HTTP/1.0 200 OK" puts $socketChannel "" set work [read $fileChannel] puts $socketChannel $work close $fileChannel } close $socketChannel } socket -server answer $port vwait forEver
AK: A number of questions.
DAG: A number of answers.
DDG: Very interesting work. However I can't read the aspn.css file. Getting the "illegal block type 4" error. The file exists however as the glob results shows. Any suggestions?
DAG: How unfortunate! May I know which version of file do you use? I am afraid the versions I own have only VERBATIM and ALIGNED blocks, so UNCOMPRESSED block part could be not tested at all.
Anyway, last night I found that there is still a bug related to Random Access: it seems probably something about odd blocks. I am afraid I access those with some misalignment. I am looking into it: next tests will copy all files from archive to another location.
Another thing I fixed is the buffer reset at the beginning of decompress proc.
DDG Thanks for your response. I use the file D:/ActiveTcl8.4.2/doc/ActiveTclHelp.chm. I can give you a link for downloading "my" chm file. It would be nice to use chmvfs for my dgHelpBrowser like the metakit files.
DAG: That's a good idea. Have a try with this last version: it works fine on both my versions, but for XOTcl pdf files. I'm working on it, but I think that probably it is still related to UNCOMPRESSED blocks. Even chmtools seems to have some problems with those pdf files, though.
I also added a small modification of DustMote, which I use to browse the CHM file as a site. In case it is not working for You, let me know where I can download Your file from.
DDG It now works also for my chm file. Great! I will see that I update dgHelpBrowser for reading chm-files on Unixes,Win and OS-X etc. I once tried to adjust chmlib for this purpose using Swig but I failed. Thanks.
DAG: I thought I found the last problem in UNCOMPRESSED blocks handling, realigning to 16-bit boundaries after each of them, if needed. Nonetheless, it seems that pdf extracted are not always identical to stored files. Still investigating.
DAG: I did it. At last I found the bug in UNCOMPRESSED blocks handling. Nevertheless, I discovered a .chm that puzzled me again. Filenames are stored in all lower-cases, but referred also with upper cases. This means that I need to add an option to mount the VFS in both modes: case-sensitive or case-insensitive. I think that, coming from Windows, the latter should be the default.