'''tarray - Typed Array extension for Tcl'''

<<TOC>>

Project is hosted at SourceForge [http://sourceforge.net/projects/tarray]. Latest download is V0.6. New commands, bug fixes, and more operations parallelized.

Web site and documentation is at http://tarray.sourceforge.net

**Overview**

The tarray extension implements a new Tcl collection data type - typed array - and associated commands `column` and `table`. A typed array stores elements of a specific data type in native format. The primary motivation for this extension is efficient memory utilization and speed of certain operations in applications dealing with very large numbers of elements. This is achieved through native storage formats and parallelization of operations on multi-core CPU's. See the benchmarks below.

The tarray extension was inspired in part by [Speed Tables] and to a lesser extent by [TclRal].

The philosophy behind tarray is to provide efficient facilities on top of which more sophisticated data structures, possibly customized for specific applications, can be easily scripted and experimented with. Therefore, unlike [Speed Tables], tarray does not require creation and recompilation of a new extension for each table definition. Moreover, tarray provides value-based semantics so that columns and tables can be used as basic building blocks. Additional facilities that [Speed Tables] provides, like remote access, are expected to be implemented at the script level.

**Benchmarks**

As expected, benefits of native format and parallelization can be significant (more than two orders of magnitude for searches) as shown in the benchmarks below.

***Sorting***

The lsort column is the baseline showing the performance of lsort for each data type (using the -real, -integer etc. options). The numbers in parenthesis show performance relative to the lsort baseline. The next two columns show tarray performance without parallelizing and parallelized to 2 threads.

======
Version:  0.6
Run date: Thu Aug 14 10:20:06 IST 2014
     Size       Type      lsort     singlethread      multithread  

    10000    strings       4159       3776 (1.10)       2367 (1.76)
    10000        any       4104       3787 (1.08)       2675 (1.53)
    10000    doubles       3087       1377 (2.24)        895 (3.45)
    10000       ints       2888       1100 (2.62)        733 (3.94)

   100000    strings      85702      55741 (1.54)      40372 (2.12)
   100000        any      84130      74371 (1.13)      63067 (1.33)
   100000    doubles      47588      17283 (2.75)      10354 (4.60)
   100000       ints      44223      13729 (3.22)       7770 (5.69)

  1000000    strings    1407869     816009 (1.73)     635882 (2.21)
  1000000        any    1379399    1204986 (1.14)    1015875 (1.36)
  1000000    doubles     781096     214341 (3.64)     132427 (5.90)
  1000000       ints     743424     148959 (4.99)      93574 (7.94)
======

***Searching***

Similar to above, for lsearch baseline using -all -exact/-integer/-real

======
     Size       Type    lsearch        singlethread      multithread
    10000    strings        200         48 (4.17)         82 (2.44)
    10000        any        238        270 (0.88)        253 (0.94)
    10000    doubles      10414         30 (347.13)      126 (82.65)
    10000       ints       1460         12 (121.67)       53 (27.55)

   100000    strings       6088        631 (9.65)        521 (11.69)
   100000        any       5756       8453 (0.68)       4926 (1.17)
   100000    doubles      92572        253 (365.90)      197 (469.91)
   100000       ints      22501        123 (182.93)      132 (170.46)

  1000000    strings      71039       5345 (13.29)       4315 (16.46)
  1000000        any      74739     107890 (0.69)       70007 (1.07)
  1000000    doubles     951024       2115 (449.66)      1164 (817.03)
  1000000       ints     280801        986 (284.79)       558 (503.23)
======

***Real world example***

The following example compares storing of geographical data for cities from http://geonames.org
as a list, as a [sqlite] in-memory database, and as a tarray table. The database has just over 142000 records.
The corresponding cities table definition is shown below.
======
tarray::table create {
    geonameid int 
    name string 
    country string 
    latitude double 
    longitude double 
    population wide 
    elevation int
}
======

The table is queried for all cities with more than a million people. Memory usage and timing is shown below. It shows even from the memory usage point of view, lists are not suitable for more than a few hundred thousand records. Note tarray uses twice as much memory as sqlite but is two orders of magnitude faster on the search. '''This is not to say sqlite and tarray are comparable! sqlite is a database, tarray is not!''' Nevertheless, for use as an in-memory data structure, tarray can replace some uses of sqlite. [APN] is somewhat surprised with the difference in speed. Any hints on optimizing the sqlite query would be appreciated. It would have been nice to also compare [Speed Tables] but that does not build on Windows and I do not have a Linux benchmarking host.

======
list results:
Virtual Bytes: 112 MB
Working Set: 111 MB
Page file: 112 MB

sqlite results: {db eval {select name from geo where population > 1000000}}
Virtual Bytes: 17 MB
Working Set: 9 MB
Page file: 9 MB
23142.2 microseconds per iteration

tarray results: {table get -columns name $tab [column search -all -gt [table column $tab population] 1000000]}
Virtual Bytes: 40 MB
Working Set: 18 MB
Page file: 19 MB
187.7 microseconds per iteration

======

<<discussion>>

[SEH] -- It would be useful to be able to do dumps of raw binary data once a TArray had been constucted.  Then one could, for example, use it with a [reflected channel] to duplicate the function of [memchan].  Or to do device I/O. (Is this already possible?)

[APN] Direct I/O files/databases to and from typed arrays is on the to-do list but low down for a couple of reasons. First, there are still a bunch of basic operations and optimizations that have to be implemented to make the package more useful as a building block. Second, it is not clear what the output / input format should be. Even just binary dumps raise questions of endianness etc.

----
'''[AK] - 2013-04-18 22:18:04'''

Tcllib already has a memchan emulation based on reflected channels.
Documentation @ https://core.tcl.tk/tcllib/doc/trunk/embedded/www/tcllib/files/modules/virtchannel_base/tcllib_memchan.html

SEH -- Since that package is pure Tcl, I was hoping a TArray-based solution would be faster.

AK -- What makes this then different from the original memchan ?

SEH -- Nothing at all, except that the author states above that TArray is intended to be a modular base for a range of specific solutions.  If the function of memchan could be duplicated, that would be one less purpose-specific package to maintain, replaced by a flexible multi-purpose tool.  I think it would be healthier for the Tcl ecosystem to have fewer of the former and more of the latter, for an equivalent range of applications.

<<categories>> Command | Package | Data Structure