Purpose: Definition of set functionality for an official extension.

This is a part of the Chart of proposed list functionality.

See also A set of Set operations.

See also Tcllib's module struct, package struct::set

- AK (1999, Mar 17/18)
- Had to revamp all the Set pages due to the removal of clutter from various scripts and the introduction of a new way of doing things, which had to be included into the timings, of course. This new way of operation uses the local namespace of a procedure as an implicit array, fast to retrieve, and fast to set. This principle was fastest among all currently tested functions and implementation variants. Its disadvantage is however that no set must contain the names of the internal variables required in the computation itself, or incorrect results will occur. This should be avoidable, if documented properly, but is nevertheless an incitement to switch to a C implementation in the future.

- Determine membership of an element in a list.
- Compute the union of 2 sets represented by 2 lists.
- Compute the intersection of 2 sets represented by 2 lists.
- Compute the symmetric difference of 2 sets represented by 2 lists (
**symmetric difference**being the set containing those elements which are in one but not the other, the inverse of intersection.). - Compute the onesided difference of 2 sets represented by 2 lists (
**onesided difference**being the set containing those elements which are in the first but not the second.). - Check for an empty list, eh, set.

Sometimes intersection and both onesided differences are required. A procedure computing them in parallel should be more efficient than doing it separately.

- A procedure to compute intersection and both onesided differences.

[AK] (Mar 20, 1999): The timing of the possible implementations proved me wrong, this procedure is not necessary.

- Remark
- The first implementation should use lists to represent sets, as they can be transfered by value. A C implementation OTOH should define a new Tcl_Obj-Type.

- Name of the extension
- SetOps

Implementation:

- Namespace
- ::setops

General features:

- Sets are represented on the tcl-level by lists. A C implementation does not exist yet, but may employ hash-tables to gain performance.

Functions:

- SetOps, Create ::setops::create elements
- SetOps, Contains ::setops::contains set element
- SetOps, Union ::setops::union set1 set2...
- SetOps, Intersection ::setops::intersection set1 set2...
- SetOps, Difference ::setops::diff set1 set2
- SetOps, Symmetrical difference ::setops::symdiff set1 set2
- SetOps, Emptiness ::setops::empty set
- SetOps, Intersect3 ::setops::intersect3 set1 set2

Code:

- SetOps, Code, 7.6 -- no inline documentation yet.
- SetOps, Code, 8.x -- no inline documentation yet.
- SetOps, Code, 8.x v2 -- by MS.
- C implementation available [L1 ]

Testcases:

**not yet**

Discussion area.

LV: Are there other list oriented languages from whom we can 'borrow' set functionality ? Perl's idea of adopting concepts found in other languages, done in the spirit of perl, is one that could really work in Tcl as well. Mar 20, 1999

*Edrx*: Icon has a built-in "set" type, with at least these built-in operators and functions (see [L2 ] and [L3 ]):

S1 ++ S2 set union; returns a set S1 ** S2 set intersection S1 -- S2 set difference *S1 the number of elements in S1 ?S returns a random element from S set(L) makes a set from a list member(S,x) fails if x is not a member of S, else returns x insert(S,x,...) insert x,... in S; return the new set delete(S,x,...) delete x,... from S; return the new set

There's also "!S", that "generates" each element of S, in no predictable order, but the concept of "generators" is too Icon-specific. If it's a matter of borrowing syntax, I would suggest using something like S++, ..., Smember... or like Set::++, ... . Nov 19, 1999

*DKF*: Membership test should be a boolean operator, and not something with a failure semantics (given correctly typed input.)

*escargo 11/11/2002*: Failure is not a bad thing in Icon. Icon has no boolean type! Expressions either success and return a value or fail. This is key to making backtracking work. It is not failure in the same sense as other languages might throw an exception. Languages that do not implement backtracking need failure less, but for Icon it is very much central.

AK: The **contains** command as defined above returns a boolean. Wrappers could be provided to make porting Icon apps easier.

*DKF*: Hmm. Reading comprehension is obviously not my strong point!

-- AK

http://web.uvic.ca/~erempel/tcl/sets/sets.html is another Tcl extension that provides set operations.

Obviously, a lot more has been done about set operations than I thought. But there will probably one subject left: infinite sets. I wrote a fairly lengthy page on finite sets and countably infinite sets (that part still has to be finished). Look for the page: Manipulating sets in Tcl. Arjen Markus