polymorphism

Polymorphism [L1 ] refers to the ability of some code to be used with multiple types of values. The word comes from Greek poly (many) and morph (form). Polymorphism is often associated with object-oriented programming (OOP), but it is also found in other techniques, such as functional programming (FP). Many of Tcl's built-in commands exhibit a sort of polymorphism due to everything is a string. Polymorphism relates to types and so is most often discussed in terms of static types. For Tcl and other dynamic languages these type descriptions are approximations, but knowing the typed versions helps shed light on the different variations. There are a number of different sorts of polymorphism:

Parametric Polymorphism

This form is common in FP languages, such as Haskell, and is present in C++, C#, and now Java in the form of generics. In this form the type of a term is allowed to contain variables which can be instantiated with specific types. For instance, in Haskell we can define a binary tree of any type as:

 data BinTree a = Empty | Leaf a | Branch (BinTree a) (BinTree a)

i.e. a binary tree with elements of any type "a" is empty, or consists of either a Leaf with a value of type "a", or a Branch with two trees. We can then instantiate this type, for instance:

 a       :: BinTree Integer                   -- An integer binary tree
 a       =  Branch (Leaf 12) (Branch (Leaf 9) (Leaf 14))

We can also define functions which operate over binary trees with any type of elements:

 size    :: BinTree a -> Int
 size Empty          = 0
 size (Leaf _)       = 1
 size (Branch l r)   = (size l) + (size r)

In its pure form as shown the polymorphic functions do not rely on any details of the parameterised type. Tcl exhibits similar behaviour to this due to its everything is a string philosophy (e.g. llength operates on any list, regardless of the "type" of elements it contains). While this owes more to Tcl being monotyped rather than explicit support for parametric polymorphism, the effect is the same.

Ad-hoc Polymorphism

Another common form of polymorphism is overloading; a form of ad-hoc polymorphism. In ad-hoc polymorphism a value is allowed to exhibit different behaviour under different types. In overloading the values are functions or methods which have different implementations for different types. For instance, in Haskell again, we can use a type-class to define a set of functions that can be instantiated for different types:

 class Collection c where
     size        :: c a -> Int
     elements    :: c a -> [a]
 instance Collection [] where
     size xs     = length xs
     elements xs = xs
 instance Collection BinTree where
     size Empty          = 0
     size (Leaf _)       = 1
     size (Branch l r)   = (size l) + (size r)
     elements Empty      = []
     elements (Leaf a)   = [a]
     elements (Branch l r)= (elements l) ++ (elements r)

Other typed languages also often allow you to overload a function based on the types of the parameters. For instance Java allows this:

 // Assume in a class context
 static int size(int elements[]) { return elements.length; }
 // Assume suitable definitions of Empty, Leaf and Branch classes
 static int size(Empty x)  { return 0; }
 static int size(Leaf x)   { return 1; }
 static int size(Branch x) { return size(x.left) + size(x.right); }

In addition, all OO languages allow overloading on the receiver object of some message. For instance, we can rewrite the above in XOTcl as:

 Class Empty
 Class Leaf -parameter item
 Class Branch -parameter {left right}
 Empty instproc size  {} { return 0 }
 Leaf instproc size   {} { return 1 }
 Branch instproc size {} { expr [[my left] size] + [[my right] size] }
 set a [Branch new -left  [Leaf new -item 12] \
                   -right [Branch new -left  [Leaf new -item 9] \
                                      -right [Leaf new -item 14]]]
 $a size ;# 3

Some OO languages, such as CLOS, feature a general form of overloading using multi-methods. Runtime type identification, such as Java's instanceof operator, and the introspection methods available in most/all Tcl OO extensions, allow for another form of ad-hoc polymorphism where you can explicitly examine types at runtime.

Overloading is essentially a dispatch mechanism so that lots of different functions with the same name can be called depending on their types. These different implementations might not share very much in common apart from the name, hence the term ad-hoc. For instance, in Java the "+" operator is overloaded to mean both addition and string concatenation. For this reason, overloading is sometimes not considered to be a "real" form of polymorphism.

Apart from overloading, another way to accomplish ad-hoc polymorphism is coercion, where you allow a value of one type to be coerced to another type which supports the given operation. Again, Tcl's everything is a string supports this type of polymorphism by different "types" having overlapping string representations. For instance, this allows foreach to operate on dicts by coercing them (internally) to lists:

 foreach {key value} $mydict { ... }

Ad-hoc and parametric polymorphism can be combined to create bounded parametric polymorphism where the type parameters are restricted by some constraints. For instance, in order to eliminate duplicates from a list we would need to be able to compare elements of the list for equality. Thus we required that our parametric types supply some overloaded equality predicate. In Haskell:

 removeDups        :: (Eq a) => [a] -> [a]
 removeDups []     = []
 removeDups (x:xs) = if (elem x xs) then removeDups xs else x:removeDups xs

(This says that removeDups is a function that takes a list of any type a in class Eq -- i.e. the type-class that defines the (==) function -- and returns another list of that type).

Subtype and Inclusion Polymorphism

A final form of polymorphism, again very popular in OO languages (indeed this is generally what is meant by "polymorphism" in OO), is that of subtype or inclusion polymorphism, where a value is allowed to belong to many types. Each supertype is more general than its subtypes, providing less information about instances. So, for instance, if A is a subtype of B then I can use instances of A anywhere that expects an instance of B (this is known as Liskov's Substitution Principle, LSP). OO languages often provide an inheritance mechanism which combines subtyping (which is about interfaces/types) and implementation reuse. More generally, there is the duck typing[L2 ] supported by many dynamic OO languages (e.g., Python, XOTcl, etc), which is strongly related to OCaml's structural subtyping. The basic idea is that a type A can be considered a subtype of a type B if it provides (at least) the same set of operations as type B, or in more dynamic languages, if it responds to the same set of messages. This is captured in the slogan "if it walks like a duck and quacks like a duck, it must be a duck."

Subtyping is often combined with overloading, so that implementations of some method in a subtype also restrict the arguments they take. There are three ways in which this can be done:

  • invariance: the subtype method takes exactly the same types as the supertype;
  • covariance: as well as narrowing the type of the receiver object (i.e. the implicit "self" or "this" parameter), the method also narrows the type of one or more arguments;
  • contravariance: the subtype method expands the types of its arguments to be more permissive.

In order to satisfy the aforementioned Liskov Substitution Principle methods in a subtype should be contravariant in their arguments, and covariant in their result type. For instance, consider this Java class hierarchy:

 class A {
     public B someMethod(C c) { ... }
 }
 class A1 extends A {
     public B1 someMethod(C1 c) { ... }
 }

In order to preserve LSP, we require that C1 be a superclass of C (or the same), and that B1 be a subclass of B (or the same). To see why this is the case, consider how these methods might be used polymorphically:

 // Calling the method
 A foo = new A1();
 foo.someMethod(new C()); // A1.foo must handle C, so C1 cannot be more restrictive
 B bar = foo.someMethod(new C()); // A1.foo must return a subtype of B 

See [L3 ] for an interesting discussion of why this is the case: is a Cow an Animal? Generally in a dynamic language like Tcl you would just define the method covariantly and then generate a runtime error for inappropriate inputs, but it's interesting to know the problems that arise if you try and statically type check some common OO situations.

On Understanding Types, Data Abstraction, and Polymorphism by Luca Cardelli and Peter Wegner [L4 ] categorises these forms of polymorphism into a tree:

                                +-- parametric
                +-- universal --|
                |               +-- inclusion/subtype
 polymorphism --|
                |               +-- overloading
                +--- ad-hoc ----|
                                +-- coercion

NEM Feel free to add comments/corrections etc.