The wiki page Regular Expressions will take you to a place where Tcl's ARE are discussed.
Technically, a regular expression is an ingenious notation (that was invented by Kleene in the 1950's) for specifying a regular language. (A "language" in this terminology is a set of strings, very often an infinite sets of strings. See http://en.wikipedia.org/wiki/Regular_language for formal definitions of regular language.)
Practically, a regular expression looks like a summary of the various forms that the set of strings it describes may take. One conventionally uses | as a separator between alternative branches of the expression. One may also write * (the Kleene star) after something to denote that it may be repeated zero or more times, and use parentheses to group things. A regular expression using all these things is
A((b|cc)a)*
which denotes a language consisting of the strings
A Aba Acca Ababa Abacca Accaba Accacca Abababa
and many others.
An important factor for the popularity of regular expressions is their linear-time complexity: when a given string is to be matched against a regular expression, it is possible to do it so that every character in the string is only looked at once. This is attained by compiling the regular expression into a finite automaton — potentially a big chunk of work, but one that only needs to be done once for each regular expression — and then running the automaton with the string as input.
Another important factor is that regular expressions can be used for efficiently searching through a large body of text. A direct implementation of the above would produce an algorithm for matching a string against a regular expression, but most RE implementations — among them regexp — play a few tricks internally that make them operate in search mode instead. In order to get matching behaviour (often useful with switch -regexp), one uses the anchors ^ and $ to require that the particular position in the regular expression must correspond to the beginning and end of the string respectively (caveat: sometimes it is beginning and end of line instead; AREs have \A and \Z as alternatives).
Other common extensions to the regular expression syntax, which however doesn't make them any more powerful than the basic set described above, are:
A somewhat intriguing class of such extensions are the constraints, which only match the empty string but refuse to do so unless the material surrounding the position of this match satisfies some condition. The general constraints are:
(In order to implement these into a finite automaton, one roughly has to form the product of an automaton for the main RE and an automaton for the constraint RE — the idea is that on must keep track of both the state visavi the main RE and the state visavi the constraint RE — which are then slightly modified so that the transition from one side to the other of the constraint takes one to a constraint-initial state. Might there be a problem if two constraints overlap, though?) The Tcl regexp engine handles lookaheads by compiling and running the constraint RE separately; in this way it is a "hybrid" RE engine.
Another set of usual extensions to the syntax concern submatch extraction and greediness. These mainly become meaningful when regular expressions are used for searching, as there in that case often are several substrings that match a particular RE, and it matters what match is reported.
Many "regular expression" engines also support extensions to the syntax which allow them to go beyond the realm of regular languages. This is most common in backtracking RE engines (such as PCRE and the Perl engine) which ignores the finite automaton theory and instead uses trial and error to find a match, but some have escaped into more general standards.
This is a partial list of tricks. It also assumes some familiarity with finite automata theory, such as knowing what distinguishes an NFA from a DFA, how one runs them, and how one can construct one from the other (all of which is standard material in relevant computer science courses).
Given a match-mode regexp engine, as one would get from running a finite automaton over a string and inspect whether the end state is final, one can run a regular expression re in search mode on it by running .*(re).* in match mode.
As usually defined, finite automata can only answer "yes" or "no", so there's no way to get submatch information out of them.
An extension of the formalism (keeping track of positions within the string corresponding to positions in the regexp, as well as the basic automaton state) can be found in http://laurikari.net/ville/spire2000-tnfa.ps
This is a classical trick.
Given one (ε-free) automaton A1 for matching the regular expression re1 and another (ε-free) automaton A2 for matching the regular expression re2, it is straightforward to construct an automaton A for re1 AND re2 as follows:
What this means in practice is that running A is equivalent to running A1 and A2 simultaneously; each A state is a pair of an A1 state and an A2 state. A accepts a string only if both A1 and A2 would do so.
Okay then - feel free to add information here on the other RE flavors available in Tcl...
LES says that there no other RE flavors available in Tcl. Tcl only uses ARE. What I meant is that regular expressions may be construed as any one (or all) of its several variations, but Regular Expressions only discusses Tcl's ARE. I said that because this wiki discusses many things under several contexts, not necessarily that of Tcl, and I thought it would be good to note that, at least in this case, it is restricted to the context of Tcl. Anyone interested in a different or more ample discussion of Regular Expressions will have to look elsewhere. E.g. on PCRE.
RS: Well, the Wiki doesn't claim to have all info - it rather asks those in the know to contribute it :) Here's from man re_syntax: