Version 11 of PCRE

Updated 2003-07-18 15:12:04

Perl-Compatible Regular Expressions. A superset of Regular Expressions with a few extra features introduced by Perl, less a couple of features that could not be enforced without Perl itself. Welcomed by many, hated by many others.

http://www.pcre.org/man.txt

Note that, in spite of the name, PCRE do not exist in Perl only. Other programs and/or languages can implement them, like PHP.

Tcl uses ARE.


DKF - Wow. There's more Features from the Black Lagoon in there than you can shake a B-Movie at...


A big Regular Expression fan sees DKF's remark and says: Maybe Tcl uses ARE because Tcl is so merciful. More features for the bold and less features for the queasy.

Another (OK, the same) big Regular Expression fan also says: Regular Expressions are not very easy, granted, but they're also overly mystified, and PCRE take the blame for some extra mystification, even by those who are good friends with Regular Expressions.

The thing is that PCRE allow tricks that are impossible with traditional RE. Some people advocate avoiding PCRE completely and, instead, writing even more complex, long-winded, probably convoluted code to replace them.

Big Regular Expression Fan will never understand these people.


Here is a very quick summary of PCRE's most relevant features. Items marked with + are supported by ARE (thank God).

 + foo(?=bar)            match "foo" only if "bar" follows it
 + foo(?!bar)            match "foo" only if "bar" does NOT follow it
  (?<=foo)bar            match "bar" only if "foo" precedes it
  (?<!foo)bar            match "bar" only if "foo" does NOT precede it

  (?<!in|on|at)foo       match "foo" only if NOT preceded by "in", "on" or "at"
  (?<=\d{3})(?<!999)foo         match "foo" only if preceded by 3 digits other than "999"

 + (?i)abc               case-insensitive match of abc, ABC, aBc, ABc, etc.
 + ab(?i)c               same as above; the (?i) applies throughout the pattern
  (ab(?i)c)              matches abc or abC; the outer parens make the difference!
 + (?m)                         multi-line pattern space: same as "s/FIND/REPL/M"
 + (?s)                         set "." to match newline also: same as "s/FIND/REPL/S"
 + (?x)                         ignore whitespace and #comments;
 + (?:abc)foo                 match "abcfoo", but do not capture 'abc' in \1
  (?:ab|cd)ef                 match "abef" or "cdef"; only 'cd' is captured in \1
 + (?#remark)xy                 match "xy"; remarks after "#" in the parens are ignored.

 \l                make letters capital
 \L                make letters small until \E
 \u                make letters capital
 \U                make letters capital until \E
 \Q                escape all until \E
 \E                end of modifyer's action
 \G                end of previous match

[ Category Acronym ]