RText uses Java regular expressions, so the ultimate source of information on this topic would be Sun's Javadoc. Check out the Java home page at http://java.sun.com and do a search for the latest API Specification, then click on the Pattern class.
However, for the eager, what follows is a brief tutorial containing everything you probably want to know. This tutorial assumes you already understand the basic concepts of regular expressions.
I. The Basics
Java regular expressions contain every construct you'll need when
searching for text, including character classes, greedy and
reluctant qualifiers, and back references.
The basics are all here:
|
II. Characters and Character Classes
|
III. Boundary Characters
|
IV. Back References
Back references allow you to capture a subsequence in a regex
match, and use that subsequence later in the regex's matching.
Back references are referred to as capturing groups, and
are enclosed in parentheses. For example, in the regular
expression: Fred([ \t])Joe\1Sue\1Tommy, the first capturing group is ([ \t]). Anywhere following it in the regular expression, you can refer back to it with \1; this means that a match must contain the text matched with the capturing group, at that location. Thus, in the example, Fred, Joe, Sue, and Tommy are all separated by the same character, either a space or a tab. Note that you can have multiple capturing groups per regular expression. In this case, the first group will be referred to as \1, the second as \2, etc. Note also that capturing groups can be embedded in one another; that is, ((A)B)\1\2 is valid. |