regexp - Plan 9 regular expression notation
This manual page describes the regular expression syntax used by the Plan 9
regular expression library regexp9
(3). It is the form used by
(1) before egrep
A regular expression
specifies a set of strings of characters. A member
of this set of strings is said to be matched
by the regular expression.
In many applications a delimiter character, commonly bounds a regular
expression. In the following specification for regular expressions the word
`character' means any character (rune) but newline.
The syntax for a regular expression e0
e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')'
| e2 REP
REP: '*' | '+' | '?'
| e1 e2
| e0 '|' e1
is any non-metacharacter, or a metacharacter (one of
), or the delimiter preceded by
is a nonempty string s
it matches any
character in (or not in) s
. A negated character class never matches
newline. A substring a-b,
ascending order, stands for the inclusive range of characters between a
. In s
, the metacharacters an initial and the regular
expression delimiter must be preceded by a other metacharacters have no
special meaning and may appear unescaped.
A matches any character.
A matches the beginning of a line; matches the end of the line.
operators match zero or more (*
), one or more (+
zero or one (?
), instances respectively of the preceding regular
A concatenated regular expression, e1e2
, matches a match to e1
followed by a match to e2
An alternative regular expression, e0|e1
, matches either a match to
or a match to e1
A match to any part of a regular expression extends as far as possible without
preventing a match to the remainder of the regular expression.