Metacharacters (Linux in a Nutshell, 3rd Edition) - e-Reading Library

start page | rating of books | rating of authors | reviews | copyrights

Book Home

9.3. Metacharacters

The following characters have special meaning in search patterns:

Character	Meaning
`.`	Match any single character except newline.
`*`	Match any number (or none) of the single character that immediately precedes it. The preceding character also can be a regular expression (e.g., since . (dot) means any character, .* means match any number of any character -- except newlines).
`^`	Match the beginning of the line or string.
`$`	Match the end of the line or string.
`[ ]`	Match any one of the enclosed characters. A hyphen (-) indicates a range of consecutive characters. A circumflex (^) as the first character in the brackets reverses the sense: it matches any one character not in the list. A hyphen or close bracket (]) as the first character is treated as a member of the list. All other metacharacters are treated as members of the list.
`[^ ]`	Match anything except enclosed characters.
`\{`n,m`\}`	Match a range of occurrences of the single character that immediately precedes it. The preceding character also can be a regular expression. \{n\} matches exactly n occurrences, \{n,\} matches at least n occurrences, and \{n,m\} matches any number of occurrences between n and m.
`{`n,m`}`	Like `\{`n,m`\}`. Available in grep by default and in gawk with the -Wre-interval option.
`\`	Turn off the special meaning of the character that follows.
`\(\)`	Save the matched text enclosed between \( and \) in a special holding space. Up to nine patterns can be saved on a single line. They can be "replayed" in the same pattern or within substitutions by the escape sequences \1 to \9.
`\`n	Reuse matched text stored in nth \( \).
`()`	In egrep and gawk, save the matched text enclosed between \( and \) in a holding space to be replayed in substitutions by the escape sequences \1 to \9.
`\<\>`	Match the beginning (\<) or end (\>) of a word.
`+`	Match one or more instances of preceding regular expression.
`?`	Match zero or one instance of preceding regular expression.
`\|`	Match the regular expression specified before or after.
`()`	Group regular expressions.

Many utilities support POSIX character lists, which are useful for matching non-ASCII characters in languages other than English. These lists are recognized only within [] ranges. A typical use would be [[:lower:]], which in English is the same as [a-z].

The following table lists POSIX character lists:

Notation	Action
[:alnum:]	Alphanumeric characters
[:alpha:]	Alphabetic characters, uppercase and lowercase
[:blank:]	Printable whitespace: spaces and tabs but not control characters
[:cntrl:]	Control characters, such as ^A through ^Z
[:digit:]	Decimal digits
[:graph:]	Printable characters, excluding whitespace
[:lower:]	Lowercase alphabetic characters
[:print:]	Printable characters, including whitespace but not control characters
[:punct:]	Punctuation, a subclass of printable characters
[:space:]	Whitespace, including spaces, tabs, and some control characters
[:upper:]	Uppercase alphabetic characters
[:xdigit:]	Hexadecimal digits

The following characters have special meaning in replacement patterns:

Character	Meaning
`\`	Turn off the special meaning of the character that follows.
`\`n	Restore the nth pattern previously saved by \( and \). n is a number from 1 to 9, matching the patterns searched sequentially from left to right.
`&`	Reuse the search pattern as part of the replacement pattern.
`~`	Reuse the previous replacement pattern in the current replacement pattern.
`\e`	End replacement pattern started by `\L` or `\U`.
`\E`	End replacement pattern started by `\L` or `\U`.
`\l`	Convert first character of replacement pattern to lowercase.
`\L`	Convert replacement pattern to lowercase.
`\u`	Convert first character of replacement pattern to uppercase.
`\U`	Convert replacement pattern to uppercase.

Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.