Regexp - Character Class


A character class defines a domain of permitted characters.


with square brackets



  • [ is the start character class definition
  • ] is the end character class definition

Inside the brackets, all characters can be used mixed with this Meta characters symbols

Construct Matches Operation
[abc] a, b, or c simple class
[^abc] Any character except a, b, or c negation
[a-zA-Z] a through z or A through Z, inclusive range
[a-d[m-p]] a through d, or m through p: [a-dm-p] union
[a-z&&[def]] d, e, or f intersection
[a-z&&[^bc]] a through z, except for b and c: [ad-z] subtraction
[a-z&&[^m-p]] a through z, and not m through p: [a-lq-z] subtraction


[..] Specifies one collation element, and can be a multicharacter element [.ch.] in Spanish
[:characterClass:] Specifies character classes. It matches any character within the character class. [:alpha:] See posix
[==] Specifies equivalence classes. [=a=] matches all characters having base letter 'a'.


Meta-character Description Example
\ general escape character
^ negate the class, but only if the first character [^abc] matches any character other than a, b, or c.
[^a-z] matches any single character that is not a lowercase letter from a to z.
- indicates character range [abc] matches a, b, or c. [a-z] specifies a range which matches any lowercase letter from a to z.
These forms can be mixed:
[abcx-z] matches a, b, c, x, y, and z,
as does [a-cx-z]



Since many ranges of characters depend on the chosen locale setting (i.e., in some settings letters are organized as abc…zABC…Z, while in some others as aAbBcC…zZ), the POSIX standard defines some classes or categories of characters as shown in the following table:

POSIX ASCII Description
[:alnum:] [A-Za-z0-9] Alphanumeric characters
[:alpha:] [A-Za-z] Alphabetic characters
[:lower:] [a-z] Lowercase letters
[:upper:] [A-Z] Uppercase letters
[:blank:] [ \s\t] Space and tab
[:cntrl:] [\x00-\x1F\x7F] Control characters
[:digit:] [0-9] Digits
[:graph:] [\x21-\x7E] Visible characters
[:print:] [\x20-\x7E] Visible characters and spaces
[:punct:] [-!"#$%&'()*+,./:;<=>[email protected][\\\]_`{|}~] Punctuation characters
[:space:] [ \t\r\n\v\f] Whitespace characters
[:xdigit:] [A-Fa-f0-9] Hexadecimal digits


The shorthand syntax begins with a slash followed by a letter. See Regular Expression - Backslash Generic Character Class (shorthand)

Powered by ComboStrap