Table of Contents

Regexp - Word Characters

About

A word can be represented by the shorthand class (\w) and is specified as:

It would then be expressed as the following class [0-9A-Za-z_].

That is, any character which can be part of a Perl “word”.

Definition of letters and digits versus Character Set

The definition of letters and digits is controlled by character tables, and may vary if locale-specific matching is taking place.

For example, in the “fr” (French) locale, some character codes greater than 128 are used for accented letters, and these are matched by \w.

Boundary

Regular Expression - Boundary Matcher

Word boundary

A word boundary \b is a zero-width assertion that matches if:

Example 1:

Non-word boundary

A non-word boundary is \B.

Example 1:

Example 2:

Documentation / Reference