About
Dot . in a regular expression matches any character in the supported character set with this characteristic, by default:
- the match stops at the first occurrence. See greedy mode versus lazy mode that will stop at the first one.
Dot has no special meaning in a character class.
Configuration
Match Newline (DOTALL)
Dot does not match newlines by default, a modifier must be set when running the matching function.
Java Example with the DOTALL flag:
- A pattern that capture the content between two XML nodes even if there is new line in there.
Pattern pattern = Pattern.compile("<top>(.*?)</top>",Pattern.DOTALL);
Stop at
last occurrence (Greedy mode - default)
Dot will match all character with the default greedy matching mode.
First Occurrence (Lazy)
If you want to made it lazy, you need to add a ? after the quantifier. See Regular Expression - (Lazy|Reluctant) Quantifier
Example
Basic
.at matches any three-character string ending with at, including:
- hat,
- cat,
- and bat.
Exclude newlines from the negation
With dot all, a common mistake is to assume that a negated character set like [^#] will also not match newlines.
In order to exclude newlines, they must be added to the set.
Example: Every characters that is not ( # and Linux EOL \n) will be expressed as:
[^#\n]