About
The Regular Expression implementation in Java is based on PCRE. See the comparison to perl section on the pattern regular expression page
Regular expressions are managed within Java with the following classes:
The class Pattern contains all details over how to create a regular expression pattern
Snippet
Group
Capturing groups are numbered by counting their opening parentheses from left to right.
Example: expression ((A)(B(C))) are four such groups:
- 1: ((A)(B(C)))
- 2: (A)
- 3: (B(C))
- 4: (C)
Pattern pattern = Pattern.compile("((A)(B(C)))");
Matcher matcher = pattern.matcher("myInputString");
Assert.assertEquals("There is 4 groups in this pattern", 4, matcher.groupCount());
Iterate
Example on how to iterate over the pattern matches.
String sourceText = "ABC";
Pattern pattern = Pattern.compile("((A)(B(C)))");
Matcher matcher = pattern.matcher(sourceText);
// Number of find
Integer seqCounter = 1;
// Matcher try to find the first sequence match
while (matcher.find()) {
seqCounter++;
System.out.println();
System.out.println("Sequence " + seqCounter);
for (int i = 0; i <= matcher.groupCount(); i++) {
System.out.printf(i + " : " + matcher.group(i));
if (i == 0) {
System.out.printf(" (the index 0 contains the whole string that matches)\n");
} else {
System.out.printf("\n");
}
}
}
Sequence 2
0 : ABC (the index 0 contains the whole string that matches)
1 : ABC
2 : A
3 : BC
4 : C
Parsing bash parameters
String text = "$1 love $2";
Pattern pattern = Pattern.compile("(\\$([0-9]))");
// to catch $1 and $2
Matcher matcher = pattern.matcher(text);
System.out.println("There is "+matcher.groupCount()+" group in the pattern");
int matchCount = 0;
while (matcher.find()) {
matchCount++;
System.out.println("Value for the "+matchCount+" match");
for (int i=0;i<=matcher.groupCount();i++) {
System.out.println(" - Group "+i+": "+ matcher.group(i));
if (i==0){
System.out.println(" The group 0 represents always the whole expression");
}
}
}
Output:
There is 2 group in the pattern
Value for the 1 match
- Group 0: $1
The group 0 represents always the whole expression
- Group 1: $1
- Group 2: 1
Value for the 2 match
- Group 0: $2
The group 0 represents always the whole expression
- Group 1: $2
- Group 2: 2
Search and replace
Flag
Flags need to be in the pattern expression.
Example with an insensitive match for the letter xxx:
Pattern pattern = Pattern.compile("(?i:xxx)");
You can see them in the doc
(?idmsux-idmsux:X)
where X is a non-capturing group with the given flags i d m s u x on, with - off.