Table of Contents

About

The Regular Expression implementation in Java is based on PCRE. See the comparison to perl section on the pattern regular expression page

Regular expressions are managed within Java with the following classes:

  • Pattern to make from a string a pattern
  • Matcher to manage the matches

The class Pattern contains all details over how to create a regular expression pattern

Snippet

Group

Regular Expression - Group (Capture|Substitution)

Capturing groups are numbered by counting their opening parentheses from left to right.

Example: expression ((A)(B(C))) are four such groups:

  • 1: ((A)(B(C)))
  • 2: (A)
  • 3: (B(C))
  • 4: (C)
Pattern pattern = Pattern.compile("((A)(B(C)))");
Matcher matcher = pattern.matcher("myInputString");
Assert.assertEquals("There is 4 groups in this pattern", 4, matcher.groupCount());

Iterate

Example on how to iterate over the pattern matches.

String sourceText = "ABC";
Pattern pattern = Pattern.compile("((A)(B(C)))");
Matcher matcher = pattern.matcher(sourceText);

// Number of find
Integer seqCounter = 1;
// Matcher try to find the first sequence match
while (matcher.find()) {

	seqCounter++;
	
	System.out.println();
	System.out.println("Sequence " + seqCounter);

	for (int i = 0; i <= matcher.groupCount(); i++) {

		System.out.printf(i + " : " + matcher.group(i));
		if (i == 0) {
			System.out.printf(" (the index 0 contains the whole string that matches)\n");
		} else {
			System.out.printf("\n");
		}
	}
	
}
Sequence 2
0 : ABC (the index 0 contains the whole string that matches)
1 : ABC
2 : A
3 : BC
4 : C

Parsing bash parameters

String text = "$1 love $2";
Pattern pattern = Pattern.compile("(\\$([0-9]))");
// to catch $1 and $2
Matcher matcher = pattern.matcher(text);
System.out.println("There is "+matcher.groupCount()+" group in the pattern");
int matchCount = 0;
while (matcher.find()) {
	matchCount++;
	System.out.println("Value for the "+matchCount+" match");
	for (int i=0;i<=matcher.groupCount();i++) {
		System.out.println(" - Group "+i+": "+ matcher.group(i));
		if (i==0){
			System.out.println("  The group 0 represents always the whole expression");
		}
	}
}

Output:

There is 2 group in the pattern
Value for the 1 match
 - Group 0: $1
  The group 0 represents always the whole expression
 - Group 1: $1
 - Group 2: 1
Value for the 2 match
 - Group 0: $2
  The group 0 represents always the whole expression
 - Group 1: $2
 - Group 2: 2

Search and replace

Flag

Flags need to be in the pattern expression.

Example with an insensitive match for the letter xxx:

Pattern pattern = Pattern.compile("(?i:xxx)");

You can see them in the doc

(?idmsux-idmsux:X) 

where X is a non-capturing group with the given flags i d m s u x on, with - off.

Documentation Reference