# Regular Expression - Group (Capture|Substitution)

### Table of Contents

## 1 - About

group are regexp expression that normally capture the match during the parsing. You can then (extract|reference) the match content.

Groups are inside parentheses.

## 2 - Articles Related

## 3 - Syntax

Every group must begin with an open bracket and end with a close bracket.

```
(myRegexp0 ( myRegexp1) ( myRegexp2) )
```

Construct | Definition |
---|---|

(?<name>X) | X, as a named-capturing group |

Non-Capturing | |

(?:X) | X, as a non-capturing group |

(?>X) | X, as an independent, non-capturing group |

Assertion (See Regexp - Look-around group (Assertion) - ( Lookahead | Lookbehind )) | |

(?=X) | X, positive lookahead (via zero-width) |

(?!X) | X, negative lookahead (via zero-width) |

(?<=X) | X, positive lookbehind (via zero-width) |

(?<!X) | X, negative lookbehind (via zero-width) |

Flag | |

(?idmsuxU-idmsuxU) | Nothing, but turns match flags i d m s u x U on - off |

(?idmsux-idmsux:X) | X, as a non-capturing group with the given flags i d m s u x on - off |

## 4 - Name

By define the group is indexed by id but you can give it a name with the following syntax

```
(?<name>X)
```

where X is the regular expression pattern that you want to capture

It's called a named-capturing group.

## 5 - Index

Capturing groups are numbered by counting their opening parentheses from left to right.

In the expression ((A)(B(C))), for example, there are the following groups:

- 0 - Group zero always stands for the entire expression - ((A)(B(C)))
- 1 - ((A)(B(C)))
- 2 - (A)
- 3 - (B(C))
- 4 - (C)

## 6 - Non-Capturing

### 6.1 - Basic

A non capturing group will not be indexed.

In the expression (?:A)(B)(C), for example, there are the following groups:

- 0 - Group zero always stands for the entire expression - (?:A)(B)(C)
- 1 - (B)
- 2 - (C)

The group (?:A) was not captured.

### 6.2 - Look-around

## 7 - Substitution

When you want to use the content of each captured group, you will generally use the following substitution construct:

- ${n} for the group index
- ${groupName} for the group name

When using group index, this construct must be used when:

- the number of group is greater than 9
- you want a number that follow the substitution

The dollar is also not always mandatory:

- $n for the group index
- $groupName for the group name

Their is also a shorthand notation for groups up to 9.

Symbol | Definition |
---|---|

\0 | backreference to the entire expression |

\1 | backreference to group 1 |

\2 | backreference to group 2 |

\n | backreference to group n |

## 8 - Example

The below regular expression has two groups

```
([^ ]) (.*)
```

where:

- the first group is [^ ] which will parse all non space characters.
- the second group is .* which will take all characters.

if you parse the following text:

```
Hello World
```

You will get:

- in the first group, \1, the text Hello
- and in the second group, \2, the text World

See more example here: Notepad++ - Replace with Regular Expression