Grok is an extension of regular expressions that supports expressions as variables (so they can be reused)
In this example, we will construct an expression that matches the part of a string time expression.
The below statements assign in grok:
MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
Every (?:___) means that this is not a capturing group
With the same syntax, we can define the month number and the year
MONTHNUM (?:0?[1-9]|1[0-2])
YEAR (?>\d\d){1,2}
Now that we have defined the part of our time string, we can reuse the previous variable to create a compound expression
DATE_EU %{MONTHDAY}[./-]%{MONTHNUM}[./-]%{YEAR}
In this example, we show an expression used to parse a web log (ie web server request log).
Example: The below line is an example
55.3.244.1 GET /index.html 15824 0.043
where there is:
To parse this line, the below grok expression can be used.
%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}
where:
The output will be:
Grok patterns are used to extract information from log files.
The data is then used to create telemetry metrics such as in prometheus exporter
This section is about the syntax of a grok expression. Grok was first introduced by logstash. 1).
In a grok expression, you may use:
%{patternName:variableName[:type]}
where:
When the pattern does not exist in the standard_pattern, you can create your own expression 2) by creating a named-capturing group expression.
Example:
(?<variableName>the pattern here)
where variableName is the variable name
If you want to be able to reuse your expression, you can create a custom pattern file in the form
patternName regularExpression
Example:
MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
You can see the built-in pattern file at logstash-patterns-core repository.
If you want to test your grok expression, see:
This section lists the known filter operations that implements Grok.