Grok

Regexp

About

Grok is an extension of regular expressions that supports expressions as variables (so they can be reused)

Example

A time variable expression

In this example, we will construct an expression that matches the part of a string time expression.

The below statements assign in grok:

  • the MONTHDAY variable
  • to the expression (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])

Every (?:___) means that this is not a capturing group

With the same syntax, we can define the month number and the year

MONTHNUM (?:0?[1-9]|1[0-2])
YEAR (?>\d\d){1,2}

Now that we have defined the part of our time string, we can reuse the previous variable to create a compound expression

DATE_EU %{MONTHDAY}[./-]%{MONTHNUM}[./-]%{YEAR}

Web Log

In this example, we show an expression used to parse a web log (ie web server request log).

Example: The below line is an example

55.3.244.1 GET /index.html 15824 0.043

where there is:

To parse this line, the below grok expression can be used.

%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}

where:

  • the ip is matched with the IP base pattern and will get the variable name client
  • the method is matched with the WORD pattern (ie word) and will get the variable method
  • and so on

The output will be:

  • client: 55.3.244.1
  • method: GET
  • request: /index.html
  • bytes: 15824
  • duration: 0.043

Syntax

This section is about the syntax of a grok expression. Grok was first introduced by logstash. 1).

In a grok expression, you may use:

Standard Pattern

%{patternName:variableName[:type]}

where:

Custom Pattern

When the pattern does not exist in the standard_pattern, you can create your own expression 2) by creating a named-capturing group expression.

Example:

 (?<variableName>the pattern here)

where variableName is the variable name

Custom Pattern File

If you want to be able to reuse your expression, you can create a custom pattern file in the form

patternName regularExpression

Example:

MONTHDAY (?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])

You can see the built-in pattern file at logstash-patterns-core repository.

Grok Debug / Editor App

If you want to test your grok expression, see:

Filter

This section lists the known filter operations that implements Grok.





Discover More
Card Puncher Data Processing
Apache - Web Log (Activity Log)

in Apache The format of the Apache log file in Grok is: This is an example of a table to parse the apache log file with hive
Data System Architecture
LogStash

LogStash is: * a metrics collector * a log collector. * with pipeline ability A Logstash pipeline is composed of the following elements; * input (produce the data) * filter (optional, process...



Share this page:
Follow us:
Task Runner