FluentBit 1) from Calyptia is a log collector (ie observability pipeline tool) (written in C, that works on Linux and Windows).
It's the Fluentd successor with smaller memory footprint 2)
When you need to parse log file, you need to define their format via a Parser format.
The parser format are defined in a parser file 3)
The regex parser definition is based on named regular expression group
Example: in the parser definition, the regex key has the regular expression
[PARSER]
Name myparser
Format regex
Regex ^(?<INT>[^ ]+) (?<FLOAT>[^ ]+) (?<BOOL>[^ ]+) (?<STRING>.+)$
where each regular expression group follows the same pattern:
The FluentBit repository has already example of regexp parser files that you can use to build your own parser file.
The files are available in the conf github directory and starts with parsers
Example: Extract of the parsers.conf file For:
[PARSER]
Name apache2
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>.*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name nginx
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z