logstash
logstash

grok

Status: stable

Parse arbitrary text and structure it. Grok is currently the best way in logstash to parse crappy unstructured log data (like syslog or apache logs) into something structured and queryable.

Grok allows you to match text without needing to be a regular expressions ninja. Logstash ships with about 120 patterns by default. You can add your own trivially. (See the patterns_dir setting)

Flags

This plugin provides the following flags:
--grok-patterns-path PATH
Colon-delimited path of patterns to load

Synopsis

This is what it might look like in your config file:
filter {
  grok {
    /[A-Za-z0-9_-]+/ => ... # string (optional)
    add_field => ... # hash (optional), default: {}
    add_tag => ... # array (optional), default: []
    break_on_match => ... # boolean (optional), default: true
    drop_if_match => ... # boolean (optional)
    keep_empty_captures => ... # boolean (optional)
    match => ... # hash (optional), default: {}
    named_captures_only => ... # boolean (optional), default: true
    pattern => ... # array (optional)
    patterns_dir => ... # array (optional), default: []
    tags => ... # array (optional), default: []
    type => ... # string (optional), default: ""
  }
}

Details

/[A-Za-z0-9_-]+/

  • The configuration attribute name here is anything that matches the above regular expression.
  • Value type is string
  • There is no default value for this setting.

Any existing field name can be used as a config name here for matching against.

# this config:
foo => "some pattern"

# same as:
match => [ "foo", "some pattern" ]

add_field

  • Value type is hash
  • Default value is {}

If this filter is successful, add any arbitrary fields to this event. Example:

filter {
  myfilter {
    add_field => [ "sample", "Hello world, from %{@source}" ]
  }
}

On success, myfilter will then add field 'sample' with the value above and the %{@source} piece replaced with that value from the event.

add_tag

  • Value type is array
  • Default value is []

If this filter is successful, add arbitrary tags to the event. Tags can be dynamic and include parts of the event using the %{field} syntax. Example:

filter {
  myfilter {
    add_tag => [ "foo_%{somefield}" ]
  }
}

If the event has field "somefield" == "hello" this filter, on success, would add a tag "foo_hello"

break_on_match

  • Value type is boolean
  • Default value is true

Break on first match. The first successful match by grok will result in the filter being finished. If you want grok to try all patterns (maybe you are parsing different things), then set this to false.

drop_if_match

  • Value type is boolean
  • There is no default value for this setting.

Drop if matched. Note, this feature may not stay. It is preferable to combine grok + grep filters to do parsing + dropping.

requested in: googlecode/issue/26

keep_empty_captures

  • Value type is boolean
  • There is no default value for this setting.

If true, keep empty captures as event fields.

match

  • Value type is hash
  • Default value is {}

Specify a path to a directory with grok pattern files in it A hash of matches of field => value

named_captures_only

  • Value type is boolean
  • Default value is true

If true, only store named captures from grok.

pattern

  • Value type is array
  • There is no default value for this setting.

Specify a pattern to parse with. This will match the '@message' field.

If you want to match other fields than @message, use the 'match' setting. Multiple patterns is fine.

patterns_dir

  • Value type is array
  • Default value is []

logstash ships by default with a bunch of patterns, so you don't necessarily need to define this yourself unless you are adding additional patterns.

Pattern files are plain text with format:

NAME PATTERN

For example:

NUMBER \d+

tags

  • Value type is array
  • Default value is []

Only handle events with all of these tags. Note that if you specify a type, the event must also match that type. Optional.

type

  • Value type is string
  • Default value is ""

The type to act on. If a type is given, then this filter will only act on messages with the same type. See any input plugin's "type" attribute for more. Optional.


This is documentation from lib/logstash/filters/grok.rb