Parse arbitrary text and structure it. Grok is currently the best way in logstash to parse crappy unstructured log data (like syslog or apache logs) into something structured and queryable.
Grok allows you to match text without needing to be a regular expressions ninja. Logstash ships with about 120 patterns by default. You can add your own trivially. (See the patterns_dir setting)
filter {
grok {
/[A-Za-z0-9_-]+/ => ... # string (optional)
add_field => ... # hash (optional), default: {}
add_tag => ... # array (optional), default: []
break_on_match => ... # boolean (optional), default: true
drop_if_match => ... # boolean (optional)
keep_empty_captures => ... # boolean (optional)
match => ... # hash (optional), default: {}
named_captures_only => ... # boolean (optional), default: true
pattern => ... # array (optional)
patterns_dir => ... # array (optional), default: []
tags => ... # array (optional), default: []
type => ... # string (optional), default: ""
}
}
Any existing field name can be used as a config name here for matching against.
# this config:
foo => "some pattern"
# same as:
match => [ "foo", "some pattern" ]
If this filter is successful, add any arbitrary fields to this event. Example:
filter {
myfilter {
add_field => [ "sample", "Hello world, from %{@source}" ]
}
}
On success, myfilter will then add field 'sample' with the value above and the %{@source} piece replaced with that value from the event.
If this filter is successful, add arbitrary tags to the event. Tags can be dynamic and include parts of the event using the %{field} syntax. Example:
filter {
myfilter {
add_tag => [ "foo_%{somefield}" ]
}
}
If the event has field "somefield" == "hello" this filter, on success, would add a tag "foo_hello"
Break on first match. The first successful match by grok will result in the filter being finished. If you want grok to try all patterns (maybe you are parsing different things), then set this to false.
Drop if matched. Note, this feature may not stay. It is preferable to combine grok + grep filters to do parsing + dropping.
requested in: googlecode/issue/26
If true, keep empty captures as event fields.
Specify a path to a directory with grok pattern files in it A hash of matches of field => value
If true, only store named captures from grok.
Specify a pattern to parse with. This will match the '@message' field.
If you want to match other fields than @message, use the 'match' setting. Multiple patterns is fine.
logstash ships by default with a bunch of patterns, so you don't necessarily need to define this yourself unless you are adding additional patterns.
Pattern files are plain text with format:
NAME PATTERN
For example:
NUMBER \d+
Only handle events with all of these tags. Note that if you specify a type, the event must also match that type. Optional.
The type to act on. If a type is given, then this filter will only act on messages with the same type. See any input plugin's "type" attribute for more. Optional.