elasticsearch - grok regex in logstash to parse and extract field

Question

Welcome To Ask or Share your Answers For Others

elasticsearch - grok regex in logstash to parse and extract field

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

elasticsearch - grok regex in logstash to parse and extract field

I am trying to extract certain fields from a single message field. I am trying to achieve this by grok regex on the logstash so that i could view them in kibana.

My log events is as below: [2021-01-06 12:10:40] ApiLogger.INFO: API log data: {"endpoint":"/rest/thre_en/V1/temp-carts/13cEIQqUb6cUfxB/tryer-inform","http_method":"GET","payload":[],"user_id":0,"user_type":4,"http_response_code":200,"response":"{"pay_methods":[{"code":"frane","title":"R2 Partial redeem"}],"totals":{"grand_total":0,"base_grand_total":0}}

The entire log has more information into different key value store- Basically, I needed these information -

time stamp (i am able to get this)
log level (I am able to get this) => on loglevel, i just want the info not the entire Api.INFO
endpoint
http-method
user_id
user_type
http_response_code
response

I am not able to get the information from 3-8 ... i tested it. it is due to the semi colon(:) this is what i tried through grok debugger %{SYSLOG5424SD:logtime} %{JAVACLASS:loglevel}: (?<API>w+ w+ w+):

i tried uri and other but it did not work, may be due to the colon.

question from:https://stackoverflow.com/questions/65600601/grok-regex-in-logstash-to-parse-and-extract-field

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T18:48:01+0000

You can use

%{SYSLOG5424SD:logtime} ApiLogger.%{LOGLEVEL:loglevel}: (?<API>w+ w+ w+):s*%{GREEDYDATA:json_field}

Then, you can parse the json_field with JSON filter.

If you want to play around with regex, you should remember that regex engine parses a string from left to right by default. If you want to capture several fields with one regular expression, you should make sure the regex engine can "walk" all the way from one part to another. If you know what patterns there are, what types of chars there are between the two, it is great. If not, you can only rely on a .* (%{GREEDYDATA}) or .*? (%{DATA}) patterns.

So, as an excercise, you might have a look at

%{SYSLOG5424SD:logtime} %{JAVACLASS:loglevel}: (?<API>w+ w+ w+):s*{"endpoint":"(?<endpoint>[^"]*)","http_method":"(?<http_method>[A-Z]++).*?"user_id":(?<user_id>[0-9]++).*?"user_type":(?<user_type>[0-9]++).*?"http_response_code":(?<http_response_code>[0-9]++).*?"response":"(?<response>.*)"

Check the ++ in [0-9]++ and .*? patterns between each field. The ++ possessive quantifier make sure the engine does not retry matching with the pattern that is modified by the quantifier again if the subsequent patterns fail to match. The [0-9]++ grabs a sequence of digits and does not give them away and if the subsequent patterns fail, the whole match fails. .*? simply matches any zero or more chars other than line break chars, as few as possible. The last .* is greedy, because it must match as many chars other than line break chars as possible.

See the regex demo.

Categories

elasticsearch - grok regex in logstash to parse and extract field

elasticsearch - grok regex in logstash to parse and extract field

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags