我得到http://researchscan1.eecs.berkeley.edu/ (和其他人)的Nginx日志条目,请求中有很多特殊字符,我试图过滤出来。 例如:
2016/07/19 09:54:49 [error] 2006#2006: *5878 testing "//http" existence failed (2: No such file or directory) while logging request, client: 169.229.3.91, server: common.example.co.uk, request: "J/¤nkb=© 2]rµÐ['lç¢î/€@I"- 2016/07/19 11:29:05 [error] 2007#2007: *5945 testing "//http" existence failed (2: No such file or directory) while logging request, client: 169.229.3.91, server: common.example.co.uk, request: "i•jœ»@d‹˜þˆ¿–j•c|B‹¤¯Dñ½°|ôáV*Õ8ÓãÎð€í)ÑYCæôì £¶›¬Dxîoÿv.N"
我通常的Logcheck正则expression式为这种请求:
^[[:digit:]]{4}/[[:digit:]]{2}/[[:digit:]]{2} [[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2} \[error\] [#[:digit:]]+: \*[[:digit:]]+ testing .+ existence failed \(2: No such file or directory\) while logging request, .+$
没有抓住他们。 我试过了:
^[[:digit:]]{4}/[[:digit:]]{2}/[[:digit:]]{2} [[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2} \[error\] [#[:digit:]]+: \*[[:digit:]]+ testing .+ existence failed \(2: No such file or directory\) while logging request, (.|[[:cntrl:]])+$
但没有运气。 两种变体都将RegexBuddy中的日志条目设置为POSIX ERE。 任何Logcheck /正则expression式专家是否能够帮助我?
你需要逃避你的斜线。 我的意思是斜杠,分开date。
^[[:digit:]]{4}\/[[:digit:]]{2}\/[[:digit:]]{2} [[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2} \[error\] [#[:digit:]]+: \*[[:digit:]]+ testing .+ existence failed \(2: No such file or directory\) while logging request, .+$
那么,即使是最后那些特殊的字符,你平常的expression对我来说也很好。