ELK实践(二):收集Nginx日志

8次阅读

共计 7528 个字符,预计需要花费 19 分钟才能阅读完成。

Nginx 访问日志
这里补充下 Nginx 访问日志使用的说明。一般在 nginx.conf 主配置文件里需要定义一种格式:
log_format main ‘$remote_addr – $remote_user [$time_local] “$request” ‘
‘$status $body_bytes_sent “$http_referer” ‘
‘”$http_user_agent” “$http_x_forwarded_for” $request_time’;
上面的格式我是基于默认的加了一个 $request_time。
然后子配置使用:
access_log logs/myapp.log main;
即可。
Filebeat 采集日志数据到 ElasticSearch
配置:
su -e elk
cd /usr/local/elk
vim beats/filebeat/filebeat.test_nginx.yml
配置详情:
filebeat.prospectors:
– type: log
input_type: log
paths:
– /work/yphp/nginx/logs/*.log
tags: [“ngx”, “yujc”]
fields:
logIndex: nginx
docType: nginx-access
fields_under_root: true
tail_files: false

output.elasticsearch:
hosts: [“127.0.0.1:9200”]
index: “test-nginx-%{+yyyy.MM.dd}”
配置说明:
filebeat.prospectors:

type 日志类型,默认 log
input_type 输入类型,默认 log
paths 采集的日志,可以使用通配符。支持多个
tags 自定义标签,是个数组。自定义
fields 自定义字段
fields_under_root 自定义字段是否追加到根。如果为 false,fields 配置的字段键名是 fields
tail_files 是否从末尾开始采集
document_type 自定义字段,用于 Logsatsh 区分来源,在 Logsatsh 里用变量 type 表示

output.elasticsearch:

hosts 配置 ES 节点,数组格式,支持多个。
index 配置 ES 索引。不配置使用默认的 filebeat-*
protocol 配置协议,例如 http,https
username 配置 ES 用户名,例如 elastic
password 配置 ES 密码,例如 changeme

设置权限 600,并启动 filebeat:
chmod -R 600 beats/filebeat/filebeat.test_nginx.yml

./beats/filebeat/filebeat -c beats/filebeat/filebeat.test_nginx.yml
然后访问 Nginx 应用,查看 ES 是否新增了一个索引:

$ curl http://127.0.0.1:9200/_cat/indices?v | grep test-nginx
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
105 1161 105 1161 0 0 123k 0 –:–:– –:–:– –:–:– 125k
yellow open test-nginx-2018.09.24 ArxrVVOkTjG8ZlXJjb9bVg 5 1 1 0 11.6kb 11.6kb

我们查看一条数据:
$ curl http://127.0.0.1:9200/test-nginx-2018.09.24/_search?q=*&size=1

{
“_index”: “test-nginx-2018.09.24”,
“_type”: “doc”,
“_id”: “AWYKkBqtJzfnbYlB_DRX”,
“_version”: 1,
“_score”: null,
“_source”: {
“@timestamp”: “2018-09-24T07:51:43.140Z”,
“beat”: {
“hostname”: “2106567e5bce”,
“name”: “2106567e5bce”,
“version”: “5.6.2”
},
“docType”: “nginx-access”,
“input_type”: “log”,
“logIndex”: “nginx”,
“message”: “172.16.10.1 – – [24/Sep/2018:07:51:40 +0000] \”GET /?time=22 HTTP/1.1\” 200 97991 \”-\” \”Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36\” \”-\” 0.009″,
“offset”: 5243,
“source”: “/work/yphp/nginx/logs/hello71.log”,
“tags”: [
“ngx”,
“yujc”
],
“type”: “log”
},
“fields”: {
“@timestamp”: [
1537775503140
]
},
“sort”: [
1537775503140
]
}
可以看到已经有数据了。但是日志内容作为一个整体(字段是 message)了。
Filebeat 采集日志数据,Logstash 过滤发到 ElasticSearch
配置:
su -e elk
cd /usr/local/elk
vim beats/filebeat/filebeat.test_nginx2.yml
配置详情:
filebeat.prospectors:
– type: log
input_type: log
paths:
– /work/yphp/nginx/logs/*.log
tags: [“ngx”, “yujc”]
fields:
logIndex: nginx
docType: nginx-access
fields_under_root: true
tail_files: false

output.logstash:
hosts: [“127.0.0.1:5044”]
配置 logstash
su -e elk
cd /usr/local/elk
vim logstash/config/conf.d/filebeat.conf
配置详情:
input {
beats {
port => 5044
}
}

filter {
grok {
match => {“message” => “%{IPORHOST:remote_ip} – %{DATA:user_name} \[%{HTTPDATE:time}\] \”%{WORD:method} %{DATA:url} HTTP/%{NUMBER:http_version}\” %{NUMBER:response_code} %{NUMBER:body_sent:bytes} \”%{DATA:referrer}\” \”%{DATA:agent}\” \”%{DATA:x_forwarded_for}\” %{NUMBER:request_time}” }
remove_field => “message”
}
}

output {
elasticsearch {
hosts => [“127.0.0.1:9200”]
index => “test-nginx2-%{type}-%{+YYYY.MM.dd}”
document_type => “%{type}”
}
stdout {codec => rubydebug}
}
我使用的 nginx 日志格式是在标准格式后面加了 2 个字段 $http_x_forwarded_for 和 $request_time:
log_format main ‘$remote_addr – $remote_user [$time_local] “$request” ‘
‘$status $body_bytes_sent “$http_referer” ‘
‘”$http_user_agent” “$http_x_forwarded_for” $request_time’;
日志示例:
172.16.10.1 – – [24/Sep/2018:09:04:40 +0000] “GET /?time=2244 HTTP/1.1” 200 98086 “-” “Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36” “-” 0.002
上面的 grok 表达式是:
%{IPORHOST:remote_ip} – %{DATA:user_name} \[%{HTTPDATE:time}\] \”%{WORD:method} %{DATA:url} HTTP/%{NUMBER:http_version}\” %{NUMBER:response_code} %{NUMBER:body_sent:bytes} \”%{DATA:referrer}\” \”%{DATA:agent}\” \”%{DATA:x_forwarded_for}\” %{NUMBER:request_time}
我们先使用 Grok Debugger 工具在线调试下,看看写的 grok 是否正确。我之前没有测试之前启动,发现 ES 里没有 grok 里解析出来的字段,后来在命令行看到 filebeat 的输出(前台运行):
$ ./beats/filebeat/filebeat -c beats/filebeat/filebeat.test_nginx2.yml

{
“@timestamp” => 2018-09-24T09:01:19.555Z,
“logIndex” => “nginx”,
“offset” => 6467,
“docType” => “nginx-access”,
“@version” => “1”,
“input_type” => “log”,
“beat” => {
“name” => “2106567e5bce”,
“hostname” => “2106567e5bce”,
“version” => “5.6.2”
},
“host” => “2106567e5bce”,
“source” => “/work/yphp/nginx/logs/hello71.log”,
“message” => “172.16.10.1 – – [24/Sep/2018:09:01:14 +0000] \”GET /?time=2244 HTTP/1.1\” 200 98087 \”-\” \”Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36\” \”-\” 0.195″,
“type” => “log”,
“tags” => [
[0] “ngx”,
[1] “yujc”,
[2] “beats_input_codec_plain_applied”,
[3] “_grokparsefailure”
]
}
最后面提示了_grokparsefailure,说明 grok 部分写的有问题。由于是参考的网上教程,也加上刚接触,不知道怎么配置,filebeat.conf 调试了很久才生效。
我们打开 Grok Debugger,第一个输入框输入 filebeat 采集的消息原文 message 字段里的内容,第二个输入框输入 grok 表达式:
点击 Go 按钮即可解析。如果下面的内容是 {} 说明解析失败,然后可以修改表达式,该工具会自动解析。最终解析结果:
{
“remote_ip”: [
[
“172.16.10.1”
]
],
“HOSTNAME”: [
[
“172.16.10.1”
]
],
“IP”: [
[
null
]
],
“IPV6”: [
[
null
]
],
“IPV4”: [
[
null
]
],
“user_name”: [
[
“-”
]
],
“time”: [
[
“24/Sep/2018:08:47:59 +0000”
]
],
“MONTHDAY”: [
[
“24”
]
],
“MONTH”: [
[
“Sep”
]
],
“YEAR”: [
[
“2018”
]
],
“TIME”: [
[
“08:47:59”
]
],
“HOUR”: [
[
“08”
]
],
“MINUTE”: [
[
“47”
]
],
“SECOND”: [
[
“59”
]
],
“INT”: [
[
“+0000”
]
],
“method”: [
[
“GET”
]
],
“url”: [
[
“/?time=2244”
]
],
“http_version”: [
[
“1.1”
]
],
“BASE10NUM”: [
[
“1.1”,
“200”,
“98086”,
“0.002”
]
],
“response_code”: [
[
“200”
]
],
“body_sent”: [
[
“98086”
]
],
“referrer”: [
[
“-”
]
],
“agent”: [
[
“Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36”
]
],
“x_forwarded_for”: [
[
“-”
]
],
“request_time”: [
[
“0.002”
]
]
}
然后可以启动 logstash 了。
测试 logstash 配置是否通过:
./logstash/bin/logstash -f logstash/config/conf.d/filebeat.conf –config.test_and_exit

Config Validation Result: OK. Exiting Logstash
# 启动 logstash
./logstash/bin/logstash &

# 启动 filebeat
./beats/filebeat/filebeat -c beats/filebeat/filebeat.test_nginx2.yml
我们再次访问 Nginx 应用,然后我们查看一条数据:
$ curl http://127.0.0.1:9200/test-nginx2-log-2018.09.24/_search?q=*&size=1&sort=@timestamp:desc

{
“took”: 14,
“timed_out”: false,
“_shards”: {
“total”: 5,
“successful”: 5,
“skipped”: 0,
“failed”: 0
},
“hits”: {
“total”: 3,
“max_score”: null,
“hits”: [
{
“_index”: “test-nginx2-log-2018.09.24”,
“_type”: “log”,
“_id”: “AWYK0to8JzfnbYlB_DRx”,
“_score”: null,
“_source”: {
“response_code”: “200”,
“agent”: “Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36”,
“logIndex”: “nginx”,
“offset”: 6875,
“method”: “GET”,
“docType”: “nginx-access”,
“user_name”: “-“,
“input_type”: “log”,
“http_version”: “1.1”,
“source”: “/work/yphp/nginx/logs/hello71.log”,
“message”: “””172.16.10.1 – – [24/Sep/2018:09:04:40 +0000] “GET /?time=2244 HTTP/1.1” 200 98086 “-” “Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36” “-” 0.002″””,
“type”: “log”,
“url”: “/?time=2244”,
“tags”: [
“ngx”,
“yujc”,
“beats_input_codec_plain_applied”
],
“x_forwarded_for”: “-“,
“referrer”: “-“,
“@timestamp”: “2018-09-24T09:04:40.404Z”,
“remote_ip”: “172.16.10.1”,
“request_time”: “0.002”,
“@version”: “1”,
“beat”: {
“name”: “2106567e5bce”,
“hostname”: “2106567e5bce”,
“version”: “5.6.2”
},
“host”: “2106567e5bce”,
“body_sent”: “98086”,
“time”: “24/Sep/2018:09:04:40 +0000”
},
“sort”: [
1537779880404
]
}
]
}
}
里面就包含了所有我们解析出来的字段。
kibana 里查看
打开 kibana web 地址:http://127.0.0.1:5601,依次打开:Management-> Kibana -> Index Patterns , 选择 Create Index Pattern:a. Index pattern 输入:test-nginx2-*;b. Time Filter field name 选择 @timestamp。c. 点击 Create。
然后打开 Discover,选择 filebeat-* 就能看到日志数据了。
可以看到详细字段:
参考
1、Logstash 使用 grok 过滤 nginx 日志(二)– Orgliny – 博客园 https://www.cnblogs.com/Orgli…2、Rsyslog 日志服务搭建 – K‘e0llm – 博客园 http://www.cnblogs.com/Eivll0…3、Logstash 中如何处理到 ElasticSearch 的数据映射 – Cocowool – 博客园 https://www.cnblogs.com/cocow…4、ELK 架构之 Logstash 和 Filebeat 安装配置 – 田园里的蟋蟀 – 博客园 http://www.cnblogs.com/xishua…5、搭建 ELK 日志分析平台(下)—— 搭建 kibana 和 logstash 服务器 -zero 菌 -51CTO 博客 http://blog.51cto.com/zero01/…

正文完
 0