graylog 是一个开源的业余的日志聚合、剖析、审计、展现、预警的工具,跟 ELK 很类似,然而更简略,上面说一说 graylog 如何部署,应用,以及对 graylog 的工作流程做一个简略的梳理
本文篇幅比拟长,一共应用了三台机器,这三台机器上部署了 kafka 集群(2.3),es 集群(7.11.2),MongoDB 正本集(4.2),还有 graylog 集群(4.0.2),收集的日志是 k8s 的日志,应用 DaemonSet 的形式通过 filebeat(7.11.2)将日志收集到 kafka 中。本文将从部署开始,一步一步的理解 graylog 是怎么部署,以及简略的应用。
graylog 介绍
组件介绍
从架构图中能够看出,graylog 是由三局部组成:
- mongodb 寄存 gralog 管制台上的配置信息,以及 graylog 集群状态信息,还有一些元信息
- es 寄存日志数据,以及检索数据等
- graylog 相当于一个直达的角色
mongodb 和 es 没什么好说的,作用都比拟清晰,重点说一下 graylog 的一些组件,及其作用。
- Inputs 日志数据起源,能够通过 graylog 的 Sidecar 来被动抓取,也能够通过其余 beats,syslog 等被动推送
- Extractors 日志数据格式转换,次要用于 json 解析、kv 解析、工夫戳解析、正则解析
- Streams 日志信息分类,通过设置一些规定来将日志发送到指定的索引中
- Indices 长久化数据存储,设置索引名及索引的过期策略、分片数、正本数、flush 工夫距离等
- Outputs 日志数据的转发,将解析的 Stream 发送到其余的 graylog 集群
- Pipelines 日志数据的过滤,建设数据荡涤的过滤规定、字段增加或删除、条件过滤、自定义函数
- Sidecar 轻量级的日志采集器
- LookupTables 服务解析,基于 IP 的 Whois 查问和基于源 IP 的情报监控
- Geolocation 可视化地理位置,基于起源 IP 的监控
流程介绍
Graylog 通过设置 Input 来收集日志,比方这里通过设置好 kafka 或者 redis 或者间接通过 filebeat 将日志收集过去,而后 Input 配置好 Extractors,用来对日志中的字段做提取和转换,能够设置多个 Extractors,依照程序执行,设置好后,零碎会把日志通过在 Stream 中设置的匹配规定保留到 Stream 中,能够在 Stream 中指定索引地位,而后存储到 es 的索引中,实现这些操作后,能够在控制台中通过指定 Stream 名称来查看对应的日志。
装置 mongodb
依照官网文档,装的是 4.2.x 的
工夫同步
装置 ntpdate
yum install ntpdate -ycp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
增加到打算工作中
# crontab -e5 * * * * ntpdate -u ntp.ntsc.ac.cn
配置仓库源并装置
vim /etc/yum.repos.d/mongodb-org.repo[mongodb-org-4.2]name=MongoDB Repositorybaseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/4.2/x86_64/gpgcheck=1enabled=1gpgkey=https://www.mongodb.org/static/pgp/server-4.2.asc
而后装置
yum makecacheyum -y install mongodb-org
而后启动
systemctl daemon-reloadsystemctl enable mongod.servicesystemctl start mongod.servicesystemctl --type=service --state=active | grep mongod
批改配置文件设置正本集
# vim /etc/mongod.conf# mongod.conf# for documentation of all options, see:# http://docs.mongodb.org/manual/reference/configuration-options/# where to write logging data.systemLog: destination: file logAppend: true path: /var/log/mongodb/mongod.log# Where and how to store data.storage: dbPath: /var/lib/mongo journal: enabled: true# engine:# wiredTiger:# how the process runsprocessManagement: fork: true # fork and run in background pidFilePath: /var/run/mongodb/mongod.pid # location of pidfile timeZoneInfo: /usr/share/zoneinfo# network interfacesnet: port: 27017 bindIp: 0.0.0.0 # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.#security:#operationProfiling:replication: replSetName: graylog-rs #设置正本集名称#sharding:## Enterprise-Only Options#auditLog:#snmp:
初始化正本集
> use admin;switched to db admin> rs.initiate( {... _id : "graylog-rs",... members: [... { _id: 0, host: "10.0.105.74:27017"},... { _id: 1, host: "10.0.105.76:27017"},... { _id: 2, host: "10.0.105.96:27017"}... ]... }){ "ok" : 1, "$clusterTime" : { "clusterTime" : Timestamp(1615885669, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1615885669, 1)}
确认正本集状态
不出意外的话,集群会有两个角色,一个是Primary
,另一个是Secondary
,应用命令能够查看
rs.status()
会返回一堆信息,如下所示:
"members" : [ { "_id" : 0, "name" : "10.0.105.74:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 623, "optime" : { "ts" : Timestamp(1615885829, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2021-03-16T09:10:29Z"), "syncingTo" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "", "electionTime" : Timestamp(1615885679, 1), "electionDate" : ISODate("2021-03-16T09:07:59Z"), "configVersion" : 1, "self" : true, "lastHeartbeatMessage" : "" }, { "_id" : 1, "name" : "10.0.105.76:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 162, "optime" : { "ts" : Timestamp(1615885829, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1615885829, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2021-03-16T09:10:29Z"), "optimeDurableDate" : ISODate("2021-03-16T09:10:29Z"), "lastHeartbeat" : ISODate("2021-03-16T09:10:31.690Z"), "lastHeartbeatRecv" : ISODate("2021-03-16T09:10:30.288Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "", "syncingTo" : "10.0.105.74:27017", "syncSourceHost" : "10.0.105.74:27017", "syncSourceId" : 0, "infoMessage" : "", "configVersion" : 1 }, { "_id" : 2, "name" : "10.0.105.96:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 162, "optime" : { "ts" : Timestamp(1615885829, 1), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1615885829, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2021-03-16T09:10:29Z"), "optimeDurableDate" : ISODate("2021-03-16T09:10:29Z"), "lastHeartbeat" : ISODate("2021-03-16T09:10:31.690Z"), "lastHeartbeatRecv" : ISODate("2021-03-16T09:10:30.286Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "", "syncingTo" : "10.0.105.74:27017", "syncSourceHost" : "10.0.105.74:27017", "syncSourceId" : 0, "infoMessage" : "", "configVersion" : 1 } ]
创立用户
轻易找一台机器执行即可
use admindb.createUser({user: "admin", pwd: "Root_1234", roles: ["root"]})db.auth("admin","Root_1234")
而后别退出,再创立一个用于 graylog 连贯的用户
use graylogdb.createUser("graylog", { "roles" : [{ "role" : "dbOwner", "db" : "graylog" }, { "role" : "readWrite", "db" : "graylog" }]})
生成 keyFile 文件
openssl rand -base64 756 > /var/lib/mongo/access.key
批改权限
chown -R mongod.mongod /var/lib/mongo/access.keychmod 600 /var/lib/mongo/access.key
生成完这个 key 之后,须要拷贝到其余另外两台机器上,并同样批改好权限
scp -r /var/lib/mongo/access.key 10.0.105.76:/var/lib/mongo/
拷贝实现后,须要批改配置文件
# vim /etc/mongod.conf#增加如下配置security: keyFile: /var/lib/mongo/access.key authorization: enabled
三台机器都须要如此设置,而后重启服务
systemctl restart mongod
而后登陆验证即可,验证两块中央
- 是否能认证胜利
- 正本集状态是否失常
如果以上 ok,那通过 yum 装置的 mongodb4.2 版本的正本集就部署好了,上面去部署 es 集群
部署 Es 集群
es 版本为目前为止最新的版本:7.11.x
系统优化
- 内核参数优化
# vim /etc/sysctl.conffs.file-max=655360vm.max_map_count=655360vm.swappiness = 0
- 批改 limits
# vim /etc/security/limits.conf* soft nproc 655350* hard nproc 655350* soft nofile 655350* hard nofile 655350* hard memlock unlimited* soft memlock unlimited
- 增加普通用户
启动 es 须要应用普通用户
useradd esgroupadd esecho 123456 | passwd es --stdin
- 装置 jdk
yum install -y java-1.8.0-openjdk-devel.x86_64
设置环境变量
# vim /etc/profileexport JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.282.b08-1.el7_9.x86_64/export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarexport PATH=$PATH:$JAVA_HOME/bin
上传压缩包
es 下载地址:https://artifacts.elastic.co/...
解压
tar zxvf elasticsearch-7.11.2-linux-x86_64.tar.gz -C /usr/local/
批改权限
chown -R es.es /usr/local/elasticsearch-7.11.2
批改 es 配置
配置集群
# vim /usr/local/elasticsearch-7.11.2/config/elasticsearch.ymlcluster.name: graylog-clusternode.name: node03path.data: /data/elasticsearch/datapath.logs: /data/elasticsearch/logsbootstrap.memory_lock: truenetwork.host: 0.0.0.0http.port: 9200discovery.seed_hosts: ["10.0.105.74","10.0.105.76","10.0.105.96"]cluster.initial_master_nodes: ["10.0.105.74","10.0.105.76"]http.cors.enabled: truehttp.cors.allow-origin: "*"
批改 jvm 内存大小
-Xms16g #设置为宿主机内存的一半-Xmx16g
应用 systemd 治理服务
# vim /usr/lib/systemd/system/elasticsearch.service[Unit]Description=elasticsearch server daemonWants=network-online.targetAfter=network-online.target[Service]Type=simpleUser=esGroup=esLimitMEMLOCK=infinityLimitNOFILE=655350LimitNPROC=655350ExecStart=/usr/local/elasticsearch-7.11.2/bin/elasticsearchRestart=always[Install]WantedBy=multi-user.target
启动并设置开机启动
systemctl daemon-reloadsystemctl enable elasticsearchsystemctl start elasticsearch
简略验证下
# curl -XGET http://127.0.0.1:9200/_cluster/health?pretty{ "cluster_name" : "graylog-cluster", "status" : "green", "timed_out" : false, "number_of_nodes" : 3, "number_of_data_nodes" : 3, "active_primary_shards" : 1, "active_shards" : 2, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0}
到这里 es 就装置实现了
部署 kafka 集群
因为我的机器是复用的,之前曾经装置过 java 环境了,所以这里就不再写了
下载安装包
kafka: https://www.dogfei.cn/pkgs/kafka_2.12-2.3.0.tgzzookeeper: https://www.dogfei.cn/pkgs/apache-zookeeper-3.6.0-bin.tar.gz
解压
tar zxvf kafka_2.12-2.3.0.tgz -C /usr/local/tar zxvf apache-zookeeper-3.6.0-bin.tar.gz -C /usr/local/
批改配置文件
kafka
# grep -v -E "^#|^$" /usr/local/kafka_2.12-2.3.0/config/server.propertiesbroker.id=1listeners=PLAINTEXT://10.0.105.74:9092num.network.threads=3num.io.threads=8socket.send.buffer.bytes=1048576socket.receive.buffer.bytes=1048576socket.request.max.bytes=104857600log.dirs=/data/kafka/datanum.partitions=8num.recovery.threads.per.data.dir=1offsets.topic.replication.factor=1transaction.state.log.replication.factor=1transaction.state.log.min.isr=1message.max.bytes=20971520log.retention.hours=1log.retention.bytes=1073741824log.segment.bytes=536870912log.retention.check.interval.ms=300000zookeeper.connect=10.0.105.74:2181,10.0.105.76:2181,10.0.105.96:2181zookeeper.connection.timeout.ms=1000000zookeeper.sync.time.ms=2000group.initial.rebalance.delay.ms=0log.cleaner.enable=truedelete.topic.enable=true
zookeeper
# grep -v -E "^#|^$" /usr/local/apache-zookeeper-3.6.0-bin/conf/zoo.cfgtickTime=10000initLimit=10syncLimit=5dataDir=/data/zookeeper/dataclientPort=2181admin.serverPort=8888server.1=10.0.105.74:22888:33888server.2=10.0.105.76:22888:33888server.3=10.0.105.96:22888:33888
嗯,别忘了创立好相应的目录
退出 systemd
kafka
# cat /usr/lib/systemd/system/kafka.service[Unit]Description=KafkaAfter=zookeeper.service[Service]Type=simpleEnvironment=LOG_DIR=/data/kafka/logsWorkingDirectory=/usr/local/kafka_2.12-2.3.0ExecStart=/usr/local/kafka_2.12-2.3.0/bin/kafka-server-start.sh /usr/local/kafka_2.12-2.3.0/config/server.propertiesExecStop=/usr/local/kafka_2.12-2.3.0/bin/kafka-server-stop.shRestart=always[Install]WantedBy=multi-user.target
zookeeper
# cat /usr/lib/systemd/system/zookeeper.service[Unit]Description=zookeeper.serviceAfter=network.target[Service]Type=forkingEnvironment=ZOO_LOG_DIR=/data/zookeeper/logsExecStart=/usr/local/apache-zookeeper-3.6.0-bin/bin/zkServer.sh startExecStop=/usr/local/apache-zookeeper-3.6.0-bin/bin/zkServer.sh stopRestart=always[Install]WantedBy=multi-user.target
启动服务
systemctl daemon-reloadsystemctl start zookeepersystemctl start kafkasystemctl enable zookeepersystemctl enable kafka
部署 filebeat
因为收集的是 k8s 的日志,filebeat 是采纳 DaemonSet 形式部署,示例如下:
daemonset 参考
apiVersion: apps/v1kind: DaemonSetmetadata: labels: app: filebeat name: filebeat namespace: defaultspec: selector: matchLabels: app: filebeat template: metadata: labels: app: filebeat name: filebeat spec: affinity: {} containers: - args: - -e - -E - http.enabled=true env: - name: POD_NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: NODE_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.nodeName image: docker.elastic.co/beats/filebeat:7.11.2 imagePullPolicy: IfNotPresent livenessProbe: exec: command: - sh - -c - | #!/usr/bin/env bash -e curl --fail 127.0.0.1:5066 failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: filebeat resources: limits: cpu: "1" memory: 200Mi requests: cpu: 100m memory: 100Mi securityContext: privileged: false runAsUser: 0 terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /usr/share/filebeat/filebeat.yml name: filebeat-config readOnly: true subPath: filebeat.yml - mountPath: /usr/share/filebeat/data name: data - mountPath: /opt/docker/containers/ name: varlibdockercontainers readOnly: true - mountPath: /var/log name: varlog readOnly: true dnsPolicy: ClusterFirst restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: filebeat serviceAccountName: filebeat terminationGracePeriodSeconds: 30 tolerations: - operator: Exists volumes: - configMap: defaultMode: 384 name: filebeat-daemonset-config name: filebeat-config - hostPath: path: /opt/docker/containers type: "" name: varlibdockercontainers - hostPath: path: /var/log type: "" name: varlog - hostPath: path: /var/lib/filebeat-data type: DirectoryOrCreate name: data updateStrategy: rollingUpdate: maxUnavailable: 1 type: RollingUpdate
configmap 参考
apiVersion: v1data: filebeat.yml: | filebeat.inputs: - type: container paths: - /var/log/containers/*.log #多行合并 multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}' multiline.negate: true multiline.match: after multiline.timeout: 30 fields: #自定义字段用于logstash辨认k8s输出的日志 service: k8s-log #禁止收集host.xxxx字段 #publisher_pipeline.disable_host: true processors: - add_kubernetes_metadata: #增加k8s形容字段 default_indexers.enabled: true default_matchers.enabled: true host: ${NODE_NAME} matchers: - logs_path: logs_path: "/var/log/containers/" - drop_fields: #删除的多余字段 fields: ["host", "tags", "ecs", "log", "prospector", "agent", "input", "beat", "offset"] ignore_missing: true output.kafka: hosts: ["10.0.105.74:9092","10.0.105.76:9092","10.0.105.96:9092"] topic: "dev-k8s-log" compression: gzip max_message_bytes: 1000000kind: ConfigMapmetadata: labels: app: filebeat name: filebeat-daemonset-config namespace: default
而后执行下,把 pod 启动起来就能够了
部署 graylog 集群
导入 rpm 包
rpm -Uvh https://packages.graylog2.org/repo/packages/graylog-4.0-repository_latest.rpm
装置
yum install graylog-server -y
启动并退出开机启动
systemctl enable graylog-serversystemctl start graylog-server
生成秘钥
生成两个秘钥,别离用于配置文件中的root_password_sha2
和password_secret
# echo -n "Enter Password: " && head -1 </dev/stdin | tr -d '\n' | sha256sum | cut -d" " -f1# pwgen -N -1 -s 40 1 #这个命令要是没有,就找一台ubuntu机器,apt install pwgen下载下就能够了
批改配置文件
# vim /etc/graylog/server/server.confis_master = false #是否是主节点,如果是主节点,则设置为true, 集群中只有一个主节点node_id_file = /etc/graylog/server/node-idpassword_secret = iMh21uM57Pt2nMHDicInjPvnE8o894AIs7rJj9SW #将下面生成的秘钥配置到这里root_password_sha2 = 8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92 #将下面生成的秘钥配置到这里plugin_dir = /usr/share/graylog-server/pluginhttp_bind_address = 0.0.0.0:9000http_publish_uri = http://10.0.105.96:9000/web_enable = truerotation_strategy = countelasticsearch_max_docs_per_index = 20000000elasticsearch_max_number_of_indices = 20retention_strategy = deleteelasticsearch_shards = 2elasticsearch_replicas = 0elasticsearch_index_prefix = graylogallow_leading_wildcard_searches = falseallow_highlighting = falseelasticsearch_analyzer = standardoutput_batch_size = 5000output_flush_interval = 120output_fault_count_threshold = 8output_fault_penalty_seconds = 120processbuffer_processors = 20outputbuffer_processors = 40processor_wait_strategy = blockingring_size = 65536inputbuffer_ring_size = 65536inputbuffer_processors = 2inputbuffer_wait_strategy = blockingmessage_journal_enabled = truemessage_journal_dir = /var/lib/graylog-server/journallb_recognition_period_seconds = 3mongodb_uri = mongodb://graylog:Graylog_1234@10.0.105.74:27017,10.0.105.76:27017,10.0.105.96:27017/graylog?replicaSet=graylog-rsmongodb_max_connections = 1000mongodb_threads_allowed_to_block_multiplier = 5content_packs_dir = /usr/share/graylog-server/contentpackscontent_packs_auto_load = grok-patterns.jsonproxied_requests_thread_pool_size = 32elasticsearch_hosts = http://10.0.105.74:9200,http://10.0.105.76:9200,http://10.0.105.96:9200elasticsearch_discovery_enabled = true
在这里要留神 mongodb 和 es 的连贯形式,我这里全都是部署的集群,所以写的是集群的连贯形式
mongodb_uri = mongodb://graylog:Graylog_1234@10.0.105.74:27017,10.0.105.76:27017,10.0.105.96:27017/graylog?replicaSet=graylog-rselasticsearch_hosts = http://10.0.105.74:9200,http://10.0.105.76:9200,http://10.0.105.96:9200
到这里部署工作就完结了,上面是在 graylog 管制台上进行配置下,然而首先得把 graylog 给代理进去,能够通过 nginx 进行代理,nginx 配置文件参考:
user nginx;worker_processes 2;error_log /var/log/nginx/error.log;pid /run/nginx.pid;include /usr/share/nginx/modules/*.conf;events { worker_connections 65535;}http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; include /etc/nginx/conf.d/*.conf; upstream graylog_servers { server 10.0.105.74:9000; server 10.0.105.76:9000; server 10.0.105.96:9000; } server { listen 80 default_server; server_name 设置一个域名; location / { proxy_set_header Host $http_host; proxy_set_header X-Forwarded-Host $host; proxy_set_header X-Forwarded-Server $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Graylog-Server-URL http://$server_name/; proxy_pass http://graylog_servers; } }}
完预先,重启下 nginx,浏览器上拜访即可,用户名是 admin,明码是之前应用 sha25 加密形式创立的明码
graylog 接入日志
配置输出源
System --> Inputs
Raw/Plaintext Kafka ---> Lauch new input
设置 kafka 和 zookeeper 地址,设置 topic 名称,保留
状态都要是 running 状态
创立索引
System/indices
设置索引信息,索引名,正本数、分片数,过期策略,创立索引策略
创立 Streams
增加规定
保留,就能够了,而后去首页就能够看到日志了
总结
到这里,一个残缺的部署流程就完结了,这里先讲一下 graylog 是怎么部署的,而后又说了下怎么应用,前面会对它的其余性能做下摸索,对日志字段做下提取之类的,敬请关注。