关于filebeat:FileBeat-简单介绍并且收集本地日志数据传入到logStash-在传入到es中并且使用kibana进行日志查看

一、FileBeat 插件

https://www.cnblogs.com/zsql/…

首先 filebeat 是 Beats 中的一员。Filebeat 是用于转发和集中日志数据的轻量级传送工具。Filebeat 监督您指定的日志文件或地位，收集日志事件，并将它们转发到 Elasticsearch 或 Logstash 进行索引。

logstash 与 filebeat 的关系：因为 logstash 是 jvm 跑的，资源耗费比拟大，所以起初作者又用 golang 写了一个性能较少然而资源耗费也小的轻量级的 logstash-forwarder。最新我的项目名就是 filebeat 了

二、filebeat 原理

1、形成

由两个组件形成，别离是 inputs（输出）和 harvesters（收集器），

harvester 负责读取单个文件的内容。harvester 逐行读取每个文件，并将内容发送到输入。
一个 input 负责管理 harvesters 和寻找所有起源读取。如果 input 类型是 log，则 input 将查找驱动器上与定义的门路匹配的所有文件，并为每个文件启动一个 harvester。每个 input 在它本人的 Go 过程中运行，Filebeat 以后反对多种输出类型。每个输出类型能够定义屡次。日志输出查看每个文件，以查看是否须要启动 harvester、是否曾经在运行 harvester 或是否能够疏忽该文件

2、filebeat 如何保留文件的状态

Filebeat 保留每个文件的状态，并常常将状态刷新到磁盘中的注册表文件中。该状态用于记住 harvester 读取的最初一个偏移量，并确保发送所有日志行。

3、filebeat 何如保障至多一次数据生产

Filebeat 保障事件将至多传递到配置的输入一次，并且不会失落数据。是因为它将每个事件的传递状态存储在注册表文件中。在已定义的输入被阻止且未确认所有事件的状况下，Filebeat 将持续尝试发送事件，直到输入确认已接管到事件为止。如果 Filebeat 在发送事件的过程中敞开，它不会期待输入确认所有事件后再敞开。当 Filebeat 重新启动时，将再次将 Filebeat 敞开前未确认的所有事件发送到输入。这样能够确保每个事件至多发送一次，但最终可能会有反复的事件发送到输入。通过设置 shutdown_timeout 选项，能够将 Filebeat 配置为在关机前期待特定工夫

三、装置

https://www.elastic.co/cn/downloads/past-releases/filebeat-7-6-2

装置 windows 版

四、执行操作
具体流程如下

1、启动 logstash

在 logstash 中创立 stdin.conf

input {
  beats {port => "5044"}
} 
 
output {
  elasticsearch {hosts => ["es 的 ip 地址:9200"]
      index => "es_index20210311"  
  }

  stdout {codec => json_lines}
}

先启动 logstash：./logstash -f stdin.conf –config.reload.automatic

2、运行 filebeat

批改 filebeat.yml 文件

filebeat.inputs:
    - type: log
      enabled: true
       paths:
    - c:\Users\34683\AppData\Local\JetBrains\IntelliJIdea2020.3\log\idea.log
 output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]

再启动：./filebeat -e -c filebeat.yml -d “publish”

参考起源：https://www.cnblogs.com/peter…
https://www.cnblogs.com/peter…

留神：呈现以下谬误不要缓和
Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":359},"total":{"ticks":
是 input 的文件没有变动了，只是一种提醒，是你已近操作了，信息记录在了 data 中，能够间接将 data 文件删除探后将 es 日志删掉，具体意思它指的是曾经将你要的数据导入到了 es 中，当日志没有更新就会始终刷这个。（嘿嘿，这是我本人认为的只是跑起来，具体没有深刻探索了）

四、扩大
1、解决 es 中只有 1 个文档，且当异样时异样数据不在一个文档的问题，且 fileds 过多的问题

起因是导入时时一行行读取导入的没有进行过滤

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

#=========================== Filebeat inputs =============================

filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - c:\Users\34683\AppData\Local\JetBrains\IntelliJIdea2020.3\log\idea.log
  #将那些日志中，异样揭示拼接在上一行中
  multiline.pattern: '^[[:space:]]+(at|\.{3})\b|^Caused by:'    
  multiline.negate: false
  multiline.match: after
    #- /var/log/*.log
    #- c:\programdata\elasticsearch\logs\*

  # Exclude lines. A list of regular expressions to match. It drops the lines that are
  # matching any regular expression from the list.
  #exclude_lines: ['^DBG']

  # Include lines. A list of regular expressions to match. It exports the lines that are
  # matching any regular expression from the list.
  #include_lines: ['^ERR', '^WARN']

  # Exclude files. A list of regular expressions to match. Filebeat drops the files that
  # are matching any regular expression from the list. By default, no files are dropped.
  #exclude_files: ['.gz$']

  # Optional additional fields. These fields can be freely picked
  # to add additional information to the crawled log files for filtering
  #fields:
  #  level: debug
  #  review: 1

  ### Multiline options

  # Multiline can be used for log messages spanning multiple lines. This is common
  # for Java Stack Traces or C-Line Continuation

  # The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
  #multiline.pattern: ^\[

  # Defines if the pattern set under pattern should be negated or not. Default is false.
  #multiline.negate: false

  # Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
  # that was (not) matched before or after or as long as a pattern is not matched based on negate.
  # Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
  #multiline.match: after


#============================= Filebeat modules ===============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

#==================== Elasticsearch template setting ==========================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false

#================================ General =====================================

# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging


#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

#============================== Kibana =====================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

#============================= Elastic Cloud ==================================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:

#================================ Outputs =====================================

# Configure what output to use when sending the data collected by the beat.

#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
  # Array of hosts to connect to.
  #hosts: ["localhost:9200"]

  # Protocol - either `http` (default) or `https`.
  #protocol: "https"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  #username: "elastic"
  #password: "changeme"

#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

#================================ Processors =====================================

# Configure processors to enhance or manipulate events generated by the beat.

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~
  - drop_fields:
      fields: ["input_type", "input.type", "agent.hostname", "agent.type", "ecs.version", "agent.ephemeral_id", "agent.id", "agent.version", "fields.ics", "log.file.path", "log.flags"]
#================================ Logging =====================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]

#============================== X-Pack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#monitoring.enabled: false

# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:

#================================= Migration ==================================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true