ElastAlert 工作原理It works by combining Elasticsearch with two types of components, rule types and alerts. Elasticsearch is periodically queried and the data is passed to the rule type, which determines when a match is found. When a match occurs, it is given to one or more alerts, which take action based on the match.周期性的查询Elastsearch并且将数据传递给规则类型,规则类型定义了需要查询哪些数据。当一个规则匹配触发,就会给到一个或者多个的告警,这些告警具体会根据规则的配置来选择告警途径,就是告警行为,比如邮件、企业微信等elastalert文档地址安装使用官网的pip install elastalert安装时,我这里报错,所以改用了git clone到本地的方式ElastAlert官方安装流程如果没有pip安装工具看下面流程pip 安装流程依赖yum install python-develsudo yum install openssl-devel 配置Next, open up config.yaml.example. In it, you will find several configuration options. ElastAlert may be run without changing any of these settings.rules_folder is where ElastAlert will load rule configuration files from. It will attempt to load every .yaml file in the folder. Without any valid rules, ElastAlert will not start. ElastAlert will also load new rules, stop running missing rules, and restart modified rules as the files in this folder change. For this tutorial, we will use the example_rules folder.这里我们复制config.yaml.example为config.yaml,新建目录rulescp config.yaml.example config.yamlmkdir rules配置ES服务器信息修改config.yaml文件如下,其他的配置不需要修改# 这里指定了我们配置的规则的目录rules_folder: rules# How often ElastAlert will query Elasticsearch# The unit can be anything from weeks to seconds# 每次间隔1分钟触发一次run_every: minutes: 1# ElastAlert will buffer results from the most recent# period of time, in case some log sources are not in real timebuffer_time: minutes: 15# The Elasticsearch hostname for metadata writeback# Note that every rule can have its own Elasticsearch host# 配置elasticsearch 的地址和端口es_host: xxx.xx.xxx.xx# The Elasticsearch portes_port: 9200配置rules规则里面已经给出了配置的范例,这里我们使用frequency的配置。要做根据频率变化的告警。[example_rules]# tree.├── example_cardinality.yaml├── example_change.yaml├── example_frequency.yaml├── example_new_term.yaml├── example_opsgenie_frequency.yaml├── example_percentage_match.yaml├── example_single_metric_agg.yaml├── example_spike.yaml└── jira_acct.txt复制frequency的配置文件到新的rules目录cp example_rules/example_frequency.yaml rules/cd rulesmv example_frequency.yaml app_frequency_mail.yaml基于邮件的配置邮件告警样例这里会详细介绍下配置,但是只会用到个别字段# Alert when the rate of events exceeds a threshold# (Optional)# Elasticsearch host# 无需修改使用全局# es_host: elasticsearch.example.com# (Optional)# Elasticsearch port# es_port: 14900# (OptionaL) Connect with SSL to Elasticsearch#use_ssl: True# (Optional) basic-auth username and password for Elasticsearch#es_username: someusername#es_password: somepassword# (Required)# Rule name, must be unique# 这里要定义一个规则名称,而且要unique唯一name: app frequency rule mail# (Required)# Type of alert.# the frequency rule type alerts when num_events events occur with timeframe time# 定义规则类型type: frequency# (Required)# Index to search, wildcard supported# 需要检索的日志索引index: logstash-app-prod*# (Required, frequency specific)# Alert when this many documents matching the query occur within a timeframe# 命中五次num_events: 5# (Required, frequency specific)# num_events must occur within this amount of time to trigger an alert# 十分钟之内命中五次,就算是触发一次规则timeframe:# hours: 4 minutes: 10# 按照某个字段进行聚合,意思是aggreation_key会和rule的名称拼接在一起作为一个组,单独发送告警,相同的mesage是一个组#aggregation_key:# - message# 聚合2分钟aggregation: minutes: 2# 不进行重复提醒的字段,和realert联合使用,30分钟内这个query_key只告警一次query_key: - messagerealert: minutes: 30# (Required)# A list of Elasticsearch filters used for find events# These filters are joined with AND and nested in a filtered query# For more info: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html# 这里按照正则匹配来查询,可以看query-dsl里面的官方文档filter:- query: regexp: category: “error-”#- term:# category: “error-”# 邮箱设置smtp_host: smtp.qq.comsmtp_port: 465smtp_ssl: true# 发件箱的地址from_addr: “xx@qq.com”# 这个是指向的邮箱验证的配置文件,有用户名、和密码,对于qq而言,这里面的密码是授权码,可以通过qq邮箱设置里面,开启smtp的时候查看smtp_auth_file: /home/elastalert/smtp_auth_file.yaml# (Required)# The alert is use when a match is found# 定义告警类型是邮件提醒alert:- “email”# 下面这些不配置,会发送一个默认的告警模板,纯文字太丑了,所以我们进行了格式化,发送一个html格式的email_format: htmlalert_subject: “app 正式环境 告警 {}”# 这里使用python 的format 进行格式化alert_subject_args:- category# 如果这个去掉,那么发送alert_text的同时,也会发送默认模板内容alert_text_type: alert_text_only# 下面这个是自己配置的alert_text: “<div style=‘display:block;background-color: red;padding: 10px;border-radius: 5px;color: white;font-weight: bold;’ > <p>{}</p></div><br><a href=‘这里填写自己的kibana地址href’ target=’_blank’ style=‘padding: 8px 16px;background-color: #46bc99;text-decoration:none;color: white;border-radius: 5px;’>Click to Kibana</a><br><h3>告警详情</h3><table><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>@timestamp:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>@version:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>_id:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>_index:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>_type:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>appType:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>appVersion:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>business:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>category:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>geoip:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>guid:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>host:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>message:</td><td style=‘padding:10px 5px;border-radius: 5px;background-color: red;color: white;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>num_hits:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>num_matches:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>path:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>server:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>uid:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>uri:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr><tr><td style=‘padding:5px;text-align: right;font-weight: bold;border-radius: 5px;background-color: #eef;’>userAgent:</td><td style=‘padding:5px;border-radius: 5px;background-color: #eef;’>{}</td></tr></table>”# 这里需要配置area_text中出现的各个字段,其实跟sprintf一样按照顺序格式化的alert_text_args:- message- “@timestamp”- “@version”- _id- _index- _type- appType- appVersion- business- category- geoip- guid- host- message- num_hits- num_matches- path- server- uid- uri- userAgent# (required, email specific)# a list of email addresses to send alerts to# 这里配置收件人的邮箱email:- “xxx@xxx.com"邮箱验证配置然后来看下邮箱验证的配置,也就是smtp_auth_file.yaml# 发件箱的qq邮箱地址,也就是用户名user: xxx@qq.com# 不是qq邮箱的登录密码,是授权码password: uxmmmmtefwqeibcjd执行的时候,很简单,稍后我们看下配置supervisor高可用nohup python -m elastalert.elastalert –rule app_frequency_mail.yaml –verbose &配置企业微信告警需要信息部门应用新建一个接收日志的部门,会分配部门id新建一个发送日志的应用程序,会有应用id在应用的可见配置里面,配置上相关人员这里我们使用一个开源企业微信发送插件git地址:https://github.com/anjia0532/elastalert-wechat-plugin插件使用说明https://anjia0532.github.io/2017/02/16/elastalert-wechat-plugin/ 按照创建邮件告警规则一样,创建新的规则告警文件。其中从alert开始配置成新的告警方式alert:- “elastalert_modules.wechat_qiye_alert.WeChatAlerter"alert_text: “======start====== \n索引:{}\n服务器:{}\n接口:{}\n告警:\n{}“alert_text_type: alert_text_only# 企业微信告警的数据不需要太多,太长alert_text_args:- _index- server- path- message#后台登陆后【设置】->【权限管理】->【普通管理组】->【创建并设置通讯录和应用权限】->【CorpID,Secret】#设置微信企业号的appidcorp_id: wxea4f5f73xxxx#设置微信企业号的Secretsecret: “xxxxxBGnxxxxxxxxxrBNHxxxxxxxE”#后台登陆后【应用中心】->【选择应用】->【应用id】#设置微信企业号应用idagent_id: 100xxxx#部门idparty_id: 14#用户微信号user_id: “@all”# 标签id#tag_id:企业微信配置注意查看作者的另一个项目https://github.com/anjia0532/weixin-qiye-alert 发现对于user_id,tag_id的配置是有规则的:如果指定标签名称不存在,会自动通过api创建一个标签(处于锁定状态),需要管理员,手动解锁,并且添加成员 如果指定标签下没有成员(标签添加部门无效),则会根据cp.properties指定的部门idPartyId和人员idUserId进行发送 如果部门下没有成员,而且指定的用户也没有关注该企业号,则会将信息推送给该企业号全部已关注成员,测试时需谨记这正合我们的心意,因为我们不会只给一个人发送消息!我们需要的是,发给所有日志告警部门的小伙伴,所以我们要怎么做呢?!!经过测试,我将user_id注释掉,并不能发送消息, 理想状态不应该是删掉user_id,就只发送给全部门么?然而并不是哒~,我们查看下源码(发现作者简直是每一行代码都有注释太好啦)我们会看到作者的注释,全部用@all~~ ,所以能看到上面user_id 我配置的是@all啦self.party_id = self.rule.get(‘party_id’) #部门idself.user_id = self.rule.get(‘user_id’, ‘’) #用户id,多人用 | 分割,全部用 @allself.tag_id = self.rule.get(’tag_id’, ‘’) #标签id企业微信告警样例