乐趣区

k8s与监控–k8s部署grafana6.0

前言
本文主要介绍最新版本 grafana6.0 的一些新特性和如何部署到 k8s 当中。
grafana6.0 简介
Grafana 的这一更新引入了一种新的查询展示数据的方式,支持日志数据和大量其他功能。
主要亮点是:

Explore – 一个新的查询工作流,用于临时数据探索和故障排除。
Grafana Loki – 与 Grafana Labs 的新开源日志聚合系统集成。
Gauge Panel – 种用于 gauges 的新型独立面板。
New Panel Editor UX 改进了面板编辑,并可在不同的可视化之间轻松切换。
Google Stackdriver Datasource 已经过测试版并正式发布。
Azure Monitor 插件从作为外部插件移植到核心数据源。
React Plugin 支持可以更轻松地构建插件。
Named Colors 包含在我们新的改良颜色选择器中。
Removal of user session storage 使 Grafana 更易于部署并提高安全性。

其实可以看出,Explore 和 Grafana Loki 是专为用于 grafana 增强自己在日志展示方面而推出的 future。不过 loki 这个受 prometheus 启发而创建的日志存储和检索框架至今没有 release,而且官方也不建议生产环境使用。但是 loki 是值得大家关注的一个技术,深度和 k8s 结合,可以用于专门处理 k8s 当中的日志。
下面是一张使用 Explore 处理日志的截图:

grafana6.0 部署
我们主要提供将 grafana6.0 部署到 k8s 中的方法。
由于我们的环境是 aws 托管的 k8s,所以需要注意 pvc 和 svc 这两个地方,需要大家移植的时候稍微做一下修改。
下面是 configmap,主要包含了 ldap.toml 和 grafana.ini 两个配置文件。由于企业实际环境中,需要对接单位的 ldap,所以包含了 ldap.toml
apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: hawkeye-grafana
name: hawkeye-grafana-cm
namespace: sgt
data:
ldap.toml: |-
# To troubleshoot and get more log info enable ldap debug logging in grafana.ini
# [log]
# filters = ldap:debug

[[servers]]
# Ldap server host (specify multiple hosts space separated)
host = “ldap.xxx.org”
# Default port is 389 or 636 if use_ssl = true
port = 389
# Set to true if ldap server supports TLS
use_ssl = false
# Set to true if connect ldap server with STARTTLS pattern (create connection in insecure, then upgrade to secure connection with TLS)
start_tls = false
# set to true if you want to skip ssl cert validation
ssl_skip_verify = false
# set to the path to your root CA certificate or leave unset to use system defaults
# root_ca_cert = “/path/to/certificate.crt”
# Authentication against LDAP servers requiring client certificates
# client_cert = “/path/to/client.crt”
# client_key = “/path/to/client.key”

# Search user bind dn
bind_dn = “cn=Manager,dc=xxx,dc=com”
# Search user bind password
# If the password contains # or ; you have to wrap it with triple quotes. Ex “””#password;”””
bind_password = ‘xxxxx’

# User search filter, for example “(cn=%s)” or “(sAMAccountName=%s)” or “(uid=%s)”
search_filter = “(cn=%s)”

# An array of base dns to search through
search_base_dns = [“ou=tech,cn=hawkeye,dc=xxxx,dc=com”]

## For Posix or LDAP setups that does not support member_of attribute you can define the below settings
## Please check grafana LDAP docs for examples
# group_search_filter = “(&(objectClass=posixGroup)(memberUid=%s))”
# group_search_base_dns = [“ou=groups,dc=grafana,dc=org”]
# group_search_filter_user_attribute = “uid”

# Specify names of the ldap attributes your ldap uses
[servers.attributes]
name = “givenName”
surname = “sn”
username = “cn”
member_of = “memberOf”
email = “email”

# Map ldap groups to grafana org roles
[[servers.group_mappings]]
group_dn = “cn=admins,dc=grafana,dc=org”
org_role = “Admin”
# To make user an instance admin (Grafana Admin) uncomment line below
# grafana_admin = true
# The Grafana organization database id, optional, if left out the default org (id 1) will be used
# org_id = 1

[[servers.group_mappings]]
group_dn = “cn=users,dc=grafana,dc=org”
org_role = “Editor”

[[servers.group_mappings]]
# If you want to match all (or no ldap groups) then you can use wildcard
group_dn = “*”
org_role = “Viewer”

grafana.ini: |-
##################### Grafana Configuration Example #####################
#
# Everything has defaults so you only need to uncomment things you want to
# change

# possible values : production, development
;app_mode = production

# instance name, defaults to HOSTNAME environment variable value or hostname if HOSTNAME var is empty
;instance_name = ${HOSTNAME}

#################################### Paths ####################################
[paths]
# Path to where grafana can store temp files, sessions, and the sqlite3 db (if that is used)
;data = /var/lib/grafana

# Temporary files in `data` directory older than given duration will be removed
;temp_data_lifetime = 24h

# Directory where grafana can store logs
;logs = /var/log/grafana

# Directory where grafana will automatically scan and look for plugins
;plugins = /var/lib/grafana/plugins

# folder that contains provisioning config files that grafana will apply on startup and while running.
;provisioning = conf/provisioning

#################################### Server ####################################
[server]
# Protocol (http, https, socket)
;protocol = http

# The ip address to bind to, empty will bind to all interfaces
;http_addr =

# The http port to use
http_port = 3000

# The public facing domain name used to access grafana from a browser
;domain = localhost

# Redirect to correct domain if host header does not match domain
# Prevents DNS rebinding attacks
;enforce_domain = false

# The full public facing url you use in browser, used for redirects and emails
# If you use reverse proxy and sub path specify full url (with sub path)
;root_url = http://localhost:3000

# Log web requests
;router_logging = false

# the path relative working path
;static_root_path = public

# enable gzip
;enable_gzip = false

# https certs & key file
;cert_file =
;cert_key =

# Unix socket path
;socket =

#################################### Database ####################################
[database]
# You can configure the database connection by specifying type, host, name, user and password
# as separate properties or as on string using the url properties.

# Either “mysql”, “postgres” or “sqlite3”, it’s your choice
;type = sqlite3
;host = 127.0.0.1:3306
;name = grafana
;user = root
# If the password contains # or ; you have to wrap it with triple quotes. Ex “””#password;”””
;password =

# Use either URL or the previous fields to configure the database
# Example: mysql://user:secret@host:port/database
;url =

# For “postgres” only, either “disable”, “require” or “verify-full”
;ssl_mode = disable

# For “sqlite3” only, path relative to data_path setting
;path = grafana.db

# Max idle conn setting default is 2
;max_idle_conn = 2

# Max conn setting default is 0 (mean not set)
;max_open_conn =

# Connection Max Lifetime default is 14400 (means 14400 seconds or 4 hours)
;conn_max_lifetime = 14400

# Set to true to log the sql calls and execution times.
log_queries =

#################################### Session ####################################
[session]
# Either “memory”, “file”, “redis”, “mysql”, “postgres”, default is “file”
;provider = file

# Provider config options
# memory: not have any config yet
# file: session dir path, is relative to grafana data_path
# redis: config like redis server e.g. `addr=127.0.0.1:6379,pool_size=100,db=grafana`
# mysql: go-sql-driver/mysql dsn config string, e.g. `user:password@tcp(127.0.0.1:3306)/database_name`
# postgres: user=a password=b host=localhost port=5432 dbname=c sslmode=disable
;provider_config = sessions

# Session cookie name
;cookie_name = grafana_sess

# If you use session in https only, default is false
;cookie_secure = false

# Session life time, default is 86400
;session_life_time = 86400

#################################### Data proxy ###########################
[dataproxy]

# This enables data proxy logging, default is false
;logging = false

#################################### Analytics ####################################
[analytics]
# Server reporting, sends usage counters to stats.grafana.org every 24 hours.
# No ip addresses are being tracked, only simple counters to track
# running instances, dashboard and error counts. It is very helpful to us.
# Change this option to false to disable reporting.
;reporting_enabled = true

# Set to false to disable all checks to https://grafana.net
# for new vesions (grafana itself and plugins), check is used
# in some UI views to notify that grafana or plugin update exists
# This option does not cause any auto updates, nor send any information
# only a GET request to http://grafana.com to get latest versions
;check_for_updates = true

# Google Analytics universal tracking code, only enabled if you specify an id here
;google_analytics_ua_id =

#################################### Security ####################################
[security]
# default admin user, created on startup
;admin_user = admin

# default admin password, can be changed before first start of grafana, or in profile settings
;admin_password = admin

# used for signing
;secret_key = xxxxx

# Auto-login remember days
;login_remember_days = 7
;cookie_username = grafana_user
;cookie_remember_name = grafana_remember

# disable gravatar profile images
;disable_gravatar = false

# data source proxy whitelist (ip_or_domain:port separated by spaces)
;data_source_proxy_whitelist =

# disable protection against brute force login attempts
;disable_brute_force_login_protection = false

#################################### Snapshots ###########################
[snapshots]
# snapshot sharing options
;external_enabled = true
;external_snapshot_url = https://snapshots-origin.raintank.io
;external_snapshot_name = Publish to snapshot.raintank.io

# remove expired snapshot
;snapshot_remove_expired = true

#################################### Dashboards History ##################
[dashboards]
# Number dashboard versions to keep (per dashboard). Default: 20, Minimum: 1
;versions_to_keep = 20

#################################### Users ###############################
[users]
# disable user signup / registration
;allow_sign_up = true

# Allow non admin users to create organizations
;allow_org_create = true

# Set to true to automatically assign new users to the default organization (id 1)
;auto_assign_org = true

# Default role new users will be automatically assigned (if disabled above is set to true)
;auto_assign_org_role = Viewer

# Background text for the user field on the login page
;login_hint = email or username

# Default UI theme (“dark” or “light”)
;default_theme = dark

# External user management, these options affect the organization users view
;external_manage_link_url =
;external_manage_link_name =
;external_manage_info =

# Viewers can edit/inspect dashboard settings in the browser. But not save the dashboard.
;viewers_can_edit = false

[auth]
# Set to true to disable (hide) the login form, useful if you use OAuth, defaults to false
;disable_login_form = false

# Set to true to disable the signout link in the side menu. useful if you use auth.proxy, defaults to false
;disable_signout_menu = false

# URL to redirect the user to after sign out
;signout_redirect_url =

#################################### Anonymous Auth ##########################
[auth.anonymous]
# enable anonymous access
;enabled = false

# specify organization name that should be used for unauthenticated users
;org_name = Main Org.

# specify role for unauthenticated users
;org_role = Viewer

#################################### Github Auth ##########################
[auth.github]
;enabled = false
;allow_sign_up = true
;client_id = some_id
;client_secret = some_secret
;scopes = user:email,read:org
;auth_url = https://github.com/login/oauth/authorize
;token_url = https://github.com/login/oauth/access_token
;api_url = https://api.github.com/user
;team_ids =
;allowed_organizations =

#################################### Google Auth ##########################
[auth.google]
;enabled = false
;allow_sign_up = true
;client_id = some_client_id
;client_secret = some_client_secret
;scopes = https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email
;auth_url = https://accounts.google.com/o/oauth2/auth
;token_url = https://accounts.google.com/o/oauth2/token
;api_url = https://www.googleapis.com/oauth2/v1/userinfo
;allowed_domains =

#################################### Generic OAuth ##########################
[auth.generic_oauth]
;enabled = false
;name = OAuth
;allow_sign_up = true
;client_id = some_id
;client_secret = some_secret
;scopes = user:email,read:org
;auth_url = https://foo.bar/login/oauth/authorize
;token_url = https://foo.bar/login/oauth/access_token
;api_url = https://foo.bar/user
;team_ids =
;allowed_organizations =
;tls_skip_verify_insecure = false
;tls_client_cert =
;tls_client_key =
;tls_client_ca =

#################################### Grafana.com Auth ####################
[auth.grafana_com]
;enabled = false
;allow_sign_up = true
;client_id = some_id
;client_secret = some_secret
;scopes = user:email
;allowed_organizations =

#################################### Auth Proxy ##########################
[auth.proxy]
;enabled = false
;header_name = X-WEBAUTH-USER
;header_property = username
;auto_sign_up = true
;ldap_sync_ttl = 60
;whitelist = 192.168.1.1, 192.168.2.1
;headers = Email:X-User-Email, Name:X-User-Name

#################################### Basic Auth ##########################
[auth.basic]
;enabled = true

#################################### Auth LDAP ##########################
[auth.ldap]
enabled = true
;config_file = /etc/grafana/ldap.toml
;allow_sign_up = true

#################################### SMTP / Emailing ##########################
[smtp]
enabled = true
host = smtp.exmail.qq.com:465
user = noreply@xxx.com
# If the password contains # or ; you have to wrap it with trippel quotes. Ex “””#password;”””
password = AFxxxxxxYoQ2G
from_address = noreply@xxxx.com
from_name = Hawkeye
;cert_file =
;key_file =
;skip_verify = false
;from_address = admin@grafana.localhost
;from_name = Grafana
# EHLO identity in SMTP dialog (defaults to instance_name)
;ehlo_identity = dashboard.example.com

[emails]
;welcome_email_on_sign_up = false

#################################### Logging ##########################
[log]
# Either “console”, “file”, “syslog”. Default is console and file
# Use space to separate multiple modes, e.g. “console file”
;mode = console file

# Either “debug”, “info”, “warn”, “error”, “critical”, default is “info”
;level = info

# optional settings to set different levels for specific loggers. Ex filters = sqlstore:debug
;filters =

# For “console” mode only
[log.console]
;level =

# log line format, valid options are text, console and json
;format = console

# For “file” mode only
[log.file]
;level =

# log line format, valid options are text, console and json
;format = text

# This enables automated log rotate(switch of following options), default is true
;log_rotate = true

# Max line number of single file, default is 1000000
;max_lines = 1000000

# Max size shift of single file, default is 28 means 1 << 28, 256MB
;max_size_shift = 28

# Segment log daily, default is true
;daily_rotate = true

# Expired days of log file(delete after max days), default is 7
;max_days = 7

[log.syslog]
;level =

# log line format, valid options are text, console and json
;format = text

# Syslog network type and address. This can be udp, tcp, or unix. If left blank, the default unix endpoints will be used.
;network =
;address =

# Syslog facility. user, daemon and local0 through local7 are valid.
;facility =

# Syslog tag. By default, the process’ argv[0] is used.
;tag =

#################################### Alerting ############################
[alerting]
# Disable alerting engine & UI features
;enabled = true
# Makes it possible to turn off alert rule execution but alerting UI is visible
;execute_alerts = true

# Default setting for new alert rules. Defaults to categorize error and timeouts as alerting. (alerting, keep_state)
;error_or_timeout = alerting

# Default setting for how Grafana handles nodata or null values in alerting. (alerting, no_data, keep_state, ok)
;nodata_or_nullvalues = no_data

# Alert notifications can include images, but rendering many images at the same time can overload the server
# This limit will protect the server from render overloading and make sure notifications are sent out quickly
;concurrent_render_limit = 5

#################################### Explore #############################
[explore]
# Enable the Explore section
;enabled = false

#################################### Internal Grafana Metrics ##########################
# Metrics available at HTTP API Url /metrics
[metrics]
# Disable / Enable internal metrics
;enabled = true

# Publish interval
;interval_seconds = 10

# Send internal metrics to Graphite
[metrics.graphite]
# Enable by setting the address setting (ex localhost:2003)
;address =
;prefix = prod.grafana.%(instance_name)s.

#################################### Distributed tracing ############
[tracing.jaeger]
# Enable by setting the address sending traces to jaeger (ex localhost:6831)
;address = localhost:6831
# Tag that will always be included in when creating new spans. ex (tag1:value1,tag2:value2)
;always_included_tag = tag1:value1
# Type specifies the type of the sampler: const, probabilistic, rateLimiting, or remote
;sampler_type = const
# jaeger samplerconfig param
# for “const” sampler, 0 or 1 for always false/true respectively
# for “probabilistic” sampler, a probability between 0 and 1
# for “rateLimiting” sampler, the number of spans per second
# for “remote” sampler, param is the same as for “probabilistic”
# and indicates the initial sampling rate before the actual one
# is received from the mothership
;sampler_param = 1

#################################### Grafana.com integration ##########################
# Url used to import dashboards directly from Grafana.com
[grafana_com]
;url = https://grafana.com

#################################### External image storage ##########################
[external_image_storage]
# Used for uploading images to public servers so they can be included in slack/email messages.
# you can choose between (s3, webdav, gcs, azure_blob, local)
;provider =

[external_image_storage.s3]
;bucket =
;region =
;path =
;access_key =
;secret_key =

[external_image_storage.webdav]
;url =
;public_url =
;username =
;password =

[external_image_storage.gcs]
;key_file =
;bucket =
;path =

[external_image_storage.azure_blob]
;account_name =
;account_key =
;container_name =

[external_image_storage.local]
# does not require any configuration

[rendering]
# Options to configure external image rendering server like https://github.com/grafana/grafana-image-renderer
;server_url =
;callback_url =


以上凡是我打 xxx 都是经过修改的,隐藏了本司的一些重要信息。大家需要根据实际情况,自行配置修改。
。如果你不需要 ldap 认证,这可以删除 configmap 当中的 ldap.toml 并且在 grafana.ini 当中 将 true 改为 false。
#################################### Auth LDAP ##########################
[auth.ldap]
enabled = true
;config_file = /etc/grafana/ldap.toml
;allow_sign_up = true

deployment.yaml 如下:
apiVersion: apps/v1
kind: Deployment
metadata:
name: hawkeye-grafana
namespace: sgt
labels:
app: hawkeye-grafana
spec:
replicas: 1
selector:
matchLabels:
app: hawkeye-grafana
template:
metadata:
labels:
app: hawkeye-grafana
spec:
containers:
– image: grafana/grafana:6.0.0
name: grafana
imagePullPolicy: IfNotPresent
# env:
env:
– name: GF_PATHS_PROVISIONING
value: /var/lib/grafana/provisioning
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 100m
memory: 100Mi
readinessProbe:
httpGet:
path: /login
port: 3000
# initialDelaySeconds: 30
# timeoutSeconds: 1
volumeMounts:
– name: grafana-persistent-storage
mountPath: /var/lib/grafana/
– name: config
mountPath: /etc/grafana/
initContainers:
– name: “init-chown-data”
image: “busybox:latest”
imagePullPolicy: “IfNotPresent”
command: [“chown”, “-R”, “472:472”, “/var/lib/grafana/”]
volumeMounts:
– name: grafana-persistent-storage
mountPath: /var/lib/grafana/
subPath: “”
volumes:
– name: config
configMap:
name: hawkeye-grafana-cm
– name: grafana-persistent-storage
persistentVolumeClaim:
claimName: hawkeye-grafana-claim

注意增加了 initContainers,主要是解决挂载的写权限的问题。
service.yaml 如下:
apiVersion: v1
kind: Service
metadata:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
labels:
app: hawkeye-grafana
name: hawkeye-grafana
namespace: sgt
spec:
type: LoadBalancer
ports:
– name: http
port: 80
protocol: TCP
targetPort: 3000
selector:
app: hawkeye-grafana

pvc.yaml 如下:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: hawkeye-grafana-claim
namespace: sgt
spec:
accessModes:
– ReadWriteOnce
resources:
requests:
storage: 30Gi

执行成功以后,访问成功,通过 admin/admin 登录。

可以看出,左侧新增了 Explore 图标。
总结
grafana 首先会从 /usr/share/grafana/conf/defaults.ini 读取配置文件,然后再读取 /etc/grafana/grafana.ini 读取,同一参数的配置,那么 /etc/grafana/grafana.ini 会覆盖 /usr/share/grafana/conf/defaults.ini 中配置。而命令行配置的参数会覆盖 /etc/grafana/grafana.ini 中的同一参数,最后环境变量中同一配置,又会覆盖命令行中的。
下面是默认的一些环境变量:
GF_PATHS_CONFIG /etc/grafana/grafana.iniGF_PATHS_DATA /var/lib/grafanaGF_PATHS_HOME /usr/share/grafanaGF_PATHS_LOGS /var/log/grafanaGF_PATHS_PLUGINS /var/lib/grafana/pluginsGF_PATHS_PROVISIONING /etc/grafana/provisioning

退出移动版