1. pt-query-digest 简介
第三方工具. perl 脚本;
能够剖析 binlog, general log, slowlog; 也能够通过 show processlist 或者 通过 tcpdump 抓取 MySQL 协定数据来进行剖析.
能够把剖析后果输入到文件中. 剖析过程是 先对查问语句的条件 参数化, 而后对参数化的查问进行分组统计, 统计出各个查问的执行工夫, 次数, 占比等, 能够借助剖析后果, 找出问题, 进而进行优化.
2. pt-query-digest 装置
2.1 装置 perl 模块
须要装置 perl 模块
yum install -y perl-CPAN per-Time_HiRes
2.2 percona 上下载 percona-toolkit 工具
https://www.percona.com/downl…
wget https://www.percona.com/downl…
下载 rpm 包, 下载实现后:
yum localinstall -y percona-toolkit-3.2.1-1.el7.x86_64.rpm
装置即可
2.3 装置遇到的问题: libmysqlclient.so
我装置中遇到:
谬误:软件包:perl-DBD-MySQL-4.023-6.el7.x86_64 (base)
须要:libmysqlclient.so.18(libmysqlclient_18)(64bit)
谬误:软件包:perl-DBD-MySQL-4.023-6.el7.x86_64 (base)
须要:libmysqlclient.so.18()(64bit)
您能够尝试增加 –skip-broken 选项来解决该问题
您能够尝试执行:rpm -Va –nofiles –nodigest
解决形式: 找到 mysql 装置的 tar 包, 装置这两个:
[root@niewj download]# rpm -ivh mysql-community-libs-5.7.31-1.el7.x86_64.rpm
[root@niewj download]# rpm -ivh mysql-community-libs-compat-5.7.31-1.el7.x86_64.rpm
[root@niewj download]#yum localinstall -y –skip-broken percona-toolkit-3.2.1-1.el7.x86_64.rpm
最初装置 percona-toolkit 就能够了!
2.4 验证 pt-query-digest 的装置
[root@niewj download]# pt-query-digest --help
pt-query-digest analyzes MySQL queries from slow, general, and binary log files.
It can also analyze queries from C<SHOW PROCESSLIST> and MySQL protocol data
from tcpdump. By default, queries are grouped by fingerprint and reported in
descending order of query time (i.e. the slowest queries first). If no C<FILES>
are given, the tool reads C<STDIN>. The optional C<DSN> is used for certain
options like L<"--since"> and L<"--until">. For more details, please use the
--help option, or try 'perldoc /usr/bin/pt-query-digest' for complete
documentation.
Usage: pt-query-digest [OPTIONS] [FILES] [DSN]
Options:
--ask-pass Prompt for a password when connecting to MySQL
--attribute-aliases=a List of attribute|alias,etc (default db|Schema)
--attribute-value-limit=i A sanity limit for attribute values (default 0)
--charset=s -A Default character set
--config=A Read this comma-separated list of config files;
if specified, this must be the first option on
the command line
--[no]continue-on-error Continue parsing even if there is an error (default yes)
....
....
省略了大略 200 行
...
...
[root@niewj download]#
好长~
2.5 pt-summary 命令查看各种维度 summary 信息;
[root@niewj download]# pt-summary
# Percona Toolkit System Summary Report ######################
Date | 2020-09-10 15:49:20 UTC (local TZ: CST +0800)
Hostname | xxx
Uptime | 23 days, 7:09, 1 user, load average: 0.00, 0.02, 0.05
System | Alibaba Cloud; Alibaba Cloud ECS; vpc-i440fx-2.1 (Other)
Service Tag | xxxxxxxxxxxxxxxxxxxxx
Platform | Linux
Release | CentOS Linux release 7.8.2003 (Core)
..........
..........
省略 200 行
很长, 蕴含 cpu/memory/disk/filesystem/io/net 各种各样的维度;
2.6 pt-diskstats 命令 查看磁盘 io 信息, 每秒更新一次
2.7 pt-mysql-summary 命令
root@niewj download]# pt-mysql-summary --user=root --password=123456
mysql: [Warning] Using a password on the command line interface can be insecure.
....
省略
3. pt-query-digest 命令 - 外围
3.1 命令格局:
pt-query-digest /var/lib/mysql/niewj-slow.log
或者, 如果有信息显示不全, 加参数 –limit=100%
pt-query-digest --limit=100% /var/lib/mysql/niewj-slow.log
3.2 汇总局部:
Overall: 13 total, 4 unique, 0.00 QPS, 0.02x concurrency
总共多少个 sql, 去重后多少个
执行工夫 / 锁定工夫 /rows 发送总量 / 查问总量 以及各种维度信息;
3.3 明细局部:
# Query 1:
# Query 1: 0.00 QPS, 0.01x concurrency, ID 0xDA8417432ACB5A8CDF14F4107EA1D77C at byte 2309
…….. 等等略, 详见上面
select * from goods where id>10000G
[root@niewj download]# pt-query-digest /var/lib/mysql/niewj-slow.log
# 180ms user time, 20ms system time, 26.07M rss, 220.58M vsz
# Current date: Fri Sep 11 00:11:06 2020
# Hostname: niewj
# Files: /var/lib/mysql/niewj-slow.log
# Overall: 13 total, 4 unique, 0.00 QPS, 0.02x concurrency _______________
# Time range: 2020-09-10T09:08:18 to 2020-09-10T12:40:37
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Exec time 204s 4s 45s 16s 27s 12s 16s
# Lock time 891us 0 142us 68us 138us 57us 77us
# Rows sent 724.87k 0 146.48k 55.76k 143.37k 65.31k 15.04k
# Rows examine 724.87k 0 146.48k 55.76k 143.37k 65.31k 0
# Query size 266 11 37 20.46 36.69 11.06 16.89
# Profile
# Rank Query ID Response time Calls R/Call V/M
# ==== ============================ ============= ===== ======= ====
# 1 0xDA8417432ACB5A8CDF14F4107E 74.4643 36.5% 4 18.6161 3.31 SELECT goods
# 2 0xF98ECC2695E4FADE8DB1D907FB 66.7331 32.7% 6 11.1222 20.33 CALL Proc
# 3 0xF25AB949C6CDC4B9D30BE11F01 51.6891 25.4% 2 25.8445 0.39 SELECT goods
# 4 0x59A74D08D407B5EDF9A57DD5A4 11.0003 5.4% 1 11.0003 0.00 SELECT
# Query 1: 0.00 QPS, 0.01x concurrency, ID 0xDA8417432ACB5A8CDF14F4107EA1D77C at byte 2309
# This item is included in the report because it matches --limit.
# Scores: V/M = 3.31
# Time range: 2020-09-10T10:26:24 to 2020-09-10T12:40:37
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 30 4
# Exec time 36 74s 5s 24s 19s 24s 8s 23s
# Lock time 62 554us 134us 142us 138us 138us 3us 138us
# Rows sent 59 431.90k 31.51k 136.72k 107.97k 136.54k 44.57k 136.54k
# Rows examine 59 431.90k 31.51k 136.72k 107.97k 136.54k 44.57k 136.54k
# Query size 54 146 35 37 36.50 36.69 0.76 36.69
# String:
# Databases test
# Hosts 121.69.51.10
# Users root
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s #####################
# 10s+ ################################################################
# Tables
# SHOW TABLE STATUS FROM `test` LIKE 'goods'\G
# SHOW CREATE TABLE `test`.`goods`\G
# EXPLAIN /*!50100 PARTITIONS*/
select * from goods where id>10000\G
# Query 2: 0.04 QPS, 0.42x concurrency, ID 0xF98ECC2695E4FADE8DB1D907FB00F811 at byte 1426
# This item is included in the report because it matches --limit.
# Scores: V/M = 20.33
# Time range: 2020-09-10T10:06:36 to 2020-09-10T10:09:16
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 46 6
# Exec time 32 67s 4s 45s 11s 45s 15s 4s
# Lock time 15 138us 0 72us 23us 69us 31us 0
# Rows sent 0 0 0 0 0 0 0 0
# Rows examine 0 0 0 0 0 0 0 0
# Query size 24 66 11 11 11 11 0 11
# String:
# Databases test
# Hosts 121.69.51.10
# Users root
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s ################################################################
# 10s+ ############
call Proc()\G
# Query 3: 0.00 QPS, 0.05x concurrency, ID 0xF25AB949C6CDC4B9D30BE11F014A9A39 at byte 2087
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.39
# Time range: 2020-09-10T10:22:48 to 2020-09-10T10:39:56
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 15 2
# Exec time 25 52s 24s 28s 26s 28s 3s 26s
# Lock time 22 199us 85us 114us 99us 114us 20us 99us
# Rows sent 40 292.97k 146.48k 146.48k 146.48k 146.48k 0 146.48k
# Rows examine 40 292.97k 146.48k 146.48k 146.48k 146.48k 0 146.48k
# Query size 14 38 19 19 19 19 0 19
# String:
# Databases test
# Hosts 121.69.51.10
# Users root
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s
# 10s+ ################################################################
# Tables
# SHOW TABLE STATUS FROM `test` LIKE 'goods'\G
# SHOW CREATE TABLE `test`.`goods`\G
# EXPLAIN /*!50100 PARTITIONS*/
select * from goods\G
# Query 4: 0 QPS, 0x concurrency, ID 0x59A74D08D407B5EDF9A57DD5A41825CA at byte 0
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: all events occurred at 2020-09-10T09:08:18
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 7 1
# Exec time 5 11s 11s 11s 11s 11s 0 11s
# Lock time 0 0 0 0 0 0 0 0
# Rows sent 0 1 1 1 1 1 0 1
# Rows examine 0 0 0 0 0 0 0 0
# Query size 6 16 16 16 16 16 0 16
# String:
# Databases sakila
# Hosts 121.69.51.10
# Users root
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms
# 100ms
# 1s
# 10s+ ################################################################
# EXPLAIN /*!50100 PARTITIONS*/
select sleep(11)\G
[root@niewj download]#
4. 查看 mysql 的主从状态: pt-slave-find(理解一下)
4.1 命令格局:
pt-slave-find --host=localhost --user=root --password=123456
4.2 命令示例:
[root@niewj download]# pt-slave-find --host=localhost --user=root --password=123456
localhost
Version 5.7.31
Server ID 0
Uptime 07:45:01 (started 2020-09-10T16:42:50)
Replication Is not a slave, has 0 slaves connected, is not read_only
Filters
Binary logging ROW
Slave status
Slave mode STRICT
Auto-increment increment 1, offset 1
InnoDB version 5.7.31
[root@niewj download]#
5. 查看 mysql 死锁信息: pt-deadlock-logger
命令:
pt-deadlock-logger --run-time=10 --interval=3 --create-dest-table --dest D=test,t=deadlocks u=root,p=123456
5.1 制作死锁
怎么制作一个死锁? 死锁的条件是什么呢? 两个事务, 先后依照不同的程序锁定各自须要的另一个数据, 就会死锁, 所以, 咱们筹备 2 个表: 外面都有 id= 1 的记录;
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| deadlocks |
| goods |
+----------------+
5.1.1 事务 1: 开启事务; 锁表 t2 的 id=1;
mysql> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from t2 where id=1 for update;
+----+---------+--------+
| id | name | price |
+----+---------+--------+
| 1 | 商品 1 | 200.17 |
+----+---------+--------+
1 row in set (0.00 sec)
5.1.2 事务 2: 开启事务; 锁表 goods 的 id=1;
mysql> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from goods where id=1 for update;
+----+---------+--------+
| id | name | price |
+----+---------+--------+
| 1 | 商品 1 | 200.17 |
+----+---------+--------+
1 row in set (0.00 sec)
5.1.3 事务 1: 锁表 goods 的 id=1;
mysql> select * from goods where id=1 for update;
+----+---------+--------+
| id | name | price |
+----+---------+--------+
| 1 | 商品 1 | 200.17 |
+----+---------+--------+
1 row in set (16.25 sec)
5.1.4 事务 2: 锁表 t2 的 id=1;
mysql> select * from t2 where id=1 for update;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
mysql>
能够看到, 此时已产生死锁:
5.2. 执行 5.1 的命令
[root@niewj download]# pt-deadlock-logger --run-time=10 --interval=3 --create-dest-table --dest D=test,t=deadlocks u=root,p=123456
server ts thread txn_id txn_time user hostname ip db tbl idx lock_type lock_mode wait_hold victim query
niewj 2020-09-11T00:45:08 97 0 96 root localhost test t2 GEN_CLUST_INDEX RECORD X w 1 select * from t2 where id=1 for update
niewj 2020-09-11T00:45:08 98 0 409 root localhost test goods id RECORD X w 0 select * from goods where id=1 for update
[root@niewj download]
5.3 查看死锁日志表中记录:
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from deadlocks;
+--------+---------------------+--------+--------+----------+------+-----------+----+------+-------+-----------------+-----------+-----------+-----------+--------+-------------------------------------------+
| server | ts | thread | txn_id | txn_time | user | hostname | ip | db | tbl | idx | lock_type | lock_mode | wait_hold | victim | query |
+--------+---------------------+--------+--------+----------+------+-----------+----+------+-------+-----------------+-----------+-----------+-----------+--------+-------------------------------------------+
| niewj | 2020-09-11 00:45:08 | 97 | 0 | 96 | root | localhost | | test | t2 | GEN_CLUST_INDEX | RECORD | X | w | 1 | select * from t2 where id=1 for update |
| niewj | 2020-09-11 00:45:08 | 98 | 0 | 409 | root | localhost | | test | goods | id | RECORD | X | w | 0 | select * from goods where id=1 for update |
+--------+---------------------+--------+--------+----------+------+-----------+----+------+-------+-----------------+-----------+-----------+-----------+--------+-------------------------------------------+
2 rows in set (0.00 sec)
下面的表格内容如下:
server | ts | thread | txn_id | txn_time | user | hostname | db | tbl | idx | lock_type | lock_mode | wait_hold | victim | query |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
niewj | 2020-09-11 00:45:08 | 97 | 0 | 96 | root | localhost | test | t2 | GEN_CLUST_INDEX | RECORD | X | w | 1 | select * from t2 where id=1 for update |
niewj | 2020-09-11 00:45:08 | 98 | 0 | 409 | root | localhost | test | goods | id | RECORD | X | w | 0 | select * from goods where id=1 for update |
6. 查看索引的应用 pt-index-usage
[root@niewj download]# pt-index-usage --user=root --password=123456 --host=localhost /var/lib/mysql/niewj-slow.log
[root@niewj download]#
7. 查看数据库中反复的索引 pt-duplicate-key-checker
[root@niewj download]# pt-duplicate-key-checker --user=root --password=123456 --host=localhost
# ########################################################################
# Summary of indexes
# ########################################################################
# Total Indexes 97
[root@niewj download]#
8. 查看数据库 IO 情况 pt-ioprofile (线上数据库忙时慎用)
[root@niewj download]# pt-ioprofile
2020 年 09 月 11 日 星期五 01:29:56 CST
Tracing process ID 18315
9. 查看表中数据容量大于某个值的: pt-find
pt-find –user=root –password=123456 –tablesize +2M
[root@niewj download]# pt-find --user=root --password=123456 --tablesize +2M
`sakila`.`payment`
`sakila`.`rental`
`test`.`goods`
[root@niewj download]# pt-find --user=root --password=123456 --tablesize +1M
`mysql`.`help_topic`
`sakila`.`payment`
`sakila`.`rental`
`test`.`goods`
[root@niewj download]#
能够看到, 大于 1M 的表有 4 个; 大于 2M 的表有 3 个;
10. 杀死查问时长超时的连贯 pt-kill
pt-kill –user=root –password=123456 –busy-time=5 –print 仅仅打印
pt-kill –user=root –password=123456 –busy-time=5 –kill 发现后 kill 掉
模仿:
关上一个 session;
10.1 pt-kill 仅监听 + 打印
- session1: pt-kill 监听
[root@niewj download]# pt-kill --user=root --password=Youareapig#3 --busy-time=5 --print
# 2020-09-11T01:40:57 KILL 108 (Query 6 sec) select sleep(18)
# 2020-09-11T01:40:59 KILL 108 (Query 8 sec) select sleep(18)
# 2020-09-11T01:41:01 KILL 108 (Query 10 sec) select sleep(18)
# 2020-09-11T01:41:03 KILL 108 (Query 12 sec) select sleep(18)
# 2020-09-11T01:41:05 KILL 108 (Query 14 sec) select sleep(18)
# 2020-09-11T01:41:07 KILL 108 (Query 16 sec) select sleep(18)
...
- session 2: 模仿超时
mysql> select sleep(18);
+-----------+
| sleep(18) |
+-----------+
| 0 |
+-----------+
1 row in set (18.00 sec)
mysql>
10.2 pt-kill 监听 +kill
session1:
[root@niewj download]# pt-kill --user=root --password=Youareapig#3 --busy-time=5 --kill --print
# 2020-09-11T01:45:27 KILL 112 (Query 5 sec) select sleep(18)
session2: 超时后 连贯失落(被杀掉了)
mysql> select sleep(18);
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql>
11. 利用工具查出三大类有问题的 sql
11.1 查问次数多 + 每次耗时的 sql(慢日志的前几个 query)
pt-query-digest 的前几个 query, 就是指标;
profile局部就是
11.2 IO 占用大的 sql(Rows Examine 大的)
pt-query-digest 中的 Rows Examine 占用大, 阐明扫描的行数多, 往往是优化的指标!
Rows Examine 占用大
examine: vt. 查看;考察;检测;考试
11.3 索引利用率低的 sql(Rows Examine 和 Rows Send 相差较大的)
检测 (扫描) 的行数 =Rows Examine
Rows Examine 越大, 就意味着扫描表的范畴越大 , 意味着 IO 大;
返回给调用方的数据行数 =Rows Send
Rows Send 越大, 意味着返回给调用方的数据越多;
Rows Examine 和 Rows Send 的关系:
退出扫描表 t1 中数据 Rows Examine 扫描了 100 万条; Rows Send 只有 100 条, 就是说破费了大的 IO, 才找到这么寥寥 100 条, 真的须要看看索引的命中率了;
论断: Rows Examine 和 Rows Send 相差不能太大, 相差太大, 阐明索引命中率不高, 须要思考优化了!