1. pt-query-digest简介

第三方工具. perl脚本;

能够剖析 binlog, general log, slowlog; 也能够通过 show processlist 或者 通过 tcpdump 抓取MySQL协定数据来进行剖析.

能够把剖析后果输入到文件中. 剖析过程是 先对查问语句的条件 参数化, 而后对参数化的查问进行分组统计, 统计出各个查问的执行工夫, 次数, 占比等, 能够借助剖析后果, 找出问题, 进而进行优化.

2. pt-query-digest装置

2.1 装置perl模块

须要装置perl模块

yum install -y perl-CPAN per-Time_HiRes

2.2 percona上下载 percona-toolkit工具

https://www.percona.com/downl...

wget https://www.percona.com/downl...

下载rpm包, 下载实现后:

yum localinstall -y percona-toolkit-3.2.1-1.el7.x86_64.rpm

装置即可

2.3 装置遇到的问题: libmysqlclient.so

我装置中遇到:

谬误:软件包:perl-DBD-MySQL-4.023-6.el7.x86_64 (base)

     须要:libmysqlclient.so.18(libmysqlclient_18)(64bit)

谬误:软件包:perl-DBD-MySQL-4.023-6.el7.x86_64 (base)

     须要:libmysqlclient.so.18()(64bit)

您能够尝试增加 --skip-broken 选项来解决该问题
您能够尝试执行:rpm -Va --nofiles --nodigest

解决形式: 找到mysql装置的tar包, 装置这两个:

[root@niewj download]# rpm -ivh mysql-community-libs-5.7.31-1.el7.x86_64.rpm
[root@niewj download]# rpm -ivh mysql-community-libs-compat-5.7.31-1.el7.x86_64.rpm
[root@niewj download]#yum localinstall -y --skip-broken percona-toolkit-3.2.1-1.el7.x86_64.rpm

最初装置 percona-toolkit就能够了!

2.4 验证 pt-query-digest 的装置

[root@niewj download]# pt-query-digest --helppt-query-digest analyzes MySQL queries from slow, general, and binary log files.It can also analyze queries from C<SHOW PROCESSLIST> and MySQL protocol datafrom tcpdump.  By default, queries are grouped by fingerprint and reported indescending order of query time (i.e. the slowest queries first).  If no C<FILES>are given, the tool reads C<STDIN>.  The optional C<DSN> is used for certainoptions like L<"--since"> and L<"--until">.  For more details, please use the--help option, or try 'perldoc /usr/bin/pt-query-digest' for completedocumentation.Usage: pt-query-digest [OPTIONS] [FILES] [DSN]Options:  --ask-pass                   Prompt for a password when connecting to MySQL  --attribute-aliases=a        List of attribute|alias,etc (default db|Schema)  --attribute-value-limit=i    A sanity limit for attribute values (default 0)  --charset=s              -A  Default character set  --config=A                   Read this comma-separated list of config files;                               if specified, this must be the first option on                               the command line  --[no]continue-on-error      Continue parsing even if there is an error (                               default yes)........省略了大略200行......[root@niewj download]# 
好长~

2.5 pt-summary命令查看各种维度summary信息;

[root@niewj download]# pt-summary# Percona Toolkit System Summary Report ######################        Date | 2020-09-10 15:49:20 UTC (local TZ: CST +0800)    Hostname | xxx      Uptime | 23 days,  7:09,  1 user,  load average: 0.00, 0.02, 0.05      System | Alibaba Cloud; Alibaba Cloud ECS; vpc-i440fx-2.1 (Other) Service Tag | xxxxxxxxxxxxxxxxxxxxx    Platform | Linux     Release | CentOS Linux release 7.8.2003 (Core)....................省略200行
很长, 蕴含 cpu/memory/disk/filesystem/io/net 各种各样的维度;

2.6 pt-diskstats命令 查看磁盘io信息, 每秒更新一次

2.7 pt-mysql-summary命令

root@niewj download]# pt-mysql-summary --user=root --password=123456mysql: [Warning] Using a password on the command line interface can be insecure.....省略

3. pt-query-digest命令-外围

3.1 命令格局:

pt-query-digest /var/lib/mysql/niewj-slow.log

或者, 如果有信息显示不全, 加参数 --limit=100%

pt-query-digest --limit=100% /var/lib/mysql/niewj-slow.log

3.2 汇总局部:

Overall: 13 total, 4 unique, 0.00 QPS, 0.02x concurrency

总共多少个sql, 去重后多少个

执行工夫/锁定工夫/rows发送总量/查问总量 以及各种维度信息;

3.3 明细局部:

# Query 1:

# Query 1: 0.00 QPS, 0.01x concurrency, ID 0xDA8417432ACB5A8CDF14F4107EA1D77C at byte 2309
........等等略, 详见上面
select * from goods where id>10000G

[root@niewj download]# pt-query-digest /var/lib/mysql/niewj-slow.log # 180ms user time, 20ms system time, 26.07M rss, 220.58M vsz# Current date: Fri Sep 11 00:11:06 2020# Hostname: niewj# Files: /var/lib/mysql/niewj-slow.log# Overall: 13 total, 4 unique, 0.00 QPS, 0.02x concurrency _______________# Time range: 2020-09-10T09:08:18 to 2020-09-10T12:40:37# Attribute          total     min     max     avg     95%  stddev  median# ============     ======= ======= ======= ======= ======= ======= =======# Exec time           204s      4s     45s     16s     27s     12s     16s# Lock time          891us       0   142us    68us   138us    57us    77us# Rows sent        724.87k       0 146.48k  55.76k 143.37k  65.31k  15.04k# Rows examine     724.87k       0 146.48k  55.76k 143.37k  65.31k       0# Query size           266      11      37   20.46   36.69   11.06   16.89# Profile# Rank Query ID                           Response time Calls R/Call  V/M # ==== ============================ ============= ===== ======= ====#    1 0xDA8417432ACB5A8CDF14F4107E 74.4643 36.5%     4 18.6161  3.31 SELECT goods#    2 0xF98ECC2695E4FADE8DB1D907FB 66.7331 32.7%     6 11.1222 20.33 CALL Proc#    3 0xF25AB949C6CDC4B9D30BE11F01 51.6891 25.4%     2 25.8445  0.39 SELECT goods#    4 0x59A74D08D407B5EDF9A57DD5A4 11.0003  5.4%     1 11.0003  0.00 SELECT# Query 1: 0.00 QPS, 0.01x concurrency, ID 0xDA8417432ACB5A8CDF14F4107EA1D77C at byte 2309# This item is included in the report because it matches --limit.# Scores: V/M = 3.31# Time range: 2020-09-10T10:26:24 to 2020-09-10T12:40:37# Attribute    pct   total     min     max     avg     95%  stddev  median# ============ === ======= ======= ======= ======= ======= ======= =======# Count         30       4# Exec time     36     74s      5s     24s     19s     24s      8s     23s# Lock time     62   554us   134us   142us   138us   138us     3us   138us# Rows sent     59 431.90k  31.51k 136.72k 107.97k 136.54k  44.57k 136.54k# Rows examine  59 431.90k  31.51k 136.72k 107.97k 136.54k  44.57k 136.54k# Query size    54     146      35      37   36.50   36.69    0.76   36.69# String:# Databases    test# Hosts        121.69.51.10# Users        root# Query_time distribution#   1us#  10us# 100us#   1ms#  10ms# 100ms#    1s  ######################  10s+  ################################################################# Tables#    SHOW TABLE STATUS FROM `test` LIKE 'goods'\G#    SHOW CREATE TABLE `test`.`goods`\G# EXPLAIN /*!50100 PARTITIONS*/select * from goods  where   id>10000\G# Query 2: 0.04 QPS, 0.42x concurrency, ID 0xF98ECC2695E4FADE8DB1D907FB00F811 at byte 1426# This item is included in the report because it matches --limit.# Scores: V/M = 20.33# Time range: 2020-09-10T10:06:36 to 2020-09-10T10:09:16# Attribute    pct   total     min     max     avg     95%  stddev  median# ============ === ======= ======= ======= ======= ======= ======= =======# Count         46       6# Exec time     32     67s      4s     45s     11s     45s     15s      4s# Lock time     15   138us       0    72us    23us    69us    31us       0# Rows sent      0       0       0       0       0       0       0       0# Rows examine   0       0       0       0       0       0       0       0# Query size    24      66      11      11      11      11       0      11# String:# Databases    test# Hosts        121.69.51.10# Users        root# Query_time distribution#   1us#  10us# 100us#   1ms#  10ms# 100ms#    1s  #################################################################  10s+  ############call Proc()\G# Query 3: 0.00 QPS, 0.05x concurrency, ID 0xF25AB949C6CDC4B9D30BE11F014A9A39 at byte 2087# This item is included in the report because it matches --limit.# Scores: V/M = 0.39# Time range: 2020-09-10T10:22:48 to 2020-09-10T10:39:56# Attribute    pct   total     min     max     avg     95%  stddev  median# ============ === ======= ======= ======= ======= ======= ======= =======# Count         15       2# Exec time     25     52s     24s     28s     26s     28s      3s     26s# Lock time     22   199us    85us   114us    99us   114us    20us    99us# Rows sent     40 292.97k 146.48k 146.48k 146.48k 146.48k       0 146.48k# Rows examine  40 292.97k 146.48k 146.48k 146.48k 146.48k       0 146.48k# Query size    14      38      19      19      19      19       0      19# String:# Databases    test# Hosts        121.69.51.10# Users        root# Query_time distribution#   1us#  10us# 100us#   1ms#  10ms# 100ms#    1s#  10s+  ################################################################# Tables#    SHOW TABLE STATUS FROM `test` LIKE 'goods'\G#    SHOW CREATE TABLE `test`.`goods`\G# EXPLAIN /*!50100 PARTITIONS*/select * from goods\G# Query 4: 0 QPS, 0x concurrency, ID 0x59A74D08D407B5EDF9A57DD5A41825CA at byte 0# This item is included in the report because it matches --limit.# Scores: V/M = 0.00# Time range: all events occurred at 2020-09-10T09:08:18# Attribute    pct   total     min     max     avg     95%  stddev  median# ============ === ======= ======= ======= ======= ======= ======= =======# Count          7       1# Exec time      5     11s     11s     11s     11s     11s       0     11s# Lock time      0       0       0       0       0       0       0       0# Rows sent      0       1       1       1       1       1       0       1# Rows examine   0       0       0       0       0       0       0       0# Query size     6      16      16      16      16      16       0      16# String:# Databases    sakila# Hosts        121.69.51.10# Users        root# Query_time distribution#   1us#  10us# 100us#   1ms#  10ms# 100ms#    1s#  10s+  ################################################################# EXPLAIN /*!50100 PARTITIONS*/select sleep(11)\G[root@niewj download]#

4. 查看mysql的主从状态: pt-slave-find(理解一下)

4.1 命令格局:

pt-slave-find --host=localhost --user=root --password=123456

4.2 命令示例:

[root@niewj download]# pt-slave-find --host=localhost --user=root --password=123456localhostVersion         5.7.31Server ID       0Uptime          07:45:01 (started 2020-09-10T16:42:50)Replication     Is not a slave, has 0 slaves connected, is not read_onlyFilters         Binary logging  ROWSlave status    Slave mode      STRICTAuto-increment  increment 1, offset 1InnoDB version  5.7.31[root@niewj download]# 

5. 查看mysql死锁信息: pt-deadlock-logger

命令:

pt-deadlock-logger --run-time=10 --interval=3 --create-dest-table --dest D=test,t=deadlocks u=root,p=123456

5.1 制作死锁

怎么制作一个死锁? 死锁的条件是什么呢? 两个事务, 先后依照不同的程序锁定各自须要的另一个数据, 就会死锁, 所以, 咱们筹备2个表: 外面都有id=1的记录;

mysql> show tables;+----------------+| Tables_in_test |+----------------+| deadlocks      || goods          |+----------------+

5.1.1 事务1: 开启事务; 锁表 t2的id=1;

mysql> set autocommit=0;Query OK, 0 rows affected (0.00 sec)mysql> select * from t2 where id=1 for update;+----+---------+--------+| id | name    | price  |+----+---------+--------+|  1 | 商品1   | 200.17 |+----+---------+--------+1 row in set (0.00 sec)

5.1.2 事务2: 开启事务; 锁表 goods的id=1;

mysql> set autocommit=0;Query OK, 0 rows affected (0.00 sec)mysql> select * from goods where id=1 for update;+----+---------+--------+| id | name    | price  |+----+---------+--------+|  1 | 商品1   | 200.17 |+----+---------+--------+1 row in set (0.00 sec)

5.1.3 事务1: 锁表 goods的id=1;

mysql> select * from goods where id=1 for update;+----+---------+--------+| id | name    | price  |+----+---------+--------+|  1 | 商品1   | 200.17 |+----+---------+--------+1 row in set (16.25 sec)

5.1.4 事务2: 锁表 t2的id=1;

mysql> select * from t2 where id=1 for update;ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transactionmysql> 

能够看到, 此时已产生死锁:

5.2. 执行5.1的命令

[root@niewj download]# pt-deadlock-logger --run-time=10 --interval=3 --create-dest-table --dest D=test,t=deadlocks u=root,p=123456server ts thread txn_id txn_time user hostname ip db tbl idx lock_type lock_mode wait_hold victim queryniewj 2020-09-11T00:45:08 97 0 96 root localhost  test t2 GEN_CLUST_INDEX RECORD X w 1 select * from t2 where id=1 for updateniewj 2020-09-11T00:45:08 98 0 409 root localhost  test goods id RECORD X w 0 select * from goods where id=1 for update[root@niewj download]

5.3 查看死锁日志表中记录:

mysql> use test;Reading table information for completion of table and column namesYou can turn off this feature to get a quicker startup with -ADatabase changedmysql> select * from deadlocks;+--------+---------------------+--------+--------+----------+------+-----------+----+------+-------+-----------------+-----------+-----------+-----------+--------+-------------------------------------------+| server | ts                  | thread | txn_id | txn_time | user | hostname  | ip | db   | tbl   | idx             | lock_type | lock_mode | wait_hold | victim | query                                     |+--------+---------------------+--------+--------+----------+------+-----------+----+------+-------+-----------------+-----------+-----------+-----------+--------+-------------------------------------------+| niewj  | 2020-09-11 00:45:08 |     97 |      0 |       96 | root | localhost |    | test | t2    | GEN_CLUST_INDEX | RECORD    | X         | w         |      1 | select * from t2 where id=1 for update    || niewj  | 2020-09-11 00:45:08 |     98 |      0 |      409 | root | localhost |    | test | goods | id              | RECORD    | X         | w         |      0 | select * from goods where id=1 for update |+--------+---------------------+--------+--------+----------+------+-----------+----+------+-------+-----------------+-----------+-----------+-----------+--------+-------------------------------------------+2 rows in set (0.00 sec)

下面的表格内容如下:

servertsthreadtxn_idtxn_timeuserhostnamedbtblidxlock_typelock_modewait_holdvictimquery
niewj2020-09-11 00:45:0897096rootlocalhosttestt2GEN_CLUST_INDEXRECORDXw1select * from t2 where id=1 for update
niewj2020-09-11 00:45:08980409rootlocalhosttestgoodsidRECORDXw0select * from goods where id=1 for update

6. 查看索引的应用 pt-index-usage

[root@niewj download]# pt-index-usage --user=root --password=123456 --host=localhost /var/lib/mysql/niewj-slow.log [root@niewj download]#

7. 查看数据库中反复的索引 pt-duplicate-key-checker

[root@niewj download]# pt-duplicate-key-checker --user=root --password=123456 --host=localhost# ######################################################################### Summary of indexes                                                      # ######################################################################### Total Indexes  97[root@niewj download]# 

8. 查看数据库IO情况 pt-ioprofile (线上数据库忙时慎用)

[root@niewj download]# pt-ioprofile2020年 09月 11日 星期五 01:29:56 CSTTracing process ID 18315

9. 查看表中数据容量大于某个值的: pt-find

pt-find --user=root --password=123456 --tablesize +2M

[root@niewj download]# pt-find --user=root --password=123456 --tablesize +2M`sakila`.`payment``sakila`.`rental``test`.`goods`[root@niewj download]# pt-find --user=root --password=123456 --tablesize +1M`mysql`.`help_topic``sakila`.`payment``sakila`.`rental``test`.`goods`[root@niewj download]# 

能够看到, 大于1M的表有4个; 大于2M的表有3个;

10. 杀死查问时长超时的连贯 pt-kill

pt-kill --user=root --password=123456 --busy-time=5 --print 仅仅打印

pt-kill --user=root --password=123456 --busy-time=5 --kill 发现后kill掉

模仿:

关上一个session;

10.1 pt-kill 仅监听+打印

  1. session1: pt-kill监听
[root@niewj download]# pt-kill --user=root --password=Youareapig#3 --busy-time=5 --print# 2020-09-11T01:40:57 KILL 108 (Query 6 sec) select sleep(18)# 2020-09-11T01:40:59 KILL 108 (Query 8 sec) select sleep(18)# 2020-09-11T01:41:01 KILL 108 (Query 10 sec) select sleep(18)# 2020-09-11T01:41:03 KILL 108 (Query 12 sec) select sleep(18)# 2020-09-11T01:41:05 KILL 108 (Query 14 sec) select sleep(18)# 2020-09-11T01:41:07 KILL 108 (Query 16 sec) select sleep(18)...
  1. session 2: 模仿超时
mysql> select sleep(18);+-----------+| sleep(18) |+-----------+|         0 |+-----------+1 row in set (18.00 sec)mysql> 

10.2 pt-kill 监听+kill

session1:

[root@niewj download]# pt-kill --user=root --password=Youareapig#3 --busy-time=5 --kill --print# 2020-09-11T01:45:27 KILL 112 (Query 5 sec) select sleep(18)

session2: 超时后 连贯失落(被杀掉了)

mysql> select sleep(18);ERROR 2013 (HY000): Lost connection to MySQL server during querymysql> 

11. 利用工具查出三大类有问题的sql

11.1 查问次数多+每次耗时的sql(慢日志的前几个query)

pt-query-digest的前几个query, 就是指标;

profile局部就是

11.2 IO占用大的sql(Rows Examine大的)

pt-query-digest中的 Rows Examine 占用大, 阐明扫描的行数多, 往往是优化的指标!

Rows Examine 占用大

examine: vt. 查看;考察; 检测;考试

11.3 索引利用率低的sql(Rows Examine和Rows Send相差较大的)

检测(扫描)的行数=Rows Examine

Rows Examine 越大, 就意味着扫描表的范畴越大 , 意味着IO大;

返回给调用方的数据行数=Rows Send

Rows Send 越大, 意味着返回给调用方的数据越多;

Rows Examine 和 Rows Send的关系:

退出扫描表t1中数据 Rows Examine 扫描了100万条; Rows Send只有 100条, 就是说破费了大的IO, 才找到这么寥寥100条, 真的须要看看索引的命中率了;

论断: Rows Examine 和 Rows Send 相差不能太大, 相差太大, 阐明索引命中率不高, 须要思考优化了!

12. 重点小结: pt-query-digest工具+3种有问题的sql

12.1 pt-query-digest命令看慢查问日志的办法

12.2 三种问题SQL

12.2.1 慢日志前几个query

12.2.2 Rows Examine 比拟大的-IO占用大

12.2.3 Rows Examine 和Rows Send 相差较大的, 索引命中率低