- 1. 说在后面的话
- 2. 装置 employees 测试库
- 3. 观测 SQL 运行状态
-
- 3.1 观测 SQL 运行时的内存耗费
- 3.2 观测 SQL 运行时的其余开销
- 3.3 观测 SQL 运行进度
感知 SQL 运行时的状态
1. 说在后面的话
在 MySQL 里,一条 SQL 运行时产生多少磁盘 I /O,占用多少内存,是否有创立长期表,这些指标如果都能观测到,有助于更快发现 SQL 瓶颈,点燃潜在隐患。
从 MySQL 5.7 版本开始,performance_schema
就默认启用了,并且还减少了sys schema
,到了 8.0 版本又进一步失去加强晋升,在 SQL 运行时就能察看到很多有用的信息,实现肯定水平的可观测性。
上面举例说明如何进行观测,以及次要观测哪些指标。
2. 装置 employees 测试库
装置 MySQL 官网提供的 employees
测试数据库,戳此链接 (https://dev.mysql.com/doc/index-other.html) 下载,解压缩后开始装置:
$ mysql -f < employees.sql;
INFO
CREATING DATABASE STRUCTURE
INFO
storage engine: InnoDB
INFO
LOADING departments
INFO
LOADING employees
INFO
LOADING dept_emp
INFO
LOADING dept_manager
INFO
LOADING titles
INFO
LOADING salaries
data_load_time_diff
00:00:37
MySQL 还提供了相应的应用文档:https://dev.mysql.com/doc/employee/en/
本次测试采纳 GreatSQL 8.0.32-24 版本,且运行在 MGR 环境中:
greatsql> \s
...
Server version: 8.0.32-24 GreatSQL, Release 24, Revision 3714067bc8c
...
greatsql> select MEMBER_ID, MEMBER_ROLE, MEMBER_VERSION from performance_schema.replication_group_members;
+--------------------------------------+-------------+----------------+
| MEMBER_ID | MEMBER_ROLE | MEMBER_VERSION |
+--------------------------------------+-------------+----------------+
| 2adec6d2-febb-11ed-baca-d08e7908bcb1 | SECONDARY | 8.0.32 |
| 2f68fee2-febb-11ed-b51e-d08e7908bcb1 | ARBITRATOR | 8.0.32 |
| 5e34a5e2-feb6-11ed-b288-d08e7908bcb1 | PRIMARY | 8.0.32 |
+--------------------------------------+-------------+----------------+
3. 观测 SQL 运行状态
查看以后连贯 / 会话的连贯 ID、外部线程 ID:
greatsql> select processlist_id, thread_id from performance_schema.threads where processlist_id = connection_id();
+----------------+-----------+
| processlist_id | thread_id |
+----------------+-----------+
| 110 | 207 |
+----------------+-----------+
查问失去以后的连贯 ID=110,外部线程 ID=207。
P.S,因为本文整顿过程不是间断的,所以上面看到的 thread_id 值可能会有好几个,每次都不同。
3.1 观测 SQL 运行时的内存耗费
执行上面的 SQL,查问所有员工的薪资总额,按员工号分组,并按薪资总额倒序,取前 10 条记录:
greatsql> explain select emp_no, sum(salary) as total_salary from salaries group by emp_no order by total_salary desc limit 10\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: salaries
partitions: NULL
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 7
ref: NULL
rows: 2838426
filtered: 100.00
Extra: Using temporary; Using filesort
看到须要全索引扫描(其实也等同于全表扫描,因为是基于 PRIMARY 索引),并且还须要生成长期表,以及额定的 filesort。
在正式运行该 SQL 之前,在另外的窗口中新建一个连贯会话,执行上面的 SQL 先察看该连贯 / 会话以后的内存分配情况:
greatsql> select * from sys.x$memory_by_thread_by_current_bytes where thread_id = 207\G
*************************** 1. row ***************************
thread_id: 207
user: root@localhost
current_count_used: 9
current_allocated: 26266
current_avg_alloc: 2918.4444
current_max_alloc: 16464
total_allocated: 30311
等到该 SQL 执行完了,再一次查问内存分配情况:
greatsql> select * from sys.x$memory_by_thread_by_current_bytes where thread_id = 207\G
*************************** 1. row ***************************
thread_id: 207
user: root@localhost
current_count_used: 13
current_allocated: 24430
current_avg_alloc: 1879.2308
current_max_alloc: 16456
total_allocated: 95719
咱们留神到几个数据的变动状况,用上面表格来展现:
指标 | 运行前 | 运行后 |
---|---|---|
total_allocated | 30311 | 95719 |
也就是说,SQL 运行时,须要调配的内存是:95719 – 30311 = 65408 字节。
3.2 观测 SQL 运行时的其余开销
通过观察 performance_schema.status_by_thread
表,能够晓得相应连贯 / 会话中 SQL 运行的一些状态指标。在 SQL 运行完结后,执行上面的 SQL 命令即可查看:
greatsql> select * from performance_schema.status_by_thread where thread_id = 207;
...
| 207 | Created_tmp_disk_tables | 0 |
| 207 | Created_tmp_tables | 0 |
...
| 207 | Handler_read_first | 1 |
| 207 | Handler_read_key | 1 |
| 207 | Handler_read_last | 0 |
| 207 | Handler_read_next | 2844047 |
| 207 | Handler_read_prev | 0 |
| 207 | Handler_read_rnd | 0 |
| 207 | Handler_read_rnd_next | 0 |
| 207 | Handler_rollback | 0 |
| 207 | Handler_savepoint | 0 |
| 207 | Handler_savepoint_rollback | 0 |
| 207 | Handler_update | 0 |
| 207 | Handler_write | 0 |
| 207 | Last_query_cost | 286802.914893 |
| 207 | Last_query_partial_plans | 1 |
...
| 207 | Select_full_join | 0 |
| 207 | Select_full_range_join | 0 |
| 207 | Select_range | 0 |
| 207 | Select_range_check | 0 |
| 207 | Select_scan | 1 |
| 207 | Slow_launch_threads | 0 |
| 207 | Slow_queries | 1 |
| 207 | Sort_merge_passes | 0 |
| 207 | Sort_range | 0 |
| 207 | Sort_rows | 1 |
| 207 | Sort_scan | 1 |
...
下面咱们只列举了局部比拟重要的状态指标。从这个后果也能够佐证 slow query log 中的后果,的确没创立长期表。
作为参照,查看这条 SQL 对应的 slow query log 记录:
# Query_time: 0.585593 Lock_time: 0.000002 Rows_sent: 10 Rows_examined: 2844057 Thread_id: 110 Errno: 0 Killed: 0 Bytes_received: 115 Bytes_sent: 313 Read_first: 1 Read_last: 0 Read_key: 1 Read_next: 2844047 Read_prev: 0 Read_rnd: 0 Read_rnd_next: 0 Sort_merge_passes: 0 Sort_range_count: 0 Sort_rows: 10 Sort_scan_count: 1 Created_tmp_disk_tables: 0 Created_tmp_tables: 0 Start: 2023-07-06T10:06:01.438376+08:00 End: 2023-07-06T10:06:02.023969+08:00 Schema: employees Rows_affected: 0
# Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
# InnoDB_trx_id: 0
# Full_scan: Yes Full_join: No Tmp_table: No Tmp_table_on_disk: No
# Filesort: Yes Filesort_on_disk: No Merge_passes: 0
# InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000
# InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
# InnoDB_pages_distinct: 4281
use employees;
SET timestamp=1688609161;
select emp_no, sum(salary) as total_salary from salaries group by emp_no order by total_salary desc limit 10;
能够看到,Created_tmp_disk_tables
, Created_tmp_tables
, Handler_read_next
, Select_full_join
, Select_scan
, Sort_rows
, Sort_scan
, 等几个指标的数值是一样的。
还能够查看该 SQL 运行时的 I /O latency 状况,SQL 运行前后两次查问比照:
greatsql> select * from sys.io_by_thread_by_latency where thread_id = 207;
+----------------+-------+---------------+-------------+-------------+-------------+-----------+----------------+
| user | total | total_latency | min_latency | avg_latency | max_latency | thread_id | processlist_id |
+----------------+-------+---------------+-------------+-------------+-------------+-----------+----------------+
| root@localhost | 7 | 75.39 us | 5.84 us | 10.77 us | 22.12 us | 207 | 110 |
+----------------+-------+---------------+-------------+-------------+-------------+-----------+----------------+
...
greatsql> select * from sys.io_by_thread_by_latency where thread_id = 207;
+----------------+-------+---------------+-------------+-------------+-------------+-----------+----------------+
| user | total | total_latency | min_latency | avg_latency | max_latency | thread_id | processlist_id |
+----------------+-------+---------------+-------------+-------------+-------------+-----------+----------------+
| root@localhost | 8 | 85.29 us | 5.84 us | 10.66 us | 22.12 us | 207 | 110 |
+----------------+-------+---------------+-------------+-------------+-------------+-----------+----------------+
能够看到这个 SQL 运行时的 I /O latency 是:85.29 – 75.39 = 9.9us。
3.3 观测 SQL 运行进度
咱们晓得,运行完一条 SQL 后,能够利用 PROFLING 性能查看它各个阶段的耗时,然而在运行时如果也想查看各阶段耗时该怎么办呢?
从 MySQL 5.7 版本开始,能够通过 performance_schema.events_stages_%
相干表查看 SQL 运行过程以及各阶段耗时,须要先批改相干设置:
# 确认是否对所有主机 & 用户都启用
greatsql> SELECT * FROM performance_schema.setup_actors;
+------+------+------+---------+---------+
| HOST | USER | ROLE | ENABLED | HISTORY |
+------+------+------+---------+---------+
| % | % | % | NO | NO |
+------+------+------+---------+---------+
# 批改成对所有主机 & 用户都启用
greatsql> UPDATE performance_schema.setup_actors
SET ENABLED = 'YES', HISTORY = 'YES'
WHERE HOST = '%' AND USER = '%';
# 批改 setup_instruments & setup_consumers 设置
greatsql> UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME LIKE '%events_statements_%';
greatsql> UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME LIKE '%events_stages_%';
这就实时能够观测 SQL 运行过程中的状态了。
在 SQL 运行过程中,从另外的窗口查看该 SQL 对应的 EVENT_ID
:
greatsql> SELECT EVENT_ID, TRUNCATE(TIMER_WAIT/1000000000000,6) as Duration, SQL_TEXT FROM performance_schema.events_statements_history WHERE thread_id = 85 order by event_id desc limit 5;
+----------+----------+-------------------------------------------------------------------------------------------------------------------------------+
| EVENT_ID | Duration | SQL_TEXT |
+----------+----------+-------------------------------------------------------------------------------------------------------------------------------+
| 149845 | 0.6420 | select emp_no, sum(salary) as total_salary, sleep(0.000001) from salaries group by emp_no order by total_salary desc limit 10 |
| 149803 | 0.6316 | select emp_no, sum(salary) as total_salary, sleep(0.000001) from salaries group by emp_no order by total_salary desc limit 10 |
| 149782 | 0.6245 | select emp_no, sum(salary) as total_salary, sleep(0.000001) from salaries group by emp_no order by total_salary desc limit 10 |
| 149761 | 0.6361 | select emp_no, sum(salary) as total_salary, sleep(0.000001) from salaries group by emp_no order by total_salary desc limit 10 |
| 149740 | 0.6245 | select emp_no, sum(salary) as total_salary, sleep(0.000001) from salaries group by emp_no order by total_salary desc limit 10 |
+----------+----------+-------------------------------------------------------------------------------------------------------------------------------+
# 再依据 EVENT_ID 值去查问 events_stages_history_long
greatsql> SELECT thread_id ,event_Id, event_name AS Stage, TRUNCATE(TIMER_WAIT/1000000000000,6) AS Duration FROM performance_schema.events_stages_history_long WHERE NESTING_EVENT_ID = 149845 order by event_id;
+-----------+----------+------------------------------------------------+----------+
| thread_id | event_Id | Stage | Duration |
+-----------+----------+------------------------------------------------+----------+
| 85 | 149846 | stage/sql/starting | 0.0000 |
| 85 | 149847 | stage/sql/Executing hook on transaction begin. | 0.0000 |
| 85 | 149848 | stage/sql/starting | 0.0000 |
| 85 | 149849 | stage/sql/checking permissions | 0.0000 |
| 85 | 149850 | stage/sql/Opening tables | 0.0000 |
| 85 | 149851 | stage/sql/init | 0.0000 |
| 85 | 149852 | stage/sql/System lock | 0.0000 |
| 85 | 149854 | stage/sql/optimizing | 0.0000 |
| 85 | 149855 | stage/sql/statistics | 0.0000 |
| 85 | 149856 | stage/sql/preparing | 0.0000 |
| 85 | 149857 | stage/sql/Creating tmp table | 0.0000 |
| 85 | 149858 | stage/sql/executing | 0.6257 |
| 85 | 149859 | stage/sql/end | 0.0000 |
| 85 | 149860 | stage/sql/query end | 0.0000 |
| 85 | 149861 | stage/sql/waiting for handler commit | 0.0000 |
| 85 | 149862 | stage/sql/closing tables | 0.0000 |
| 85 | 149863 | stage/sql/freeing items | 0.0000 |
| 85 | 149864 | stage/sql/logging slow query | 0.0000 |
| 85 | 149865 | stage/sql/cleaning up | 0.0000 |
+-----------+----------+------------------------------------------------+----------+
下面就是这条 SQL 的运行进度展现,以及各个阶段的耗时,和 PROFILING
的输入一样,当咱们理解一条 SQL 运行所须要经验的各个阶段时,从下面的输入后果中也就能估算出该 SQL 大略还要多久能跑完,决定是否要提前 kill 它。
如果想要察看 DDL SQL 的运行进度,能够参考这篇文章:不必 MariaDB/Percona 也能查看 DDL 的进度。
更多的观测指标、维度还有待持续开掘,当前有机会再写。
另外,也能够利用 MySQL Workbench 工具,或 MySQL Enterprise Monitor,都已集成了很多可观测性指标,相当不错的体验。
延长浏览
- Query Profiling Using Performance Schema, https://dev.mysql.com/doc/refman/8.0/en/performance-schema-qu…
- 不必 MariaDB/Percona 也能查看 DDL 的进度,https://mp.weixin.qq.com/s?__biz=MjM5NzAzMTY4NQ==&mid=2653931…
- 事件记录 | performance_schema 全方位介绍,http://mp.weixin.qq.com/s?__biz=MjM5NzAzMTY4NQ==&mid=26539350…
- 内存调配统计视图 | 全方位意识 sys 零碎库,http://mp.weixin.qq.com/s?__biz=MjM5NzAzMTY4NQ==&mid=26539351…
Enjoy GreatSQL :)
## 对于 GreatSQL
GreatSQL 是实用于金融级利用的国内自主开源数据库,具备高性能、高牢靠、高易用性、高平安等多个外围个性,能够作为 MySQL 或 Percona Server 的可选替换,用于线上生产环境,且完全免费并兼容 MySQL 或 Percona Server。
相干链接:GreatSQL 社区 Gitee GitHub Bilibili
GreatSQL 社区:
社区博客有奖征稿详情:https://greatsql.cn/thread-100-1-1.html
技术交换群:
微信:扫码增加
GreatSQL 社区助手
微信好友,发送验证信息加群
。