共计 16571 个字符,预计需要花费 42 分钟才能阅读完成。
- GreatSQL 社区原创内容未经受权不得随便应用,转载请分割小编并注明起源。
导语
在数据检索的过程中,常常会有多个列的匹配需要,明天介绍下联结索引的应用以及最左匹配准则的案例。
最左匹配准则作用在联结索引中,如果表中有一个联结索引 (tcol01,tcol02,tcol03)
,只有当 SQL 应用到tcol01、tcol02
索引的前提下,tcol03
的索引才会被应用;同理只有 tcol01
的索引被应用的前提下,tcol02
的索引才会被应用。
上面咱们来列举几个例子来阐明。
步骤
应用 mysql_random_data_load
创立测试数据
建库和建表
CREATE DATABASE IF NOT EXISTS test;
CREATE TABLE `test`.`t3` (`id` int(11) NOT NULL AUTO_INCREMENT,
`tcol01` tinyint(4) DEFAULT NULL,
`tcol02` smallint(6) DEFAULT NULL,
`tcol03` mediumint(9) DEFAULT NULL,
`tcol04` int(11) DEFAULT NULL,
`tcol05` bigint(20) DEFAULT NULL,
`tcol06` float DEFAULT NULL,
`tcol07` double DEFAULT NULL,
`tcol08` decimal(10,2) DEFAULT NULL,
`tcol09` date DEFAULT NULL,
`tcol10` datetime DEFAULT NULL,
`tcol11` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`tcol12` time DEFAULT NULL,
`tcol13` year(4) DEFAULT NULL,
`tcol14` varchar(100) DEFAULT NULL,
`tcol15` char(2) DEFAULT NULL,
`tcol16` blob,
`tcol17` text,
`tcol18` mediumtext,
`tcol19` mediumblob,
`tcol20` longblob,
`tcol21` longtext,
`tcol22` mediumtext,
`tcol23` varchar(3) DEFAULT NULL,
`tcol24` varbinary(10) DEFAULT NULL,
`tcol25` enum('a','b','c') DEFAULT NULL,
`tcol26` set('red','green','blue') DEFAULT NULL,
`tcol27` float(5,3) DEFAULT NULL,
`tcol28` double(4,2) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
建联结索引
CREATE INDEX idx_tcol123 ON t1(`tcol01`,`tcol02`,`tcol03`);
写入 100w 条测试数据
./mysql_random_data_load test t1 1000000 --user=root --password=GreatSQL --config-file=/data/GreatSQL/my.cnf
联结索引数据存储形式
先对索引中第一列的数据进行排序,而后在满足第一列数据排序的前提下,再对第二列数据进行排序,以此类推。
如下图:
索引最左准则案例
状况 1:三个索引都能应用上
试验 1:仅有 where 子句
# 三个条件都应用上,优化器能够本人调整程序满足索引要求
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol02=167 AND tcol03=202019 AND tcol01=1;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | t1 | NULL | ref | idx_tcol123 | idx_tcol123 | 9 | const,const,const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+
1 row in set, 1 warning (0.11 sec)
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol02=167 AND tcol01=1 AND tcol03=202019 ;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | t1 | NULL | ref | idx_tcol123 | idx_tcol123 | 9 | const,const,const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+
试验 2:WHERE 加 order by 子句
# 解析进去只有用到 tcol01,tcol02 索引,因为 `explain` 不会统计 `order by` 索引的信息,所有看起来 `key_len` 长度只有 5;当 tcol03 倒序的时候就会用到 `Backward index scan` 性能
[test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02=167 order by tcol03;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+-------+
| 1 | SIMPLE | t1 | NULL | ref | idx_tcol123 | idx_tcol123 | 5 | const,const | 269 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+-------+
1 row in set, 1 warning (0.01 sec)
[test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02=167 order by tcol03 desc;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+---------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+---------------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_tcol123 | idx_tcol123 | 5 | const,const | 269 | 100.00 | Backward index scan |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+---------------------+
1 row in set, 1 warning (0.00 sec)
# 当 order by 中的字段不蕴含在联结索引中的时候,就会用到 `Using filesort`
[root@GreatSQL][test]>EXPLAIN SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02=167 order by tcol04;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+----------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_tcol123 | idx_tcol123 | 5 | const,const | 269 | 100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
试验 3:仅 order by 子句
# 优化器默认采取全副扫描了,因为是查问出所有数据,所以全表扫描回比索引更快,节俭回表的工夫
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 ORDER BY tcol01,tcol02,tcol03;
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 941900 | 100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 force index(`idx_tcol123`) ORDER BY tcol01,tcol02,tcol03;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------+
| 1 | SIMPLE | t1 | NULL | index | NULL | idx_tcol123 | 9 | NULL | 941900 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------+
1 row in set, 1 warning (0.00 sec)
# 只筛选索引列,也会应用到索引,也就是所谓的笼罩索引
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ tcol01,tcol02,tcol03 FROM t1 ORDER BY tcol01,tcol02,tcol03;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | index | NULL | idx_tcol123 | 9 | NULL | 941900 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
# 如果是筛选局部数据,那么就会应用到索引而不会全表扫描
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 ORDER BY tcol01,tcol02,tcol03 limit 10000,11110;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------+
| 1 | SIMPLE | t1 | NULL | index | NULL | idx_tcol123 | 9 | NULL | 21110 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------+
1 row in set, 1 warning (0.00 sec)
# 调整字段程序后,就变成 `Using filesort` 且没有用到索引,所以当应用 order by 语句,确保与联结索引的程序要统一
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 ORDER BY tcol02,tcol01,tcol03 limit 10000,11110;
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 941900 | 100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
状况 2:上面的 SQL 最多只能用到索引 tcol1,tcol2 局部
# tcol02 范畴查找后,导致数据乱序,于是 tcol03 索引条件用不上,同时回呈现 `Using index condition` 和 `Using MRR`。[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02>100 AND tcol03=202019;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+
| 1 | SIMPLE | t1 | NULL | range | idx_tcol123 | idx_tcol123 | 5 | NULL | 77976 | 10.00 | Using index condition; Using MRR |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+
1 row in set, 1 warning (0.00 sec)
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02>100;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+
| 1 | SIMPLE | t1 | NULL | range | idx_tcol123 | idx_tcol123 | 5 | NULL | 77976 | 100.00 | Using index condition; Using MRR |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+
1 row in set, 1 warning (0.00 sec)
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol03=202019 ORDER BY tcol02;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-----------------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_tcol123 | idx_tcol123 | 2 | const | 126670 | 10.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)
# 关掉 `Using index condition` 和 `Using MRR` 后再看一下执行打算,理论测试效率要高很多。# 这是因为 ICP 缩小了引擎层和 server 层之间的数据传输和回表申请,不满足条件的申请,间接过滤无需回表
# 实际上开启 ICP 后下面语句有用到 tcol03 的索引局部。[root@GreatSQL][test]>SET optimizer_switch = 'MRR=off';
[root@GreatSQL][test]>SET optimizer_switch = 'index_condition_pushdown=off';
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02>100 AND tcol03=202019;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | range | idx_tcol123 | idx_tcol123 | 5 | NULL | 77976 | 10.00 | Using where |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
1 row in set (1.81 sec) /* 敞开 ICP 和 MRR 后执行工夫 */
1 row in set (0.01 sec) /* 开启 ICP 和 MRR 后执行工夫 */
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02>100;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | range | idx_tcol123 | idx_tcol123 | 5 | NULL | 77976 | 100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
40252 rows in set (2.04 sec) /* 敞开 ICP 和 MRR 后执行工夫 */
40252 rows in set (1.58 sec) /* 开启 ICP 和 MRR 后执行工夫 */
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol03=202019 ORDER BY tcol02;
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | ref | idx_tcol123 | idx_tcol123 | 2 | const | 126670 | 10.00 | Using where |
+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
1 row in set (1.99 sec) /* 敞开 ICP 和后执行工夫 */
1 row in set (0.01 sec) /* 开启 ICP 和后执行工夫 */
Using index condition
请看文章 https://mp.weixin.qq.com/s/pt6mr3Ge1ya2aa6WlrpIvQ
Using MRR
前面再介绍。
状况 3:上面的 SQL 用不到索引
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol02=167;
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 941900 | 10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol02=167 AND tcol03 >=1;
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 941900 | 3.33 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.01 sec)
[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 ORDER BY tcol02 limit 10000,11000;
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 941900 | 100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
联结索引有哪些益处
- 1. 缩小开销。建一个联结索引
(tcol01, tcol02, tcol03)
,相当于建设三个索引(tcol01),(tcol01,tcol02),(tcol01,tcol02,tcol03)
的性能。每个索引都会占用写入开销和磁盘开销,对于大量数据的表,应用联结索引会大大的缩小开销。 - 2. 笼罩索引。对联结索引
(tcol01, tcol02, tcol03)
,如果有如下的 SQL:select tcol01,tcol02,tcol03 from t1 where tcol01=? and tcol02=? and tcol03=?
那么就能够应用到笼罩索引的性能,查问数据无需回表,缩小随机 IO。 - 3. 效率高。多列条件的查问下,索引列越多,通过索引筛选出的数据就越少。
联结索引应用倡议
- 1. 查问条件中的
where、order by、group by
波及多个字段,个别须要创立多列索引,比方后面的select * from t1 where tcol01=100 and tcol02=50;
- 2. 创立联结索引的时候,要将区分度高的字段放在后面,如果有一张学生表蕴含学号和姓名,那么在建设联结索引的时候,学号放在姓名后面,因为学号是唯一性的,能过滤更多的数据。
- 3. 尽量避免
>、<、between、or、like 首字母为 %
的范畴查找,范畴查问可能导致无奈应用索引。 - 4. 只筛选须要的数据字段,满足笼罩索引的要求,不要用
select *
筛选所有列数据。
Enjoy GreatSQL :)
文章举荐:
面向金融级利用的 GreatSQL 正式开源
https://mp.weixin.qq.com/s/cI…
Changes in GreatSQL 8.0.25 (2021-8-18)
https://mp.weixin.qq.com/s/qc…
MGR 及 GreatSQL 资源汇总
https://mp.weixin.qq.com/s/qX…
GreatSQL MGR FAQ
https://mp.weixin.qq.com/s/J6…
在 Linux 下源码编译装置 GreatSQL/MySQL
https://mp.weixin.qq.com/s/WZ…
# 对于 GreatSQL
GreatSQL 是由万里数据库保护的 MySQL 分支,专一于晋升 MGR 可靠性及性能,反对 InnoDB 并行查问个性,是实用于金融级利用的 MySQL 分支版本。
Gitee:
https://gitee.com/GreatSQL/Gr…
GitHub:
https://github.com/GreatSQL/G…
Bilibili:
https://space.bilibili.com/13…
微信 &QQ 群:
可搜寻增加 GreatSQL 社区助手微信好友,发送验证信息“加群”退出 GreatSQL/MGR 交换微信群
QQ 群:533341697
微信小助手:wanlidbc
本文由博客一文多发平台 OpenWrite 公布!