• GreatSQL社区原创内容未经受权不得随便应用,转载请分割小编并注明起源。

导语

在数据检索的过程中,常常会有多个列的匹配需要,明天介绍下联结索引的应用以及最左匹配准则的案例。

最左匹配准则作用在联结索引中,如果表中有一个联结索引(tcol01,tcol02,tcol03),只有当SQL应用到tcol01、tcol02索引的前提下,tcol03的索引才会被应用;同理只有tcol01的索引被应用的前提下,tcol02的索引才会被应用。

上面咱们来列举几个例子来阐明。

步骤

应用 mysql_random_data_load 创立测试数据

建库和建表

CREATE DATABASE IF NOT EXISTS test;CREATE TABLE `test`.`t3` (  `id` int(11) NOT NULL AUTO_INCREMENT,  `tcol01` tinyint(4) DEFAULT NULL,  `tcol02` smallint(6) DEFAULT NULL,  `tcol03` mediumint(9) DEFAULT NULL,  `tcol04` int(11) DEFAULT NULL,  `tcol05` bigint(20) DEFAULT NULL,  `tcol06` float DEFAULT NULL,  `tcol07` double DEFAULT NULL,  `tcol08` decimal(10,2) DEFAULT NULL,  `tcol09` date DEFAULT NULL,  `tcol10` datetime DEFAULT NULL,  `tcol11` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,  `tcol12` time DEFAULT NULL,  `tcol13` year(4) DEFAULT NULL,  `tcol14` varchar(100) DEFAULT NULL,  `tcol15` char(2) DEFAULT NULL,  `tcol16` blob,  `tcol17` text,  `tcol18` mediumtext,  `tcol19` mediumblob,  `tcol20` longblob,  `tcol21` longtext,  `tcol22` mediumtext,  `tcol23` varchar(3) DEFAULT NULL,  `tcol24` varbinary(10) DEFAULT NULL,  `tcol25` enum('a','b','c') DEFAULT NULL,  `tcol26` set('red','green','blue') DEFAULT NULL,  `tcol27` float(5,3) DEFAULT NULL,  `tcol28` double(4,2) DEFAULT NULL,  PRIMARY KEY (`id`)) ENGINE=InnoDB;

建联结索引

CREATE INDEX idx_tcol123 ON t1(`tcol01`,`tcol02`,`tcol03`);

写入100w条测试数据

./mysql_random_data_load test t1 1000000 --user=root --password=GreatSQL --config-file=/data/GreatSQL/my.cnf

联结索引数据存储形式

先对索引中第一列的数据进行排序,而后在满足第一列数据排序的前提下,再对第二列数据进行排序,以此类推。

如下图:

索引最左准则案例

状况1:三个索引都能应用上

试验1:仅有where子句

# 三个条件都应用上,优化器能够本人调整程序满足索引要求[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol02=167 AND tcol03=202019 AND tcol01=1;+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref               | rows | filtered | Extra |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+|  1 | SIMPLE      | t1    | NULL       | ref  | idx_tcol123   | idx_tcol123 | 9       | const,const,const |    1 |   100.00 | NULL  |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+1 row in set, 1 warning (0.11 sec)[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol02=167 AND tcol01=1 AND tcol03=202019 ;+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref               | rows | filtered | Extra |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+|  1 | SIMPLE      | t1    | NULL       | ref  | idx_tcol123   | idx_tcol123 | 9       | const,const,const |    1 |   100.00 | NULL  |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------------+------+----------+-------+

试验2:WHERE 加 order by子句

# 解析进去只有用到tcol01,tcol02索引,因为`explain`不会统计`order by`索引的信息,所有看起来`key_len`长度只有5;当tcol03倒序的时候就会用到`Backward index scan`性能[test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02=167 order by tcol03;+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+-------+| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref         | rows | filtered | Extra |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+-------+|  1 | SIMPLE      | t1    | NULL       | ref  | idx_tcol123   | idx_tcol123 | 5       | const,const |  269 |   100.00 | NULL  |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+-------+1 row in set, 1 warning (0.01 sec)[test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02=167 order by tcol03 desc;+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+---------------------+| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref         | rows | filtered | Extra               |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+---------------------+|  1 | SIMPLE      | t1    | NULL       | ref  | idx_tcol123   | idx_tcol123 | 5       | const,const |  269 |   100.00 | Backward index scan |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+---------------------+1 row in set, 1 warning (0.00 sec)# 当order by中的字段不蕴含在联结索引中的时候,就会用到`Using filesort`[root@GreatSQL][test]>EXPLAIN SELECT /* NO_CACHE */ * FROM t1 WHERE tcol01=1 AND tcol02=167 order by tcol04;+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+----------------+| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref         | rows | filtered | Extra          |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+----------------+|  1 | SIMPLE      | t1    | NULL       | ref  | idx_tcol123   | idx_tcol123 | 5       | const,const |  269 |   100.00 | Using filesort |+----+-------------+-------+------------+------+---------------+-------------+---------+-------------+------+----------+----------------+1 row in set, 1 warning (0.00 sec)

试验3:仅order by子句

# 优化器默认采取全副扫描了,因为是查问出所有数据,所以全表扫描回比索引更快,节俭回表的工夫[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 ORDER BY tcol01,tcol02,tcol03;+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra          |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 941900 |   100.00 | Using filesort |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+1 row in set, 1 warning (0.00 sec)[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 force index(`idx_tcol123`) ORDER BY tcol01,tcol02,tcol03;+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------+| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows   | filtered | Extra |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------+|  1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_tcol123 | 9       | NULL | 941900 |   100.00 | NULL  |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------+1 row in set, 1 warning (0.00 sec)# 只筛选索引列,也会应用到索引,也就是所谓的笼罩索引[root@GreatSQL][test]>explain  SELECT /* NO_CACHE */ tcol01,tcol02,tcol03 FROM t1  ORDER BY tcol01,tcol02,tcol03;+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------------+| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows   | filtered | Extra       |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------------+|  1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_tcol123 | 9       | NULL | 941900 |   100.00 | Using index |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+--------+----------+-------------+1 row in set, 1 warning (0.00 sec)# 如果是筛选局部数据,那么就会应用到索引而不会全表扫描[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 ORDER BY tcol01,tcol02,tcol03 limit 10000,11110;+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------+| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows  | filtered | Extra |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------+|  1 | SIMPLE      | t1    | NULL       | index | NULL          | idx_tcol123 | 9       | NULL | 21110 |   100.00 | NULL  |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------+1 row in set, 1 warning (0.00 sec)# 调整字段程序后,就变成`Using filesort`且没有用到索引,所以当应用order by语句,确保与联结索引的程序要统一[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1  ORDER BY tcol02,tcol01,tcol03 limit 10000,11110;+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra          |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 941900 |   100.00 | Using filesort |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+1 row in set, 1 warning (0.00 sec)

状况2:上面的SQL最多只能用到索引tcol1,tcol2局部

# tcol02范畴查找后,导致数据乱序,于是tcol03索引条件用不上,同时回呈现`Using index condition`和 `Using MRR`。[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1  WHERE tcol01=1 AND tcol02>100 AND tcol03=202019;+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows  | filtered | Extra                            |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+|  1 | SIMPLE      | t1    | NULL       | range | idx_tcol123   | idx_tcol123 | 5       | NULL | 77976 |    10.00 | Using index condition; Using MRR |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+1 row in set, 1 warning (0.00 sec)[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1  WHERE tcol01=1 AND tcol02>100;+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows  | filtered | Extra                            |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+|  1 | SIMPLE      | t1    | NULL       | range | idx_tcol123   | idx_tcol123 | 5       | NULL | 77976 |   100.00 | Using index condition; Using MRR |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+----------------------------------+1 row in set, 1 warning (0.00 sec)[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1  WHERE tcol01=1 AND tcol03=202019 ORDER BY tcol02;+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-----------------------+| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref   | rows   | filtered | Extra                 |+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-----------------------+|  1 | SIMPLE      | t1    | NULL       | ref  | idx_tcol123   | idx_tcol123 | 2       | const | 126670 |    10.00 | Using index condition |+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-----------------------+1 row in set, 1 warning (0.00 sec)# 关掉`Using index condition`和`Using MRR`后再看一下执行打算,理论测试效率要高很多。# 这是因为ICP缩小了引擎层和server层之间的数据传输和回表申请,不满足条件的申请,间接过滤无需回表# 实际上开启ICP后下面语句有用到tcol03的索引局部。[root@GreatSQL][test]>SET optimizer_switch = 'MRR=off';[root@GreatSQL][test]>SET optimizer_switch = 'index_condition_pushdown=off';[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1  WHERE tcol01=1 AND tcol02>100 AND tcol03=202019;+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows  | filtered | Extra       |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+|  1 | SIMPLE      | t1    | NULL       | range | idx_tcol123   | idx_tcol123 | 5       | NULL | 77976 |    10.00 | Using where |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+1 row in set, 1 warning (0.00 sec)1 row in set (1.81 sec) /* 敞开ICP和MRR后执行工夫 */1 row in set (0.01 sec) /* 开启ICP和MRR后执行工夫 */[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1  WHERE tcol01=1 AND tcol02>100;+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+| id | select_type | table | partitions | type  | possible_keys | key         | key_len | ref  | rows  | filtered | Extra       |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+|  1 | SIMPLE      | t1    | NULL       | range | idx_tcol123   | idx_tcol123 | 5       | NULL | 77976 |   100.00 | Using where |+----+-------------+-------+------------+-------+---------------+-------------+---------+------+-------+----------+-------------+1 row in set, 1 warning (0.00 sec)40252 rows in set (2.04 sec) /* 敞开ICP和MRR后执行工夫 */40252 rows in set (1.58 sec) /* 开启ICP和MRR后执行工夫 */[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1  WHERE tcol01=1 AND tcol03=202019 ORDER BY tcol02;+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-------------+| id | select_type | table | partitions | type | possible_keys | key         | key_len | ref   | rows   | filtered | Extra       |+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-------------+|  1 | SIMPLE      | t1    | NULL       | ref  | idx_tcol123   | idx_tcol123 | 2       | const | 126670 |    10.00 | Using where |+----+-------------+-------+------------+------+---------------+-------------+---------+-------+--------+----------+-------------+1 row in set, 1 warning (0.00 sec)1 row in set (1.99 sec) /* 敞开ICP和后执行工夫 */1 row in set (0.01 sec) /* 开启ICP和后执行工夫 */

Using index condition 请看文章 https://mp.weixin.qq.com/s/pt6mr3Ge1ya2aa6WlrpIvQ

Using MRR 前面再介绍。

状况3:上面的SQL用不到索引

[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol02=167;+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 941900 |    10.00 | Using where |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+1 row in set, 1 warning (0.00 sec)[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 WHERE tcol02=167 AND tcol03 >=1;+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 941900 |     3.33 | Using where |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+1 row in set, 1 warning (0.01 sec)[root@GreatSQL][test]>explain SELECT /* NO_CACHE */ * FROM t1 ORDER BY tcol02 limit 10000,11000;+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra          |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 941900 |   100.00 | Using filesort |+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+----------------+1 row in set, 1 warning (0.00 sec)

联结索引有哪些益处

  • 1.缩小开销。建一个联结索引(tcol01, tcol02, tcol03),相当于建设三个索引(tcol01),(tcol01,tcol02),(tcol01,tcol02,tcol03)的性能。每个索引都会占用写入开销和磁盘开销,对于大量数据的表,应用联结索引会大大的缩小开销。
  • 2.笼罩索引。对联结索引(tcol01, tcol02, tcol03),如果有如下的SQL: select tcol01,tcol02,tcol03 from t1 where tcol01=? and tcol02=? and tcol03=? 那么就能够应用到笼罩索引的性能,查问数据无需回表,缩小随机IO。
  • 3.效率高。多列条件的查问下,索引列越多,通过索引筛选出的数据就越少。

联结索引应用倡议

  • 1.查问条件中的 where、order by、group by 波及多个字段,个别须要创立多列索引,比方后面的 select * from t1 where tcol01=100 and tcol02=50;
  • 2.创立联结索引的时候,要将区分度高的字段放在后面,如果有一张学生表蕴含学号和姓名,那么在建设联结索引的时候,学号放在姓名后面,因为学号是唯一性的,能过滤更多的数据。
  • 3.尽量避免>、<、between、or、like首字母为%的范畴查找,范畴查问可能导致无奈应用索引。
  • 4.只筛选须要的数据字段,满足笼罩索引的要求,不要用 select * 筛选所有列数据。

Enjoy GreatSQL :)

文章举荐:

面向金融级利用的GreatSQL正式开源
https://mp.weixin.qq.com/s/cI...

Changes in GreatSQL 8.0.25 (2021-8-18)
https://mp.weixin.qq.com/s/qc...

MGR及GreatSQL资源汇总
https://mp.weixin.qq.com/s/qX...

GreatSQL MGR FAQ
https://mp.weixin.qq.com/s/J6...

在Linux下源码编译装置GreatSQL/MySQL
https://mp.weixin.qq.com/s/WZ...

# 对于 GreatSQL

GreatSQL是由万里数据库保护的MySQL分支,专一于晋升MGR可靠性及性能,反对InnoDB并行查问个性,是实用于金融级利用的MySQL分支版本。

Gitee:

https://gitee.com/GreatSQL/Gr...

GitHub:

https://github.com/GreatSQL/G...

Bilibili:

https://space.bilibili.com/13...

微信&QQ群:

可搜寻增加GreatSQL社区助手微信好友,发送验证信息“加群”退出GreatSQL/MGR交换微信群

QQ群:533341697

微信小助手:wanlidbc

本文由博客一文多发平台 OpenWrite 公布!