关于sql优化:第22期索引设计组合索引适用场景

建设在多个列上的索引即组合索引（联结索引），实用在多个列必须一起应用或者是从左到右方向局部间断列一起应用的业务场景。

组合索引和单值索引相似，索引上的每个键值依照肯定的大小排序。比方针对三个字段的组合索引有以下组合：

(f1, f2, f3)
(f1, f2, f3 desc)
(f1, f2 desc, f3)
(f1 desc, f2, f3)
(f1 desc, f2 desc, f3 desc)
...

明天探讨的组合索引只基于默认排序形式，也就是 (f1,f2,f3)，等价于 (f1 asc, f2 asc, f3 asc)。

组合索引的语法：

alter table t1 add key idx_multi(f1 [asc/desc],f2 [asc/desc],f3 [asc/desc]) [using btree/using hash]

MySQL 里，组合索引最大反对 16 个列。能够基于 B+ 树，也能够基于哈希，这篇次要探讨基于 B 树，并且索引程序默认升序，基于 HASH 只有一种用法，就是所有列的都必须等值过滤【仅限于上面 SQL 3】。

应用组合索引的必备条件为：列 f1 必须存在于 SQL 语句过滤条件中！也就是说组合索引的第一个列（最左列）在过滤条件中必须存在，而且最好是等值过滤。

思考以下 15 条 SQL 语句，别离对表 t1 字段 f1、f2、f3 有不同的组合过滤，并且都蕴含了列 f1，也就是说满足了组合索引应用的必备条件。

# SQL 1select * from t1 where f1 = 1;# SQL 2select * from t1 where f1 = 1 and f2 = 1;# SQL 3select * from t1 where f1 = 1 and f2 = 1 and f3 = 1 ;# SQL 4select f1,f2 from t1 where 1 order by f1,f2;# SQL 5select f1,f2,f3 from t1 where 1 order by f1,f2,f3;# SQL 6select f1,f2,count(*) from t1 group by f1,f2;# SQL 7select f1,f2,f3,count(*) from t1 group by f1,f2,f3;# SQL 8select * from t1 where f1 = 10 and f2 = 5 and f3 > 10# SQL 9select  * from t1 where f1 = 10 and f2 > 5;# SQL 10select * from t1 where f1 < 10;# SQL 11select * from t1 where f1 < 10 and f2 > 5;# SQL 12select * from t1 where f1 < 10 and f2 > 5 and f3 < 10;# SQL 13select * from t1 where f1 < 10 and f2 = 5 and f3 < 10;# SQL 14select * from t1 where f1 < 10 and f2 = 5 and f3 = 10;# SQL 15select * from t1 where f1 = 1 and f3 = 1;

SQL 1、SQL 2、 SQL 3 三条 SQL 别离基于组合索引 idx_multi 过滤后回表；其中 SQL 3 是组合索引中每个字段都能过滤到的最完满查问。来看看 SQL 3的执行打算：

(127.0.0.1:3400)|(ytt)>explain  select * from t1 where f1 = 1 and f2 = 1 and f3 = 1\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t1   partitions: NULL         type: refpossible_keys: idx_multi          key: idx_multi      key_len: 15          ref: const,const,const         rows: 1     filtered: 100.00        Extra: NULL1 row in set, 1 warning (0.00 sec)

SQL 3 的执行打算趋近完满，当然最完满的是索引 idx_multi 是主键或者惟一索引。比方上面对表 t3 查问，t3 的索引 udx_multi 是一个惟一索引。

(127.0.0.1:3400)|(ytt)>explain  select * from t3 where f1 = 9 and f2 = 52 and f3 = 35\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t3   partitions: NULL         type: constpossible_keys: udx_multi          key: udx_multi      key_len: 15          ref: const,const,const         rows: 1     filtered: 100.00        Extra: NULL1 row in set, 1 warning (0.00 sec)

SQL 4、SQL 5、SQL 6、SQL 7 四条 SQL 走笼罩索引扫描，不必回表，利用索引 idx_multi 升序输入后果。轻易打印下其中 SQL 7 的执行打算看看：

127.0.0.1:3400)|(ytt)>explain    -> select f1,f2,f3,count(*) from t1 group by f1,f2,f3\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t1   partitions: NULL         type: indexpossible_keys: idx_multi          key: idx_multi      key_len: 15          ref: NULL         rows: 32194     filtered: 100.00        Extra: Using index1 row in set, 1 warning (0.00 sec)

后果 type 列为 index，间接索引返回后果。

其实这里遇到一个十分常见的疑难：SQL 1 过滤条件里只有字段 f1, SQL 2 过滤字段里只有 f1,f2，针对这两种场景可否应该建设如下单值索引让查问运行的更加高效？

alter table t1 add key idx_f1(f1);alter table t1 add key idx_multi_sub(f1,f2);

其实针对列 f1 独自建设索引没有必要，因为 f1 为索引 idx_multi 的第一个字段，查问时如果仅仅蕴含字段 f1，那 MySQL 也仅仅只应用 f1 的索引数据，不会让索引 idx_multi 的所有列都被应用；

同理，基于字段 (f1,f2) 再建设一个组合索引也没有必要，(f1, f2) 能够看作以这样的形式存在与组合索引中：((f1, f2), f3)

那别离看下 SQL 2 用索引 idx_multi 和 idx_multi_sub 的执行打算来证实以上的说法：

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f1 = 1 and f2 = 1\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t1   partitions: NULL         type: refpossible_keys: idx_multi,idx_f1,idx_multi_sub          key: idx_multi      key_len: 10          ref: const,const         rows: 3     filtered: 100.00        Extra: NULL1 row in set, 1 warning (0.00 sec)(127.0.0.1:3400)|(ytt)>explain select * from t1 force index (idx_multi_sub) where f1 = 1 and f2 = 1\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t1   partitions: NULL         type: refpossible_keys: idx_multi_sub          key: idx_multi_sub      key_len: 10          ref: const,const         rows: 3     filtered: 100.00        Extra: NULL1 row in set, 1 warning (0.00 sec)

从以上后果来看，执行打算没有任何不同，所以如果查问过滤条件为组合索引里第一个列或者是蕴含第一个列的间断前缀列，不须要独自再建设局部字段的组合索引，保留原来组合索引即可。

(127.0.0.1:3400)|(ytt)>alter table t1 drop key idx_multi_sub, drop key idx_f1;Query OK, 0 rows affected (0.06 sec)Records: 0  Duplicates: 0  Warnings: 0

那接下来看看 SQL 8。SQL 8 里前两列也是间断的，不同的是前面列 f3 的过滤条件是一个范畴，那看下 SQL 8 执行打算：

(127.0.0.1:3400)|(ytt)>explain select * from t1 where f1 = 10 and f2 = 5 and f3 > 10\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t1   partitions: NULL         type: rangepossible_keys: idx_multi          key: idx_multi      key_len: 15          ref: NULL         rows: 154     filtered: 100.00        Extra: Using index condition1 row in set, 1 warning (0.00 sec)

SQL 8 的执行打算显示：因为列f3是过滤条件是一个范畴，并不影响应用组合索引，因为前两列 f1,f2 是间断的。

SQL 9 和 SQL 8 一样，因为列 f1 等值过滤，之后是 f2 范畴过滤。

SQL 10 只用到了列 f1，这点相似于单值索引 (f1)。

SQL 11、 SQL 12 这两条 SQL 和 SQL 10 是相似的，尽管过滤字段程序和索引字段程序一样，然而因为第一个列是一个范畴，只能用到组合索引的第一列。

SQL 13 、SQL 14 有点不一样，尽管列f1是范畴过滤，然而 SQL 13 里 f2 是等值过滤，SQL 14 里 f2,f3 是等值过滤。

所以如果 SQL 13、SQL 14 也运行较为频繁的话，能够另外加一个组合索引 (f2, f3)。

(127.0.0.1:3400)|(ytt)>alter table t1 add key idx_multi2(f2,f3);Query OK, 0 rows affected (0.38 sec)Records: 0  Duplicates: 0  Warnings: 0

看下 SQL 13 和 SQL 14 的执行打算：

(127.0.0.1:3400)|(ytt)>explain  select * from t1 where f1 < 10 and f2 = 5 and f3 < 10\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t1   partitions: NULL         type: rangepossible_keys: idx_multi,idx_multi2          key: idx_multi2      key_len: 10          ref: NULL         rows: 174     filtered: 7.99        Extra: Using index condition; Using where1 row in set, 1 warning (0.00 sec)(127.0.0.1:3400)|(ytt)>explain  select * from t1 where f1 < 10 and f2 = 5 and f3 = 10\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t1   partitions: NULL         type: refpossible_keys: idx_multi,idx_multi2          key: idx_multi2      key_len: 10          ref: const,const         rows: 14     filtered: 7.99        Extra: Using where1 row in set, 1 warning (0.00 sec)(127.0.0.1:3400)|(ytt)>

那此时下面两条 SQL 又回到了组合索引的经典应用场景。

再看下最初一条 SQL，SQL 15。SQL 15 过滤条件只有 (f1=1 and f3=1)，也就是不匹配组合索引的过滤连续性特色，然而因为列 f1 是等值过滤，所以也能够应用组合索引 idx_multi，看下执行打算：

(127.0.0.1:3400)|(ytt)>explain  select * from t1 where f1 = 1 and f3 = 1\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t1   partitions: NULL         type: refpossible_keys: idx_multi          key: idx_multi      key_len: 5          ref: const         rows: 312     filtered: 10.00        Extra: Using index condition1 row in set, 1 warning (0.00 sec)

从执行打算后果看出，也只用到了列 f1，因为 f2 不存在。所以这条 SQL 如果运行也很频繁，能够再次建设一个新的组合索引 (f1, f3)。

(127.0.0.1:3400)|(ytt)>alter table t1 add key idx_multi3(f1,f3);Query OK, 0 rows affected (0.36 sec)Records: 0  Duplicates: 0  Warnings: 0

再看下执行打算：

(127.0.0.1:3400)|(ytt)>explain  select * from t1 where f1 = 1 and f3 = 1\G*************************** 1. row ***************************           id: 1  select_type: SIMPLE        table: t1   partitions: NULL         type: refpossible_keys: idx_multi,idx_multi3          key: idx_multi3      key_len: 10          ref: const,const         rows: 1     filtered: 100.00        Extra: NULL1 row in set, 1 warning (0.00 sec)

那此时又回到了组合索引的过滤连续性特色场景。

文中示例用到的表构造：

(127.0.0.1:3400)|(ytt)>show create table t1\G*************************** 1. row ***************************       Table: t1Create Table: CREATE TABLE `t1` (  `id` int NOT NULL,  `f1` int DEFAULT NULL,  `f2` int DEFAULT NULL,  `f3` int DEFAULT NULL,  `f4` int DEFAULT NULL,  PRIMARY KEY (`id`),  KEY `idx_multi` (`f1`,`f2`,`f3`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci1 row in set (0.00 sec)(127.0.0.1:3400)|(ytt)>show create table t3\G*************************** 1. row ***************************       Table: t3Create Table: CREATE TABLE `t3` (  `id` int NOT NULL,  `f1` int DEFAULT NULL,  `f2` int DEFAULT NULL,  `f3` int DEFAULT NULL,  `f4` int DEFAULT NULL,  UNIQUE KEY `udx_multi` (`f1`,`f2`,`f3`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci1 row in set (0.00 sec)

这里探讨了组合索引的种种应用点，日常业务中，如果一个列曾经在组合索引，并且在第一位，该当防止建设额定的单个索引。