关于索引:order-by-limit-造成优化器选择索引错误

38次阅读

共计 3255 个字符,预计需要花费 9 分钟才能阅读完成。

原创 https://developer.aliyun.com/…

MySQL · 捉虫动静 · order by limit 造成优化器抉择索引谬误

简介: 问题形容 bug 触发条件如下:优化器先抉择了 where 条件中字段的索引,该索引过滤性较好;SQL 中必须有 order by limit 从而疏导优化器尝试应用 order by 字段上的索引进行优化,最终因代价问题没有胜利。复现 case 表构造 create table t

问题形容

bug 触发条件如下:

  1. 优化器先抉择了 where 条件中字段的索引,该索引过滤性较好;
  2. SQL 中必须有 order by limit 从而疏导优化器尝试应用 order by 字段上的索引进行优化,最终因代价问题没有胜利。

复现 case

表构造

create table t1(
      id int auto_increment primary key,
      a int, b int, c int,
      key iabc (a, b, c),
      key ic (c)
) engine = innodb; 

结构数据

insert into t1 select null,null,null,null;
insert into t1 select null,null,null,null from t1;
insert into t1 select null,null,null,null from t1;
insert into t1 select null,null,null,null from t1;
insert into t1 select null,null,null,null from t1;
insert into t1 select null,null,null,null from t1;
update t1 set a = id / 2, b = id / 4, c = 6 - id / 8; 

触发 SQL

mysql> explain select id from t1 where a<3 and b in (1, 13) and c>=3 order by c limit 2G
*************************** 1. row ***************************
 id: 1
  select_type: SIMPLE
 table: t1
 type: index
possible_keys: iabc,ic
 key: iabc
 key_len: 15
 ref: NULL
 rows: 32
 Extra: Using where; Using index; Using filesort 

应用 force index 能够抉择过滤性好的索引

mysql> explain select id from t1 force index(iabc) where a<3 and b in (1, 13) and c>=3 order by c limit 2G
*************************** 1. row ***************************
 id: 1
  select_type: SIMPLE
 table: t1
 type: range
possible_keys: iabc
 key: iabc
 key_len: 5
 ref: NULL
 rows: 3
 Extra: Using where; Using index; Using filesort 

问题剖析

optimizer_trace 能够帮忙剖析这个问题。

SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACEG

 "range_scan_alternatives": [
                  {
                    "index": "iabc",
                    "ranges": ["NULL < a < 3"],
                    "index_dives_for_eq_ranges": true,
                    "rowid_ordered": false,
                    "using_mrr": false,
                    "index_only": true,
                    "rows": 3,
                    "cost": 1.6146,
                    "chosen": true
                  },
                  {
                    "index": "ic",
                    "ranges": ["3 <= c"],
                    "index_dives_for_eq_ranges": true,
                    "rowid_ordered": false,
                    "using_mrr": false,
                    "index_only": false,
                    "rows": 17,
                    "cost": 21.41,
                    "chosen": false,
                    "cause": "cost"
                  }
                ], 

range_scan_alternatives 计算 range_scan,各个索引的开销,从下面的后果能够看出,联结索引 iabc 开销较小,应该抉择 iabc。

 "considered_execution_plans": [
          {"plan_prefix": [],
            "table": "`t1`",
            "best_access_path": {
              "considered_access_paths": [
                {
                  "access_type": "range",
                  "rows": 3,
                  "cost": 2.2146,
                  "chosen": true
                }
              ]
            },
            "cost_for_plan": 2.2146,
            "rows_for_plan": 3,
            "chosen": true
          }
        ] 

considered_execution_plans 表索引抉择过程,access_type 是 range,rows_for_plan=3,到这里为止,执行打算还是合乎预期的。

 {
        "clause_processing": {
          "clause": "ORDER BY",
          "original_clause": "`t1`.`c`",
          "items": [
            {"item": "`t1`.`c`"}
          ],
          "resulting_clause_is_simple": true,
          "resulting_clause": "`t1`.`c`"
        }
      },
      {
        "refine_plan": [
          {
            "table": "`t1`",
            "access_type": "index_scan"
          }
        ]
      },
      {
        "reconsidering_access_paths_for_index_ordering": {
          "clause": "ORDER BY",
          "index_order_summary": {
            "table": "`t1`",
            "index_provides_order": false,
            "order_direction": "undefined",
            "index": "unknown",
            "plan_changed": false
          }
        }
      } 

clause_processing 用于简化 order by,通过 clause_processing access_type 变成 index_scan(全索引扫描,过滤性较 range 差),此时呈现了和预期不符的后果。

因而能够揣测优化器试图优化 order by 时呈现了谬误:

  • 第一阶段,优化器抉择了索引 iabc,采纳 range 拜访;
  • 第二阶段,优化器试图进一步优化执行打算,应用 order by 的列拜访,并清空了第一阶段的后果;
  • 第三阶段,优化器发现应用 order by 的列拜访,代价比第一阶段的后果更大,然而第一阶段后果曾经被清空了,无奈还原,于是抉择了代价较大的拜访形式(index_scan),触发了 bug。

问题解决

  1. 咱们在索引优化函数 SQL_SELECT::test_quick_select 最开始的时候保留拜访打算变量(quick);
  2. 在索引没变的时候,还原这个变量;
  3. 在索引产生扭转的时候,删除这个变量。

在不批改 mysql 源码的状况下,能够通过 force index 强制指定索引躲避这个 bug。

SQL_SELECT::test_quick_select 调用栈如下

 #0  SQL_SELECT::test_quick_select
    #1  make_join_select
    #2  JOIN::optimize
    #3  mysql_execute_select
    #4  mysql_select
    #5  mysql_explain_unit
    #6  explain_query_expression
    #7  execute_sqlcom_select
    #8  mysql_execute_command
    #9  mysql_parse
    #10 dispatch_command
    #11 do_command

正文完
 0