乐趣区

关于数据库:带着问题读-TiDB-源码Power-BI-Desktop-以-MySQL-驱动连接-TiDB-报错

​常有人说,浏览源码是每个优良开发工程师的必经之路,然而在面对像相似 TiDB 这样简单的零碎时,源码浏览是一个十分宏大的工程。而对一些 TiDB User 来说,从本人日常遇到的问题登程,反过来浏览源码就是一个不错的切入点,因而咱们策动了《带着问题读源码》系列文章。

本文为该系列的第二篇,从一个 Power BI Desktop 在 TiDB 上体现异样的问题为例,介绍从问题的发现、定位,到通过开源社区提 issue、写 PR 解决问题的流程,从代码实现的角度来做 trouble shooting,心愿可能帮忙大家更好地理解 TiDB 源码。

首先咱们重现一下失败的场景(TiDB 5.1.1 on MacOS),建一个简略的只有一个字段的表:

CREATE TABLE test(name VARCHAR(1) PRIMARY KEY);

MySQL 上能够 TiDB 上就不能够,报错

DataSource.Error: An error happened while reading data from the provider: ‘Failed to enable constraints. One or more rows contain values violating non-null, unique, or foreign-key constraints.’
Details:

DataSourceKind=MySql
DataSourcePath=localhost:4000;test

看 general log TiDB 上最初一条跑的 SQL 是:

select COLUMN_NAME, ORDINAL_POSITION, IS_NULLABLE, DATA_TYPE, case when NUMERIC_PRECISION is null then null when DATA_TYPE in ('FLOAT', 'DOUBLE') then 2 else 10 end AS NUMERIC_PRECISION_RADIX, NUMERIC_PRECISION, NUMERIC_SCALE,            CHARACTER_MAXIMUM_LENGTH, COLUMN_DEFAULT, COLUMN_COMMENT AS DESCRIPTION, COLUMN_TYPE  from INFORMATION_SCHEMA.COLUMNS  where table_schema = 'test' and table_name = 'test';

咱们用 tiup 启动一个 TiDB 集群,应用 tiup client 执行该命令,tiup client 也会报错:

error: mysql: sql: Scan error on column index 4, name “NUMERIC_PRECISION_RADIX”: converting NULL to int64 is unsupported

那咱们的注意力就集中在解决这条语句的问题,咱们先看 tiup client 上报的这个错意味着什么。tiup client 应用的是 golang xo/usql 库,然而在 xo/usql 库中,咱们并不能找到对应的报错信息,grep converting 关键字返回极无限且无关的内容。咱们再看 xo/usql 的 mysql driver,其中又援用到了 go-sql-driver/mysql,下载它的代码并 grep converting,只返回了 changelog 中的一条信息,大概率报错的中央也不在这个库中。浏览一下 go-sql-driver/mysql 中的代码,发现它依赖于 database/sql,那咱们看看 database/sql 的内容。database/sql 是 golang 的规范库,所以咱们须要下载 golang 的源码。在 golang 的 database 目录中 grep converting,很快就找到了与报错信息相符的内容:

go/src/database/sql/convert.go

case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
        if src == nil {return fmt.Errorf("converting NULL to %s is unsupported", dv.Kind())
        }
        s := asString(src)
        i64, err := strconv.ParseInt(s, 10, dv.Type().Bits())
        if err != nil {err = strconvErr(err)
                return fmt.Errorf("converting driver.Value type %T (%q) to a %s: %v", src, s, dv.Kind(), err)
        }
        dv.SetInt(i64)
        return nil

咱们再追踪这个片段,看这里的类型是如何来的,最终咱们会回到 go-sql-driver/mysql 中:

mysql/fields.go

        case fieldTypeLongLong:
                if mf.flags&flagNotNULL != 0 {
                        if mf.flags&flagUnsigned != 0 {return scanTypeUint64}
                        return scanTypeInt64
                }
                return scanTypeNullInt

这部分的代码是在解析语句返回体中的 column definition,转换成 golang 中的类型。咱们能够应用 mysql --host 127.0.0.1 --port 4000 -u root --column-type-info 连上后查看有问题的 SQL 返回的 column metadata:

MySQL

Field 5: `NUMERIC_PRECISION_RADIX`
Catalog: `def`
Database: `` 
Table: ``
Org_table: ``
Type: LONGLONG
Collation: binary (63)
Length: 3
Max_length: 0
Decimals: 0
Flags: BINARY NUM

TiDB

Field 5: `NUMERIC_PRECISION_RADIX`
Catalog: `def`
Database: ``
Table: ``
Org_table: ``
Type: LONGLONG
Collation: binary (63)
Length: 2
Max_length: 0
Decimals: 0
Flags: NOT_NULL BINARY NUM

能够很显著的看到,tiup client 报错信息中的 NUMERIC_PRECISION_RADIX 字段的 column definition 在 TiDB 上有显著的问题,该字段在 TiDB 的返回体中被标记为了 NOT_NULL,很显著这是不合理的,因为该字段显然能够是 NULL,MySQL 的返回值也体现了这一点。所以 xo/usql 在解决返回体的时候报错了。到了这里,咱们曾经发现了 client 端为什么会报错,上面咱们就须要去寻找 TiDB 为什么会返回一个谬误的 column definition。

通过 TiDB Dev Guide 咱们能够晓得 TiDB 中一条 DQL 语句的大体执行过程,咱们从入口的 server/conn.go#clientConn.Run 往下看去,一路通过 server/conn.go#clientConn.dispatchserver/conn.go#clientConn.handleQueryserver/conn.go#clientConn.handleStmtserver/driver_tidb.go#TiDBContext.ExecuteStmtsession/session.go#session.ExecuteStmtexecutor/compiler.go#Compiler.Compileplanner/optimize.go#Optimizeplanner/optimize.go#optimizeplanner/core/planbuilder.go#PlanBuilder.Buildplanner/core/logical_plan_builder.go#PlanBuilder.buildSelect,在 buildSelect 中,咱们能够看到 TiDB planner 对查问语句进行的一系列解决,而后咱们就能够走到 planner/core/expression_rewriter.go#PlanBuilder.rewriteWithPreprocessplanner/core/expression_rewriter.go#PlanBuilder.rewriteExprNode,在 rewriteExprNode 中,会把有问题的字段 NUMERIC_PRECISION_RADIX 进行解析,最终这条 CASE 表达式的解析会在 expression/builtin_control.go#caseWhenFunctionClass.getFunction 中,咱们终于走到了计算 CASE 表达式返回的 column definition 的中央(这依赖于遍历 compiler 解析出的 AST):

    for i := 1; i < l; i += 2 {fieldTps = append(fieldTps, args[i].GetType())
        decimal = mathutil.Max(decimal, args[i].GetType().Decimal)
        if args[i].GetType().Flen == -1 {flen = -1} else if flen != -1 {flen = mathutil.Max(flen, args[i].GetType().Flen)
        }
        isBinaryStr = isBinaryStr || types.IsBinaryStr(args[i].GetType())
        isBinaryFlag = isBinaryFlag || !types.IsNonBinaryStr(args[i].GetType())
    }
    if l%2 == 1 {fieldTps = append(fieldTps, args[l-1].GetType())
        decimal = mathutil.Max(decimal, args[l-1].GetType().Decimal)
        if args[l-1].GetType().Flen == -1 {flen = -1} else if flen != -1 {flen = mathutil.Max(flen, args[l-1].GetType().Flen)
        }
        isBinaryStr = isBinaryStr || types.IsBinaryStr(args[l-1].GetType())
        isBinaryFlag = isBinaryFlag || !types.IsNonBinaryStr(args[l-1].GetType())
    }


    fieldTp := types.AggFieldType(fieldTps)
    // Here we turn off NotNullFlag. Because if all when-clauses are false,
    // the result of case-when expr is NULL.
    types.SetTypeFlag(&fieldTp.Flag, mysql.NotNullFlag, false)
    tp := fieldTp.EvalType()


    if tp == types.ETInt {decimal = 0}
    fieldTp.Decimal, fieldTp.Flen = decimal, flen
    if fieldTp.EvalType().IsStringKind() && !isBinaryStr {fieldTp.Charset, fieldTp.Collate = DeriveCollationFromExprs(ctx, args...)
        if fieldTp.Charset == charset.CharsetBin && fieldTp.Collate == charset.CollationBin {// When args are Json and Numerical type(eg. Int), the fieldTp is String.
            // Both their charset/collation is binary, but the String need a default charset/collation.
            fieldTp.Charset, fieldTp.Collate = charset.GetDefaultCharsetAndCollate()}
    } else {fieldTp.Charset, fieldTp.Collate = charset.CharsetBin, charset.CollationBin}
    if isBinaryFlag {fieldTp.Flag |= mysql.BinaryFlag}
    // Set retType to BINARY(0) if all arguments are of type NULL.
    if fieldTp.Tp == mysql.TypeNull {
        fieldTp.Flen, fieldTp.Decimal = 0, types.UnspecifiedLength
        types.SetBinChsClnFlag(fieldTp)
    }

查看如上计算 column definition flag 的代码咱们能够发现,无论 CASE 表达式的状况是怎么样的,NOT_NULL 标记位都肯定会被设置成 false,所以问题不呈现在这里!这个时候咱们只能沿着下面的代码门路往回看,看看下面生成的 column definition 在后续有没有被批改。终于在 server/conn.go#clientConn.handleStmt 中,发现它调用了 server/conn.go#clientConn.writeResultSet,而后又陆续调用了 server/conn.go#clientConn.writeChunksserver/conn.go#clientConn.writeColumnInfoserver/column.go#ColumnInfo.Dumpserver/column.go#dumpFlag,在 dumpFlag 中,之前生成的 column definition flag 被批改了:

func dumpFlag(tp byte, flag uint16) uint16 {
    switch tp {
    case mysql.TypeSet:
        return flag | uint16(mysql.SetFlag)
    case mysql.TypeEnum:
        return flag | uint16(mysql.EnumFlag)
    default:
        if mysql.HasBinaryFlag(uint(flag)) {return flag | uint16(mysql.NotNullFlag)
        }
        return flag
    }
}

终于,咱们找到了 TiDB 返回谬误的 column definition 的起因!其实这个 bug 在 TiDB 最新版 5.2.0 中曾经被修复了:*: fix some problems related to notNullFlag by wjhuang2016 · Pull Request #27697 · pingcap/tidb。

最初,在上述浏览代码的过程中,咱们其实最好可能看到被 TiDB 解析后的 AST 是什么样子的,这样在最初遍历 AST 的过程中,才不至于摸瞎。TiDB dev guide 中有 parser 章节解说如何调试 parser,parser/quickstart.md at master · pingcap/parser 中也有样例输入生成的 AST,然而简略地输入根本没有任何作用,咱们能够应用 davecgh/go-spew 间接输入 parser 生成的 node,这样就能取得一个可被人了解的 tree:

package main

import (
        "fmt"
        "github.com/pingcap/parser"
        "github.com/pingcap/parser/ast"
        _ "github.com/pingcap/parser/test_driver"
        "github.com/davecgh/go-spew/spew"
)

func parse(sql string) (*ast.StmtNode, error) {p := parser.New()
        stmtNodes, _, err := p.Parse(sql, "","")
        if err != nil {return nil, err}
        return &stmtNodes[0], nil
}

func main() {
        spew.Config.Indent = " "
        astNode, err := parse("SELECT a, b FROM t")
        if err != nil {fmt.Printf("parse error: %v\n", err.Error())
                return
        }
        fmt.Printf("%s\n", spew.Sdump(*astNode))
}
(*ast.SelectStmt)(0x140001dac30)({dmlNode: (ast.dmlNode) {stmtNode: (ast.stmtNode) {node: (ast.node) {text: (string) (len=18) "SELECT a, b FROM t"
            }
        }
    },
    resultSetNode: (ast.resultSetNode) {resultFields: ([]*ast.ResultField) <nil>
    },
    SelectStmtOpts: (*ast.SelectStmtOpts)(0x14000115bc0)({Distinct: (bool) false,
        SQLBigResult: (bool) false,
        SQLBufferResult: (bool) false,
        SQLCache: (bool) true,
        SQLSmallResult: (bool) false,
        CalcFoundRows: (bool) false,
        StraightJoin: (bool) false,
        Priority: (mysql.PriorityEnum) 0,
        TableHints: ([]*ast.TableOptimizerHint) <nil>
    }),
    Distinct: (bool) false,
    From: (*ast.TableRefsClause)(0x140001223c0)({node: (ast.node) {text: (string) ""
        },
        TableRefs: (*ast.Join)(0x14000254100)({node: (ast.node) {text: (string) ""
            },
            resultSetNode: (ast.resultSetNode) {resultFields: ([]*ast.ResultField) <nil>
            },
            Left: (*ast.TableSource)(0x14000156480)({node: (ast.node) {text: (string) ""
                },
                Source: (*ast.TableName)(0x1400013a370)({node: (ast.node) {text: (string) ""
                    },
                    resultSetNode: (ast.resultSetNode) {resultFields: ([]*ast.ResultField) <nil>
                    },
                    Schema: (model.CIStr) ,
                    Name: (model.CIStr) t,
                    DBInfo: (*model.DBInfo)(<nil>),
                    TableInfo: (*model.TableInfo)(<nil>),
                    IndexHints: ([]*ast.IndexHint) <nil>,
                    PartitionNames: ([]model.CIStr) {}}),
                AsName: (model.CIStr)
            }),
            Right: (ast.ResultSetNode) <nil>,
            Tp: (ast.JoinType) 0,
            On: (*ast.OnCondition)(<nil>),
            Using: ([]*ast.ColumnName) <nil>,
            NaturalJoin: (bool) false,
            StraightJoin: (bool) false
        })
    }),
    Where: (ast.ExprNode) <nil>,
    Fields: (*ast.FieldList)(0x14000115bf0)({node: (ast.node) {text: (string) ""
        },
        Fields: ([]*ast.SelectField) (len=2 cap=2) {(*ast.SelectField)(0x140001367e0)({node: (ast.node) {text: (string) (len=1) "a"
                },
                Offset: (int) 7,
                WildCard: (*ast.WildCardField)(<nil>),
                Expr: (*ast.ColumnNameExpr)(0x14000254000)({exprNode: (ast.exprNode) {node: (ast.node) {text: (string) ""
                        },
                        Type: (types.FieldType) unspecified,
                        flag: (uint64) 8
                    },
                    Name: (*ast.ColumnName)(0x1400017dc70)(a),
                    Refer: (*ast.ResultField)(<nil>)
                }),
                AsName: (model.CIStr) ,
                Auxiliary: (bool) false
            }),
            (*ast.SelectField)(0x14000136840)({node: (ast.node) {text: (string) (len=1) "b"
                },
                Offset: (int) 10,
                WildCard: (*ast.WildCardField)(<nil>),
                Expr: (*ast.ColumnNameExpr)(0x14000254080)({exprNode: (ast.exprNode) {node: (ast.node) {text: (string) ""
                        },
                        Type: (types.FieldType) unspecified,
                        flag: (uint64) 8
                    },
                    Name: (*ast.ColumnName)(0x1400017dce0)(b),
                    Refer: (*ast.ResultField)(<nil>)
                }),
                AsName: (model.CIStr) ,
                Auxiliary: (bool) false
            })
        }
    }),
    GroupBy: (*ast.GroupByClause)(<nil>),
    Having: (*ast.HavingClause)(<nil>),
    WindowSpecs: ([]ast.WindowSpec) <nil>,
    OrderBy: (*ast.OrderByClause)(<nil>),
    Limit: (*ast.Limit)(<nil>),
    LockTp: (ast.SelectLockType) none,
    TableHints: ([]*ast.TableOptimizerHint) <nil>,
    IsAfterUnionDistinct: (bool) false,
    IsInBraces: (bool) false,
    QueryBlockOffset: (int) 0,
    SelectIntoOpt: (*ast.SelectIntoOption)(<nil>)
})
退出移动版