作者:阙秀平

爱可生 dble 团队测试成员,次要负责 dble 需要测试,自动化编写和社区问题解答。热衷rua雍正。

本文起源:原创投稿

*爱可生开源社区出品,原创内容未经受权不得随便应用,转载请分割小编并注明起源。


一、性能背景

咱们晓得, dble 是基于 MySQL 的⾼可扩展性的分布式中间件,而 MySQL 善于的是联机事务处理(OLTP),那么面对越来越多的联机剖析(OLAP)需要,MySQL 就显得有些顾此失彼了。为了可能提供良好的联机剖析(OLAP)能力,dble 在 3.22.01 版本提供了解决方案,能够在后端节点反对 ClickHouse ,借助 ClickHouse 的能力,提供弱小的联机剖析(OLAP)服务。

那什么是 ClickHouse?为什么抉择ClickHouse呢?ClickHouse是一个用于联机剖析(OLAP)的列式数据库管理系统(DBMS)。绝对于行式数据库 MySQL ,对于大多数查问而言,列式数据库处理速度至多进步100倍。

那 dble 是怎么实现的呢?上面来简略操作一下。

二、环境筹备

服务器上部署四个 docker 容器:

1.部署两个 5.7.25 版本的 MySQL 数据库。

2.部署 3.22.01 版本 dble 环境,装置可参考:【0.3 疾速开始 · dble manual】(https://actiontech.github.io/...)

3.部署 22.6.1 版本 ClickHouse 数据库,装置可参考:【装置 | ClickHouse Docs】(https://clickhouse.com/docs/z...),验证 ClickHosue 可用。

# clickhouse-client -u test --password password -h 127.0.0.1ClickHouse client version 22.6.1.1985 (official build).Connecting to 127.0.0.1:9000 as user test.Connected to ClickHouse server version 22.6.1 revision 54455.clickhouse-1 :) show databases;SHOW DATABASESQuery id: b2a417e7-7a76-4461-896d-961540eb60a1┌─name─────────┐│ INFORMATION_SCHEMA ││ default            ││ information_schema ││ system             │└────────────┘4 rows in set. Elapsed: 0.003 sec.

三、dble+ClickHouse

1.进入{install_dir}/dble/conf/user.xml ,新增用户 analysisUser 。

<?xml version='1.0' encoding='utf-8'?><!DOCTYPE dble:user SYSTEM "user.dtd"><dble:user xmlns:dble="http://dble.cloud/">     <managerUser name="root" password="password"/>     <analysisUser name="ana1" password="password" dbGroup="ha_group3" /></dble:user>
  • user.xml 配置留神:

    (1). 一个 analysisUser 仅对应一个 dbGroup 。

    (2). 多个 analysisUser 可援用同一个 dbGroup 。

2.进入{install_dir}/dble/conf/db.xml,新增 ha_group3

<?xml version='1.0' encoding='utf-8'?><!DOCTYPE dble:db SYSTEM "db.dtd"><dble:db xmlns:dble="http://dble.cloud/">    <dbGroup rwSplitMode="0" name="ha_group3" delayThreshold="100" >        <heartbeat>select user()</heartbeat>        <dbInstance name="hostM3" password="password" url="172.100.9.13:9004" user="test" maxCon="1000" minCon="10" primary="true" databaseType="clickhouse"/>    </dbGroup></dble:db>
  • db.xml配置留神:

    (1). heartbeat 中下发心跳的语句须要是 ClickHouse 反对的语法。

    (2). analysisUser 用户所援用的 dbgroup 中配置的数据库必须和 databaseType 对立。

    (3). ClickHosue 的端口需配置 9004 ,9004 提供 MySQL 的协定。

3.dble 启动胜利,应用 analysisUser 用户登录。

# mysql -h127.0.0.1 -uana1 -ppassword -P8066#下发show databases语句返回的后果可知,dble未应用分库分表性能,仅展现后端 clickhouse 的数据库信息mysql> show databases;+--------------------+| name               |+--------------------+| INFORMATION_SCHEMA || default            || information_schema || system             |+--------------------+4 rows in set (0.00 sec)#进入default库mysql> use default;Database changed, 1 warning#建表mysql>  CREATE TABLE t1 (x String) ENGINE = TinyLog;;Query OK, 1 row affected (0.03 sec)#检查表是否正确mysql> desc t1;+------+--------+--------------+--------------------+---------+------------------+----------------+| name | type   | default_type | default_expression | comment | codec_expression | ttl_expression |+------+--------+--------------+--------------------+---------+------------------+----------------+| x    | String |              |                    |         |                  |                |+------+--------+--------------+--------------------+---------+------------------+----------------+1 row in set (0.00 sec)#插入数据mysql> INSERT INTO t1 VALUES(1);Query OK, 1 row affected (0.00 sec)#检查数据是否正确插入mysql>  SELECT x, toTypeName(x) FROM t1;+------+---------------+| x    | toTypeName(x) |+------+---------------+| 1    | String        |+------+---------------+1 row in set (0.00 sec)

进入 ClickHouse 中验证。

# clickhouse-client -u test --password password -h 127.0.0.1#进入默认库clickhouse-1 :) use default;USE defaultQuery id: eac19b96-6da6-4d77-8258-e5a827c31685Ok.0 rows in set. Elapsed: 0.002 sec.# 查看表是否建设胜利clickhouse-1 :) show tables;SHOW TABLESQuery id: 02f5018d-4b7b-4348-be5a-89fdcdbc3aa5┌─name─┐│ t1   │└──────┘1 row in set. Elapsed: 0.003 sec.#查看表是否正确clickhouse-1 :) desc t1;DESCRIBE TABLE t1Query id: 6721fa63-c52c-4236-8c4a-27a1ffdcd059┌─name─┬─type───┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐│ x    │ String │              │                    │         │                  │                │└──────┴────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘1 row in set. Elapsed: 0.002 sec.# 查看数据是否和dble查问的统一clickhouse-1 :) SELECT x, toTypeName(x) FROM t1;SELECT    x,    toTypeName(x)FROM t1Query id: 548ea88e-b1a1-4362-912d-ffa770c7c1be┌─x─┬─toTypeName(x)─┐│ 1 │ String        │└───┴───────────────┘2 rows in set. Elapsed: 0.002 sec.

4.在 ClickHouse 导入 4.4G 的数据,比照 ClickHouse 和 dble 的查问后果。

#clickhouse容器装置门路下 下载数据,网速好的话五分钟左右可下载实现wget http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-complete.csv#clickhouse-client -u test --password password -h 127.0.0.1,登录默认库建表CREATE TABLE uk_price_paid(    price UInt32,    date Date,    postcode1 LowCardinality(String),    postcode2 LowCardinality(String),    type Enum8('terraced' = 1, 'semi-detached' = 2, 'detached' = 3, 'flat' = 4, 'other' = 0),    is_new UInt8,    duration Enum8('freehold' = 1, 'leasehold' = 2, 'unknown' = 0),    addr1 String,    addr2 String,    street LowCardinality(String),    locality LowCardinality(String),    town LowCardinality(String),    district LowCardinality(String),    county LowCardinality(String),    category UInt8) ENGINE = MergeTree ORDER BY (postcode1, postcode2, addr1, addr2);#clickhouse容器装置门路下 导入数据,可能耗时40s左右clickhouse-local --input-format CSV --structure '    uuid String,    price UInt32,    time DateTime,    postcode String,    a String,    b String,    c String,    addr1 String,    addr2 String,    street String,    locality String,    town String,    district String,    county String,    d String,    e String' --query "    WITH splitByChar(' ', postcode) AS p    SELECT        price,        toDate(time) AS date,        p[1] AS postcode1,        p[2] AS postcode2,        transform(a, ['T', 'S', 'D', 'F', 'O'], ['terraced', 'semi-detached', 'detached', 'flat', 'other']) AS type,        b = 'Y' AS is_new,        transform(c, ['F', 'L', 'U'], ['freehold', 'leasehold', 'unknown']) AS duration,        addr1,        addr2,        street,        locality,        town,        district,        county,        d = 'B' AS category    FROM table" --date_time_input_format best_effort < pp-complete.csv | clickhouse-client --query "INSERT INTO uk_price_paid FORMAT TSV"

登录 dble 和 ClickHouse 查看比照数据。

#ClickHouse侧后果clickhouse-1 :) use default;0 rows in set. Elapsed: 0.001 sec.clickhouse-1 :) SELECT count() FROM uk_price_paid;┌──count()─┐│ 27176256 │└──────────┘1 row in set. Elapsed: 0.003 sec.clickhouse-1 :) SELECT formatReadableSize(total_bytes) FROM system.tables WHERE name = 'uk_price_paid';┌─formatReadableSize(total_bytes)─┐│ 235.39 MiB                      │└─────────────────────────────────┘1 row in set. Elapsed: 0.003 sec.clickhouse-1 :) SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 1000000, 80) FROM uk_price_paid GROUP BY year ORDER BY year;┌─year─┬──price─┬─bar(round(avg(price)), 0, 1000000, 80)─┐│ 1995 │  67933 │ █████▍                                 ││ 1996 │  71507 │ █████▋                                 ││ 1997 │  78536 │ ██████▎                                ││ 1998 │  85439 │ ██████▋                                ││ 1999 │  96038 │ ███████▋                               ││ 2000 │ 107486 │ ████████▌                              ││ 2001 │ 118888 │ █████████▌                             ││ 2002 │ 137945 │ ███████████                            ││ 2003 │ 155893 │ ████████████▍                          ││ 2004 │ 178887 │ ██████████████▎                        ││ 2005 │ 189356 │ ███████████████▏                       ││ 2006 │ 203530 │ ████████████████▎                      ││ 2007 │ 219379 │ █████████████████▌                     ││ 2008 │ 217054 │ █████████████████▎                     ││ 2009 │ 213418 │ █████████████████                      ││ 2010 │ 236107 │ ██████████████████▊                    ││ 2011 │ 232803 │ ██████████████████▌                    ││ 2012 │ 238381 │ ███████████████████                    ││ 2013 │ 256923 │ ████████████████████▌                  ││ 2014 │ 279984 │ ██████████████████████▍                ││ 2015 │ 297263 │ ███████████████████████▋               ││ 2016 │ 313470 │ █████████████████████████              ││ 2017 │ 346297 │ ███████████████████████████▋           ││ 2018 │ 350486 │ ████████████████████████████           ││ 2019 │ 351985 │ ████████████████████████████▏          ││ 2020 │ 375697 │ ██████████████████████████████         ││ 2021 │ 379729 │ ██████████████████████████████▍        ││ 2022 │ 370402 │ █████████████████████████████▋         │└──────┴────────┴────────────────────────────────────────┘28 rows in set. Elapsed: 0.064 sec. Processed 27.18 million rows, 163.06 MB (423.68 million rows/s., 2.54 GB/s.)


#dble侧后果mysql> SELECT count() FROM uk_price_paid;+----------+| count()  |+----------+| 27176256 |+----------+1 row in set (0.00 sec)mysql>mysql> SELECT count() FROM uk_price_paid;+----------+| count()  |+----------+| 27176256 |+----------+1 row in set (0.00 sec)mysql> SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 1000000, 80) FROM uk_price_paid GROUP BY year ORDER BY year;+------+--------+-----------------------------------------------------------------------------------------------+| year | price  | bar(round(avg(price)), 0, 1000000, 80)                                                        |+------+--------+-----------------------------------------------------------------------------------------------+| 1995 |  67933 | █████▍                                                                                        || 1996 |  71507 | █████▋                                                                                        || 1997 |  78536 | ██████▎                                                                                       || 1998 |  85439 | ██████▋                                                                                       || 1999 |  96038 | ███████▋                                                                                      || 2000 | 107486 | ████████▌                                                                                     || 2001 | 118888 | █████████▌                                                                                    || 2002 | 137945 | ███████████                                                                                   || 2003 | 155893 | ████████████▍                                                                                 || 2004 | 178887 | ██████████████▎                                                                               || 2005 | 189356 | ███████████████▏                                                                              || 2006 | 203530 | ████████████████▎                                                                             || 2007 | 219379 | █████████████████▌                                                                            || 2008 | 217054 | █████████████████▎                                                                            || 2009 | 213418 | █████████████████                                                                             || 2010 | 236107 | ██████████████████▊                                                                           || 2011 | 232803 | ██████████████████▌                                                                           || 2012 | 238381 | ███████████████████                                                                           || 2013 | 256923 | ████████████████████▌                                                                         || 2014 | 279984 | ██████████████████████▍                                                                       || 2015 | 297263 | ███████████████████████▋                                                                      || 2016 | 313470 | █████████████████████████                                                                     || 2017 | 346297 | ███████████████████████████▋                                                                  || 2018 | 350486 | ████████████████████████████                                                                  || 2019 | 351985 | ████████████████████████████▏                                                                 || 2020 | 375697 | ██████████████████████████████                                                                || 2021 | 379729 | ██████████████████████████████▍                                                               || 2022 | 370402 | █████████████████████████████▋                                                                |+------+--------+-----------------------------------------------------------------------------------------------+28 rows in set (0.07 sec)

四、dble+MySQL+ClickHouse

1.批改 user.xml 的配置

<?xml version='1.0' encoding='utf-8'?><!DOCTYPE dble:user SYSTEM "user.dtd"><dble:user xmlns:dble="http://dble.cloud/">   <managerUser name="root" password="password"/>   <shardingUser name="test" password="password" schemas="schema1"/>   <analysisUser name="ana1" password="password" dbGroup="ha_group3" /></dble:user

2.批改 db.xml 的配置

<!DOCTYPE dble:db SYSTEM "db.dtd"><dble:db xmlns:dble="http://dble.cloud/">   <dbGroup rwSplitMode="0" name="ha_group1" delayThreshold="100" >        <heartbeat>select user()</heartbeat>        <dbInstance name="hostM1" password="password" url="172.100.9.5:3307" user="test" maxCon="1000" minCon="10" primary="true">        </dbInstance>    </dbGroup>    <dbGroup rwSplitMode="0" name="ha_group2" delayThreshold="100" >        <heartbeat>select user()</heartbeat>        <dbInstance name="hostM2" password="password" url="172.100.9.6:3307" user="test" maxCon="1000" minCon="10" primary="true">        </dbInstance>    </dbGroup>        <dbGroup rwSplitMode="0" name="ha_group3" delayThreshold="100" >        <heartbeat>select user()</heartbeat>        <dbInstance name="hostM3" password="password" url="172.100.9.13:9004" user="test" maxCon="1000" minCon="10" primary="true" databaseType="clickhouse"/>    </dbGroup>    </dble:db>
  • db.xml 配置留神

    (1). shardingUser 用户所援用的 dbgroup 中 databaseType 的参数值只能是 MySQL ,当 databaseType 未设置时,默认是 MySQL 。

    (2). shardingUser 援用的 dbGroup ,须要被配置的 schemas 对应的 sharding.xml 中的 shardingNode 所援用

3.批改 sharding.xml 配置

<?xml version='1.0' encoding='utf-8'?><!DOCTYPE dble:sharding SYSTEM "sharding.dtd"><dble:sharding xmlns:dble="http://dble.cloud/">    <schema shardingNode="dn1" name="schema1" sqlMaxLimit="100">        <shardingTable name="sharding_4_t1" shardingNode="dn1,dn2,dn3,dn4" function="hash-four" shardingColumn="id"/>    </schema>    <shardingNode dbGroup="ha_group1" database="db1" name="dn1" />    <shardingNode dbGroup="ha_group2" database="db1" name="dn2" />    <shardingNode dbGroup="ha_group1" database="db2" name="dn3" />    <shardingNode dbGroup="ha_group2" database="db2" name="dn4" />    <function class="Hash" name="hash-four">      <property name="partitionCount">4</property>      <property name="partitionLength">1</property>   </function>    </dble:sharding>

4.dble 启动胜利,别离应用 shardingUser 用户和 analysisUser 用户登录。

#mysql -h127.0.0.1 -utest -ppassword -P8066 分库分表用户mysql> show databases;+----------+| DATABASE |+----------+| schema1  |+----------+1 row in set (0.01 sec)


#mysql -h127.0.0.1 -uana1 -ppassword -P8066 analysisUser用户mysql> show databases;+--------------------+| name               |+--------------------+| INFORMATION_SCHEMA || default            || information_schema || system             |+--------------------+4 rows in set (0.00 sec)

一种可能的业务架构:

1.OLTP:client 端会发送申请至 dble ,dble(shardingUser,rwSplitUser)会将语句发送至 MySQL ,而后返回申请。

2.复制:dble 后端的 MySQL 的数据会同步至 MySQL 汇总数据库,为了应用 ClickHouse 剖析数据,应用工具把 MySQL 汇总数据库的数据同步至 ClickHouse 。

3.OLAP:client 端会发送申请至 dble ,dble(analysisUser)会将语句发送至 ClickHouse ,而后返回申请。实现 dble+MySQL+ClickHouse 的数据分析。