关于数据库:使用TPCH-进行GreatSQL并行查询测试

31次阅读

共计 11370 个字符,预计需要花费 29 分钟才能阅读完成。

筹备工作

数据库版本

GreatSQL-8.0.25-17

生成数据

应用 TPC-H 生成数据

#TPC-H Population Generator (Version 3.0.0)

#生成 10G 的数据
$ ./dbgen -vf -s 10

批改 my.cnf

vim /etc/my.cnf 

#设置 IPB 为 8G
innodb_buffer_pool_size = 8G

#设置并行查问的应用最大内存(此处为 8G,依据具体配置设置)
parallel_memory_limit= 8G

#关上并行查问
force_parallel_execute=1

 #设置双 1(不便导入数据)innodb_flush_log_at_trx_commit = 1 
sync_binlog = 1

#敞开 binlog
skip-log_bin

datadir    = /data/GreatSQL
socket    = mysql.sock

启动数据库后,能够查看配置是否失效

mysql> show variables like '%double%';
mysql> show variables like 'log_bin';
mysql> show variables like 'sync_binlog';
mysql> show variables like 'innodb_flush_log_at_trx_commit';
mysql> show variables like 'innodb_buffer_pool_size';

并行查问相干参数

mysql> show global variables like '%parall%';
+----------------------------------+----------------+
| force_parallel_execute           | ON             |
| innodb_parallel_dblwr_encrypt    | OFF            |
| innodb_parallel_doublewrite_path | xb_doublewrite |
| innodb_parallel_read_threads     | 4              |
| parallel_cost_threshold          | 1000           |
| parallel_default_dop             | 4              |
| parallel_max_threads             | 64             |
| parallel_memory_limit            | 8589934592     |
| parallel_queue_timeout           | 0              |
| slave_parallel_type              | LOGICAL_CLOCK  |
| slave_parallel_workers           | 2              |
+----------------------------------+----------------+
11 rows in set (0.01 sec)

启动数据库

启动数据库:

$ systemctl start greatsql.service 

文件筹备

本次的工作在 /data/tpch

可执行程序为 dbgen, 依赖一个数据分布文件 dists.dss。能够将 dbgen 和 dists.dss 拷贝到同一目录应用

dss.ddl dss.ri 文件

筹备表构造和索引文件 dss.ddldss.ri 到工作目录

$ cd /data/tpch/tpch_2.18.0/dbgen
$ cp dss.ri /data/tpch/
$ cp dss.ddl /data/tpch/

load.sql

批改 load.sql 文件 批改文件的门路

$ cd /data/tpch/
$ cp load.sql loadfix.sql 
$ vim loadfix.sql

导入数据

$ /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock

#创立数据库
mysql> create database tpch;
mysql> use tpch;

#导入表构造
mysql> source /data/tpch/dss.ddl;

#导入数据(文件见附录)mysql> sh loadfile

#导入索引、外键等
mysql> source /data/tpch/dssfix.ri

注:binlog 要关再导入, 否则 binlog 会爆

Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage; increase this mysqld variable and try again

查看导入的表(20G 数据)

mysql> select table_name,table_rows 
from information_schema.tables 
where table_name in 
('customer','lineitem','nation','orders','part','partsupp','region','supplier');

+------------+------------+
| TABLE_NAME | TABLE_ROWS |
+------------+------------+
| region     |          5 |
| nation     |         25 |
| part       |    3860136 |
| supplier   |     197853 |
| customer   |    2884322 |
| partsupp   |   17084176 |
| orders     |   29678499 |
| lineitem   |   87786966 |
+------------+------------+
8 rows in set (0.00 sec)

编写并运行测试脚本

# 测试脚本见附录
#这里是在 tmux 中运行,防止因为终端敞开导致测试终止
$ sh auto.sh&

#测试完结后在以后脚本的目录查看生成的日志
$ cat tpch-PQ-******.log

测试运行时,察看相干指标。

mysql> show global status like '%PQ%';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| PQ_memory_refused  | 0     |
| PQ_memory_used     | 0     |
| PQ_threads_refused | 0     |
| PQ_threads_running | 0     |
+--------------------+-------+
4 rows in set (0.00 sec)


mysql> show processlist;
mysql> explain for connection **;

测试后果

开启并行查问(16 线程)的执行工夫,与不开启并行查问的执行工夫如下:

SQL1 SQL3 SQL5 SQL6 SQL10 SQL12 SQL19
PQ16 1m25.645s 1m5.514s 8m56.306s 35.451s 44.564s 59.115s 5.771s
NOPQ 6m1.724s 5m19.083s 37m42.078s 2m16.331s 1m57.998s 2m39.672s 24.907s

注:本文章重点讲测试过程,具体的测试后果就不开展了。

**
**

附录 - 相干文件

导入脚本

$ cat loadfile
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile'/data/tpch/data/region.tbl'into table region FIELDS TERMINATED BY'|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile'/data/tpch/data/nation.tbl'into table nation FIELDS TERMINATED BY'|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile'/data/tpch/data/supplier.tbl'into table supplier FIELDS TERMINATED BY'|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile'/data/tpch/data/part.tbl'into table part FIELDS TERMINATED BY'|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile'/data/tpch/data/customer.tbl'into table customer FIELDS TERMINATED BY'|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch -f -e "set session foreign_key_checks=0;load data infile'/data/tpch/data/partsupp.tbl'into table partsupp FIELDS TERMINATED BY'|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile'/data/tpch/data/orders.tbl'into table orders FIELDS TERMINATED BY'|';" tpch &
/usr/localGreatSQL-8.0.25-17/bin/mysql -S /data/GreatSQL/mysql.sock -Dtpch-f -e "set session foreign_key_checks=0;load data infile'/data/tpch/data/lineitem.tbl'into table lineitem FIELDS TERMINATED BY'|';" tpch &

测试脚本:

脚本是七拼八凑的,写的不好,心愿有大佬能领导一下

$ cat auto.sh 
#include <iostream>TH=$PATH:/usr/local/bin
export PATH
#set -u
#set -x
#set -e
. ~/.bash_profile > /dev/null 2>&1
exec 3>&1 4>&2 1>> tpch-PQ-`date +'%Y%m%d%H%M%S'`.log 2>&1

# 定义要执行的 SQL 文件寄存的目录
SQL_DIR="/data/tpch/SQLs"

# 判断目录是否存在
if [! -d "$SQL_DIR"]; then
    echo "SQL 文件目录不存在!"
    exit 1
fi

# 进入 SQL 文件目录
cd $SQL_DIR


I=1
II=3
while [$I -le $II]
do
    # 执行 SQL 文件
    for file in `ls *.sql`
    do
        echo "正在执行:$file"
        time /usr/localGreatSQL-8.0.25-17/bin/mysql -uroot -S /data/GreatSQL/mysql.sock -Dtpch < $file
        
        echo "SQL:$file, 执行实现"
        echo -e
        echo "劳动 100s"
        sleep 100
        echo -e
    done
    echo "第 $I 次循环执行实现!"
I=`expr $I + 1`
done
echo "脚本完结"

dss.ddl

-- Sccsid:     @(#)dss.ddl    2.1.8.1
drop database tpch;
create database tpch;
use tpch;
CREATE TABLE NATION  ( N_NATIONKEY  INTEGER NOT NULL,
                            N_NAME       CHAR(25) NOT NULL,
                            N_REGIONKEY  INTEGER NOT NULL,
                            N_COMMENT    VARCHAR(152));

CREATE TABLE REGION  ( R_REGIONKEY  INTEGER NOT NULL,
                            R_NAME       CHAR(25) NOT NULL,
                            R_COMMENT    VARCHAR(152));

CREATE TABLE PART  ( P_PARTKEY     INTEGER NOT NULL,
                          P_NAME        VARCHAR(55) NOT NULL,
                          P_MFGR        CHAR(25) NOT NULL,
                          P_BRAND       CHAR(10) NOT NULL,
                          P_TYPE        VARCHAR(25) NOT NULL,
                          P_SIZE        INTEGER NOT NULL,
                          P_CONTAINER   CHAR(10) NOT NULL,
                          P_RETAILPRICE DECIMAL(15,2) NOT NULL,
                          P_COMMENT     VARCHAR(23) NOT NULL );

CREATE TABLE SUPPLIER ( S_SUPPKEY     INTEGER NOT NULL,
                             S_NAME        CHAR(25) NOT NULL,
                             S_ADDRESS     VARCHAR(40) NOT NULL,
                             S_NATIONKEY   INTEGER NOT NULL,
                             S_PHONE       CHAR(15) NOT NULL,
                             S_ACCTBAL     DECIMAL(15,2) NOT NULL,
                             S_COMMENT     VARCHAR(101) NOT NULL);

CREATE TABLE PARTSUPP ( PS_PARTKEY     INTEGER NOT NULL,
                             PS_SUPPKEY     INTEGER NOT NULL,
                             PS_AVAILQTY    INTEGER NOT NULL,
                             PS_SUPPLYCOST  DECIMAL(15,2)  NOT NULL,
                             PS_COMMENT     VARCHAR(199) NOT NULL );

CREATE TABLE CUSTOMER ( C_CUSTKEY     INTEGER NOT NULL,
                             C_NAME        VARCHAR(25) NOT NULL,
                             C_ADDRESS     VARCHAR(40) NOT NULL,
                             C_NATIONKEY   INTEGER NOT NULL,
                             C_PHONE       CHAR(15) NOT NULL,
                             C_ACCTBAL     DECIMAL(15,2)   NOT NULL,
                             C_MKTSEGMENT  CHAR(10) NOT NULL,
                             C_COMMENT     VARCHAR(117) NOT NULL);

CREATE TABLE ORDERS  ( O_ORDERKEY       INTEGER NOT NULL,
                           O_CUSTKEY        INTEGER NOT NULL,
                           O_ORDERSTATUS    CHAR(1) NOT NULL,
                           O_TOTALPRICE     DECIMAL(15,2) NOT NULL,
                           O_ORDERDATE      DATE NOT NULL,
                           O_ORDERPRIORITY  CHAR(15) NOT NULL,  
                           O_CLERK          CHAR(15) NOT NULL, 
                           O_SHIPPRIORITY   INTEGER NOT NULL,
                           O_COMMENT        VARCHAR(79) NOT NULL);

CREATE TABLE LINEITEM ( L_ORDERKEY    INTEGER NOT NULL,
                             L_PARTKEY     INTEGER NOT NULL,
                             L_SUPPKEY     INTEGER NOT NULL,
                             L_LINENUMBER  INTEGER NOT NULL,
                             L_QUANTITY    DECIMAL(15,2) NOT NULL,
                             L_EXTENDEDPRICE  DECIMAL(15,2) NOT NULL,
                             L_DISCOUNT    DECIMAL(15,2) NOT NULL,
                             L_TAX         DECIMAL(15,2) NOT NULL,
                             L_RETURNFLAG  CHAR(1) NOT NULL,
                             L_LINESTATUS  CHAR(1) NOT NULL,
                             L_SHIPDATE    DATE NOT NULL,
                             L_COMMITDATE  DATE NOT NULL,
                             L_RECEIPTDATE DATE NOT NULL,
                             L_SHIPINSTRUCT CHAR(25) NOT NULL,
                             L_SHIPMODE     CHAR(10) NOT NULL,
                             L_COMMENT      VARCHAR(44) NOT NULL);

dss.ri

-- Sccsid:     @(#)dss.ri    2.1.8.1
-- tpch Benchmark Version 8.0

-- For table REGION
ALTER TABLE tpch.REGION
ADD PRIMARY KEY (R_REGIONKEY);

-- For table NATION
ALTER TABLE tpch.NATION
ADD PRIMARY KEY (N_NATIONKEY);

ALTER TABLE tpch.NATION
ADD FOREIGN KEY NATION_FK1 (N_REGIONKEY) references tpch.REGION(R_REGIONKEY); 

COMMIT WORK;

-- For table PART
ALTER TABLE tpch.PART
ADD PRIMARY KEY (P_PARTKEY);

COMMIT WORK;

-- For table SUPPLIER
ALTER TABLE tpch.SUPPLIER
ADD PRIMARY KEY (S_SUPPKEY);

ALTER TABLE tpch.SUPPLIER
ADD FOREIGN KEY SUPPLIER_FK1 (S_NATIONKEY) references tpch.NATION(N_NATIONKEY);

COMMIT WORK;

-- For table PARTSUPP
ALTER TABLE tpch.PARTSUPP
ADD PRIMARY KEY (PS_PARTKEY,PS_SUPPKEY);

COMMIT WORK;

-- For table CUSTOMER
ALTER TABLE tpch.CUSTOMER
ADD PRIMARY KEY (C_CUSTKEY);

ALTER TABLE tpch.CUSTOMER
ADD FOREIGN KEY CUSTOMER_FK1 (C_NATIONKEY) references tpch.NATION(N_NATIONKEY); 

COMMIT WORK;

-- For table LINEITEM
ALTER TABLE tpch.LINEITEM
ADD PRIMARY KEY (L_ORDERKEY,L_LINENUMBER);

COMMIT WORK;

-- For table ORDERS
ALTER TABLE tpch.ORDERS
ADD PRIMARY KEY (O_ORDERKEY);

COMMIT WORK;

-- For table PARTSUPP
ALTER TABLE tpch.PARTSUPP
ADD FOREIGN KEY PARTSUPP_FK1 (PS_SUPPKEY) references tpch.SUPPLIER(S_SUPPKEY);

COMMIT WORK;

ALTER TABLE tpch.PARTSUPP
ADD FOREIGN KEY PARTSUPP_FK2 (PS_PARTKEY) references tpch.PART(P_PARTKEY);

COMMIT WORK;

-- For table ORDERS
ALTER TABLE tpch.ORDERS
ADD FOREIGN KEY ORDERS_FK1 (O_CUSTKEY) references tpch.CUSTOMER(C_CUSTKEY);

COMMIT WORK;

-- For table LINEITEM
ALTER TABLE tpch.LINEITEM
ADD FOREIGN KEY LINEITEM_FK1 (L_ORDERKEY)  references tpch.ORDERS(O_ORDERKEY);

COMMIT WORK;

ALTER TABLE tpch.LINEITEM
ADD FOREIGN KEY LINEITEM_FK2 (L_PARTKEY,L_SUPPKEY) references 
        tpch.PARTSUPP(PS_PARTKEY,PS_SUPPKEY);

COMMIT WORK;

SQL 语句

--SQL1
 select /*+ PQ(16) */
    l_returnflag,
    l_linestatus,
    sum(l_quantity) as sum_qty,
    sum(l_extendedprice) as sum_base_price,
    sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
    sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(*) as count_order
from
    lineitem
where
    l_shipdate <= date '1998-12-01' - interval '88' day
group by
    l_returnflag,
    l_linestatus
order by
    l_returnflag,
    l_linestatus limit 1;

--SQL3
select /*+ PQ(16) */
    l_orderkey,
    sum(l_extendedprice * (1 - l_discount)) as revenue,
    o_orderdate,
    o_shippriority
from
    customer,
    orders,
    lineitem
where
    c_mktsegment = 'MACHINERY'
    and c_custkey = o_custkey
    and l_orderkey = o_orderkey
    and o_orderdate < date '1995-03-01'
    and l_shipdate > date '1995-03-01'
group by
    l_orderkey,
    o_orderdate,
    o_shippriority
order by
    revenue desc,
    o_orderdate limit 10;


--SQL6
select /*+ PQ(16) */
    sum(l_extendedprice * l_discount) as revenue
from
    lineitem
where
    l_shipdate >= date '1993-01-01'
    and l_shipdate < date '1993-01-01' + interval '1' year
    and l_discount between 0.02 - 0.01 and 0.02 + 0.01
    and l_quantity < 24 limit 1;
    



--SQL10
select /*+ PQ(16) */
    c_custkey,
    c_name,
    sum(l_extendedprice * (1 - l_discount)) as revenue,
    c_acctbal,
    n_name,
    c_address,
    c_phone,
    c_comment
from
    customer,
    orders,
    lineitem,
    nation
where
    c_custkey = o_custkey
    and l_orderkey = o_orderkey
    and o_orderdate >= date '1994-05-01'
    and o_orderdate < date '1994-05-01' + interval '3' month
    and l_returnflag = 'R'
    and c_nationkey = n_nationkey
group by
    c_custkey,
    c_name,
    c_acctbal,
    c_phone,
    n_name,
    c_address,
    c_comment
order by
    revenue desc limit 20;





--SQL12
select /*+ PQ(16) */
    l_shipmode,
    sum(case
        when o_orderpriority = '1-URGENT'
            or o_orderpriority = '2-HIGH'
            then 1
        else 0
    end) as high_line_count,
    sum(case
        when o_orderpriority <> '1-URGENT'
            and o_orderpriority <> '2-HIGH'
            then 1
        else 0
    end) as low_line_count
from
    orders,
    lineitem
where
    o_orderkey = l_orderkey
    and l_shipmode in ('TRUCK', 'FOB')
    and l_commitdate < l_receiptdate
    and l_shipdate < l_commitdate
    and l_receiptdate >= date '1996-01-01'
    and l_receiptdate < date '1996-01-01' + interval '1' year
group by
    l_shipmode
order by
    l_shipmode limit 1 ;



--SQL19
select /*+ PQ(16) */
    sum(l_extendedprice* (1 - l_discount)) as revenue
from
    lineitem,
    part
where
    (
        p_partkey = l_partkey
        and p_brand = 'Brand#12'
        and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
        and l_quantity >= 10 and l_quantity <= 10 + 10
        and p_size between 1 and 5
        and l_shipmode in ('AIR', 'AIR REG')
        and l_shipinstruct = 'DELIVER IN PERSON'
    )
    or
    (
        p_partkey = l_partkey
        and p_brand = 'Brand#22'
        and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
        and l_quantity >= 15 and l_quantity <= 15 + 10
        and p_size between 1 and 10
        and l_shipmode in ('AIR', 'AIR REG')
        and l_shipinstruct = 'DELIVER IN PERSON'
    )
    or
    (
        p_partkey = l_partkey
        and p_brand = 'Brand#43'
        and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
        and l_quantity >= 22 and l_quantity <= 22 + 10
        and p_size between 1 and 15
        and l_shipmode in ('AIR', 'AIR REG')
        and l_shipinstruct = 'DELIVER IN PERSON'
    ) limit 1 ;
    

相干问题

参考资料

https://imysql.com/2012/12/21/tpch-for-mysql-manual.html


Enjoy GreatSQL :)

## 对于 GreatSQL

GreatSQL 是由万里数据库保护的 MySQL 分支,专一于晋升 MGR 可靠性及性能,反对 InnoDB 并行查问个性,是实用于金融级利用的 MySQL 分支版本。

相干链接:GreatSQL 社区 Gitee GitHub Bilibili

GreatSQL 社区:

社区博客有奖征稿详情:https://greatsql.cn/thread-100-1-1.html

技术交换群:

微信:扫码增加 GreatSQL 社区助手 微信好友,发送验证信息 加群

正文完
 0