关于列式数据库:MergeTree-创建方式和存储结构

创立形式和存储构造Mergetree在写入数据时,数据总会以数据片段的模式写入磁盘,为了防止片段过多,ClickHouse会通过后盾线程,定期合并这些数据片段,属于雷同分区的数据片段会被合并成一个新的片段,正式合并树名称的由来。 创立形式CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster](    name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],    name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],    ...    INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1,    INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2) ENGINE = MergeTree()[PARTITION BY expr][ORDER BY expr][PRIMARY KEY expr][SAMPLE BY expr][SETTINGS name=value, ...]主要参数: PARTITION BY分区键,不申明分区键,则会默认生成一个名为all的分区。 ORDER BY必填,排序键,默认状况下主键与排序键雷同。 PRIMARY KEY会依据主键字段生成一级索引,用于减速查问,可不申明,默认是ORDER BY定义的字段。 SAMPLE BY抽样表达式,申明数据以何种规范进行采样,如果应用此配置,必须子主键的配置中也申明同样的表达式。ORDER BY (CounterID,intHash32(UserID))SAMPLE BY intHash32(UserID) SETTINGSindex_granularity:索引粒度,默认8192,也就每隔8192行才生成一条索引enable_mixed_granularity_parts:是否开启自适应索引距离性能,默认开启index_granularity_bytes:索引粒度,依据每一批次写入数据的大小,动静划分距离大小,默认10M(10*1024*1024) 存储构造创立测试表 CREATE TABLE test.part_v1( `ID` String, `URL` String, `age` UInt8 DEFAULT 0, `EventTime` Date)ENGINE = MergeTree()PARTITION BY toYYYYMMDD(EventTime)ORDER BY IDSETTINGS index_granularity = 8192插入数据 ...

August 27, 2020 · 2 min · jiezi

关于列式数据库:ClickHouse-Merge性能测试

ClickHouse 性能测试为了验证ClickHouse性能,将结合实际业务场景对clickhouse进行多维度测试。造测试数据在理论业务中最常见的业务场景,有二张表,订单主表和订单明细表 通常二张表会join查问,或者group by查问,上面就会应用clickhouse对这种状况进行测试 定义表构造test_order: 主表表构造: CREATE TABLE `test_order` ( `id` bigint(11) NOT NULL AUTO_INCREMENT, `field_name_1` varchar(60) NOT NULL, `field_name_2` varchar(60) NOT NULL, `field_name_3` varchar(60) NOT NULL, `field_name_4` varchar(60) NOT NULL, `field_name_5` varchar(60) NOT NULL, `field_name_6` varchar(60) NOT NULL, `field_name_7` varchar(60) NOT NULL, `field_name_8` varchar(60) NOT NULL, `field_name_9` varchar(60) NOT NULL, `field_name_10` varchar(60) NOT NULL, `field_id_1` int(11) NOT NULL, `field_id_2` int(11) NOT NULL, `field_id_3` int(11) NOT NULL, `field_id_4` int(11) NOT NULL, `field_id_5` int(11) NOT NULL, `field_id_6` int(11) NOT NULL, `field_id_7` int(11) NOT NULL, `field_id_8` int(11) NOT NULL, `field_id_9` int(11) NOT NULL, `field_id_10` int(11) NOT NULL, `field_date_1` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `field_date_2` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `field_date_3` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `field_date_4` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, `field_date_5` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, `field_date_6` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, `field_date_7` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, `field_date_8` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, `field_date_9` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `idx_field_1` (`field_name_1`,`field_id_1`) USING BTREE) ENGINE=InnoDB AUTO_INCREMENT=1043 DEFAULT CHARSET=utf8mb4;test_order_detail: 明细表,为了减少sql查问简单的,定义了41个字段表构造 ...

August 24, 2020 · 5 min · jiezi