前文回顾
实现一个简略的Database1(译文)
实现一个简略的Database2(译文)
实现一个简略的Database3(译文)
译注:cstsck在github保护了一个简略的、相似SQLite的数据库实现,通过这个简略的我的项目,能够很好的了解数据库是如何运行的。本文是第四篇,次要是应用rspec对目前实现的性能进行测试并解决测试呈现BUG
译注:cstsck在github保护了一个简略的、相似sqlite的数据库实现,通过这个简略的我的项目,能够很好的了解数据库是如何运行的。本文是第四篇,次要是应用rspec对目前实现的性能进行测试并解决测试呈现BUG
Part 4 咱们的第一个测试(和BUG)
咱们曾经取得插入数据到数据库并打印所有数据的能力。当初来测试一下目前microseconds已有的性能。<br/>
我要应用rspec来写我的测试,因为我对rspec很相熟,它的语法也相当易读。<br/>
_译注:rsepec 是一个基于Ruby的测试框架,语法非常简单,能够很不便的测试各种可执行程序,判断输入_<br/>
我定义一个短小的help来发送一个帮忙命令列表到数据库,而后对输入进行断言。
describe 'database' do def run_script(commands) raw_output = nil IO.popen("./db", "r+") do |pipe| commands.each do |command| pipe.puts command end pipe.close_write # Read entire output raw_output = pipe.gets(nil) end raw_output.split("\n") end it 'inserts and retrieves a row' do result = run_script([ "insert 1 user1 [email protected]", "select", ".exit", ]) expect(result).to match_array([ "db > Executed.", "db > (1, user1, [email protected])", "Executed.", "db > ", ]) endend
这个简略的测试是确认咱们的输出可能获取返回后果。并确保能通过测试:
bundle exec rspec.Finished in 0.00871 seconds (files took 0.09506 seconds to load)1 example, 0 failures
当初测试插入更多行数据到数据库是可行的:
it 'prints error message when table is full' do script = (1..1401).map do |i| "insert #{i} user#{i} person#{i}@example.com" end script << ".exit" result = run_script(script) expect(result[-2]).to eq('db > Error: Table full.')end
再次运行测试:
bundle exec rspec..Finished in 0.01553 seconds (files took 0.08156 seconds to load)2 examples, 0 failures
妙啊,测试通过了!咱们的数据库当初可能hold住1400行数据,这是因为咱们设置pages最大数量是100页,每页能够寄存14行数据。
查看咱们目前写的的代码,我意识到咱们可能没有正确处理存储文本字段。很容易用上面的例子测试进去(插入边界长度的字符串):
it 'allows inserting strings that are the maximum length' do long_username = "a"*32 long_email = "a"*255 script = [ "insert 1 #{long_username} #{long_email}", "select", ".exit", ] result = run_script(script) expect(result).to match_array([ "db > Executed.", "db > (1, #{long_username}, #{long_email})", "Executed.", "db > ", ])end
测试失败了:
Failures: 1) database allows inserting strings that are the maximum length Failure/Error: raw_output.split("\n") ArgumentError: invalid byte sequence in UTF-8 # ./spec/main_spec.rb:14:in `split' # ./spec/main_spec.rb:14:in `run_script' # ./spec/main_spec.rb:48:in `block (2 levels) in <top (required)>'
如果是咱们本人人工来测试,当咱们打印行数据时,会看到有一些奇怪的字符(例子中,我把很长的字符串进行了缩写):
db > insert 1 aaaaa... aaaaa...Executed.db > select(1, aaaaa...aaa\�, aaaaa...aaa\�)Executed.db >
产生了什么?如果看一下代码中咱们定义的Row构造,咱们的确为username字段调配了32个字节长度,为email字段调配255个字节长度。然而C语言的strings是以一个null字符来作为结尾的,这个字符咱们没有为它调配空间。解决办法就是多调配一个额定的字节(来寄存这个null字符):
const uint32_t COLUMN_EMAIL_SIZE = 255;typedef struct { uint32_t id;- char username[COLUMN_USERNAME_SIZE];- char email[COLUMN_EMAIL_SIZE];+ char username[COLUMN_USERNAME_SIZE + 1];+ char email[COLUMN_EMAIL_SIZE + 1];} Row;
这样的确解决了下面的问题(从新运行下面插入边界长度字符串的测试):
bundle exec rspec...Finished in 0.0188 seconds (files took 0.08516 seconds to load)3 examples, 0 failures
咱们不容许插入的username或者email的长度超过固定的列的长度。这样的超出长度要求的spec测试看起来就像上面这样:
it 'prints error message if strings are too long' do long_username = "a"*33 long_email = "a"*256 script = [ "insert 1 #{long_username} #{long_email}", "select", ".exit", ] result = run_script(script) expect(result).to match_array([ "db > String is too long.", "db > Executed.", "db > ", ])end
我了可能反对下面这种成果,咱们须要降级咱们的解析器(parser)。揭示一下,咱们当初应用的是scanf():
if (strncmp(input_buffer->buffer, "insert", 6) == 0) { statement->type = STATEMENT_INSERT; int args_assigned = sscanf( input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id), statement->row_to_insert.username, statement->row_to_insert.email); if (args_assigned < 3) { return PREPARE_SYNTAX_ERROR; } return PREPARE_SUCCESS;}
然而scanf()有一些毛病。如果读取的string大于正在读取它的缓存(buffer),就会引起缓存溢出(buffer overflow)并写入到意想不到的中央。所以咱们须要在拷贝string到Row构造前查看每一个string的长度。为了查看string长度,咱们须要用空格宰割输出。
我应用 strtok() 来做这些。如果你看到过程就会感觉它很容易了解:<br/>
译注: strtok: 字符串处理函数,char strtok ( char str, const char * delimiters ); 合成字符串为一组字符串。str为要合成的字符,delimiters为分隔符字符(如果传入字符串,则传入的字符串中每个字符均为宰割符)
+PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) {+ statement->type = STATEMENT_INSERT;++ char* keyword = strtok(input_buffer->buffer, " ");+ char* id_string = strtok(NULL, " ");+ char* username = strtok(NULL, " ");+ char* email = strtok(NULL, " ");++ if (id_string == NULL || username == NULL || email == NULL) {+ return PREPARE_SYNTAX_ERROR;+ }++ int id = atoi(id_string);+ if (strlen(username) > COLUMN_USERNAME_SIZE) {+ return PREPARE_STRING_TOO_LONG;+ }+ if (strlen(email) > COLUMN_EMAIL_SIZE) {+ return PREPARE_STRING_TOO_LONG;+ }++ statement->row_to_insert.id = id;+ strcpy(statement->row_to_insert.username, username);+ strcpy(statement->row_to_insert.email, email);++ return PREPARE_SUCCESS;+}+ PrepareResult prepare_statement(InputBuffer* input_buffer, Statement* statement) { if (strncmp(input_buffer->buffer, "insert", 6) == 0) {+ return prepare_insert(input_buffer, statement);- statement->type = STATEMENT_INSERT;- int args_assigned = sscanf(- input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id),- statement->row_to_insert.username, statement->row_to_insert.email);- if (args_assigned < 3) {- return PREPARE_SYNTAX_ERROR;- }- return PREPARE_SUCCESS; }
每当输出到一个分隔符时(在咱们的例子中就是空格),就在输出缓冲(input buffer)上间断调用strtok(),把它分解成子字符串。它返回一个指向子字符串开始地位的指针。
咱们能够在每个文本值上调用strlen(),看看它是否太长(strlen()函数,获取字符串的长度)。
咱们能够像解决其余错误码一样处理错误:
enum PrepareResult_t { PREPARE_SUCCESS,+ PREPARE_STRING_TOO_LONG, PREPARE_SYNTAX_ERROR, PREPARE_UNRECOGNIZED_STATEMENT};
switch (prepare_statement(input_buffer, &statement)) { case (PREPARE_SUCCESS): break;+ case (PREPARE_STRING_TOO_LONG):+ printf("String is too long.\n");+ continue; case (PREPARE_SYNTAX_ERROR): printf("Syntax error. Could not parse statement.\n"); continue;
这样就能通过测试了。
bundle exec rspec....Finished in 0.02284 seconds (files took 0.116 seconds to load)4 examples, 0 failures
到了这里,咱们无妨再多解决一个谬误状况(id值插入一个负值):
it 'prints an error message if id is negative' do script = [ "insert -1 cstack [email protected]", "select", ".exit", ] result = run_script(script) expect(result).to match_array([ "db > ID must be positive.", "db > Executed.", "db > ", ])end
enum PrepareResult_t { PREPARE_SUCCESS,+ PREPARE_NEGATIVE_ID, PREPARE_STRING_TOO_LONG, PREPARE_SYNTAX_ERROR, PREPARE_UNRECOGNIZED_STATEMENT@@ -148,9 +147,6 @@ PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) { } int id = atoi(id_string);+ if (id < 0) {+ return PREPARE_NEGATIVE_ID;+ } if (strlen(username) > COLUMN_USERNAME_SIZE) { return PREPARE_STRING_TOO_LONG; }@@ -230,9 +226,6 @@ int main(int argc, char* argv[]) { switch (prepare_statement(input_buffer, &statement)) { case (PREPARE_SUCCESS): break;+ case (PREPARE_NEGATIVE_ID):+ printf("ID must be positive.\n");+ continue; case (PREPARE_STRING_TOO_LONG): printf("String is too long.\n"); continue;
好了,测试做的差不多了。接下来是十分重要的性能:长久化!咱们要实现保留咱们的数据库到一个文件,再把它从文件中读取进去。(目前它还在内存当中)
当初它越来越牛了。
上面是和上一部分代码,批改地位的比照:
@@ -22,6 +22,8 @@ enum PrepareResult_t { PREPARE_SUCCESS,+ PREPARE_NEGATIVE_ID,+ PREPARE_STRING_TOO_LONG, PREPARE_SYNTAX_ERROR, PREPARE_UNRECOGNIZED_STATEMENT };@@ -34,8 +36,8 @@ #define COLUMN_EMAIL_SIZE 255 typedef struct { uint32_t id;- char username[COLUMN_USERNAME_SIZE];- char email[COLUMN_EMAIL_SIZE];+ char username[COLUMN_USERNAME_SIZE + 1];+ char email[COLUMN_EMAIL_SIZE + 1]; } Row;@@ -150,18 +152,40 @@ MetaCommandResult do_meta_command(InputBuffer* input_buffer, Table *table) { } }-PrepareResult prepare_statement(InputBuffer* input_buffer,- Statement* statement) {- if (strncmp(input_buffer->buffer, "insert", 6) == 0) {+PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) { statement->type = STATEMENT_INSERT;- int args_assigned = sscanf(- input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id),- statement->row_to_insert.username, statement->row_to_insert.email- );- if (args_assigned < 3) {++ char* keyword = strtok(input_buffer->buffer, " ");+ char* id_string = strtok(NULL, " ");+ char* username = strtok(NULL, " ");+ char* email = strtok(NULL, " ");++ if (id_string == NULL || username == NULL || email == NULL) { return PREPARE_SYNTAX_ERROR; }++ int id = atoi(id_string);+ if (id < 0) {+ return PREPARE_NEGATIVE_ID;+ }+ if (strlen(username) > COLUMN_USERNAME_SIZE) {+ return PREPARE_STRING_TOO_LONG;+ }+ if (strlen(email) > COLUMN_EMAIL_SIZE) {+ return PREPARE_STRING_TOO_LONG;+ }++ statement->row_to_insert.id = id;+ strcpy(statement->row_to_insert.username, username);+ strcpy(statement->row_to_insert.email, email);+ return PREPARE_SUCCESS;++}+PrepareResult prepare_statement(InputBuffer* input_buffer,+ Statement* statement) {+ if (strncmp(input_buffer->buffer, "insert", 6) == 0) {+ return prepare_insert(input_buffer, statement); } if (strcmp(input_buffer->buffer, "select") == 0) { statement->type = STATEMENT_SELECT;@@ -223,6 +247,12 @@ int main(int argc, char* argv[]) { switch (prepare_statement(input_buffer, &statement)) { case (PREPARE_SUCCESS): break;+ case (PREPARE_NEGATIVE_ID):+ printf("ID must be positive.\n");+ continue;+ case (PREPARE_STRING_TOO_LONG):+ printf("String is too long.\n");+ continue; case (PREPARE_SYNTAX_ERROR): printf("Syntax error. Could not parse statement.\n"); continue;
并且咱们增加了一些测试:
+describe 'database' do+ def run_script(commands)+ raw_output = nil+ IO.popen("./db", "r+") do |pipe|+ commands.each do |command|+ pipe.puts command+ end++ pipe.close_write++ # Read entire output+ raw_output = pipe.gets(nil)+ end+ raw_output.split("\n")+ end++ it 'inserts and retrieves a row' do+ result = run_script([+ "insert 1 user1 [email protected]",+ "select",+ ".exit",+ ])+ expect(result).to match_array([+ "db > Executed.",+ "db > (1, user1, [email protected])",+ "Executed.",+ "db > ",+ ])+ end++ it 'prints error message when table is full' do+ script = (1..1401).map do |i|+ "insert #{i} user#{i} person#{i}@example.com"+ end+ script << ".exit"+ result = run_script(script)+ expect(result[-2]).to eq('db > Error: Table full.')+ end++ it 'allows inserting strings that are the maximum length' do+ long_username = "a"*32+ long_email = "a"*255+ script = [+ "insert 1 #{long_username} #{long_email}",+ "select",+ ".exit",+ ]+ result = run_script(script)+ expect(result).to match_array([+ "db > Executed.",+ "db > (1, #{long_username}, #{long_email})",+ "Executed.",+ "db > ",+ ])+ end++ it 'prints error message if strings are too long' do+ long_username = "a"*33+ long_email = "a"*256+ script = [+ "insert 1 #{long_username} #{long_email}",+ "select",+ ".exit",+ ]+ result = run_script(script)+ expect(result).to match_array([+ "db > String is too long.",+ "db > Executed.",+ "db > ",+ ])+ end++ it 'prints an error message if id is negative' do+ script = [+ "insert -1 cstack [email protected]",+ "select",+ ".exit",+ ]+ result = run_script(script)+ expect(result).to match_array([+ "db > ID must be positive.",+ "db > Executed.",+ "db > ",+ ])+ end+end
Enjoy GreatSQL :)
## 对于 GreatSQL
GreatSQL是由万里数据库保护的MySQL分支,专一于晋升MGR可靠性及性能,反对InnoDB并行查问个性,是实用于金融级利用的MySQL分支版本。
相干链接: GreatSQL社区 Gitee GitHub Bilibili
GreatSQL社区:
欢送来GreatSQL社区发帖发问
https://greatsql.cn/
技术交换群:
微信:扫码增加GreatSQL社区助手
微信好友,发送验证信息加群
。