内存数据库

通过TCP与简易内存数据库进行数据交互的实现

目的运用TCP相关原理，实现一个简单的server端和client端的数据库交互程序，可以将client端输入的指令被server端解析，将返回信息又返送给client端。之前的简单内存数据库的实现：T-Tree、T*-Tree的理解与简单内存数据库的实现TCP/IP,socket等相关计算机网络原理七层网络模型七层网络模型TCP/IPTCP/IP是互联网协议簇的统称。TCP-transmission control protocal-传输控制协议IP-Internet Protocal-因特网协议UDP 是User Datagram Protocol 是无连接类型的传输层协议，socket是什么socket是对TCP/IP协议的封装，socket翻译为套接字，socket是一个接口/插座。TCPIP 是Socket的一种实现，Socket并不只有TCP/IP。两种socket:stream sockets,datagram socketsocket其实不止两种。stream socket：串流式socket。是有连接类型的，网页浏览器所使用的 HTTP 协议是用 stream sockets 取得网页。datagram socket：讯息式socket。是无连接类型的，用于语音通信，视频传输较多。TCP serverserver端socket()函数#include <sys/types.h>#include <sys/socket.h>int socket(int domain, int type, int protocol);domain一个地址描述。目前仅支持AF_INET格式，也就是说ARPA Internet地址格式。type指定socket类型。新套接口的类型描述类型，如TCP（SOCK_STREAM）和UDP（SOCK_DGRAM）。常用的socket类型有，SOCK_STREAM、SOCK_DGRAM、SOCK_RAW、SOCK_PACKET、SOCK_SEQPACKET等等。protocol指定协议。套接口所用的协议。如不想指定，可用0。常用的协议有，IPPROTO_TCP、IPPROTO_UDP、IPPROTO_STCP、IPPROTO_TIPC等，它们分别对应TCP传输协议、UDP传输协议、STCP传输协议、TIPC传输协议。bind()函数#include <sys/types.h>#include <sys/socket.h>int bind(int sockfd, struct sockaddr *my_addr, int addrlen);bind()将一本地地址与一socket捆绑.sockfdsockfd 是 socket() 传回的 socket file descriptor。my_addrmy_addr是指向包含你的地址资料丶名称及 IP address 的 struct sockaddr 之指针。addrlenaddrlen 是以 byte 为单位的地址长度。connect#include <sys/types.h>#include <sys/socket.h>int connect(int sockfd, struct sockaddr *serv_addr, int addrlen);connect()从client端连接到server端listen(),accept()int listen(int sockfd, int backlog);—————————————————————–#include <sys/types.h>#include <sys/socket.h>int accept(int sockfd, struct sockaddr *addr, socklen_t addrlen);send(),recv()int send(int sockfd, const void msg, int len, int flags);————————————————————int recv(int sockfd, void buf, int len, int flags);send() 会返回实际有送出的 byte 数，可能会少与所要传送的数目。recv()若返回0，则说明远端那边已经关闭了你的连接close()close(sockfd);关闭socket。一个tcp server 的示例代码：#include <netdb.h> #include <netinet/in.h> #include <stdlib.h> #include <string.h> #include <sys/socket.h> #include <sys/types.h> #define MAX 80 #define PORT 8080 #define SA struct sockaddr // Function designed for chat between client and server. void func(int sockfd) { char buff[MAX]; int n; // infinite loop for chat for (;;) { bzero(buff, MAX); // read the message from client and copy it in buffer read(sockfd, buff, sizeof(buff)); // print buffer which contains the client contents printf(“From client: %s\t To client : “, buff); bzero(buff, MAX); n = 0; // copy server message in the buffer while ((buff[n++] = getchar()) != ‘\n’) ; // and send that buffer to client write(sockfd, buff, sizeof(buff)); // if msg contains “Exit” then server exit and chat ended. if (strncmp(“exit”, buff, 4) == 0) { printf(“Server Exit…\n”); break; } } } // Driver function int main() { int sockfd, connfd, len; struct sockaddr_in servaddr, cli; // socket create and verification sockfd = socket(AF_INET, SOCK_STREAM, 0); if (sockfd == -1) { printf(“socket creation failed…\n”); exit(0); } else printf(“Socket successfully created..\n”); bzero(&servaddr, sizeof(servaddr)); // assign IP, PORT servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = htonl(INADDR_ANY); servaddr.sin_port = htons(PORT); // Binding newly created socket to given IP and verification if ((bind(sockfd, (SA)&servaddr, sizeof(servaddr))) != 0) { printf(“socket bind failed…\n”); exit(0); } else printf(“Socket successfully binded..\n”); // Now server is ready to listen and verification if ((listen(sockfd, 5)) != 0) { printf(“Listen failed…\n”); exit(0); } else printf(“Server listening..\n”); len = sizeof(cli); // Accept the data packet from client and verification connfd = accept(sockfd, (SA)&cli, &len); if (connfd < 0) { printf(“server acccept failed…\n”); exit(0); } else printf(“server acccept the client…\n”); // Function for chatting between client and server func(connfd); // After chatting close the socket close(sockfd); } client端一个tcp client的示例代码：// Write CPP code here #include <netdb.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/socket.h> #define MAX 80 #define PORT 8080 #define SA struct sockaddr void func(int sockfd) { char buff[MAX]; int n; for (;;) { bzero(buff, sizeof(buff)); printf(“Enter the string : “); n = 0; while ((buff[n++] = getchar()) != ‘\n’) ; write(sockfd, buff, sizeof(buff)); bzero(buff, sizeof(buff)); read(sockfd, buff, sizeof(buff)); printf(“From Server : %s”, buff); if ((strncmp(buff, “exit”, 4)) == 0) { printf(“Client Exit…\n”); break; } } } int main() { int sockfd, connfd; struct sockaddr_in servaddr, cli; // socket create and varification sockfd = socket(AF_INET, SOCK_STREAM, 0); if (sockfd == -1) { printf(“socket creation failed…\n”); exit(0); } else printf(“Socket successfully created..\n”); bzero(&servaddr, sizeof(servaddr)); // assign IP, PORT servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = inet_addr(“127.0.0.1”); servaddr.sin_port = htons(PORT); // connect the client socket to server socket if (connect(sockfd, (SA)&servaddr, sizeof(servaddr)) != 0) { printf(“connection with the server failed…\n”); exit(0); } else printf(“connected to the server..\n”); // function for chat func(sockfd); // close the socket close(sockfd); } 通过TCP与简易内存数据库的数据交互为了让client端和server中的内存数据库，通信如果server端和client建立连接之后则进入一个大循环，大循环的退出条件是client端发去“EXIT”，client端随即断开连接。通过TCP和内存数据库通信的所以代码所有代码都在github里：slarsar/ttree_mmdb_tcp_server_client参考GeeksforGeeksBeej’s Guide to Network Programming 简体中文版 ...

T-Tree、T*-Tree的理解、实现与简单的内存数据库应用

章节目录T*-tree的介绍T*-tree节点与Ｃ语言实现T*-tree的插入、删除、查找与旋转实现简单的key-value内存数据库参考文献T-tree和T*-tree极为相似，他们的不同主要是T×-tree的节点结构比T-tree多了一个指向successor的指针位,指向successor的指针的存在是的树的寻找和遍历的时间复杂度.注：本文中关于ttree的head file ：ttree.h和ttree_defs.h来源于Dan Kruchinin <dkruchinin@acm.org>，Github：dkruchinin/libttreeT*-tree的介绍在计算机科学中，T-tree是一种二叉树，它有一个左子树和一个右子树，由主存储器数据库使用，例如Datablitz，EXtremeDB，MySQL Cluster，Oracle TimesTen和MobileLite。T树是一种平衡的索引树数据结构，针对索引和实际数据都完全保存在内存中的情况进行了优化，就像B树是针对面向块的辅助存储设备（如硬盘）上的存储而优化的索引结构。 T树寻求获得内存树结构（如AVL树）的性能优势，同时避免它们常见的大存储空间开销。T树不保留索引树节点本身内的索引数据字段的副本。相反，它们利用了这样的事实：实际数据总是与索引一起在主存储器中，因此它们只包含指向实际数据字段的指针。虽然T树以前被广泛用于主存数据库，但最近的研究表明它们在现代硬件上的表现并不比B树好。主要原因是现代的内存和硬盘的速度差异越来越大了，内存访问速度比硬盘访问速度快，并且CPU的核心缓存容量也越来越大。T*-tree的节点结构与C语言实现T树节点通常由指向父节点，左右子节点，有序数据指针数组和一些额外控制数据的指针组成。具有两个子树的节点称为内部节点（internal nodes），没有子树的节点称为叶子节点（leaf nodes），而仅具有一个子树的节点称为半叶节点（half-leaf nodes）。如果值在节点的当前最小值和最大值之间，则该节点称为（bounding node）。对于每个内部节点,它的左子树中会有一个叶子节点或半叶子节点中有一个predecessor（称为最大下限-GLB(Greatest Lower Bound)），还在右子树中包含其最大数据值的后继者（LUB-lower upper bound）的节点(包含GLB或LUP的节点或许距离参照内部节点的距离很远,但也有可能恰好相邻。正因为T-tree的每个节点是有序的，并且不同节点之间保证左子树的数据都比节点数据的最小值小，右子树的数据都比节点数据的最大值大，因此B-tree最左边的节点中最左边的数据是整棵树最小的数据，最右边的节点中的最大值是整棵树最大的数据。叶子和半叶节点中的数据元素量在1~最大元素量之间，内部节点将其占用保持在预定义的最小元素量和最大元素数之间。如图：T-tree，T-treenode的插入、删除、旋转和查找代码来自于:Github:dkruchinin/libttree typedef struct ttree_node { struct ttree_node *parent; /< Pointer to node’s parent */ struct ttree_node *successor; /< Pointer to node’s soccussor */ union { struct ttree_node *sides[2]; struct { struct ttree_node *left; /< Pointer to node’s left child */ struct ttree_node *right; /< Pointer to node’s right child */ }; }; union { uint32_t pad; struct { signed min_idx :12; /< Index of minimum item in node’s array */ signed max_idx :12; /< Index of maximum item in node’s array */ signed bfc :4; /< Node’s balance factor */ unsigned node_side :4; /< Node’s side(TNODE_LEFT, TNODE_RIGHT or TNODE_ROOT) / }; };T-tree的插入、删除、查找与旋转插入int ttree_insert(Ttree *ttree, void item){ TtreeCursor cursor; / * If the tree already contains the same key item has and * tree’s wasn’t allowed to hold duplicate keys, signal an error. */ if (ttree_lookup(ttree, ttree_item2key(ttree, item), &cursor) && ttree->keys_are_unique) { return -1; } ttree_insert_at_cursor(&cursor, item); return 0;}void ttree_insert_at_cursor(TtreeCursor *cursor, void *item){ Ttree *ttree = cursor->ttree; TtreeNode *at_node, *n; TtreeCursor tmp_cursor; void key; TTREE_ASSERT(cursor->ttree != NULL); //TTREE_ASSERT(cursor->state == CURSOR_PENDING); key = ttree_item2key(ttree, item); n = at_node = cursor->tnode; if (!ttree->root) { / The root node has to be created. / at_node = allocate_ttree_node(ttree); at_node->keys[first_tnode_idx(ttree)] = key; at_node->min_idx = at_node->max_idx = first_tnode_idx(ttree); ttree->root = at_node; tnode_set_side(at_node, TNODE_ROOT); ttree_cursor_open_on_node(cursor, ttree, at_node, TNODE_SEEK_START); return; } if (cursor->side == TNODE_BOUND) { if (tnode_is_full(ttree, n)) { / * If node is full its max item should be removed and * new key should be inserted into it. Removed key becomes * new insert value that should be put in successor node. */ void tmp = n->keys[n->max_idx–]; increase_tnode_window(ttree, n, &cursor->idx); n->keys[cursor->idx] = key; key = tmp; ttree_cursor_copy(&tmp_cursor, cursor); cursor = &tmp_cursor; / * If current node hasn’t successor and right child * New node have to be created. It’ll become the right child * of the current node. / if (!n->successor || !n->right) { cursor->side = TNODE_RIGHT; cursor->idx = first_tnode_idx(ttree); goto create_new_node; } at_node = n->successor; / * If successor hasn’t any free rooms, new value is inserted * into newly created node that becomes left child of the current * node’s successor. / if (tnode_is_full(ttree, at_node)) { cursor->side = TNODE_LEFT; cursor->idx = first_tnode_idx(ttree); goto create_new_node; } / * If we’re here, then successor has free rooms and key * will be inserted to one of them. */ cursor->idx = at_node->min_idx; cursor->tnode = at_node; } increase_tnode_window(ttree, at_node, &cursor->idx); at_node->keys[cursor->idx] = key; cursor->state = CURSOR_OPENED; return; }create_new_node: n = allocate_ttree_node(ttree); n->keys[cursor->idx] = key; n->min_idx = n->max_idx = cursor->idx; n->parent = at_node; at_node->sides[cursor->side] = n; tnode_set_side(n, cursor->side); cursor->tnode = n; cursor->state = CURSOR_OPENED; fixup_after_insertion(ttree, n, cursor);}删除void *ttree_delete(Ttree *ttree, void *key){ TtreeCursor cursor; void *ret; ret = ttree_lookup(ttree, key, &cursor); if (!ret) { return ret; } ttree_delete_at_cursor(&cursor); return ret;}void *ttree_delete_at_cursor(TtreeCursor *cursor){ Ttree *ttree = cursor->ttree; TtreeNode *tnode, n; void ret; TTREE_ASSERT(cursor->ttree != NULL); TTREE_ASSERT(cursor->state == CURSOR_OPENED); tnode = cursor->tnode; ret = ttree_key2item(ttree, tnode->keys[cursor->idx]); decrease_tnode_window(ttree, tnode, &cursor->idx); cursor->state = CURSOR_CLOSED; if (UNLIKELY(cursor->idx > tnode->max_idx)) { cursor->idx = tnode->max_idx; } / * If after a key was removed, T-tree node contains more than * minimum allowed number of items, the proccess is completed. / if (tnode_num_keys(tnode) > min_tnode_entries(ttree)) { return ret; } if (is_internal_node(tnode)) { int idx; / * If it is an internal node, we have to recover number * of items from it by moving one item from its successor. / n = tnode->successor; idx = tnode->max_idx + 1; increase_tnode_window(ttree, tnode, &idx); tnode->keys[idx] = n->keys[n->min_idx++]; if (UNLIKELY(cursor->idx > tnode->max_idx)) { cursor->idx = tnode->max_idx; } if (!tnode_is_empty(n) && is_leaf_node(n)) { return ret; } / * If we’re here, then successor is either a half-leaf * or an empty leaf. / tnode = n; } if (!is_leaf_node(tnode)) { int items, diff; n = tnode->left ? tnode->left : tnode->right; items = tnode_num_keys(n); / * If half-leaf can not be merged with a leaf, * the proccess is completed. / if (items > (ttree->keys_per_tnode - tnode_num_keys(tnode))) { return ret; } if (tnode_get_side(n) == TNODE_RIGHT) { / * Merge current node with its right leaf. Items from the leaf * are placed after the maximum item in a node. */ diff = (ttree->keys_per_tnode - tnode->max_idx - items) - 1; if (diff < 0) { memcpy(tnode->keys + tnode->min_idx + diff, tnode->keys + tnode->min_idx, sizeof(void *) * tnode_num_keys(tnode)); tnode->min_idx += diff; tnode->max_idx += diff; if (cursor->tnode == tnode) { cursor->idx += diff; } } memcpy(tnode->keys + tnode->max_idx + 1, n->keys + n->min_idx, sizeof(void ) * items); tnode->max_idx += items; } else { / * Merge current node with its left leaf. Items the leaf * are placed before the minimum item in a node. */ diff = tnode->min_idx - items; if (diff < 0) { register int i; for (i = tnode->max_idx; i >= tnode->min_idx; i–) { tnode->keys[i - diff] = tnode->keys[i]; } tnode->min_idx -= diff; tnode->max_idx -= diff; if (cursor->tnode == tnode) { cursor->idx -= diff; } } memcpy(tnode->keys + tnode->min_idx - items, n->keys + n->min_idx, sizeof(void ) * items); tnode->min_idx -= items; } n->min_idx = 1; n->max_idx = 0; tnode = n; } if (!tnode_is_empty(tnode)) { return ret; } / if we’re here, then current node will be removed from the tree. */ n = tnode->parent; if (!n) { ttree->root = NULL; free(tnode); return ret; } n->sides[tnode_get_side(tnode)] = NULL; fixup_after_deletion(ttree, tnode, NULL); free(tnode); return ret;}查找void *ttree_lookup(Ttree *ttree, void *key, TtreeCursor *cursor){ TtreeNode *n, *marked_tn, *target; int side = TNODE_BOUND, cmp_res, idx; void item = NULL; enum ttree_cursor_state st = CURSOR_PENDING; / * Classical T-tree search algorithm is O(log(2N/M) + log(M - 2)) * Where N is total number of items in the tree and M is a number of * items per node. In worst case each node on the path requires 2 * comparison(with its min and max items) plus binary search in the last * node(bound node) excluding its first and last items. * * Here is used another approach that was suggested in * “Tobin J. Lehman , Michael J. Carey, A Study of Index Structures for * Main Memory Database Management Systems”. * It reduces O(log(2N/M) + log(M - 2)) to true O(log(N)). * This algorithm compares the search * key only with minimum item in each node. If search key is greater, * current node is marked for future consideration. / target = n = ttree->root; marked_tn = NULL; idx = first_tnode_idx(ttree); if (!n) { goto out; } while (n) { target = n; cmp_res = ttree->cmp_func(key, tnode_key_min(n)); if (cmp_res < 0) side = TNODE_LEFT; else if (cmp_res > 0) { marked_tn = n; / mark current node for future consideration. / side = TNODE_RIGHT; } else { / ok, key is found, search is completed. / side = TNODE_BOUND; idx = n->min_idx; item = ttree_key2item(ttree, tnode_key_min(n)); st = CURSOR_OPENED; goto out; } n = n->sides[side]; } if (marked_tn) { int c = ttree->cmp_func(key, tnode_key_max(marked_tn)); if (c <= 0) { side = TNODE_BOUND; target = marked_tn; if (!c) { item = ttree_key2item(ttree, tnode_key_max(target)); idx = target->max_idx; st = CURSOR_OPENED; } else { / make internal binary search / struct tnode_lookup tnl; tnl.key = key; tnl.low_bound = target->min_idx + 1; tnl.high_bound = target->max_idx - 1; item = lookup_inside_tnode(ttree, target, &tnl, &idx); st = (item != NULL) ? CURSOR_OPENED : CURSOR_PENDING; } goto out; } } / * If we’re here, item wasn’t found. So the only thing * needs to be done is to determine the position where search key * may be placed to. If target node is not empty, key may be placed * to its min or max positions. */ if (!tnode_is_full(ttree, target)) { side = TNODE_BOUND; idx = ((marked_tn != target) || (cmp_res < 0)) ? target->min_idx : (target->max_idx + 1); st = CURSOR_PENDING; }out: if (cursor) { ttree_cursor_open_on_node(cursor, ttree, target, TNODE_SEEK_START); cursor->side = side; cursor->idx = idx; cursor->state = st; } return item;}旋转static void __rotate_single(TtreeNode **target, int side){ TtreeNode *p, *s; int opside = opposite_side(side); p = *target; TTREE_ASSERT(p != NULL); s = p->sides[side]; TTREE_ASSERT(s != NULL); tnode_set_side(s, tnode_get_side(p)); p->sides[side] = s->sides[opside]; s->sides[opside] = p; tnode_set_side(p, opside); s->parent = p->parent; p->parent = s; if (p->sides[side]) { p->sides[side]->parent = p; tnode_set_side(p->sides[side], side); } if (s->parent) { if (s->parent->sides[side] == p) s->parent->sides[side] = s; else s->parent->sides[opside] = s; } target = s;}/ * There are two cases of single rotation possible: * 1) Right rotation (side = TNODE_LEFT) * [P] [L] * / \ / \ * [L] x1 => x2 [P] * / \ / \ * x2 x3 x3 x1 * * 2) Left rotation (side = TNODE_RIHGT) * [P] [R] * / \ / \ * x1 [R] => [P] x2 * / \ / \ * x3 x2 x1 x3 */static void rotate_single(TtreeNode **target, int side){ TtreeNode *n; __rotate_single(target, side); n = (target)->sides[opposite_side(side)]; / * Recalculate balance factors of nodes after rotation. * Let X was a root node of rotated subtree and Y was its * child. After single rotation Y is new root of subtree and X is its child. * Y node may become either balanced or overweighted to the * same side it was but 1 level less. * X node scales at 1 level down and possibly it has new child, so * its balance should be recalculated too. If it still internal node and * its new parent was not overwaighted to the opposite to X side, * X is overweighted to the opposite to its new parent side, * otherwise it’s balanced. If X is either half-leaf or leaf, * balance racalculation is obvious. */ if (is_internal_node(n)) { n->bfc = (n->parent->bfc != side2bfc(side)) ? side2bfc(side) : 0; } else { n->bfc = !!(n->right) - !!(n->left); } (*target)->bfc += side2bfc(opposite_side(side)); TTREE_ASSERT((abs(n->bfc < 2) && (abs((target)->bfc) < 2)));}/ * There are two possible cases of double rotation: * 1) Left-right rotation: (side == TNODE_LEFT) * [P] [r] * / \ / \ * [L] x1 [L] [P] * / \ => / \ / \ * x2 [r] x2 x4 x3 x1 * / \ * x4 x3 * * 2) Right-left rotation: (side == TNODE_RIGHT) * [P] [l] * / \ / \ * x1 [R] [P] [R] * / \ => / \ / \ * [l] x2 x1 x3 x4 x2 * / \ * x3 x4 */static void rotate_double(TtreeNode **target, int side){ int opside = opposite_side(side); TtreeNode *n = (target)->sides[side]; __rotate_single(&n, opside); / * Balance recalculation is very similar to recalculation after * simple single rotation. */ if (is_internal_node(n->sides[side])) { n->sides[side]->bfc = (n->bfc == side2bfc(opside)) ? side2bfc(side) : 0; } else { n->sides[side]->bfc = !!(n->sides[side]->right) - !!(n->sides[side]->left); } TTREE_ASSERT(abs(n->sides[side]->bfc) < 2); n = n->parent; __rotate_single(target, side); if (is_internal_node(n)) { n->bfc = ((target)->bfc == side2bfc(side)) ? side2bfc(opside) : 0; } else { n->bfc = !!(n->right) - !!(n->left); } / * new root node of subtree is always ideally balanced * after double rotation. */ TTREE_ASSERT(abs(n->bfc) < 2); (*target)->bfc = 0;}static void rebalance(Ttree *ttree, TtreeNode **node, TtreeCursor *cursor){ int lh = left_heavy(*node); int sum = abs((*node)->bfc + (node)->sides[opposite_side(lh)]->bfc); if (sum >= 2) { rotate_single(node, opposite_side(lh)); goto out; } rotate_double(node, opposite_side(lh)); / * T-tree rotation rules difference from AVL rules in only one aspect. * After double rotation is done and a leaf became a new root node of * subtree and both its left and right childs are half-leafs. * If the new root node contains only one item, N - 1 items should * be moved into it from one of its childs. * (N is a number of items in selected child node). */ if ((tnode_num_keys(*node) == 1) && is_half_leaf((*node)->left) && is_half_leaf((*node)->right)) { TtreeNode n; int offs, nkeys; / * If right child contains more items than left, they will be moved * from the right child. Otherwise from the left one. */ if (tnode_num_keys((*node)->right) >= tnode_num_keys((node)->left)) { / * Right child was selected. So first N - 1 items will be copied * and inserted after parent’s first item. */ n = (*node)->right; nkeys = tnode_num_keys(n); (*node)->keys[0] = (*node)->keys[(*node)->min_idx]; offs = 1; (*node)->min_idx = 0; (*node)->max_idx = nkeys - 1; if (!cursor) { goto no_cursor; } else if (cursor->tnode == n) { if (cursor->idx < n->max_idx) { cursor->tnode = *node; cursor->idx = (node)->min_idx + (cursor->idx - n->min_idx + 1); } else { cursor->idx = first_tnode_idx(ttree); } } } else { / * Left child was selected. So its N - 1 items * (starting after the min one) * will be copied and inserted before parent’s single item. / n = (node)->left; nkeys = tnode_num_keys(n); (node)->keys[ttree->keys_per_tnode - 1] = (node)->keys[(node)->min_idx]; (node)->min_idx = offs = ttree->keys_per_tnode - nkeys; (node)->max_idx = ttree->keys_per_tnode - 1; if (!cursor) { goto no_cursor; } else if (cursor->tnode == n) { if (cursor->idx > n->min_idx) { cursor->tnode = node; cursor->idx = (node)->min_idx + (cursor->idx - n->min_idx); } else { cursor->idx = first_tnode_idx(ttree); } } n->max_idx = n->min_idx++; }no_cursor: memcpy((node)->keys + offs, n->keys + n->min_idx, sizeof(void ) * (nkeys - 1)); n->keys[first_tnode_idx(ttree)] = n->keys[n->max_idx]; n->min_idx = n->max_idx = first_tnode_idx(ttree); }out: if (ttree->root->parent) { ttree->root = node; }}实现简单的key-value内存数据库实现简单的key-value内存数据库,用hashtable来链接key-value的关系。key不光插入到ttree中，而且还存到hash-table中。hash_table采用了macro：hash-table（uthash.h）`uthash.h的帮助文档:macro：uthash.h帮助文档hashkey-value对的插入:插入之前先HASH_FIND_INT看看，key-value存不存在，如果不存在则可以插，存在的话不能插入。void add_user(int user_id, char name) { struct my_struct s; HASH_FIND_INT(users, &user_id, s); / id already in the hash? / if (s==NULL) { s = (struct my_struct )malloc(sizeof s); s->id = user_id; HASH_ADD_INT( users, id, s ); / id: name of key field / } strcpy(s->name, name);}解析存有key-value格式的文件找到一个key-value格式的实例:xlarge.delfopen()读取文件,读完之后 fclose()关闭。因为待会儿要用strtok来拆开每一行，所以malloc个file_line FILE * fp; fp = fopen("/home/vory/programing/c/key_value_mmdb/xlarge.del",“r”); file_line = malloc(1000 * sizeof(char)); memset(file_line, 0, 1000 * sizeof(char)); …… fclose(fp); fgets获取每一行: char * buf; buf = malloc(sizeof(input)); memset(buf,0,sizeof(input)); while(fgets(input ,256,fp)){ strcpy(buf,input); …… }strtok切割每一行为若干字符串。strtok将目标字符串的符合delim中的元素全部替换为'0’,strtok用了之后，原来的目标代码就被破坏了，因此，新malloc一个接受复制的字符串，将目标字符串strcpy()之后，对复制的字符串进行strtok()操作。用完之后free()。 strcpy(buf,source_string); token = strtok(buf,delim);// printf("%s\n",token); parameter[parametercount] = malloc(sizeof(input)); strcpy(parameter[parametercount],token); parametercount++; token = strtok(NULL,delim);// printf("%s\n",token); parameter[parametercount] = malloc(sizeof(input)); strcpy(parameter[parametercount],token);　实例的xlarge.del 文件的内容大概是这样：每一行分成两部分，KEY 和VALUE被逗号隔开着。41231234,“Teenage Caveman"3061234,“Banger Sisters, The"18861234,“Hope Floats"29381234,“No Looking Back"1288,“Erotic Confessions: Volume 8"2954,“Normal Life"43901234,“Utomlyonnye solntsem"20801234,“Island of Dr. Moreau, The"3019,“One Hell of a Guy"6712341,“Adventures of Pluto Nash, The"33031234,“Pronto"34701234,“Ripper, The"106612341,“Devotion"39481234,“Starship Troopers"32381234,“Polish Wedding"30551234,“Oscar and Lucinda"42391,“Tomcats"1661123411,“Gojira ni-sen mireniamu"10611234,“Devil in a Blue Dress"61612341,“Bully"102612341,“Defenders: Taking the First, The"1650,“Go Fish"43512341,“Black Rose of Harlem"解析从文件读来的每一行：每一行最多解析为２个参数([KEY] [VALUE])。void parse_file(char * string){ char * buf; char * delim; delim = NULL; delim = “,”; char * token; token = NULL; buf = malloc(1000sizeof(char)); memset(buf,0, 1000sizeof(char)); if (!parf0){ parf0 =malloc(500sizeof(char)); } memset(parf0,0, 500sizeof(char)); if (!parf1){ parf1 =malloc(500sizeof(char)); } memset(parf1,0, 500sizeof(char)); strcpy(buf, string); token = strtok(buf, delim); if(token != NULL) { strcpy(parf0, token); } token = strtok(NULL, delim); if (token != NULL){ strcpy(parf1, token); } free(buf);}strtol将字符型的数据转换成long int型:long int strtol(const char nptr,char endptr,int base);strtol不仅可以识别十进制整数，还可以识别其它进制的整数，取决于base参数，base为10，则识别十进制整数。 all_items[bcount].key = strtol(parf0,NULL ,10); bcount++; hash_user_id = strtol(parf0,NULL ,10); strcpy(hash_name,parf1);解析从命令行输入的命令（[COMMANDS] [KEY] [VALUE])从文件读取每一行是 [KEY] [VALUE]的格式，但是从命令行读取是[COMMANDS] [KEY] [VALUE]的格式。我将copy_string,par1，par2，par3定义在了别处。这样就可以解析从stdin输入的句子了，被空格隔开的句子最多分为发出3路给par1，par2，par3。如果stdin输入的句子只包含了一个空格（也就是含有[COMMANDS] [KEY]的结构）则只能被分发为2路给par1和par2.malloc()之后要free(),我在别处free() 了。void parseinput(char string){ char * delim; delim = " “; copy_string = malloc(100sizeof(char)); memset(copy_string,0,100 sizeof(char)); char * token; token = NULL; par1 = malloc(50sizeof(char)); par2 = malloc(50sizeof(char)); par3 = malloc(50sizeof(char)); memset(par1,0,50sizeof(char)); memset(par2,0,50sizeof(char)); memset(par3,0,50sizeof(char)); strcpy(copy_string,string); printf("%s is copystring .\n “,copy_string); printf("%s is string . \n”,string); token = strtok(copy_string,delim); if (token != NULL){ printf("%s is token1 \n”,token); strcpy(par1,token); } token = strtok(NULL,delim); if (token != NULL){ printf("%s is token2 \n”,token); strcpy(par2,token); } token = strtok(NULL,delim); if (token != NULL){ printf("%s is token3 \n”,token); strcpy(par3,token); } free(copy_string);}初始化T-tree#define ttree_init(ttree, num_keys, is_unique, cmpf, data_struct, key_field) _ttree_init(ttree, num_keys, is_unique, cmpf, offsetof(data_struct, key_field))int __ttree_init(Ttree ttree, int num_keys, bool is_unique, ttree_cmp_func_fn cmpf, size_t key_offs);……….. ret = ttree_init(&ttree, 8, false, __cmpfunc, struct item, key); if (ret < 0) { fprintf(stderr, “Failed to initialize T-tree. [ERR=%d]\n”, ret); free(all_items); exit(EXIT_FAILURE); }将读取的每一行插入ttree，并将key-value插入hashtable在一个循环中解析每一行，当真个文件的所有行都读完则跳出循环。 while (fgets(file_line, 1000, fp)) { parse_file(file_line); all_items[bcount].key = strtol(parf0, NULL, 10); hash_name = malloc(500 * sizeof(char)); memset(hash_name, 0, 500 * sizeof(char)); hash_user_id = strtol(parf0, NULL, 10); strcpy(hash_name, parf1); s = find_user(hash_user_id); if (s == NULL) { add_user(hash_user_id, hash_name); } free(hash_name); memset(file_line, 0, 1000 * sizeof(char)); } for (i = 0; i < num_keys; i++) { ret = ttree_insert(&ttree, &all_items[i]); if (ret < 0) { fprintf(stderr, “Failed to insert item %d with key %ld! [ERR=%d]\n”, i, all_items[i].key, ret); free(all_items); exit(EXIT_FAILURE); } } 打印出ttree的所有key for (i = 0; i < num_keys; i++) { printf("%ld “, all_items[i].key); }给ttree的所有key排序从小到大排序，递归实现。 printf("\nSorted keys:\n”); printf(”{ “); tnode = ttree_node_leftmost(ttree.root); while (tnode) { tnode_for_each_index(tnode, i) { printf("%d “, (int ) tnode_key(tnode, i)); } tnode = tnode->successor; } printf(”}\n”);程序结束前free()，释放内存空间ttree_destroy(&ttree);free(all_items);附件&代码github代码所有的代码参考文献[1].Tobin J. Lehman and Michael J. Carey. 1986. A Study of Index Structures for Main Memory Database Management Systems. In Proceedings of the 12th International Conference on Very Large Data Bases (VLDB ‘86), Wesley W. Chu, Georges Gardarin, Setsuo Ohsuga, and Yahiko Kambayashi (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 294-303.[2].Kong-Rim Choi and Kyung-Chang Kim, “T-tree: a main memory database index structure for real time applications,” Proceedings of 3rd International Workshop on Real-Time Computing Systems and Applications, Seoul, South Korea, 1996, pp. 81-88.doi: 10.1109/RTCSA.1996.554964[3].wikipidia about T-tree[4].An Open-source T-tree Library ...

PHP面试常考内容之Memcache和Redis（1）

你好，是我琉忆。继上周（2019.2-11至2-15）发布的“PHP面试常考内容之面向对象”专题后，发布的第二个专题，感谢你的阅读。本周（2019.2-18至2-22）的文章内容点为以下几点，更新时间为每周一三五，可以关注本栏持续关注，感谢你的支持。一、什么是Memcache？二、Memcache有什么特征？三、Memcache的内存管理机制是什么样的？四、Memcache和Memcached有什么区别？五、如何操作Memcache？六、如何使用Memcache做Session共享？七、什么是Redis？八、如何使用Redis？九、使用Redis需要注意哪些问题？十、Memcache和Redis常考的面试题本章节的内容将会被分为三篇文章进行讲解完整块内容，第一篇主要讲解一到六，第二篇主要讲解七到九，第三篇围绕第十点。以下正文的部分内容来自《PHP程序员面试笔试宝典》书籍，如果转载请保留出处：一、什么是Memcache？Memcache是一个开源、免费、高性能的分布式对象缓存系统，它基于一个存储键/值对的hashmap来存储数据到内存中。它的作用是减少对数据库的读取以提高Web应用的性能。虽然它的守护进程（daemon ）是用 C语言实现的，但是客户端可以用任何语言来编写，并通过Memcache协议与守护进程通信。需要注意的是，当某个服务器停止运行或崩溃了，所有存放在该服务器上的键/值对都将丢失。Memcache的服务器端没有提供分布式功能，各个Memcache应用不会互相通信以共享信息。想要实现分布式通信，可以搭建几个Memcache应用，通过算法实现此效果。下面介绍Memcache的两个重要概念。slab：为了防止内存碎片化，Memcache服务器端会预先将数据空间划分为一系列slab；举个例子，现在有一个100m3的房间，为了合理规划这个房间放置东西，会在这个房间里放置30个1m3的盒子、20个1.25m3的盒子、15个1.5m3的盒子……这些盒子就是slab。LRU：最近最少使用算法；当同一个slab的格子满了，这时需要新加一个值时，不会考虑将这个新数据放到比当前slat更大的空闲slab，而是使用LRU移除旧数据，放入这个新数据。二、Memcache有什么特征？Memcache的特征如下：1）协议简单。2）基于libevent的事件处理。3）内置内存存储方式。4）Memcached不互相通信的分布式。Memcache的特性如下：（1）单个item 最大的数据为1MB。（2）单进程最大的使用内存为2GB，需要更多内存时可开多个端口。（3）Memcached是多线程，非阻塞io复用的网络模型，Redis是单线程。（4）键长最大为250字节。三、Memcache的内存管理机制是什么样的？Memcache将内存分割成各种尺寸的块（chunk），并把尺寸相同的块分成组（chunk的集合）。page是分配给slab的内存空间，默认是1MB，根据slab大小切分成chunk，chunk是用户缓存记录的内存空间，slab class是特定chunk的组。在存储数据时，Memcache会压缩数据的大小进行存储，一般压缩后的数据为原数据大小的30%左右，从而节省了70%的传输性能消耗。如果存储的数是小于100字节的键值对，那么压缩后可能带来膨胀，但Memcache都是按照固定大小分块存储的，最小也有88B，所以小数据带来的压缩膨胀也并不会有什么影响。具体流程如下图所示。自己整理了一篇“如何解决Memcache的缓存雪崩现象？”的文章，关注公众号：“琉忆编程库”，回复：“me”，我发给你。四、Memcache和Memcached有什么区别？其实 memcache 和 memcached 说的是 PHP 的扩展。Memcache是一个自由和开放源代码、高性能、分配的内存对象缓存系统。用于加速动态web应用程序，减轻数据库负载。它可以应对任意多个连接，使用非阻塞的网络IO。由于它的工作机制是在内存中开辟一块空间，然后建立一个Hash表，Memcached自管理这些Hash表。Memcache是该系统的项目名称，Memcached是该系统的主程序文件（字母d可以理解为daemon），以守护程序方式运行于一个或多个服务器中，随时接受客户端的连接操作，使用共享内存存取数据。Memcached处理的原子是每一个（key，value）对（以下简称kv对），key会通过一个hash算法转化成hash-key，便于查找、对比以及做到尽可能的散列。同时，memcached用的是一个二级散列，通过一张大hash表来维护。Memcached有两个核心组件组成：服务端（Server）和客户端（Client），在一个memcached的查询中，Client先通过计算key的hash值来确定kv对所处在的Server位置。当Server确定后，客户端就会发送一个查询请求给对应的Server，让它来查找确切的数据。因为这之间没有交互以及多播协议，所以 memcached交互带给网络的五、如何操作Memcache？Memcached的操作命令可以分为存储命令、查找命令、统计命令等三大类。第一类，Memcached的存储命令：第二类，Memcached的查找命令：第三类，Memcached的统计命令：使用示例代码参考：<?php $m = new Memcached(); $m->addServer(“127.0.0.1”,11211); //连接Memcache服务器 $m->set(“mkey”,“123”,600);//设置一个键名为mkey，值为123，600s过期，0为永久有效 echo $m->get(“mkey”);//输出123,get可以获取键名为mkey的值 $m->delete(“mkey”);//删除键名为mkey的值?>具体全部的使用方法不在此一一罗列，可根据上面的方法表和参考示例进行模仿使用。六、如何使用Memcache做Session共享？默认状态下，PHP使用文件方式做Session存储，磁盘的I/O性能肯定没有内存中存取快，因此改用memcache来存储Session在读写速度上会快很多，而且在多台服务器集群时，使用memcahced能够有效地解决Session共享的问题。我们配置好memcached服务器后，修改php.ini配置文件的代码：session.save_handler = memcachesession.save_path = “tcp://127.0.0.1:11211”也可以使用在网站目录下放置Apache的.htaccess文件：php_value session.save_handler “memcache”php_value session.save_path “tcp://127.0.0.1:11211”执行PHP脚本时写入以下两行：ini_set(“session.save_handler”,”memcache”);ini_set(“session.save_path”,”tcp://127.0.0.1:11211”);可以在使用多个memcached服务器时用都好隔开，还可以额外带参数：persistent、weight、timeout、retry_interval等。例如：$session_save_path = “tcp://<memcache_server1>:11211?persistent=1&weight=1&timeout=1&retry_interval=15,udp://<memcache_server2>:11211”;如果想使用udp支持，需要确保你的memcached服务器启用，也确保Web服务器连接到memcache服务器端口正常。重新启动Apache后，将开始使用的memcache存储PHP的Session。测试案例：<?php Session_start(); $_SESSION[‘test’] = time(); echo $_SESSION[‘test’].”<br>”; echo session_id();?>测试memcache是否已存储成功，通过sessionid去取对应的数据：$m = memcache_connect(‘localhost’,11211);echo $m->get(“13765595df9dsa8f9dsa8fa9sf8saf865”);得到结果：17559898989如果使用memcached作为Session存储，要确保memcached的服务正常工作，否则Session相关功能将不起作用，这样PHP的处理就多了一层外面的依赖。因为memcached是使用内存的，这样当用户量比较大时，就可能由于内存方面原因导致Session时长上的问题，Session的实际失效时长达不到设定的失效时长（由memcached在内存不够时的处理机制决定）。Memcache可扩展考察的内容有很多，由于整理的篇幅内容很多，在此只是罗列了普遍遇到的Memcache相关的知识考点。额外的考点还有：（1）Memcache的工作流程是什么样的？（2）Memcache如何实现分布式？（3）Memcache的分布式算法有哪些？（4）使用Memcache有哪些技术限制？（5）Memcache和Redis有什么相同和区别的地方？更多相关面试知识点可以阅读《PHP程序员面试笔试宝典》。预告：PHP面试常考内容之Memcache和Redis（2）将于本周三（2019.2-20）更新。以上内容摘自《PHP程序员面试笔试宝典》书籍，该书已在天猫、京东、当当等电商平台销售。更多PHP相关的面试知识、考题可以关注公众号获取：琉忆编程库对本文有什么问题或建议都可以进行留言，我将不断完善追求极致，感谢你们的支持。