关于flask:设计一个基于flask的高并发高可用的查询ip的http服务

基础架构为 flask+gunicorn+ 负载平衡，负载平衡分为阿里云硬件负载平衡服务和软负载 nginx。gunicorn 应用 supervisor 进行治理。

应用 nginx 软件负载结构图

应用阿里云硬件负载平衡服务结构图

因为 flask app 须要在内存中保留 ip 树以及国家、省份、城市相干的字典，因而占用内存较高。gunicorn 的 1 个 worker 须要占用 300M 内存，nginx 的 4 个 worker 内存占用较小（不到 100M），因而占用 1.3G 的内存（即须要一个 2G 内存的服务器）。当 gunicorn 任意一个节点挂断或者降级时，另外一个节点依然在应用，不会影响整体服务

<!–more–>

IP 库(也叫 IP 地址数据库)，是由业余技术人员通过长时间通过多种技术手段收集而来的，并且长期有业余人员进行更新、保护、补充。

ip 数据库解析查问代码

基于二叉查找树实现

import struct
from socket import inet_aton, inet_ntoa
import os
import sys

sys.setrecursionlimit(1000000)

_unpack_V = lambda b: struct.unpack("<L", b)
_unpack_N = lambda b: struct.unpack(">L", b)
_unpack_C = lambda b: struct.unpack("B", b)


class IpTree:
    def __init__(self):
        self.ip_dict = {}
        self.country_codes = {}
        self.china_province_codes = {}
        self.china_city_codes = {}

    def load_country_codes(self, file_name):
        try:
            path = os.path.abspath(file_name)
            with open(path, "rb") as f:
                for line in f.readlines():
                    data = line.split('\t')
                    self.country_codes[data[0]] = data[1]
                    # print self.country_codes
        except Exception as ex:
            print "cannot open file %s: %s" % (file, ex)
            print ex.message
            exit(0)

    def load_china_province_codes(self, file_name):
        try:
            path = os.path.abspath(file_name)
            with open(path, "rb") as f:
                for line in f.readlines():
                    data = line.split('\t')
                    provinces = data[2].split('\r')
                    self.china_province_codes[provinces[0]] = data[0]
                    # print self.china_province_codes
        except Exception as ex:
            print "cannot open file %s: %s" % (file, ex)
            print ex.message
            exit(0)

    def load_china_city_codes(self, file_name):
        try:
            path = os.path.abspath(file_name)
            with open(path, "rb") as f:
                for line in f.readlines():
                    data = line.split('\t')
                    cities = data[3].split('\r')
                    self.china_city_codes[cities[0]] = data[0]
        except Exception as ex:
            print "cannot open file %s: %s" % (file, ex)
            print ex.message
            exit(0)

    def loadfile(self, file_name):
        try:
            ipdot0 = 254
            path = os.path.abspath(file_name)
            with open(path, "rb") as f:
                local_binary0 = f.read()
                local_offset, = _unpack_N(local_binary0[:4])
                local_binary = local_binary0[4:local_offset]
                # 256 nodes
                while ipdot0 >= 0:
                    middle_ip = None
                    middle_content = None
                    lis = []
                    # offset
                    begin_offset = ipdot0 * 4
                    end_offset = (ipdot0 + 1) * 4
                    # index
                    start_index, = _unpack_V(local_binary[begin_offset:begin_offset + 4])
                    start_index = start_index * 8 + 1024
                    end_index, = _unpack_V(local_binary[end_offset:end_offset + 4])
                    end_index = end_index * 8 + 1024
                    while start_index < end_index:
                        content_offset, = _unpack_V(local_binary[start_index + 4: start_index + 7] +
                                                    chr(0).encode('utf-8'))
                        content_length, = _unpack_C(local_binary[start_index + 7])
                        content_offset = local_offset + content_offset - 1024
                        content = local_binary0[content_offset:content_offset + content_length]
                        if middle_content != content and middle_content is not None:
                            contents = middle_content.split('\t')
                            lis.append((middle_ip, (contents[0], self.lookup_country_code(contents[0]),
                                                    contents[1], self.lookup_china_province_code(contents[1]),
                                                    contents[2], self.lookup_china_city_code(contents[2]),
                                                    contents[3], contents[4])))
                        middle_content, = content,
                        middle_ip = inet_ntoa(local_binary[start_index:start_index + 4])
                        start_index += 8
                    self.ip_dict[ipdot0] = self.generate_tree(lis)
                    ipdot0 -= 1
        except Exception as ex:
            print "cannot open file %s: %s" % (file, ex)
            print ex.message
            exit(0)

    def lookup_country(self, country_code):
        try:
            for item_country, item_country_code in self.country_codes.items():
                if country_code == item_country_code:
                    return item_country, item_country_code
            return 'None', 'None'
        except KeyError:
            return 'None', 'None'

    def lookup_country_code(self, country):
        try:
            return self.country_codes[country]
        except KeyError:
            return 'None'

    def lookup_china_province(self, province_code):
        try:
            for item_province, item_province_code, in self.china_province_codes.items():
                if province_code == item_province_code:
                    return item_province, item_province_code
            return 'None', 'None'
        except KeyError:
            return 'None', 'None'

    def lookup_china_province_code(self, province):
        try:
            return self.china_province_codes[province.encode('utf-8')]
        except KeyError:
            return 'None'

    def lookup_china_city(self, city_code):
        try:
            for item_city, item_city_code in self.china_city_codes.items():
                if city_code == item_city_code:
                    return item_city, item_city_code
            return 'None', 'None'
        except KeyError:
            return 'None', 'None'

    def lookup_china_city_code(self, city):
        try:
            return self.china_city_codes[city]
        except KeyError:
            return 'None'

    def lookup(self, ip):
        ipdot = ip.split('.')
        ipdot0 = int(ipdot[0])
        if ipdot0 < 0 or ipdot0 > 255 or len(ipdot) != 4:
            return None
        try:
            d = self.ip_dict[int(ipdot[0])]
        except KeyError:
            return None
        if d is not None:
            return self.lookup1(inet_aton(ip), d)
        else:
            return None

    def lookup1(self, net_ip, (net_ip1, content, lefts, rights)):
        if net_ip < net_ip1:
            if lefts is None:
                return content
            else:
                return self.lookup1(net_ip, lefts)
        elif net_ip > net_ip1:
            if rights is None:
                return content
            else:
                return self.lookup1(net_ip, rights)
        else:
            return content

    def generate_tree(self, ip_list):
        length = len(ip_list)
        if length > 1:
            lefts = ip_list[:length / 2]
            rights = ip_list[length / 2:]
            (ip, content) = lefts[length / 2 - 1]
            return inet_aton(ip), content, self.generate_tree(lefts), self.generate_tree(rights)
        elif length == 1:
            (ip, content) = ip_list[0]
            return inet_aton(ip), content, None, None
        else:
            return

if __name__ == "__main__":
    import sys

    reload(sys)
    sys.setdefaultencoding('utf-8')
    ip_tree = IpTree()
    ip_tree.load_country_codes("doc/country_list.txt")
    ip_tree.load_china_province_codes("doc/china_province_code.txt")
    ip_tree.load_china_city_codes("doc/china_city_code.txt")
    ip_tree.loadfile("doc/mydata4vipday2.dat")
    print ip_tree.lookup('123.12.23.45')

提供 ip 查问服务的 GET 申请和 POST 申请

@ip_app.route('/api/ip_query', methods=['POST'])
def ip_query():
    try:
        ip = request.json['ip']
    except KeyError as e:
        raise InvalidUsage('bad request: no key ip in your request json body. {}'.format(e), status_code=400)
    if not is_ip(ip):
        raise InvalidUsage('{} is not a ip'.format(ip), status_code=400)
    try:
        res = ip_tree.lookup(ip)
    except Exception as e:
        raise InvalidUsage('internal error: {}'.format(e), status_code=500)
    if res is not None:
        return jsonify(res)
    else:
        raise InvalidUsage('no ip info in ip db for ip: {}'.format(ip), status_code=501)


@ip_app.route('/api/ip_query', methods=['GET'])
def ip_query_get():
    try:
        ip = request.values.get('ip')
    except ValueError as e:
        raise InvalidUsage('bad request: no param ip in your request. {}'.format(e), status_code=400)
    if not is_ip(ip):
        raise InvalidUsage('{} is not a ip'.format(ip), status_code=400)
    try:
        res = ip_tree.lookup(ip)
    except Exception as e:
        raise InvalidUsage('internal error: {}'.format(e), status_code=500)
    if res is not None:
        return jsonify(res)
    else:
        raise InvalidUsage('no ip info in ip db for ip: {}'.format(ip), status_code=501)

POST 申请须要在申请体中蕴含相似上面的 json 字段

{"ip": "165.118.213.9"}

GET 申请的模式如：http://127.0.0.1:5000/api/ip_query?ip=165.118.213.9

装置依赖库

依赖的库 requirements.txt 如下：

certifi==2017.7.27.1
chardet==3.0.4
click==6.7
Flask==0.12.2
gevent==1.1.1
greenlet==0.4.12
gunicorn==19.7.1
idna==2.5
itsdangerous==0.24
Jinja2==2.9.6
locustio==0.7.5
MarkupSafe==1.0
meld3==1.0.2
msgpack-python==0.4.8
requests==2.18.3
supervisor==3.3.3
urllib3==1.22
Werkzeug==0.12.2

装置办法：pip install -r requirements.txt

配置 supervisor

vim /etc/supervisor/conf.d/ip_query_http_service.conf，内容如下

[program:ip_query_http_service]
directory = /root/qk_python/ip_query
command = gunicorn -w10 -b0.0.0.0:8080 ip_query_app:ip_app --worker-class gevent
autostart = true
startsecs = 5
autorestart = true
startretries = 3
user = root
stdout_logfile=/root/qk_python/ip_query/log/gunicorn.log
stderr_logfile=/root/qk_python/ip_query/log/gunicorn.err

内容增加实现之后，须要创立 stdout_logfile 和 stderr_logfile 这两个目录，否则 supervisor 启动会报错。而后更新 supervisor 启动 ip_query_http_service 过程。

# 启动 supervisor
supervisord -c /etc/supervisor/supervisord.conf    

# 更新 supervisor 服务
supervisorctl update

对于 supervisor 的罕用操作参见最初面的参考资料。

装置 nginx

如果是软负载的模式须要装置 nginx，编译装置 nginx 的办法参见最初面的参考资料。

配置 nginx

vim /usr/local/nginx/nginx.conf，批改配置文件内容如下：

#user  nobody;
#nginx 过程数，倡议设置为等于 CPU 总外围数。worker_processes  4;
#error_log  logs/error.log;
#error_log  logs/error.log  notice;
#全局谬误日志定义类型，[debug | info | notice | warn | error | crit]
error_log  logs/error.log  info;
#过程文件
pid        logs/nginx.pid;
#一个 nginx 过程关上的最多文件描述符数目，理论值应该是最多关上文件数（零碎的值 ulimit -n）与 nginx 过程数相除，然而 nginx 调配申请并不平均，所以倡议与 ulimit - n 的值保持一致。worker_rlimit_nofile 65535;
events {
    #参考事件模型 linux 下应用 epoll
    use epoll;
    #单个过程最大连接数（最大连接数 = 连接数 * 过程数）worker_connections  65535;
}

http {
    include       mime.types;
    default_type  application/octet-stream;
    log_format  main  '$remote_addr - $remote_user [$time_local]"$request"''$status $body_bytes_sent "$http_referer" ''"$http_user_agent""$http_x_forwarded_for"';
    access_log  logs/access.log  main;
    sendfile        on;
    #keepalive_timeout  0;
    keepalive_timeout  65;
    tcp_nopush on; #防止网络阻塞
    tcp_nodelay on; #防止网络阻塞
    #gzip  on;
    server {
        #这里配置连接服务提供的代理端口.
        listen       9000;
        server_name  localhost;
        #charset koi8-r;
        #access_log  logs/host.access.log  main;
        location / {
            #            root   html;
            #            index  index.html index.htm;
            proxy_pass http://127.0.0.1:8000;
            proxy_redirect off;
            proxy_set_header X-Real-IP $remote_addr;
            #后端的 Web 服务器能够通过 X -Forwarded-For 获取用户实在 IP
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header Host $host;
            client_max_body_size 10m; #容许客户端申请的最大单文件字节数
            client_body_buffer_size 128k; #缓冲区代理缓冲用户端申请的最大字节数，proxy_buffer_size 4k; #设置代理服务器（nginx）保留用户头信息的缓冲区大小
            proxy_temp_file_write_size 64k;       #设定缓存文件夹大小，大于这个值，将从 upstream 服务器传
        }

        #error_page  404              /404.html;
        # redirect server error pages to the static page /50x.html
        #
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {root   html;}       
    }
}

做压力测试，抉择正确的工具是前提。以下工具中，jmeter 运行在 windows 机器较多，其余工具倡议都运行在 *nix 机器上。

工具名称	优缺点	倡议
ApacheBench(ab)	命令应用简略，效率高，统计信息欠缺，施压机器内存压力小	举荐
locust	python 编写，效率低，受限于 GIL，须要编写 python 测试脚本	不举荐
wrk	命令应用简略，效率高，统计信息精炼，坑少，少报错	最举荐
jmeter	基于 java，Apache 开源，图形化界面，操作简便	举荐
webbench	应用简略，然而不反对 POST 申请	个别
tsung	erlang 编写，配置模板较多，较简单	不举荐

上述六种工具全副亲自应用过，上面抉择 ab、wrk、jmeter 三种工具简略阐明装置应用办法，其余工具的应用办法如有须要，自行 google

装置

apt-get install apache2-utils

常见 options

option	含意
-r	当接管到 socket 谬误的时候 ab 不退出
-t	发送申请的最长工夫
-c	并发数，一次结构的申请数量
-n	发送的申请数量
-p	postfile，指定蕴含 post 数据的文件
-T	content-type, 指定 post 和 put 发送申请时申请体的类型

应用

测试 GET 申请

ab -r -t 120 -c 5000 http://127.0.0.1:8080/api/ip_query?ip=165.118.213.9

测试 POST 申请

ab -r -t 120 -c 5000 -p /tmp/post_data.txt -T 'application/json' http://127.0.0.1:8080/api/ip_query

其中 /tmp/post_data.txt 文件的内容为待发送的 - T 指定格局的数据，在此处为 json 格局

{"ip": "125.118.213.9"}

http://www.restran.net/2016/0…

装置

apt-get install libssl-dev
git clone https://github.com/wg/wrk.git
cd wrk
make
cp wrk /usr/sbin

常见 options

option	含意
-c	关上的连接数，即并发数
-d	压力测试工夫：发送申请的最长工夫
-t	施压机器应用的线程数量
-s	指定要加载的 lua 脚本
–latency	打印提早统计信息

应用

测试 GET 申请

wrk -t10 -c5000 -d120s --latency http://127.0.0.1:8080/api/ip_query?ip=165.118.213.9

测试 POST 申请

wrk -t50 -c5000 -d120s --latency -s /tmp/wrk_post.lua http://127.0.0.1:8080

其中 /tmp/wrk_post.lua 文件的内容为待加载的 lua 脚本，指定 post 的 path，header，body

request = function()
  path = "/api/ip_query"
  wrk.headers["Content-Type"] = "application/json"
  wrk.body = "{\"ip\":\"125.118.213.9\"}"
  return wrk.format("POST", path)
end

装置

装置 jmeter 前须要先装置 jdk1.8。而后在 Apache 官网能够下载 jmeter，点此下载

应用

以上图片来自一个测试大牛，十分具体，残缺的 xmind 文件下载见：jmeter- 张蓓.xmind

jmeter 的入门级应用也能够参考最初面的参考资料局部：应用 Apache Jmeter 进行并发压力测试

wrk GET 申请压测后果

root@ubuntu:/tmp# wrk -t10 -c5000 -d60s --latency http://127.0.0.1:8080/api/ip_query?ip=165.118.213.9
Running 1m test @ http://127.0.0.1:8080/api/ip_query?ip=165.118.213.9
  10 threads and 5000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   897.19ms  322.83ms   1.99s    70.52%
    Req/Sec   318.80    206.03     2.14k    68.84%
  Latency Distribution
     50%  915.29ms
     75%    1.11s 
     90%    1.29s 
     99%    1.57s 
  187029 requests in 1.00m, 51.01MB read
  Socket errors: connect 0, read 0, write 0, timeout 38
Requests/sec:   3113.27
Transfer/sec:    869.53KB

ab GET 申请压测后果

root@ubuntu:/tmp# ab -r -t 60 -c 5000 http://127.0.0.1:8080/api/ip_query?ip=165.118.213.9
This is ApacheBench, Version 2.3 <$Revision: 1796539 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, https://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 5000 requests
Completed 10000 requests
Completed 15000 requests
Completed 20000 requests
Completed 25000 requests
Completed 30000 requests
Completed 35000 requests
Completed 40000 requests
Completed 45000 requests
Completed 50000 requests
Finished 50000 requests


Server Software:        gunicorn/19.7.1
Server Hostname:        127.0.0.1
Server Port:            8080

Document Path:          /api/ip_query?ip=165.118.213.9
Document Length:        128 bytes

Concurrency Level:      5000
Time taken for tests:   19.617 seconds
Complete requests:      50000
Failed requests:        2
   (Connect: 0, Receive: 0, Length: 1, Exceptions: 1)
Total transferred:      14050000 bytes
HTML transferred:       6400000 bytes
Requests per second:    2548.85 [#/sec] (mean)
Time per request:       1961.668 [ms] (mean)
Time per request:       0.392 [ms] (mean, across all concurrent requests)
Transfer rate:          699.44 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0  597 1671.8      4   15500
Processing:     4  224 201.4    173    3013
Waiting:        4  223 200.1    172    2873
Total:          7  821 1694.4    236   15914

Percentage of the requests served within a certain time (ms)
  50%    236
  66%    383
  75%   1049
  80%   1155
  90%   1476
  95%   3295
  98%   7347
  99%   7551
 100%  15914 (longest request)

jmeter GET 申请压测后果

后果剖析

以上三个工具的压测后果大体雷同，RPS(Requests per second)大抵在 3000 左右，此时机器配置为 4 核 4G 内存，并且 gunicorn 开了 10 个 worker，内存占用 3.2G。单台机器只有 3000 并发，对于此配置的机器来说，须要进一步剖析起因。后续再弄一台机器，负载平衡后能达到 5000 以上能力满足应用要求。

文件关上数

压力测试时对施压机器的文件关上数个别有要求，远不止 1024 个 open files，须要减少 linux 零碎的文件关上数，减少办法：

# 文件关上数
ulimit -a
# 批改文件关上数
ulimit -n 500000

SYN 洪水攻打爱护

linux 零碎中有一个参数：/etc/sysctl.conf配置文件中的 net.ipv4.tcp_syncookies 字段。这个字段值默认为 1，示意零碎会检测 SYN 洪水攻打，并开启爱护。因而压测时，如果发送大量重复性数据的申请，受压机器 SYN 队列溢出之后启用 SYN cookie，导致会有大量申请超时失败。阿里云的负载平衡是有 SYN 洪水攻打检测和 DDos 攻打检测性能的，因而在做压力测试时须要留神两点：

测试时适当敞开负载平衡机器的 net.ipv4.tcp_syncookies 字段
造数据时应该尽量避免大量重复性数据，免得被辨认为攻打。

对于 gunicorn 的抉择能够参考测试报告：Python WSGI Server 性能剖析

在选定 gunicorn 作为 WSGI server 之后，须要依据机器抉择相应的 worker 数量以及每个 worker 的 worker-class。

worker 数量抉择

每一个 worker 都是作为一个独自的子过程来运行，都持有一份独立的内存数据，每减少或缩小一个 worker，零碎内存显著的成倍数的扭转。最后单台机器 gunicorn 开启 3 个 worker，零碎只反对 1000RPS 的并发。当把 worker 扩大为 9 个之后，零碎反对 3000RPS 的并发。因而在内存足够的时候，能够适当减少 worker 数量。

worker-class 抉择

能够参考尾部的参考资料中的 gunicorn 罕用 settings 和Gunicorn 几种 Worker class 性能测试比拟 这两篇文章。

将 gunicorn 启动时的 worker-class 从默认的 sync 改成 gevent 之后，零碎 RPS 间接翻倍。

worker-class	worker 数量	ab 测试的 RPS
sync	3	573.90
gevent	3	1011.84

gevent 依赖：gevent >= 0.13。因而须要先应用 pip 装置。对应的 gunicorn 启动 flask 利用的命令须要批改为：

gunicorn -w10 -b0.0.0.0:8080 ip_query_app:ip_app --worker-class gevent

改良 ip 数据库准确性

损失效率换取准确性：应用繁多 ip 数据库会存在一些 ip 无奈查问出后果的状况，并且国外 ip 个别只能准确到国家。能够均衡几家 ip 数据库的准确度和覆盖率，当无奈查问出精确的地址信息时去查问另外几个 ip 数据库。

进步单台机器并发量

从发动申请，到 WSGI 服务器解决，到利用接口，到 ip 查问每个过程都须要独自剖析每秒可执行量，进而剖析零碎瓶颈，从根本上进步单机并发量。

参考资料

寰球 IPv4 地址归属地数据库(IPIP.NET 版)
应用 flask 开发 RESTful 架构的 api 服务器端(5)–部署 flask 利用到 nginx
python web 部署：nginx + gunicorn + supervisor + flask 部署笔记
flowsnow-nginx 编译装置
supervisor 举荐教程 - 应用 supervisor 治理过程
维基 - 二叉查找树
简书 -wrk 压力测试 post 接口
应用 Apache Jmeter 进行并发压力测试
gunicorn 罕用 settings
Gunicorn 几种 Worker class 性能测试比拟

记得帮我点赞哦！

精心整顿了计算机各个方向的从入门、进阶、实战的视频课程和电子书，依照目录正当分类，总能找到你须要的学习材料，还在等什么？快去关注下载吧！！！

朝思暮想，必有回响，小伙伴们帮我点个赞吧，非常感谢。

我是职场亮哥，YY 高级软件工程师、四年工作教训，回绝咸鱼争当龙头的斜杠程序员。

听我说，提高多，程序人生一把梭

如果有幸能帮到你，请帮我点个【赞】，给个关注，如果能顺带评论给个激励，将不胜感激。

职场亮哥文章列表：更多文章

自己所有文章、答复都与版权保护平台有单干，著作权归职场亮哥所有，未经受权，转载必究！

关于flask:设计一个基于flask的高并发高可用的查询ip的http服务

结构设计

ip 数据库

http 申请

服务部署

压力测试

压力测试工具抉择

ab

wrk

jmeter

压力测试后果剖析

压力测试注意事项

gunicorn 简介及调优

改良点