前言
- CNI是Container Network Interface的是一个规范的,通用的接口。当初容器平台:docker,kubernetes,mesos,容器网络解决方案:flannel,calico,weave。只有提供一个规范的接口,就能为同样满足该协定的所有容器平台提供网络性能,而CNI正是这样的一个规范接口协议。
- CNI用于连贯容器管理系统和网络插件。提供一个容器所在的network namespace,将network interface插入该network namespace中(比方veth的一端),并且在宿主机做一些必要的配置(例如将veth的另一端退出bridge中),最初对namespace中的interface进行IP和路由的配置
Kubernetes次要存在4种类型的通信:
- container-to-container:产生在Pod外部,借助于lo实现;
- Pod-to-Pod: Pod间的通信,k8s本身并未解决该类通信,而是借助于CNI接口,交给第三方解决方案;CNI之前的接口叫kubenet;
- Service-to-Pod:借助于kube-proxy生成的iptables或ipvs规定实现;
- ExternalClients-to-Service:引入集群内部流量 hostPort、hostNletwork、nodeport/service,、loadbalancer/service、 exteralP/service、 Ingres;
Flannel简介
- Flannel是CoreOS团队针对Kubernetes设计的一个网络布局服务,简略来说,它的性能是让集群中的不同节点主机创立的Docker容器都具备全集群惟一的虚构IP地址。
- 在默认的Docker配置中,每个节点上的Docker服务会别离负责所在节点容器的IP调配。这样导致的一个问题是,不同节点上容器可能取得雷同的内外IP地址。并使这些容器之间可能之间通过IP地址互相找到,也就是互相ping通。
- Flannel的设计目标就是为集群中的所有节点从新布局IP地址的应用规定,从而使得不同节点上的容器可能取得“同属一个内网”且”不反复的”IP地址,并让属于不同节点上的容器可能间接通过内网IP通信。
- Flannel本质上是一种“笼罩网络(overlaynetwork)”,也就是将TCP数据包装在另一种网络包外面进行路由转发和通信,目前曾经反对udp、vxlan、host-gw、aws-vpc、gce和alloc路由等数据转发形式,默认的节点间数据通信形式是UDP转发。
简略总结Flannel特点
- 使集群中的不同Node主机创立的Docker容器都具备全集群惟一的虚构IP地址。
- 建设一个笼罩网络(overlay network),通过这个笼罩网络,将数据包一成不变的传递到指标容器。笼罩网络是建设在另一个网络之上并由其基础设施反对的虚构网络。笼罩网络通过将一个分组封装在另一个分组内来将网络服务与底层基础设施拆散。在将封装的数据包转发到端点后,将其解封装。
- 创立一个新的虚构网卡flannel0接管docker网桥的数据,通过保护路由表,对接管到的数据进行封包和转发(vxlan)。
- etcd保障了所有node上flanned所看到的配置是统一的。同时每个node上的flanned监听etcd上的数据变动,实时感知集群中node的变动。
Flannel反对三种Pod网络模型,每个模型在flannel中称为一种"backend":
- vxlan: Pod与Pod经由隧道封装后通信,各节点彼此间能通信就行,不要求在同一个二层网络; 毛病:因为通过2次封装,吞吐量绝对变低,长处:不要求节点处于同一个2层网络
- vwlan directrouting:位于同一个二层网络上的、但不同节点上的Pod间通信,毋庸隧道封装;但非同一个二层网络上的节点上的Pod间通信,仍须隧道封装; 最优的计划
- host-gw: Pod与Pod不经隧道封装而间接通信,要求各节点位于同一个二层网络; #吞吐量最大 但须要在同个2层网络中
Flannel 下载安装地址
https://github.com/flannel-io...
示例1: 部署flannel 以Vxlan类型运行
#查看flannel部署清单yaml文件中有对于网络类型的形容[root@k8s-master plugin]# cat kube-flannel.ymlkind: ConfigMapapiVersion: v1metadata: name: kube-flannel-cfg namespace: kube-system labels: tier: node app: flanneldata: cni-conf.json: | { "name": "cbr0", "cniVersion": "0.3.1", "plugins": [ { "type": "flannel", #实现虚构网络 "delegate": { "hairpinMode": true, "isDefaultGateway": true } }, { "type": "portmap", #端口映射 如:NodePort "capabilities": { "portMappings": true } } ] } net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "vxlan" #默认为vxlan模式 } }[root@k8s-master plugin]# kubectl apply -f kube-flannel.yml
- vxlan模式下 路由表Pod地址指向flannel.1
[root@k8s-master plugin]# route -nKernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Iface0.0.0.0 192.168.54.2 0.0.0.0 UG 101 0 0 eth410.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0 #本机虚构网络接口10.244.1.0 10.244.1.0 255.255.255.0 UG 0 0 0 flannel.110.244.2.0 10.244.2.0 255.255.255.0 UG 0 0 0 flannel.110.244.3.0 10.244.3.0 255.255.255.0 UG 0 0 0 flannel.1172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0192.168.4.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0192.168.54.0 0.0.0.0 255.255.255.0 U 101 0 0 eth4[root@k8s-node1 ~]# route -nKernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Iface0.0.0.0 192.168.54.2 0.0.0.0 UG 101 0 0 eth410.244.0.0 10.244.0.0 255.255.255.0 UG 0 0 0 flannel.110.244.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0 #本机虚构网络接口10.244.2.0 10.244.2.0 255.255.255.0 UG 0 0 0 flannel.110.244.3.0 10.244.3.0 255.255.255.0 UG 0 0 0 flannel.1172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0192.168.4.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0192.168.54.0 0.0.0.0 255.255.255.0 U 101 0 0 eth4[root@k8s-master plugin]# ip neighbour|grep flannel.1 #生成的永恒neighbour信息 进步路由效率10.244.1.0 dev flannel.1 lladdr ba:98:1c:fa:3a:51 PERMANENT10.244.3.0 dev flannel.1 lladdr da:29:42:38:29:55 PERMANENT10.244.2.0 dev flannel.1 lladdr fa:48:c1:29:0b:dd PERMANENT[root@k8s-master plugin]# bridge fdb show flannel.1|grep flannel.1ba:98:1c:fa:3a:51 dev flannel.1 dst 192.168.54.171 self permanent22:85:29:77:e1:00 dev flannel.1 dst 192.168.54.173 self permanentfa:48:c1:29:0b:dd dev flannel.1 dst 192.168.54.172 self permanentda:29:42:38:29:55 dev flannel.1 dst 192.168.54.173 self permanent#抓包flannel网络 其中 udp 8472为flannel网络默认端口[root@k8s-node3 ~]# tcpdump -i eth4 -nn udp port 8472tcpdump: verbose output suppressed, use -v or -vv for full protocol decodelistening on eth4, link-type EN10MB (Ethernet), capture size 262144 bytes17:08:15.113389 IP 192.168.54.172.46879 > 192.168.54.173.8472: OTV, flags [I] (0x08), overlay 0, instance 1IP 10.244.2.9 > 10.244.3.92: ICMP echo request, id 2816, seq 61, length 6417:08:15.113498 IP 192.168.54.173.55553 > 192.168.54.172.8472: OTV, flags [I] (0x08), overlay 0, instance 1IP 10.244.3.92 > 10.244.2.9: ICMP echo reply, id 2816, seq 61, length 6417:08:16.114359 IP 192.168.54.172.46879 > 192.168.54.173.8472: OTV, flags [I] (0x08), overlay 0, instance 1IP 10.244.2.9 > 10.244.3.92: ICMP echo request, id 2816, seq 62, length 6417:08:16.114447 IP 192.168.54.173.55553 > 192.168.54.172.8472: OTV, flags [I] (0x08), overlay 0, instance 1IP 10.244.3.92 > 10.244.2.9: ICMP echo reply, id 2816, seq 62, length 6417:08:17.115558 IP 192.168.54.172.46879 > 192.168.54.173.8472: OTV, flags [I] (0x08), overlay 0, instance 1IP 10.244.2.9 > 10.244.3.92: ICMP echo request, id 2816, seq 63, length 6417:08:17.115717 IP 192.168.54.173.55553 > 192.168.54.172.8472: OTV, flags [I] (0x08), overlay 0, instance 1IP 10.244.3.92 > 10.244.2.9: ICMP echo reply, id 2816, seq 63, length 6417:08:18.117498 IP 192.168.54.172.46879 > 192.168.54.173.8472: OTV, flags [I] (0x08), overlay 0, instance 1IP 10.244.2.9 > 10.244.3.92: ICMP echo request, id 2816, seq 64, length 64
- 能够看到10.244.2.9 > 10.244.3.92 Pod间的传输 通过封装从节点192.168.54.172.46879 传输到节点 192.168.54.173.8472 通过一层数据封装
示例2: 增加flannel网络类型DirectRouting
- 增加DirectRouting后,2层网络节点会应用宿主机网络接口间接通信,3层网络的节点会应用Vxlan 隧道封装后通信,组合应用是flannel最现实的网络类型
- 因为测试环境所有节点都处于同一2层网络,所以从路由表无奈看到和flannel.1接口同时存在
[root@k8s-master ~]# kubectl get cm -n kube-systemNAME DATA AGEcoredns 1 57dextension-apiserver-authentication 6 57dkube-flannel-cfg 2 57dkube-proxy 2 57dkubeadm-config 2 57dkubelet-config-1.19 1 57d[root@k8s-master ~]# kubectl edit cm kube-flannel-cfg -n kube-system net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "vxlan", "DirectRouting": true #增加 } }
- 重启Pod 正式环境用蓝绿更新
[root@k8s-master ~]# kubectl get pod -n kube-system --show-labelsNAME READY STATUS RESTARTS AGE LABELScoredns-f9fd979d6-l9zck 1/1 Running 16 57d k8s-app=kube-dns,pod-template-hash=f9fd979d6coredns-f9fd979d6-s8fp5 1/1 Running 15 57d k8s-app=kube-dns,pod-template-hash=f9fd979d6etcd-k8s-master 1/1 Running 12 57d component=etcd,tier=control-planekube-apiserver-k8s-master 1/1 Running 16 57d component=kube-apiserver,tier=control-planekube-controller-manager-k8s-master 1/1 Running 40 57d component=kube-controller-manager,tier=control-planekube-flannel-ds-6sppx 1/1 Running 1 7d23h app=flannel,controller-revision-hash=585c88d56b,pod-template-generation=2,tier=nodekube-flannel-ds-j5g9s 1/1 Running 3 7d23h app=flannel,controller-revision-hash=585c88d56b,pod-template-generation=2,tier=nodekube-flannel-ds-nfz77 1/1 Running 1 7d23h app=flannel,controller-revision-hash=585c88d56b,pod-template-generation=2,tier=nodekube-flannel-ds-sqhq2 1/1 Running 1 7d23h app=flannel,controller-revision-hash=585c88d56b,pod-template-generation=2,tier=nodekube-proxy-42vln 1/1 Running 4 25d controller-revision-hash=565786c69c,k8s-app=kube-proxy,pod-template-generation=1kube-proxy-98gfb 1/1 Running 3 21d controller-revision-hash=565786c69c,k8s-app=kube-proxy,pod-template-generation=1kube-proxy-nlnnw 1/1 Running 4 17d controller-revision-hash=565786c69c,k8s-app=kube-proxy,pod-template-generation=1kube-proxy-qbsw2 1/1 Running 4 25d controller-revision-hash=565786c69c,k8s-app=kube-proxy,pod-template-generation=1kube-scheduler-k8s-master 1/1 Running 38 57d component=kube-scheduler,tier=control-planemetrics-server-6849f98b-fsvf8 1/1 Running 15 8d k8s-app=metrics-server,pod-template-hash=6849f98b[root@k8s-master ~]# kubectl delete pod -n kube-system -l app=flannelpod "kube-flannel-ds-6sppx" deletedpod "kube-flannel-ds-j5g9s" deletedpod "kube-flannel-ds-nfz77" deletedpod "kube-flannel-ds-sqhq2" deleted[root@k8s-master ~]#
- 再次查看master、node路由表
[root@k8s-master ~]# route -n Kernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Iface0.0.0.0 192.168.54.2 0.0.0.0 UG 101 0 0 eth410.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni010.244.1.0 10.244.1.0 255.255.255.0 UG 0 0 0 eth410.244.2.0 192.168.54.172 255.255.255.0 UG 0 0 0 eth410.244.3.0 192.168.54.173 255.255.255.0 UG 0 0 0 eth4172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0192.168.4.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0192.168.54.0 0.0.0.0 255.255.255.0 U 101 0 0 eth4
[root@k8s-node1 ~]# route -nKernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Iface0.0.0.0 192.168.54.2 0.0.0.0 UG 101 0 0 eth410.244.0.0 192.168.54.170 255.255.255.0 UG 0 0 0 eth410.244.1.0 0.0.0.0 255.255.255.0 U 0 0 0 cni010.244.2.0 192.168.54.172 255.255.255.0 UG 0 0 0 eth410.244.3.0 192.168.54.173 255.255.255.0 UG 0 0 0 eth4172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0192.168.4.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0192.168.54.0 0.0.0.0 255.255.255.0 U 101 0 0 eth4#网络相干的Pod的IP会间接通过宿主机网络接口地址[root@k8s-master ~]# kubectl get pod -n kube-system -o wideNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATEScoredns-f9fd979d6-l9zck 1/1 Running 16 57d 10.244.0.42 k8s-master <none> <none>coredns-f9fd979d6-s8fp5 1/1 Running 15 57d 10.244.0.41 k8s-master <none> <none>etcd-k8s-master 1/1 Running 12 57d 192.168.4.170 k8s-master <none> <none>kube-apiserver-k8s-master 1/1 Running 16 57d 192.168.4.170 k8s-master <none> <none>kube-controller-manager-k8s-master 1/1 Running 40 57d 192.168.4.170 k8s-master <none> <none>kube-flannel-ds-d79nx 1/1 Running 0 2m12s 192.168.4.170 k8s-master <none> <none>kube-flannel-ds-m48m7 1/1 Running 0 2m14s 192.168.4.172 k8s-node2 <none> <none>kube-flannel-ds-pxmnf 1/1 Running 0 2m14s 192.168.4.171 k8s-node1 <none> <none>kube-flannel-ds-vm9kt 1/1 Running 0 2m19s 192.168.4.173 k8s-node3 <none> <none>kube-proxy-42vln 1/1 Running 4 25d 192.168.4.172 k8s-node2 <none> <none> #应用宿主机网络接口kube-proxy-98gfb 1/1 Running 3 21d 192.168.4.173 k8s-node3 <none> <none>kube-proxy-nlnnw 1/1 Running 4 17d 192.168.4.171 k8s-node1 <none> <none>kube-proxy-qbsw2 1/1 Running 4 25d 192.168.4.170 k8s-master <none> <none>kube-scheduler-k8s-master 1/1 Running 38 57d 192.168.4.170 k8s-master <none> <none>metrics-server-6849f98b-fsvf8 1/1 Running 15 8d 10.244.2.250 k8s-node2 <none> <none>
- 抓包查看数据封装
[root@k8s-master plugin]# kubectl get pod -o wideNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATESclient-1639 1/1 Running 0 52s 10.244.1.222 k8s-node1 <none> <none>replicaset-demo-v1.1-lgf6b 1/1 Running 0 59m 10.244.1.221 k8s-node1 <none> <none>replicaset-demo-v1.1-mvvfq 1/1 Running 0 59m 10.244.3.169 k8s-node3 <none> <none>replicaset-demo-v1.1-tn49t 1/1 Running 0 59m 10.244.2.136 k8s-node2 <none> <none>root@k8s-master plugin]# kubectl exec replicaset-demo-v1.1-tn49t -it -- /bin/sh #拜访node3[root@replicaset-demo-v1 /]# curl 10.244.3.169iKubernetes demoapp v1.1 !! ClientIP: 10.244.2.136, ServerName: replicaset-demo-v1.1-mvvfq, ServerIP: 10.244.3.169![root@replicaset-demo-v1 /]# curl 10.244.3.169#node3上抓包[root@k8s-node3 ~]# tcpdump -i eth4 -nn tcp port 80tcpdump: verbose output suppressed, use -v or -vv for full protocol decodelistening on eth4, link-type EN10MB (Ethernet), capture size 262144 bytes11:03:57.508877 IP 10.244.2.136.49656 > 10.244.3.169.80: Flags [S], seq 1760692242, win 64860, options [mss 1410,sackOK,TS val 4266124446 ecr 0,nop,wscale 7], length 011:03:57.509245 IP 10.244.3.169.80 > 10.244.2.136.49656: Flags [S.], seq 3150629627, ack 1760692243, win 64308, options [mss 1410,sackOK,TS val 1453973317 ecr 4266124446,nop,wscale 7], length 011:03:57.510198 IP 10.244.2.136.49656 > 10.244.3.169.80: Flags [.], ack 1, win 507, options [nop,nop,TS val 4266124447 ecr 1453973317], length 011:03:57.510373 IP 10.244.2.136.49656 > 10.244.3.169.80: Flags [P.], seq 1:77, ack 1, win 507, options [nop,nop,TS val 4266124447 ecr 1453973317], length 76: HTTP: GET / HTTP/1.111:03:57.510427 IP 10.244.3.169.80 > 10.244.2.136.49656: Flags [.], ack 77, win 502, options [nop,nop,TS val 1453973318 ecr 4266124447], length 011:03:57.713241 IP 10.244.3.169.80 > 10.244.2.136.49656: Flags [P.], seq 1:18, ack 77, win 502, options [nop,nop,TS val 1453973521 ecr 4266124447], length 17: HTTP: HTTP/1.0 200 OK11:03:57.713821 IP 10.244.2.136.49656 > 10.244.3.169.80: Flags [.], ack 18, win 507, options [nop,nop,TS val 4266124651 ecr 1453973521], length 011:03:57.733459 IP 10.244.3.169.80 > 10.244.2.136.49656: Flags [P.], seq 18:155, ack 77, win 502, options [nop,nop,TS val 1453973541 ecr 4266124651], length 137: HTTP11:03:57.733720 IP 10.244.3.169.80 > 10.244.2.136.49656: Flags [FP.], seq 155:271, ack 77, win 502, options [nop,nop,TS val 1453973541 ecr 4266124651], length 116: HTTP11:03:57.735862 IP 10.244.2.136.49656 > 10.244.3.169.80: Flags [.], ack 155, win 506, options [nop,nop,TS val 4266124671 ecr 1453973541], length 011:03:57.735883 IP 10.244.2.136.49656 > 10.244.3.169.80: Flags [F.], seq 77, ack 272, win 506, options [nop,nop,TS val 4266124672 ecr 1453973541], length 011:03:57.736063 IP 10.244.3.169.80 > 10.244.2.136.49656: Flags [.], ack 78, win 502, options [nop,nop,TS val 1453973543 ecr 4266124672], length 011:03:58.650891 IP 10.244.2.136.49662 > 10.244.3.169.80: Flags [S], seq 3494935965, win 64860, options [mss 1410,sackOK,TS val 4266125588 ecr 0,nop,wscale 7], length 0
- 能够看到数据的传输没有再通过封装 间接通过Pod IP flannel网络传输
示例3: 批改flannel网络类型host-gw 须要留神host-gw只反对2层网络
因为所有节点都处在2层网络中,实践上和后面增加DirectRouting 成果是一样的 就不累述
[root@k8s-master plugin]# vim kube-flannel.yml...net-conf.json: | { "Network": "10.244.0.0/16", "Backend": { "Type": "host-gw" #批改类型为host-gw } }...#查看路由表 [root@k8s-master plugin]# kubectl apply -f kube-flannel.ymlKernel IP routing tableDestination Gateway Genmask Flags Metric Ref Use Iface0.0.0.0 192.168.54.2 0.0.0.0 UG 101 0 0 eth410.244.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni010.244.1.0 192.168.54.171 255.255.255.0 UG 0 0 0 eth410.244.2.0 192.168.54.172 255.255.255.0 UG 0 0 0 eth410.244.3.0 192.168.54.173 255.255.255.0 UG 0 0 0 eth4172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0192.168.4.0 0.0.0.0 255.255.255.0 U 102 0 0 eth0192.168.54.0 0.0.0.0 255.255.255.0 U 101 0 0 eth4