关于容器:Istio-中实现客户端源-IP-的保持

45次阅读

共计 15634 个字符,预计需要花费 40 分钟才能阅读完成。

作者

尹烨,腾讯专家工程师,腾讯云 TCM 产品负责人。在 K8s、Service Mesh 等方面有多年的实践经验。

导语

对于很多后端服务业务,咱们都心愿失去客户端源 IP。云上的负载均衡器,比方,腾讯云 CLB 反对将客户端源 IP 传递到后端服务。但在应用 istio 的时候,因为 istio ingressgateway 以及 sidecar 的存在,后端服务如果须要获取客户端源 IP,特地是四层协定,状况会变得比较复杂。

注释

很多业务场景,咱们都心愿失去客户端源 IP。云上负载均衡器,比方,腾讯云 CLB 反对将客户端 IP 传递到后端服务。TKE/TCM 也对该能力做了很好的集成。

但在应用 istio 的时候,因为两头链路上,istio ingressgateway 以及 sidecar 的存在,后端服务如果须要获取客户端 IP,特地是四层协定,状况会变得比较复杂。

对于应用服务来说,它只能看到 Envoy 过去的连贯。

一些常见的源 IP 放弃办法

先看看一些常见 Loadbalancer/Proxy 的源 IP 放弃办法。咱们的利用协定个别都是四层、或者七层协定。

七层协定的源 IP 放弃

七层的客户端源 IP 放弃形式比较简单,最具代表性的是 HTTP 头XFF(X-Forwarded-For),XFF 保留原始客户端的源 IP,并透传到后端,利用能够解析 XFF 头,失去客户端的源 IP。常见的七层代理组件,比方 Nginx、Haproxy,包含 Envoy 都反对该性能。

四层协定的源 IP 放弃

DNAT

IPVS/iptables都反对 DNAT,客户端通过 VIP 拜访 LB,申请报文达到 LB 时,LB 依据连贯调度算法抉择一个后端 Server,将报文的指标地址 VIP 改写成选定 Server 的地址,报文的指标端口改写成选定 Server 的相应端口,最初将批改后的报文发送给选出的 Server。因为 LB 在转发报文时,没有批改报文的源 IP,所以,后端 Server 能够看到客户端的源 IP。

Transparent Proxy

Nginx/Haproxy 反对通明代理(Transparent Proxy)。当开启该配置时,LB 与后端服务建设连贯时,会将 socket 的源 IP 绑定为客户端的 IP 地址,这里依赖内核 TPROXY 以及 socket 的 IP_TRANSPARENT 选项。

此外,下面两种形式,后端服务的响应必须通过 LB,再回到 Client,个别还须要策略路由的配合。

TOA

TOA(TCP Option Address)是基于四层协定(TCP)获取实在源 IP 的办法,实质是将源 IP 地址插入 TCP 协定的 Options 字段。这须要内核装置对应的 TOA 内核模块。

Proxy Protocol

Proxy Protocol 是 Haproxy 实现的一个四层源地址保留计划。它的原理特地简略,Proxy 在与后端 Server 建设 TCP 连贯后,在发送理论利用数据之前,首先发送一个 Proxy Protocol 协定头(包含客户端源 IP/ 端口、指标 IP/ 端口等信息)。这样,后端 server 通过解析协定头获取实在的客户端源 IP 地址。

Proxy Protocol须要 Proxy 和 Server 同时反对该协定。但它却能够实现跨多层两头代理放弃源 IP。这有点相似七层 XFF 的设计思维。

istio 中实现源 IP 放弃

istio 中,因为 istio ingressgateway 以及 sidecar 的存在,利用要获取客户端源 IP 地址,会变得比拟艰难。但 Envoy 自身为了反对通明代理,它反对Proxy Protocol,再联合 TPROXY,咱们能够在 istio 的服务中获取到源 IP。

东西向流量

istio 东西向服务拜访时,因为 Sidecar 的注入,所有进出服务的流量均被 Envoy 拦挡代理,而后再由 Envoy 将申请转给利用。所以,利用收到的申请的源地址,是 Envoy 拜访过去的地址127.0.0.6

# kubectl -n foo apply -f samples/httpbin/httpbin.yaml
# kubectl -n foo apply -f samples/sleep/sleep.yaml
# kubectl -n foo get pods -o wide
NAME                       READY   STATUS    RESTARTS   AGE    IP            NODE           NOMINATED NODE   READINESS GATES
httpbin-74fb669cc6-qvlb5   2/2     Running   0          4m9s   172.17.0.57   10.206.2.144   <none>           <none>
sleep-74b7c4c84c-9nbtr     2/2     Running   0          6s     172.17.0.58   10.206.2.144   <none>           <none>


# kubectl -n foo exec -it deploy/sleep -c sleep -- curl http://httpbin:8000/ip
{"origin": "127.0.0.6"}

能够看到,httpbin 看到的源 IP 是127.0.0.6。从 socket 信息,也能够确认这一点。

# kubectl -n foo exec -it deploy/httpbin -c httpbin -- netstat -ntp | grep 80
tcp        0      0 172.17.0.57:80          127.0.0.6:56043         TIME_WAIT   -
  • istio 开启 TPROXY

咱们批改 httpbin deployment,应用 TPROXY(留神 httpbin 的 IP 变成了172.17.0.59):

# kubectl patch deployment -n foo httpbin -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/interceptionMode":"TPROXY"}}}}}'
# kubectl -n foo get pods -l app=httpbin  -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES
httpbin-6565f59ff8-plnn7   2/2     Running   0          43m   172.17.0.59   10.206.2.144   <none>           <none>

# kubectl -n foo exec -it deploy/sleep -c sleep -- curl http://httpbin:8000/ip
{"origin": "172.17.0.58"}

能够看到,httpbin 能够失去 sleep 端的实在 IP。

socket 的状态:

# kubectl -n foo exec -it deploy/httpbin -c httpbin -- netstat -ntp | grep 80                  
tcp        0      0 172.17.0.59:80          172.17.0.58:35899       ESTABLISHED 9/python3           
tcp        0      0 172.17.0.58:35899       172.17.0.59:80          ESTABLISHED -

第一行是 httpbin 的接收端 socket,第二行是 envoy 的发送端 socket。

httpbin envoy日志:

{"bytes_received":0,"upstream_local_address":"172.17.0.58:35899",
"downstream_remote_address":"172.17.0.58:46864","x_forwarded_for":null,
"path":"/ip","istio_policy_status":null,
"response_code":200,"upstream_service_time":"1",
"authority":"httpbin:8000","start_time":"2022-05-30T02:09:13.892Z",
"downstream_local_address":"172.17.0.59:80","user_agent":"curl/7.81.0-DEV","response_flags":"-",
"upstream_transport_failure_reason":null,"request_id":"2b2ab6cc-78da-95c0-b278-5b3e30b514a0",
"protocol":"HTTP/1.1","requested_server_name":null,"duration":1,"bytes_sent":30,"route_name":"default",
"upstream_cluster":"inbound|80||","upstream_host":"172.17.0.59:80","method":"GET"}

能够看到,

  • downstream_remote_address: 172.17.0.58:46864 ## sleep 的地址
  • downstream_local_address: 172.17.0.59:80 ## sleep 拜访的指标地址
  • upstream_local_address: 172.17.0.58:35899 ## httpbin envoy 连贯 httpbin 的 local address(为 sleep 的 IP)
  • upstream_host: 172.17.0.59:80 ## httpbin envoy 拜访的指标地址

httpbin envoy 连贯 httpbin 的 local address 为 sleep 的 IP 地址。

南北向流量

对于南北向流量,客户端先申请 CLB,CLB 将申请转给 ingressgateway,再转到后端服务,因为两头多了 ingressgateway 一跳,想要获取客户端源 IP,变得更加艰难。

咱们以 TCP 协定拜访 httpbin:

apiVersion: v1
kind: Service
metadata:
  name: httpbin
  namespace: foo
  labels:
    app: httpbin
    service: httpbin
spec:
  ports:
  - name: tcp
    port: 8000
    targetPort: 80
  selector:
    app: httpbin
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: httpbin-gw
  namespace: foo
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 8000
      name: tcp
      protocol: TCP
    hosts:
    - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
  namespace: foo
spec:
  hosts:
    - "*"
  gateways:
    - httpbin-gw
  tcp:
    - match:
      - port: 8000
      route:
        - destination:
            port:
              number: 8000
            host: httpbin

通过 ingressgateway 拜访 httpbin:

# export GATEWAY_URL=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
# curl http://$GATEWAY_URL:8000/ip
{"origin": "172.17.0.54"}

能够看到,httpbin 看到的地址是 ingressgateway 的地址:

# kubectl -n istio-system get pods -l istio=ingressgateway -o wide
NAME                                    READY   STATUS    RESTARTS   AGE     IP            NODE           NOMINATED NODE   READINESS GATES
istio-ingressgateway-5d5b776b7b-pxc2g   1/1     Running   0          3d15h   172.17.0.54   10.206.2.144   <none>           <none>

尽管咱们在 httpbin envoy 开启了通明代理,但 ingressgateway 并不能把 client 的源地址传到httpbin envoy。基于 envoy 实现的Proxy Protocol,能够解决这个问题。

通过 EnvoyFilter 在 ingressgateway 和 httpbin 同时开启 Proxy Protocol 反对。

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: ingressgw-pp
  namespace: istio-system
spec:
  configPatches:
  - applyTo: CLUSTER
    patch:
      operation: MERGE
      value:
        transport_socket:
          name: envoy.transport_sockets.upstream_proxy_protocol
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.transport_sockets.proxy_protocol.v3.ProxyProtocolUpstreamTransport
            config:
              version: V1
            transport_socket:
              name: "envoy.transport_sockets.raw_buffer"
  workloadSelector:
    labels:
      istio: ingressgateway
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: httpbin-pp
  namespace: foo
spec:
  configPatches:
  - applyTo: LISTENER
    match:
      context: SIDECAR_INBOUND
    patch:
      operation: MERGE
      value:
        listener_filters:
        - name: envoy.filters.listener.proxy_protocol
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.filters.listener.proxy_protocol.v3.ProxyProtocol
        - name: envoy.filters.listener.original_dst
        - name: envoy.filters.listener.original_src
  workloadSelector:
    labels:
      app: httpbin

再次通过 LB 拜访 httpbin:

# curl http://$GATEWAY_URL:8000/ip
{"origin": "106.52.131.116"}

httpbin 失去了客户端的源 IP。

  • ingressgateway envoy 日志
{"istio_policy_status":null,"protocol":null,"bytes_sent":262,"downstream_remote_address":"106.52.131.116:6093","start_time":"2022-05-30T03:33:33.759Z",
"upstream_service_time":null,"authority":null,"requested_server_name":null,"user_agent":null,"request_id":null,
"upstream_cluster":"outbound|8000||httpbin.foo.svc.cluster.local","upstream_transport_failure_reason":null,"duration":37,"response_code":0,
"method":null,"downstream_local_address":"172.17.0.54:8000","route_name":null,"upstream_host":"172.17.0.59:80","bytes_received":83,"path":null,
"x_forwarded_for":null,"upstream_local_address":"172.17.0.54:36162","response_flags":"-"}

能够看到,

  • downstream_remote_address: 106.52.131.116:6093 ## 客户端源地址
  • downstream_local_address: 172.17.0.54:8000
  • upstream_local_address: 172.17.0.54:42122 ## ingressgw local addr
  • upstream_host: 172.17.0.59:80 ## httpbin 地址
  • httpbin envoy 日志
{"istio_policy_status":null,"response_flags":"-","protocol":null,"method":null,"upstream_transport_failure_reason":null,"authority":null,"duration":37,
"x_forwarded_for":null,"user_agent":null,"downstream_remote_address":"106.52.131.116:6093","downstream_local_address":"172.17.0.59:80",
"bytes_sent":262,"path":null,"requested_server_name":null,"upstream_service_time":null,"request_id":null,"bytes_received":83,"route_name":null,
"upstream_local_address":"106.52.131.116:34431","upstream_host":"172.17.0.59:80","response_code":0,"start_time":"2022-05-30T03:33:33.759Z","upstream_cluster":"inbound|80||"}

能够看到,

  • downstream_remote_address: 106.52.131.116:6093 ## 客户端源地址
  • downstream_local_address: 172.17.0.59:80 ## httpbin 地址
  • upstream_local_address: 106.52.131.116:34431 ## 保留了客户端 IP,port 不一样
  • upstream_host: 172.17.0.59:80 ## httpbin 地址

值得注意的是,httpbin envoyupstream_local_address 保留了客户端的 IP,这样,httpbin 看到的源地址 IP,就是客户端的实在 IP。

  • 数据流

相干实现剖析

TRPOXY

TPROXY 的内核实现参考 net/netfilter/xt_TPROXY.c。

istio-iptables会设置上面的 iptables 规定,给数据报文设置标记。

-A PREROUTING -p tcp -j ISTIO_INBOUND
-A PREROUTING -p tcp -m mark --mark 0x539 -j CONNMARK --save-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A OUTPUT -p tcp -m connmark --mark 0x539 -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A ISTIO_DIVERT -j MARK --set-xmark 0x539/0xffffffff
-A ISTIO_DIVERT -j ACCEPT
-A ISTIO_INBOUND -p tcp -m conntrack --ctstate RELATED,ESTABLISHED -j ISTIO_DIVERT
-A ISTIO_INBOUND -p tcp -j ISTIO_TPROXY
-A ISTIO_TPROXY ! -d 127.0.0.1/32 -p tcp -j TPROXY --on-port 15006 --on-ip 0.0.0.0 --tproxy-mark 0x539/0xffffffff

值得一提的是,TPROXY 不必依赖 NAT,自身就能够实现数据包的重定向。另外,联合策略路由,将非本地的数据包通过本地 lo 路由:

# ip rule list
0:    from all lookup local 
32765:    from all fwmark 0x539 lookup 133 
32766:    from all lookup main 
32767:    from all lookup default 

# ip route show table 133
local default dev lo scope host

TPROXY 的更多具体介绍参考这里。

Envoy 中 Proxy Protocol 的实现

  • proxy protocol header format

这里应用了Version 1(Human-readable header format),如下:

0000   50 52 4f 58 59 20 54 43 50 34 20 31 30 36 2e 35   PROXY TCP4 106.5
0010   32 2e 31 33 31 2e 31 31 36 20 31 37 32 2e 31 37   2.131.116 172.17
0020   2e 30 2e 35 34 20 36 30 39 33 20 38 30 30 30 0d   .0.54 6093 8000.
0030   0a                                                .

能够看到,header 包含 client 和 ingressgateway 的 IP:PORT 信息。更加具体的介绍参考这里。

  • ProxyProtocolUpstreamTransport

ingressgateway 作为发送端,应用 ProxyProtocolUpstreamTransport,构建Proxy Protocol 头部:

/// source/extensions/transport_sockets/proxy_protocol/proxy_protocol.cc

void UpstreamProxyProtocolSocket::generateHeaderV1() {// Default to local addresses (used if no downstream connection exists e.g. health checks)
  auto src_addr = callbacks_->connection().addressProvider().localAddress(); 
  auto dst_addr = callbacks_->connection().addressProvider().remoteAddress();

  if (options_ && options_->proxyProtocolOptions().has_value()) {const auto options = options_->proxyProtocolOptions().value();
    src_addr = options.src_addr_;
    dst_addr = options.dst_addr_;
  }

  Common::ProxyProtocol::generateV1Header(*src_addr->ip(), *dst_addr->ip(), header_buffer_);
}
  • envoy.filters.listener.proxy_protocol

httpbin envoy作为接收端,配置 ListenerFilter(envoy.filters.listener.proxy_protocol)解析 Proxy Protocol 头部:

/// source/extensions/filters/listener/proxy_protocol/proxy_protocol.cc

ReadOrParseState Filter::onReadWorker() {Network::ConnectionSocket& socket = cb_->socket(); /// ConnectionHandlerImpl::ActiveTcpSocket
...
  if (proxy_protocol_header_.has_value() && !proxy_protocol_header_.value().local_command_) {
...
    // Only set the local address if it really changed, and mark it as address being restored.
    if (*proxy_protocol_header_.value().local_address_ !=
        *socket.addressProvider().localAddress()) { /// proxy protocol header: 172.17.0.54:8000
      socket.addressProvider().restoreLocalAddress(proxy_protocol_header_.value().local_address_); /// => 172.17.0.54:8000
    } /// Network::ConnectionSocket
    socket.addressProvider().setRemoteAddress(proxy_protocol_header_.value().remote_address_); /// 批改 downstream_remote_address 为 106.52.131.116
  }

  // Release the file event so that we do not interfere with the connection read events.
  socket.ioHandle().resetFileEvents();
  cb_->continueFilterChain(true); /// ConnectionHandlerImpl::ActiveTcpSocket
  return ReadOrParseState::Done;
}

这里值得注意的,envoy.filters.listener.proxy_protocol在解析 proxy protocol header 时,local_address为发送端的 dst_addr(172.17.0.54:8000)remote_address 为发送端的src_addr(106.52.131.116)。程序刚好反过来了。

通过 proxy_protocol 的解决,连贯的 downstream_remote_address 被批改为 client 的源地址。

  • envoy.filters.listener.original_src

对于 sidecar.istio.io/interceptionMode: TPROXYvirtualInbound listener 会减少envoy.filters.listener.original_src:

# istioctl -n foo pc listeners deploy/httpbin --port 15006 -o json
[
    {
        "name": "virtualInbound",
        "address": {
            "socketAddress": {
                "address": "0.0.0.0",
                "portValue": 15006
            }
        },
        "filterChains": [...],
        "listenerFilters": [
            {
                "name": "envoy.filters.listener.original_dst",
                "typedConfig": {"@type": "type.googleapis.com/envoy.extensions.filters.listener.original_dst.v3.OriginalDst"}
            },
            {
                "name": "envoy.filters.listener.original_src",
                "typedConfig": {
                    "@type": "type.googleapis.com/envoy.extensions.filters.listener.original_src.v3.OriginalSrc",
                    "mark": 1337
                }
            }
        ...
        ]
        "listenerFiltersTimeout": "0s",
        "continueOnListenerFiltersTimeout": true,
        "transparent": true,
        "trafficDirection": "INBOUND",
        "accessLog": [...]
    }
]

envoy.filters.listener.original_src通过 tcp option 实现批改 upstream_local_addressdownstream_remote_address,实现透传 client IP。

/// source/extensions/filters/listener/original_src/original_src.cc

Network::FilterStatus OriginalSrcFilter::onAccept(Network::ListenerFilterCallbacks& cb) {auto& socket = cb.socket(); /// ConnectionHandlerImpl::ActiveTcpSocket.socket()
  auto address = socket.addressProvider().remoteAddress();   /// get downstream_remote_address
  ASSERT(address);

  ENVOY_LOG(debug,
            "Got a new connection in the original_src filter for address {}. Marking with {}",
            address->asString(), config_.mark());

...
  auto options_to_add =
      Filters::Common::OriginalSrc::buildOriginalSrcOptions(std::move(address), config_.mark()); 
  socket.addOptions(std::move(options_to_add)); /// Network::Socket::Options
  return Network::FilterStatus::Continue;
}
  • envoy.filters.listener.original_dst

另外,httbin envoy作为 ingressgateway 的接收端,virtualInbound listener还配置了 ListenerFilter(envoy.filters.listener.original_dst),来看看它的作用。

// source/extensions/filters/listener/original_dst/original_dst.cc

Network::FilterStatus OriginalDstFilter::onAccept(Network::ListenerFilterCallbacks& cb) {ENVOY_LOG(debug, "original_dst: New connection accepted");
  Network::ConnectionSocket& socket = cb.socket();

  if (socket.addressType() == Network::Address::Type::Ip) { /// socket SO_ORIGINAL_DST option
    Network::Address::InstanceConstSharedPtr original_local_address = getOriginalDst(socket); /// origin dst address

    // A listener that has the use_original_dst flag set to true can still receive
    // connections that are NOT redirected using iptables. If a connection was not redirected,
    // the address returned by getOriginalDst() matches the local address of the new socket.
    // In this case the listener handles the connection directly and does not hand it off.
    if (original_local_address) { /// change local address to origin dst address
      // Restore the local address to the original one.
      socket.addressProvider().restoreLocalAddress(original_local_address);
    }
  }

  return Network::FilterStatus::Continue;
}

对于 istio,由 iptable 截持原有 request,并转到 15006(in request),或者 15001(out request)端口,所以,解决 request 的 socket 的 local address,并不申请的original dst addressoriginal_dst ListenerFilter 负责将 socket 的 local address 改为original dst address

对于 virtualOutbound listener,不会间接增加envoy.filters.listener.original_dst,而是将use_original_dst 设置为 true,而后 envoy 会主动增加 envoy.filters.listener.original_dst。同时,virtualOutbound listener 会将申请,转给申请原目标地址关联的 listener 进行解决。

对于 virtualInbound listener,会间接增加envoy.filters.listener.original_dst。与virtualOutbound listener 不同的是,它只是将地址改为original dst address,而不会将申请转给对应的 listener 解决(对于入申请,并不存在 dst address 的 listener)。实际上,对于入申请是由 FilterChain 实现解决。

参考 istio 生成 virtualInbound listener 的代码:

// istio/istio/pilot/pkg/networking/core/v1alpha3/listener_builder.go

func (lb *ListenerBuilder) aggregateVirtualInboundListener(passthroughInspectors map[int]enabledInspector) *ListenerBuilder {
    // Deprecated by envoyproxy. Replaced
    // 1. filter chains in this listener
    // 2. explicit original_dst listener filter
    // UseOriginalDst: proto.BoolTrue,
    lb.virtualInboundListener.UseOriginalDst = nil
    lb.virtualInboundListener.ListenerFilters = append(lb.virtualInboundListener.ListenerFilters,
        xdsfilters.OriginalDestination, /// 增加 envoy.filters.listener.original_dst
    )
    if lb.node.GetInterceptionMode() == model.InterceptionTproxy { /// TPROXY mode
        lb.virtualInboundListener.ListenerFilters =
            append(lb.virtualInboundListener.ListenerFilters, xdsfilters.OriginalSrc)
    }
...

小结

基于 TPROXY 以及 Proxy Protocol,咱们能够在 istio 中,实现四层协定的客户端源 IP 的放弃。

参考

  • istio doc: Configuring Gateway Network Topology
  • IP Transparency and Direct Server Return with NGINX and NGINX Plus as Transparent Proxy
  • Kernel doc: Transparent proxy support
  • Haproxy doc: The PROXY protocol
  • Envoy doc: IP Transparency
  • 【IstioCon 2021】如何在 Istio 中进行源地址放弃?

对于咱们

更多对于云原生的案例和常识,可关注同名【腾讯云原生】公众号~

福利:

①公众号后盾回复【手册】,可取得《腾讯云原生路线图手册》&《腾讯云原生最佳实际》~

②公众号后盾回复【系列】,可取得《15 个系列 100+ 篇超实用云原生原创干货合集》,蕴含 Kubernetes 降本增效、K8s 性能优化实际、最佳实际等系列。

③公众号后盾回复【白皮书】,可取得《腾讯云容器平安白皮书》&《降本之源 - 云原生老本治理白皮书 v1.0》

④公众号后盾回复【光速入门】,可取得腾讯云专家 5 万字精髓教程,光速入门 Prometheus 和 Grafana。

正文完
 0