关于程序员:eBPF深度探索-高效DNS监控实现

eBPF 能够灵便扩大 Linux 内核机制，本文通过实现一个 DNS 监控工具为例，介绍了怎么开发理论的 eBPF 利用。原文: A Deep Dive into eBPF: Writing an Efficient DNS Monitoring

eBPF 是内核内置的虚拟机，在 Linux 内核外部提供了高层库、指令集以及执行环境，被用于诸多 Linux 内核子系统，特地是网络、跟踪、调试和平安畛域。其性能即反对扭转内核对数据包的解决，也容许对网络设备 (如智能网卡) 进行编程。

曾经有大量各种语言的对于 eBPF 的介绍文章，所以本文不会过多波及 eBPF 的细节。只管许多文章都提供了相当多的信息，但都没有答复最重要的问题: eBPF 是如何解决数据包并监督从主机发送给用户的数据包的？本文将从头开始创立一个理论的应用程序，逐渐丰盛其性能，特地是监控 DNS 申请、响应及其过程，并提供所有这些过程的解释、评论以及源代码链接。因为想多举几个例子，而不仅仅只是繁多问题的解决方案，因而有时候咱们会略微有点偏题。最终心愿那些想要相熟 eBPF 的人能够花更少的工夫钻研有用的资料，并更快的开始编程。

简介

假如主机能够发送非法的 DNS 申请，但发送这些申请的 IP 地址是未知的。在网络过滤器日志中，能够看到一直受到申请，但不分明这是非法申请，还是信息曾经泄露给了攻击者？如果发送数据的服务器所在的域是已知的，那就容易了。可怜的是，PTR 曾经过期，SecurityTrails 显示这个 IP 要么什么都没有，要么有太多乌七八糟的货色。

咱们能够执行 tcpdump 命令，然而谁违心始终盯着显示器呢？如果有多个服务器又怎么办呢？ELK 技术栈里有 packetbeat，这是一个能够吃掉服务器上所有处理器解决能力的怪物。Osquery 也是一个很好的工具，它十分理解网络连接，但不理解 DNS 查问，相干反对曾经不再提供了。Zeek是一个我在寻找如何跟踪 DNS 查问时理解到的工具，看起来还不错，但有两点让人感到困惑: 它不仅仅监督 DNS，这意味着资源还将花在我不须要的工作上(兴许只管能够在设置中抉择协定)，它也不晓得是哪个过程发送了申请。

咱们将用 Python 并从最简略的局部开始编写代码，从而了解 Python 是如何与 eBPF 交互的。首先装置这些包:

#apt install python3-bpfcc bpfcc-tools libbpfcc linux-headers-$(uname -r)

这是在 Ubuntu 下的命令，然而如果想要深刻内核，为其余发行版找到必要的包应该也不是问题。当初让咱们开始吧:

#!/usr/bin/env python3
from bcc import BPF
FIRST_BPF = r"""
int first(void *ctx) {bpf_trace_printk("Hello world! execve() is calling\n");
  return 0;
}
"""
bpf = BPF(text=FIRST_BPF)
bpf.attach_kprobe(event=bpf.get_syscall_fnname("execve"), fn_name="first")
while True:
    try:
        (_, _, _, _, _, event_b) = bpf.trace_fields()
        events = event_b.decode('utf8')
        if 'Hello world' in events:
            print(events)
    except ValueError:
        continue
    except KeyboardInterrupt:
        break

留神: 在 Ubuntu 20.04 LTS 和 18.04 LTS 中，默认状况下容许无特权用户加载 eBPF 程序，但在最近的 Ubuntu 版本 (21.10 和 22.04 LTS) 中，出于平安思考，默认禁用了这一性能。通过以下命令能够重启此能力:

$ sudo sysctl kernel.unprivileged_bpf_disabled=0

与所有 hello-world 示例一样，它没有做任何有用的事件，只是向咱们介绍了基础知识。当主机上的任何程序调用 execve()零碎调用时，first()函数就会被执行。能够在另一个管制台上运行命令 ls|cat|grep|clear 或任何蕴含 execve() 的命令来触发，而后执行咱们的代码。也能够在内核中产生的各种事件时调用 eBPF 程序，attach_kprobe()示意在调用特定内核函数时触发。但咱们更习惯于解决零碎调用，谁会晓得对应函数的名字呢？因而，助手函数 get_syscall_fnname() 能够帮忙咱们将零碎调用名转换为内核函数名。

eBPF 中最简略的输入选项是函数bpf_trace_printk()，但这只是用于调试的输入。传递给这个函数的所有货色都能够通过 /sys/kernel/debug/tracing/trace_pipe 文件取得。为了防止在另一个控制台中读取这个文件，咱们应用函数trace_fields()，它能够读取这个文件，并在程序中为咱们提供其内容。

代码的其余部分比拟明确，在一个可能被 Ctrl- C 中断的有限循环中，读取调试输入，如果呈现 ”Hello world” 字符串，就将其残缺输入。

留神: bpf_trace_printk()能够实现相似 printf() 的格式化文本，但有重要限度: 不能超过 3 个参数，并且只有一个参数是%s。

当初咱们曾经大抵理解了如何应用 eBPF，接下来咱们开始构建一个理论的应用程序，监督所有 DNS 申请和响应，并记录谁问了什么以及收到了什么响应。

开始

咱们从 eBPF 开始，解决数据包最简略的办法是将它们附加到网络套接字上。在本例中，每个包都将触发咱们的程序。稍后咱们将具体阐明这是如何实现的，但当初咱们须要在所有数据包中捕捉端口为 53 的 UDP 包。要做到这一点，必须本人拆解包构造，并在 C 中拆散所有嵌套的协定。cursor_advance宏能够在包的范畴内挪动光标(指针)，返回其以后地位并挪动到指定地位，从而帮忙咱们做到这一点:

#include <linux/if_ether.h>
#include <linux/in.h>
#include <bcc/proto.h>
int dns_matching(struct __sk_buff *skb) {
 u8 *cursor = 0;
// Checking the IP protocol::
 struct ethernet_t *ethernet = cursor_advance(cursor, sizeof(*ethernet));
if (ethernet->type == ETH_P_IP) {…

proto.h文件中形容的构造ethernet_t:

struct ethernet_t {
  unsigned long long  dst:48;
  unsigned long long  src:48;
  unsigned int        type:16;
} BPF_PACKET_HEADER;

以太帧格局自身非常简单，蕴含 6 个字节 (48 位) 的目地地址，雷同大小的源地址，而后是两个字节 (16 位) 的负载类型。

负载类型由一个等于 0x0800 的常量 ETH_P_IP 编码，定义在文件 if_ether.h”>if_ether.h中，确保下一层协定是 IP(该代码以及其余可能的值都由 IEEE 形容)。

咱们持续查看 IP 外部是否是端口为 53 的 UDP:

// Checking the UDP protocol:
struct ip_t *ip = cursor_advance(cursor, sizeof(*ip));
if (ip->nextp == IPPROTO_UDP) {
    // Checking port 53:
    struct udp_t *udp = cursor_advance(cursor, sizeof(*udp));
    if (udp->dport == 53) {
        // Request
        return -1;
    }
    if (udp->sport == 53) {
        // Respose
        return -1;
    }
}

ip_t和 udp_t 依然定义在 proto.h 中，但 IPPROTO_UDP 来自于 in.h”>in.h。一般来说，这个例子并不完全正确。IP 构造曾经有点简单了，它有可选字段，因而头部长度有可能不一样。正确做法是首先从头部获取其长度值，而后执行偏移，但咱们才刚刚开始，不须要搞得太简单。

这就很简略的找到了 DNS 包，接下来须要剖析它的构造。为了简略起见，咱们把包传递给用户空间(为此返回 -1，而返回码 0 意味着不须要复制包)。

回到 Python，咱们首先依然将程序附加到套接字上:

#!/usr/bin/env python3
import dnslib
import sys
from bcc import BPF
...
bpf = BPF(text=BPF_PROGRAM)
function_dns_matching = bpf.load_func("dns_matching", BPF.SOCKET_FILTER)
BPF.attach_raw_socket(function_dns_matching, '')

与上一个例子不同，当初程序不是在调用任何函数时被调用，而是被每个包调用。attach_raw_socket中的空参数意味着 ” 所有网络接口 ”，如果咱们须要监控特定网络接口，那么就填入对应的名字。

将 socket 设置为阻塞模式:

import fcntl
import os
socket_fd = function_dns_matching.sock
fl = fcntl.fcntl(socket_fd, fcntl.F_GETFL)
fcntl.fcntl(socket_fd, fcntl.F_SETFL, fl & ~os.O_NONBLOCK)

剩下的就很简略了，应用相似的有限循环，从套接字读取数据，去掉所有头域，间接取得 DNS 包并解码。

残缺代码如下:

#!/usr/bin/env python3

import dnslib
import fcntl
import os
import sys

from bcc import BPF

BPF_APP = r'''
#include <linux/if_ether.h>
#include <linux/in.h>
#include <bcc/proto.h>
int dns_matching(struct __sk_buff *skb) {
    u8 *cursor = 0;
     // Checking the IP protocol:
    struct ethernet_t *ethernet = cursor_advance(cursor, sizeof(*ethernet));
    if (ethernet->type == ETH_P_IP) {
         // Checking the UDP protocol:
        struct ip_t *ip = cursor_advance(cursor, sizeof(*ip));
        if (ip->nextp == IPPROTO_UDP) {
             // Check the port 53:
            struct udp_t *udp = cursor_advance(cursor, sizeof(*udp));
            if (udp->dport == 53 || udp->sport == 53) {return -1;}
        }
    }
    return 0;
}
'''


bpf = BPF(text=BPF_APP)
function_dns_matching = bpf.load_func("dns_matching", BPF.SOCKET_FILTER)
BPF.attach_raw_socket(function_dns_matching, '')

socket_fd = function_dns_matching.sock
fl = fcntl.fcntl(socket_fd, fcntl.F_GETFL)
fcntl.fcntl(socket_fd, fcntl.F_SETFL, fl & ~os.O_NONBLOCK)

while True:
    try:
        packet_str = os.read(socket_fd, 2048)
    except KeyboardInterrupt:
        sys.exit(0)

    packet_bytearray = bytearray(packet_str)

    ETH_HLEN = 14
    UDP_HLEN = 8

    # IP header length
    ip_header_length = packet_bytearray[ETH_HLEN]
    ip_header_length = ip_header_length & 0x0F
    ip_header_length = ip_header_length << 2

    # Starting the DNS packet
    payload_offset = ETH_HLEN + ip_header_length + UDP_HLEN

    payload = packet_bytearray[payload_offset:]

    dnsrec = dnslib.DNSRecord.parse(payload)

    # If it’s the response:
    if dnsrec.rr:
        print(f'Resp: {dnsrec.rr[0].rname} {dnslib.QTYPE.get(dnsrec.rr[0].rtype)} {", ".join([repr(dnsrec.rr[i].rdata) for i in range(0, len(dnsrec.rr))])}')
    # If it’s the request:
    else:
        print(f'Request: {dnsrec.questions[0].qname} {dnslib.QTYPE.get(dnsrec.questions[0].qtype)}')

该示例展现了哪些 DNS 申请 / 响应会通过咱们的网络接口，但通过这种形式，咱们还是不晓得是什么过程在解决。也就是说，只有无限的信息，因为不足信息，我没有抉择 Zeek。

从数据包到过程

要获取对于 eBPF 中的过程信息，能够应用以下函数: bpf_get_current_pid_tgid()、bpf_get_current_uid_gid()、bpf_get_current_comm(char *buf, int size_of_buf)。当程序被绑定到对某个内核函数调用时 (如第一个示例所示)，就能够应用它们。UID/GID 应该比拟明确，但对于那些以前没有接触过内核操作细节的人来说，还是须要解释一下。在内核中被视为 PID 的货色在用户空间中显示为过程的 thread ID。内核认为用户空间中的thread group ID 是 PID。相似的，bpf_get_current_comm()返回的不是通常的过程名 (能够通过ps 命令查看)，而是线程名。

好吧，咱们总归会拿到过程数据，那怎么将数据传递到用户空间？Table 就是用于此目标，通过 BPF_PERF_OUTPUT(event) 创立，通过办法 event.perf_submit(ctx, data, data_size) 传递，并通过 b.perf_buffer_poll() 轮询接管。在此之后，只有数据可用，就会调用 callback() 函数，即b["event"].open_perf_buffer(callback)。

上面将具体介绍这一机制，但当初，咱们持续从实践上进行剖析。咱们既能够传输数据，也能够传输数据包自身。但要做到这一点，必须为传输的数据抉择一个特定长度的变量。怎么选？间接答复是512 字节，但并不正确。这一长度并没有思考 EDNS，而且咱们还想正确跟踪基于 TCP 的 DNS 报文。因而咱们不得不调配大量的预留空间，而更大的包将会被抛弃，大多数状况下，咱们将调配比所需更多的内存。我不喜爱这种办法，侥幸的是，还有另一个办法: perf_submit_skb()。除了数据外，它还从缓冲区传输指定字节的数据包。但须要留神，该办法仅实用于网络程序 eBPF: 套接字，XDP。也就是说，咱们无奈取得无关过程的信息。

侥幸的是，能够应用多个 eBPF 程序并互相交换数据！这也能够通过 Table 来实现。申明如下:

BPF_TABLE_PUBLIC("hash", key, val, name, max_elements);

这是为了使其对其余 eBPF 程序可用。在另一个程序中，通过如下代码拜访:

BPF_TABLE("extern", key, val, name, max_elements);

因而，即便 5 元组 (协定、源地址、源端口、目标地址和目标端口) 都一样，也不会失落数据包，键将是以下构造:

struct port_key {
     u8 proto;
     u32 saddr;
     u32 daddr;
     u16 sport;
     u16 dport;
 };

值是咱们想晓得的对于这个过程的所有信息:

struct port_val {
     u32 ifindex;
     u32 pid;
     u32 tgid;
     u32 uid;
     u32 gid;
     char comm[64];
 };

ifindex是网络设备，咱们将在套接字上运行的另一个程序中填充这个值。在这里，咱们用它来将整个构造转移到将来的用户空间。

总结: 当调用内核函数发送数据包时，存储波及到的过程信息。当数据包呈现在网络接口上时(不论是传出的还是传入)，查看是否在目的地之间通过这样或那样的协定传输包的任何信息。如果有，就将其与包一起传递给 Python，在那里实现其余工作。

好了，咱们曾经探讨程序的根本逻辑，接下来开始编程吧！

我的名字是过程

咱们从获取相干过程的信息开始。udp_sendmsg()”>udp_sendmsg()和 tcp_sendmsg()”>tcp_sendmsg()函数用于发送数据包，两者都将 sock”>sock构造作为第一个参数。在 eBPF 中有两种办法能够拜访所钻研函数的实参: 将其指定为函数的形参，或者应用宏PT_REGS_PARMx，其中 x 是实参号。上面将展现这两个选项，这是第一个程序，C_BPF_KPROBE:

// The structure that will be used as the key for 
// eBPF table 'proc_ports':
struct port_key {
    u8 proto;
    u32 saddr;
    u32 daddr;
    u16 sport;
    u16 dport;
};
// The structure that will be stored in the eBPF table 'proc_ports' 
// contains information about the process:
struct port_val {
    u32 ifindex;
    u32 pid;
    u32 tgid;
    u32 uid;
    u32 gid;
    char comm[64];
};
// Public (accessible from other eBPF programs) eBPF table in which 
// information about the process is written. 
// It's read when a packet appears on the socket:
BPF_TABLE_PUBLIC("hash", struct port_key, struct port_val, proc_ports, 20480);
// These are two ways to get access to the function arguments:
//int trace_udp_sendmsg(struct pt_regs *ctx) {// struct sock *sk = (struct sock *)PT_REGS_PARM1(ctx);
int trace_udp_sendmsg(struct pt_regs *ctx, struct sock *sk) {
    u16 sport = sk->sk_num;
    u16 dport = sk->sk_dport;
  
    // Processing packets only on port 53.
    // 13568 = ntohs(53);
    if (sport == 13568 || dport == 13568) {
        // Preparing the data:
        u32 saddr = sk->sk_rcv_saddr;
        u32 daddr = sk->sk_daddr;
        u64 pid_tgid = bpf_get_current_pid_tgid();
        u64 uid_gid = bpf_get_current_uid_gid();
        // Forming the key structure.
        // These strange transformations will be explained below.
        struct port_key key = {.proto = 17};
        key.saddr = htonl(saddr);
        key.daddr = htonl(daddr);
        key.sport = sport;
        key.dport = htons(dport);
        // Forming a structure with the process properties:
        struct port_val val = {};
        val.pid = pid_tgid >> 32;
        val.tgid = (u32)pid_tgid;
        val.uid = (u32)uid_gid;
        val.gid = uid_gid >> 32;
        bpf_get_current_comm(val.comm, 64);
        //Writing the value into the eBPF table:
        proc_ports.update(&key, &val);
    }
    return 0;
}

应用 tcp_sendmsg 也齐全一样，惟一的区别是，在构造 port_key 中，字段 proto 将等于 6，这两个值 (17 和 6) 别离是 UDP 和 TCP 的协定号，能够在 /etc/protocols 文件中查看这些值。

两个 bpf_get_current_* 函数都返回 64 比特，因而咱们别离获取高下 32 比特来提取数据。此外，对于 PID/TGID，咱们能够立刻以常见的模式获取(例如，对于 PID，写入字段的高 32 位，其中蕴含内核认为是 TGID 的内容)。

咱们接下来看看要害数据结构的转换。在下一节中，咱们将在程序中创立一个相似的构造。但咱们不是从原子结构 sock 中获取数据，而是从 eBPF 的__sk_buff”>__sk_buff中，数据的存储模式为:

__u32 remote_ip4; /* Stored in network byte order */
__u32 local_ip4; /* Stored in network byte order */
__u32 remote_port; /* Stored in network byte order */
__u32 local_port; /* stored in host byte order */

提取到用户空间

咱们的第二个程序 BPF_SOCK_TEXT 将 ” 挂起(hang)” 在套接字上，为每个包查看对应过程的信息，并将其和包自身一起传输到用户空间:

// The structure that will be used as the key for
// eBPF table 'proc_ports':
struct port_key {
    u8 proto;
    u32 saddr;
    u32 daddr;
    u16 sport;
    u16 dport;
};
// The structure that will be stored in the eBPF table 'proc_ports',
// Contains information about the process:
struct port_val {
    u32 ifindex;
    u32 pid;
    u32 tgid;
    u32 uid;
    u32 gid;
    char comm[64];
};
// eBPF table from which information about the process is extracted.
// Filled when calling kernel functions udp_sendmsg()/tcp_sendmsg():
BPF_TABLE("extern", struct port_key, struct port_val, proc_ports, 20480);
// Table for transferring data to the user space:
BPF_PERF_OUTPUT(dns_events);
// Look for DNS packets among the data passing through the socket and 
// check if there is any information about the process:
int dns_matching(struct __sk_buff *skb) {
    u8 *cursor = 0;
// Checking the IP protocol:
struct ethernet_t *ethernet = cursor_advance(cursor, sizeof(*ethernet));
     if (ethernet->type == ETH_P_IP) {struct ip_t *ip = cursor_advance(cursor, sizeof(*ip));
        u8 proto;
        u16 sport;
        u16 dport;
        // Checking the transport layer protocol:
        if (ip->nextp == IPPROTO_UDP) {struct udp_t *udp = cursor_advance(cursor, sizeof(*udp));
            proto = 17;
            // Getting the data about the ports:
            sport = udp->sport;
            dport = udp->dport;
        } else if (ip->nextp == IPPROTO_TCP) {struct tcp_t *tcp = cursor_advance(cursor, sizeof(*tcp));
            // We don't need packets where no data is transmitted:
            if (!tcp->flag_psh) {return 0;}
            proto = 6;
            // Getting the data about the ports:
            sport = tcp->src_port;
            dport = tcp->dst_port;
        } else {return 0;}
        // If it's a DNS query:
        if (dport == 53 || sport == 53) {
            // Form a key structure:
            struct port_key key = {};
            key.proto = proto;
            if (skb->ingress_ifindex == 0) {
                key.saddr = ip->src;
                key.daddr = ip->dst;
                key.sport = sport;
                key.dport = dport;
            } else {
                key.saddr = ip->dst;
                key.daddr = ip->src;
                key.sport = dport;
                key.dport = sport;
            }
            // By the key, look for a value in the eBPF table:
            struct port_val *p_val;
            p_val = proc_ports.lookup(&key);
            // If no value is found, then we have no information about the 
            // process and there is no point in continuing:
            if (!p_val) {return 0;}
            // Network device index:
            p_val->ifindex = skb->ifindex;
            // Transmit the structure with the process information along with 
            // skb->len bytes sent to the socket:
            dns_events.perf_submit_skb(skb, skb->len, p_val,
                                       sizeof(struct port_val));
            return 0;
        } //dport == 53 || sport == 53
    } //ethernet->type == ETH_P_IP
return 0;
}

该程序的启动形式与第一个示例雷同。咱们在数据包中挪动指针，从不同级别的协定中收集信息。以后依然不思考 IP 头的理论长度，但还是增加了一些新的货色，对于 TCP 包，咱们将查看其标记，过滤掉不携带数据的包 (SYN、ACK 等)。

但咱们必须复原键，从而从 proc_ports 表中获取数据。同时，必须辨别流量的方向，毕竟，当咱们在表中输出数据时，意味着咱们是源。然而对于传入的数据包，源将是近程服务器。为了了解数据包的挪动方向，我将 ingress_ifindex 标识为 0 用于标识输入流量。

提供服务

咱们须要通过 Python 做三件事: 将程序加载到内核中，从内核中获取数据，并对其进行解决。

前两个工作很简略。此外，咱们曾经在第一个例子中思考了应用 eBPF 的两种办法:

# BPF initialization:
bpf_kprobe = BPF(text=C_BPF_KPROBE)
bpf_sock = BPF(text=BPF_SOCK_TEXT)
# Send UDP:
bpf_kprobe.attach_kprobe(event="udp_sendmsg", fn_name="trace_udp_sendmsg")
# Send TCP:
bpf_kprobe.attach_kprobe(event="tcp_sendmsg", fn_name="trace_tcp_sendmsg")
# Socket:
function_dns_matching = bpf_sock.load_func("dns_matching", BPF.SOCKET_FILTER)
BPF.attach_raw_socket(function_dns_matching, '')

获取数据的代码甚至更短:

bpf_sock["dns_events"].open_perf_buffer(print_dns)
while True:
    try:
        bpf_sock.perf_buffer_poll()
    except KeyboardInterrupt:
        exit()

但数据处理将更加繁琐。只管有现成模块，咱们还是决定本人解析协定头。首先，我想本人弄清楚这是如何产生的 (最初，只管在当前情况下正确处理 IP 包头的长度没有意义，因为头域有额定选项的包将在 eBPF 中被抛弃)，其次是缩小对模块的依赖。然而，对于间接解析 DNS，我依然（到目前为止）应用现成模块，DNS 构造比 IP/TCP 略微简单一些，须要另一个模块(ctypes) 来解决 C 数据类型。

def print_dns(cpu, data, size):
    import ctypes as ct
    class SkbEvent(ct.Structure):
        _fields_ = [("ifindex", ct.c_uint32),
            ("pid", ct.c_uint32),
            ("tgid", ct.c_uint32),
            ("uid", ct.c_uint32),
            ("gid", ct.c_uint32),
            ("comm", ct.c_char * 64),
            ("raw", ct.c_ubyte * (size - ct.sizeof(ct.c_uint32 * 5) - ct.sizeof(ct.c_char * 64)))
        ]
    # We get our 'port_val' structure and also the packet itself in the 'raw' field:
    sk = ct.cast(data, ct.POINTER(SkbEvent)).contents
    # Protocols:
    NET_PROTO = {6: "TCP", 17: "UDP"}
    # eBPF operates on thread names.
    # Sometimes they coincide with process names, but often not.
    # So we try to get the process name by its PID:
    try:
        with open(f'/proc/{sk.pid}/comm', 'r') as proc_comm:
            proc_name = proc_comm.read().rstrip()
    except:
        proc_name = sk.comm.decode()
    # Get the name of the network interface by index:
    ifname = if_indextoname(sk.ifindex)
    # The length of the Ethernet frame header is 14 bytes:
    ip_packet = bytes(sk.raw[14:])
    # The length of the IP packet header is not fixed due to the arbitrary
    # number of parameters.
    # Of all the possible IP header we are only interested in 20 bytes:
    (length, _, _, _, _, proto, _, saddr, daddr) = unpack('!BBHLBBHLL', ip_packet[:20])
    # The direct length is written in the second half of the first byte (0b00001111 = 15):
    # len_iph = length & 15
    # Length is written in 32-bit words, convert it to bytes:
    # len_iph = len_iph * 4
    # Convert addresses from numbers into IPs, assembling it into octets:
    saddr = ".".join(map(str, [saddr >> 24 & 0xff, saddr >> 16 & 0xff, saddr >> 8 & 0xff, saddr & 0xff]))
    daddr = ".".join(map(map(str, [daddr >> 24 & 0xff, daddr >> 16 & 0xff, daddr >> 8 & 0xff, daddr & 0xff]))
    # If the transport layer protocol is UDP:
    if proto == 17:
        udp_packet = ip_packet[len_iph:]
        (sport, dport) = unpack('!HH', udp_packet[:4])
        # UDP datagram header length is 8 bytes:
        dns_packet = udp_packet[8:]
    # If the transport layer protocol is TCP:
    elif proto == 6:
        tcp_packet = ip_packet[len_iph:]
        # TCP packet header length is also not fixed due to the optional
        # options. Of the entire TCP header, we are only interested in the data up to the 13th
        # byte (header length):
        (sport, dport, _, length) = unpack('!HHQB', tcp_packet[:13])
        # The direct length is written in the first half (4 bits):
        len_tcph = length >> 4
        # Length is written in 32-bit words, converted to bytes:
        len_tcph = len_tcph * 4
        # That's the tricky part.
        # I don't know where I went wrong or why I need a 2 byte offset,
        # but it's necessary because the DNS packet doesn't start until after it:
        dns_packet = tcp_packet[len_tcph + 2:]
    # other protocols are not handled:
    else:
        return
    # DNS data decoding:
    dns_data = dnslib.DNSRecord.parse(dns_packet)
    # Resource record types:
    DNS_QTYPE = {1: "A", 28: "AAAA"}
    # Query:
    If dns_data.header.qr == 0:
        # We are only interested in A (1) and AAAA (28) records:
        for q in dns_data.questions:
            If q.qtype == 1 or q.qtype == 28:
                print(f'COMM={proc_name} PID={sk.pid} TGID={sk.tgid} DEV={ifname} PROTO={NET_PROTO[proto]} SRC={saddr} DST={daddr} SPT={sport} DPT={dport} UID={sk.uid} GID={sk.gid} DNS_QR=0 DNS_NAME={q.qname} DNS_TYPE={DNS_QTYPE[q.qtype]}')
    # Response:
    elif dns_data.header.qr == 1:
        # We are only interested in A (1) and AAAA (28) records:
        For rr in dns_data.rr:
            If rr.rtype == 1 or rr.rtype == 28:
                print(f'COMM={proc_name} PID={sk.pid} TGID={sk.tgid} DEV={ifname} PROTO={NET_PROTO[proto]} SRC={saddr} DST={daddr} SPT={sport} DPT={dport} UID={sk.uid} GID={sk.gid} DNS_QR=1 DNS_NAME={rr.rname} DNS_TYPE={DNS_QTYPE[rr.rtype]} DNS_DATA={rr.rdata}')
    else:
        print('Invalid DNS query type.')

最初

启动应用程序 Python 代码，在另一个控制台中用 dig”>dig工具发动申请。

# dig @1.1.1.1 google.com +tcp

如果正确执行，程序输入应该是这样的:

# python3 final_code_eBPF_dns.py
The program is running. Press Ctrl-C to abort.
COMM=dig PID=10738 TGID=10739 DEV=ens18 PROTO=TCP SRC=192.168.44.3 DST=1.1.1.1 SPT=57915 DPT=53 UID=0 GID=0 DNS_QR=0 DNS_NAME=google.com. DNS_TYPE=A
COMM=dig PID=10738 TGID=10739 DEV=ens18 PROTO=TCP SRC=1.1.1.1 DST=192.168.44.3 SPT=53 DPT=57915 UID=0 GID=0 DNS_QR=1 DNS_NAME=google.com. DNS_TYPE=A DNS_DATA=142.251.12.101
COMM=dig PID=10738 TGID=10739 DEV=ens18 PROTO=TCP SRC=1.1.1.1 DST=192.168.44.3 SPT=53 DPT=57915 UID=0 GID=0 DNS_QR=1 DNS_NAME=google.com. DNS_TYPE=A DNS_DATA=142.251.12.113
COMM=dig PID=10738 TGID=10739 DEV=ens18 PROTO=TCP SRC=1.1.1.1 DST=192.168.44.3 SPT=53 DPT=57915 UID=0 GID=0 DNS_QR=1 DNS_NAME=google.com. DNS_TYPE=A DNS_DATA=142.251.12.102
COMM=dig PID=10738 TGID=10739 DEV=ens18 PROTO=TCP SRC=1.1.1.1 DST=192.168.44.3 SPT=53 DPT=57915 UID=0 GID=0 DNS_QR=1 DNS_NAME=google.com. DNS_TYPE=A DNS_DATA=142.251.12.139
COMM=dig PID=10738 TGID=10739 DEV=ens18 PROTO=TCP SRC=1.1.1.1 DST=192.168.44.3 SPT=53 DPT=57915 UID=0 GID=0 DNS_QR=1 DNS_NAME=google.com. DNS_TYPE=A DNS_DATA=142.251.12.100
COMM=dig PID=10738 TGID=10739 DEV=ens18 PROTO=TCP SRC=1.1.1.1 DST=192.168.44.3 SPT=53 DPT=57915 UID=0 GID=0 DNS_QR=1 DNS_NAME=google.com. DNS_TYPE=A DNS_DATA=142.251.12.138

到此为止，咱们曾经创立了一个有用的应用程序，能够显示零碎中所有的 DNS 查问。心愿下面的解释足够具体，这样如果你对编写 eBPF 程序感兴趣，能够更容易开始。这段代码曾经帮忙我更好的理解服务器上产生的事件，以下链接能够获取残缺代码。

残缺代码

论断

这段代码还能够做得更好吗？当然能够！首先，应该减少对 IPv6 的反对。其次，不要再依赖 IP 头的固定长度，而是要对其进行解析。我回绝应用 Python 库来解决数据包，不是没有起因的，在 C 语言中，依然须要手动操作。第三，用 C 语言重写代码也很好，能够齐全放弃 Python，当然还要减少几行 JSON 输入的代码，这样在当前开发 UI 仪表盘时会更不便。这将导致第四点，对 DNS 数据包的手动剖析。最初，最迷人的一点是进行查看端口(因为兴许 DNS 数据包并不总是通过 53 端口)，并尝试剖析每个数据包，在其中寻找那些合乎 DNS 格局的数据包，这将使咱们即便在非标准的端口上也能检测到数据包。

你好，我是俞凡，在 Motorola 做过研发，当初在 Mavenir 做技术工作，对通信、网络、后端架构、云原生、DevOps、CICD、区块链、AI 等技术始终保持着浓重的趣味，平时喜爱浏览、思考，置信继续学习、一生成长，欢送一起交流学习。\
微信公众号：DeepNoMind

本文由 mdnice 多平台公布