原文地址:
https://github.com/cssaheel/d...

1.1 Introduction

Cuckoo Sandbox is an Automated Malware Analysis developed by Claudio Guarnieri, mainly Cuckoo is a lightweight solution that performs automated dynamic analysis of provided Windows binaries. It is able to return comprehensive reports on key API calls and network activity. This documentation is introduce you a library which processes the network files (PCAP files or Packet-Capture files) and return back a report of the result. This library dissect packets fields and extract the most possible extent of information out of network packets, it also aware of tcp reassemblingn not just that it can recover the downloaded files for http, ftp and the sent emails by smtp.
Cuckoo Sandbox是Claudio Guarnieri开发的主动恶意软件剖析工具,次要的Cuckoo是一个轻量级解决方案,该计划对提供的Windows二进制文件执行主动动态分析。它可能返回无关要害API调用和网络流动的全面报告。本文档向您介绍一个库,它解决网络文件(PCAP文件或数据包捕捉文件)并返回后果报告。该库解析数据包字段,并从网络数据包中提取尽可能多的信息,它还理解tcp重组,不仅能够复原http、ftp下载的文件和smtp发送的电子邮件。

1.2 Description

This library depend on Scapy library. The supported protocols in this library are: TCP, UDP, ICMP, DNS, SMB, HTTP, FTP, IRC, SIP, TELNET, SSH, SMTP, IMAP and POP. Even that the first five protocols were supported by Scapy they have been interfaced by this library. This figure demonstrates the transparent structure of the library:
这个库依赖于Scapy库。该库中反对的协定有:TCP、UDP、ICMP、DNS、SMB、HTTP、FTP、IRC、SIP、TELNET、SSH、SMTP、IMAP和POP。即便前五个协定由Scapy反对,它们也由这个库连贯。此图显示了库的通明构造:

1- The main component in this library which is dissector is responsible of receiving a path to pcap file and send back a dictionary of the supported protocols which holds the dissected packets. Also this component is the one who specify how to represent the data and also it is the responsible of importing Scapy classes and the library classes. Also it preprocesses the tcp sequence numbers and implements the tcp reassembly.
1- 此库中的次要组件是dissector,负责接管pcap文件的门路,并发回反对的协定字典,该字典保留解析的数据包。该组件还负责指定如何示意数据,并负责导入Scapy类和库类。同时对tcp序列号进行预处理,实现tcp重组。

2- The protocols files, each file has one or more classes which responsible fordissecting the corresponding protocol packets.
2- 协定文件,每个文件有一个或多个类,负责拆散相应的协定包。

3- There are set of Scapy classes have been used in this library which are Packet class inherited by "Protocols classes", and Field class which inherited by "Fields classes" and it does use rdpcap which takes a path to pcap file and returns back a list of packets.
3- 该库中应用了一组Scapy类,它们是由“Protocols classes”继承的Packet类,以及由“Fields classes”继承的Field类,并且它的确应用rdpcap,该rdpcap获取pcap文件的门路并返回数据包列表。

1.3 General Protocol File Structure

For any future development no need to go deep in Scapy since in this library I didn’t use advanced features of Scapy, so I am going to introduce you the simplest (pseudo code) form of a protocol file structure I followed in this library annotated with some comments:
对于任何将来的开发,无需深刻Scapy,因为在这个库中,我没有应用Scapy的高级性能,所以我将向您介绍我在这个库中遵循的协定文件构造的最简略(伪代码)模式,并附上一些正文:

class FTPData(Packet):    """    class for dissecting the ftp data    @attention: it inherets Packet class from Scapy library    """    name = "ftp"    fields_desc = [FTPDataField("data", "")]class FTPResponse(Packet):    """    class for dissecting the ftp responses    @attention: it inherets Packet class from Scapy library    """    name = "ftp"    fields_desc = [FTPResField("command", "", "H"),                    FTPResArgField("argument", "", "H")]class FTPRequest(Packet):    """    class for dissecting the ftp requests    @attention: it inherets Packet class from Scapy library    """    name = "ftp"    fields_desc = [FTPReqField("command", "", "H"),                    StrField("argument", "", "H")]bind_layers(TCP, FTPResponse, sport=21)bind_layers(TCP, FTPRequest, dport=21)bind_layers(TCP, FTPData, dport=20)bind_layers(TCP, FTPData, dport=20)

Are we done of the protocol file? well, Not yet. As you see in the previous code in fields desc we have used a class named FTPField and this class is "Field Class" which means in either way it should inherits Field class of Scapy, the other class StrField this has the same thing it inherits Field class but it is predefined by Scapy. Now let us have a look at FTPField class.
协定文件处理完毕了吗?嗯,还没有。正如您在后面的fields desc代码中看到的,咱们应用了一个名为FTPField的类,这个类是“Field class”,这意味着它应该以任何一种形式继承Scapy的Field class,另一个类StrField它继承了Field class,但它是由Scapy预约义的。当初让咱们看看FTPField类。

class FTPReqField(StrField):    holds_packets = 1    name = "FTPReqField"    def getfield(self, pkt, s):        """        this method will get the packet, takes what does need to be        taken and let the remaining go, so it returns two values.        first value which belongs to this field and the second is        the remaining which does need to be dissected with        other "field classes".        @param pkt: holds the whole packet        @param s: holds only the remaining data which is not dissected yet.        """        remain = ""        value = ""        ls = s.split()        if ls[0].lower() == "retr":            c = 1            file = ""            while c < len(ls):                file = file + ls[c]                c = c + 1            if len(file) > 0:                add_file(file)        length = len(ls)        if length > 1:            value = ls[0]            if length == 2:                remain = ls[1]                return remain, value            else:                i = 1                remain = ""                while i < length:                    remain = remain + ls[i] + " "                    i = i + 1                return remain, value        else:            return "", ls[0]    def __init__(self, name, default, fmt, remain=0):        """        class constructor for initializing the instance variables        @param name: name of the field        @param default: Scapy has many formats to represent the data        internal, human and machine. anyways you may sit this param to None.        @param fmt: specifying the format, this has been set to "H"        @param remain: this parameter specifies the size of the remaining        data so make it 0 to handle all of the data.        """        self.name = name        StrField.__init__(self, name, default, fmt, remain)

1.4 Protocols Details and Notes

Different protocols have different properties especially when you go in details. So here I am going to lists the different characteristics and features of the implemented protocols.
不同的协定有不同的属性,尤其是当你深刻理解细节时。在这里,我将列出已实现协定的不同特色和个性。

1.5 Requirements

This library has been tested with python version 2.6.5 and Scapy version 2.1.0.

1.6 Usage

Here you will see simple use of this library. Let us have our file usedissector.py as follows:

from dissector import *"""this file is a test unit for a pcap library (mainly dissector.pyand its associated protocols classes). This library uses anddepends on Scapy library."""# instance of dissector classdissector = Dissector()# sending the pcap file to be dissectedpkts = dissector.dissect_pkts("/root/Desktop/ssh.cap")print(pkts)

the output will be similar to this:

{’ftp’: [....], ’http’: [....], ....}

1.7 Downloaded Files Recovery

I have wrote a dedicated section for the files recovery to state how this feature works for http, ftp and smtp. All of the protocols will create a directory named downloaded in the current working directory (CWD) to store the recovered files. in case that you want to change the default and want to store the recovered files in another directory you have to send a path to change dfolder just like this:
我曾经为文件复原写了一个专门的章节,来阐明这个性能如何实用于http、ftp和smtp。所有协定都将在当前工作目录(CWD)中创立一个名为downloaded的目录来存储复原的文件。如果要更改默认值,并心愿将复原的文件存储在另一个目录中,则必须发送一个门路来更改数据文件夹,如下所示:

from dissector import *# instance of dissector classdissector = Dissector()# now the downloaded files will be stored on the desktopdissector.change_dfolder("/root/Desktop/")# sending the pcap file to be dissectedpkts = dissector.dissect_pkts("/root/Desktop/ssh.cap")

for http it takes the file name from the start line of the http request, so if another file has the same name in the specified directory or the name has some special characters then a random name will be generated. the same apply for ftp which takes the file name from RETR command. whereas smtp just gives the file a random name.
对于http,它从http申请的起始行获取文件名,因而如果另一个文件在指定目录中具备雷同的名称,或者该名称具备一些特殊字符,则将生成一个随机名称。这同样实用于从RETR命令获取文件名的ftp。而smtp只是给文件一个随机名称。

1.8 Source Code

the source code of this library is on github:
$ git clone https://github.com/cssaheel/d...