关于c++:一文说清-OCLint-源码解析及工作流分析

一线工程师，架构师

15-20min ????

理解动态代码审核技术的原理
理解动态代码审核技术工作流

因为 OCLint 是一个基于 Clang tool 的动态代码剖析工具，所以不得不提一下 Clang。
Clang 作为 LLVM 的子项目, 是一个用来编译 c，c++，以及 oc 的编译器。

OCLint 自身是基于 Clang tool 的，换句话说相当于做了一层封装。
它的外围能力是对 Clang AST 进行剖析，最初输入违反规定的代码信息，并且导出指定格局的报告。

接下来就让咱们看看作为输出信息的 Clang AST 是什么样子的。

Clang AST 是在编译器编译时的一个两头产物，从词法剖析，语法分析（生成 AST），到语义剖析，生成中间代码。

这里先对形象语法树有一个初步的印象。

//Example.c
#include <stdio.h>
int global;
void myPrint(int param) {if (param == 1)
        printf("param is 1");
    for (int i = 0 ; i < 10 ; i++) {global += i;}
}
int main(int argc, char *argv[]) {
    int param = 1;
    myPrint(param);
    return 0;
}

这里能够清晰的看到，这一段代码的每一个元素与其子节点的关系。其中的节点有两大类型，一个是 Stmt 类，包含 Expr 表达式类也是继承于 Stmt，它是语句，有肯定操作；另一大类元素是 Decl 类，即定义。所有的类，办法，函数变量均是一个 Decl 类 (这两个类互不兼容，须要非凡容器节点来转换，比方 DeclStmt 节点)。另外从数据结构中能够看到，这个树是单向的，只有从某一个顶层元素向下拜访。

在终端中能够用如下指令查看语法树:

clang -Xclang -ast-dump -fsyntax-only Example.c

无论是 Stmt 还是 Decl 都自带迭代器，能够不便的遍历所有节点元素，再判断其类型进行操作。不过在 Clang 中还有更不便的办法：继承 RecursiveASTVisitor 类。
它是一个 AST 树递归器，能够递归的拜访一个 AST 树的所有节点。最罕用的办法是 TraverseStmt 和 TraverseDecl。

例如我要拜访这么一段代码中所有的函数，即 FunctionDecl，并且输入这些函数的名字，我就要重写 (通过自定义 checker) 这么一个办法：

bool VisitFunctionDecl(FunctionDecl *decl){string name = decl->getNameAsString();
    printf(name);
    return true;
}

这样，咱们就可能拜访到这棵 AST 树中所有的 FunctionDecl 节点，并且把其中函数名字给输入进去了。

接下来咱们看看 OCLint 的源码，看看 OCLint 到底是如何工作的！

首先看一下外围类关系图，有一点初步的印象后，咱们开始看代码 ????

1 首先找到入口文件 oclint/driver/main.cpp，及入口函数 main()

该文件的精简后的代码框架如下所示:

int main(int argc, const char **argv)
{llvm::cl::SetVersionPrinter(oclintVersionPrinter);
    // 结构 parser 分析程序
    CommonOptionsParser optionsParser(argc, argv, OCLintOptionCategory);
    // 配置
    oclint::option::process(argv[0]);
    
    ...

// 结构 analyzer
    oclint::RulesetBasedAnalyzer analyzer(oclint::option::rulesetFilter().filteredRules());
// 结构 driver
    oclint::Driver driver;

    // 执行剖析
    driver.run(optionsParser.getCompilations(), optionsParser.getSourcePathList(), analyzer);
    
    std::unique_ptr<oclint::Results> results(std::move(getResults()));

    ostream *out = outStream();
    // 输入报告
    reporter()->report(results.get(), *out);
    disposeOutStream(out);

    return handleExit(results.get());
}

2 接着查看外围的 Driver 类的要害代码片段，有三个比拟外围的办法 constructCompilers()，invoke()，run()

// 构建编译器
static void constructCompilers(std::vector<oclint::CompilerInstance *> &compilers,
    CompileCommandPairs &compileCommands,
    std::string &mainExecutable)
{for (auto &compileCommand : compileCommands) // 遍历编译命令集
    {
        std::vector<std::string> adjustedCmdLine =
            adjustArguments(compileCommand.second.CommandLine, compileCommand.first);

#ifndef NDEBUG
        printCompileCommandDebugInfo(compileCommand, adjustedCmdLine);
#endif

        LOG_VERBOSE("Compiling");
        LOG_VERBOSE(compileCommand.first.c_str());
    std::string targetDir = stringReplace(compileCommand.second.Directory, "\\", " ");

        if(chdir(targetDir.c_str()))
        {
            throw oclint::GenericException("Cannot change dictionary into \"" +
                targetDir + "\", ""please make sure the directory exists and you have permission to access!");
        }
        clang::CompilerInvocation *compilerInvocation =
            newCompilerInvocation(mainExecutable, adjustedCmdLine);// 创立 CompilerInvocation 对象
        oclint::CompilerInstance *compiler = newCompilerInstance(compilerInvocation);
// 应用 clang 的 CompilerInvocation 对象 创立 oclint 的 CompilerInstance 对象，oclint 做了封装
        compiler->start(); // clang::FrontendAction 外围是获取到 action 并执行
        if (!compiler->getDiagnostics().hasErrorOccurred() && compiler->hasASTContext())
        {LOG_VERBOSE("- Success");
            compilers.push_back(compiler); // oclint 封装的 CompilerInstance 对象放入汇合中
        }
        else
        {LOG_VERBOSE("- Failed");
        }
        LOG_VERBOSE_LINE("");
    }
}

// 理论的进行剖析的唤起办法
static void invoke(CompileCommandPairs &compileCommands,
    std::string &mainExecutable, oclint::Analyzer &analyzer)
{
    std::vector<oclint::CompilerInstance *> compilers; // 编译器容器
    constructCompilers(compilers, compileCommands, mainExecutable);  // 构建编译器

    // collect a collection of AST contexts
    std::vector<clang::ASTContext *> localContexts;
    for (auto compiler : compilers) // 遍历编译器汇合
    {localContexts.push_back(&compiler->getASTContext()); // 将 AST 上下文放入 上下文汇合
    }

    // use the analyzer to do the actual analysis
    analyzer.preprocess(localContexts); // 将上下文汇合送入分析器 预处理
    analyzer.analyze(localContexts); // 剖析
    analyzer.postprocess(localContexts); // 发送解决

    // send out the signals to release or simply leak resources
    for (size_t compilerIndex = 0; compilerIndex != compilers.size(); ++compilerIndex)
    {compilers.at(compilerIndex)->end();
        delete compilers.at(compilerIndex);
    }
}
// main.cpp 调用的外围办法，执行剖析
void Driver::run(const clang::tooling::CompilationDatabase &compilationDatabase,
    llvm::ArrayRef<std::string> sourcePaths, oclint::Analyzer &analyzer)
{
    CompileCommandPairs compileCommands; // 生成编译指令对容器
    constructCompileCommands(compileCommands, compilationDatabase, sourcePaths); // 结构编译指令对

    static int staticSymbol; // 动态符号
    std::string mainExecutable = llvm::sys::fs::getMainExecutable("oclint", &staticSymbol);// 获取 oclint 可执行程序的门路

    if (option::enableGlobalAnalysis()) // 启用全局剖析的状况
    {invoke(compileCommands, mainExecutable, analyzer);// 调用 invoke 办法，留神 analyzer 也一并入参
    }
    else 
    { // 非全局剖析的状况 一一 compileCommand 进行剖析
        for (auto &compileCommand : compileCommands)
        {CompileCommandPairs oneCompileCommand { compileCommand};
            invoke(oneCompileCommand, mainExecutable, analyzer);
        }
    }

    if (option::enableClangChecker()) // 启用 clang checker
    {invokeClangStaticAnalyzer(compileCommands, mainExecutable); // 调用 clang 的动态分析器
    }
}

3 最初一个就是 RulesetBasedAnalyzer 类，这个类的代码量非常少，如下所示

void RulesetBasedAnalyzer::analyze(std::vector<clang::ASTContext *> &contexts)
{for (const auto& context : contexts)
    {LOG_VERBOSE("Analyzing");
        auto violationSet = new ViolationSet();
        auto carrier = new RuleCarrier(context, violationSet); // 规定运载者，context 是传递给规定来剖析的数据，violationSet 是用于寄存解决好的后果集
        LOG_VERBOSE(carrier->getMainFilePath().c_str());
        for (RuleBase *rule : _filteredRules) // 遍历曾经过滤的规定汇合
        {rule->takeoff(carrier); // 调用规定的 takeoff
        }
        ResultCollector *results = ResultCollector::getInstance(); // 获得后果收集器实例
        results->add(violationSet); // 将规定解决好的数据退出收集器
        LOG_VERBOSE_LINE("- Done");
    }
}

从下面的代码能够看出 analyzer 会遍历规定汇合，来调用 rule 的 takeoff 办法。rule 的基类是 RuleBase，这个基类含有一个 RuleCarrier 的示例作为成员，RuleCarrier 蕴含了每个文件对应的 ASTContext 和 violationSet，violationSet 用来寄存违例的相干信息。
rule 的职责就是，查看其成员变量 ruleCarrier 的 ASTContext，有违例的状况，就将后果写入 ruleCarrier 的 violationSet 中。

到目前为止，咱们曾经理解到 oclint 的根本用法，以及工作流程。

接下来更灵便也是有更高的应用难度的局部 –自定义规定。

规定必须实现 RuleBase 类或其派生的抽象类。不同的规定专一于不同的形象级别，例如，某些规定可能必须十分深刻地钻研代码的控制流，相同，某些规定仅通过读取源代码的字符串来检测缺点。

oclint 提供了三个抽象类，以便咱们来编写自定义规定。
AbstractSourceCodeReaderRule（源代码读取器规定），AbstractASTVisitorRule（AST 访问者规定），以及 AbstractASTMatcherRule（AST 匹配器规定）。

依照官网文档的说法，因为 AST 匹配器规定 具备良好的可读性，除非性能是个大问题，咱们可能大多数时候都会抉择编写 AST 匹配器规定。

AST 访问者规定 是基于访问者模式，你只须要重载某些办法（该抽象类提供了一系列节点被拜访的接口），即可解决相应节点内的校验逻辑。（因为 OCLint 应用的是 Clang 生成的形象语法树，因而理解 Clang AST 的 API 在编写规定时十分有帮忙相干链接）。

AST 匹配器规定 是基于匹配模式，你须要结构一些匹配器并加载。只有找到匹配项，callback 就以该 AST 节点作为参数调用 method，你就能够在 callback 中收集违例信息。（对于匹配器的更多信息看这里）

这里简略就说这么多，咱们只须要晓得 oclint 提供了抽象类，用于实现自定义规定。对于如何编写一个规定的局部会在下一节开展。

这是由 oclint 提供的一个脚手架。相干介绍如下应用脚手架创立规定
能够应用该脚本能够不便的创立自定义规定。

通过浏览 oclint 的官网文档，以及浏览 Clang AST 的介绍。当初咱们曾经晓得了，oclint 的大抵工作形式。首先通过调用 Clang 的 api 把源文件一个个的生成对应的 AST；其次遍历 AST 中的每个节点，并依据相应的规定将违例状况写入违例后果集；最初依据配置的报告类型，将违例后果输入成指定的报告格局。

先上一个 oclint 规定编写思路的脑图，有个初步的印象即可。

依照上文，咱们当初曾经失去了一个 xcodeproj 工程。当初能够关上咱们创立的规定的 cpp 源文件。

首先咱们能够看到，应用脚手架生成的规定，模板代码有近 2000 行，是不是有点慌？不必放心。这些模板里，大多都是 Visit 结尾的办法，这是 oclint 提供给咱们的回调办法，也就是说在拜访到 AST 上相应的节点时就会触发的办法。

上面咱们来看一个理论的案例，曾经用在 iOS 组的代码查看中的一个规定。
这个规定所做的工作大抵如下，依照 cocoa 的标准要求来查看 if else 条件分支的格局。
具体的格局要求是这样的，if else 和前面跟着的括号以及花括号要宰割开，能够应用空格和换行符。
示例代码如下：

void example()
{
    int a = 1;
    if(a > 0) { // (左侧无空格或换行不合规
        a = 10;
    }
    
    if (a > 0){//)右侧无空格或换行不合规
        a = 10;
    }
    
    if (a > 0)
    {a = 10;}else {//}右侧无空格或换行不合规
        a = -1;
    }
    
    if (a > 0)
    {a = 10;} else{ // {左侧无空格或换行不合规
        a = -1;
    }
}

1 首先在终端中应用 dump 查看 AST（上文曾经介绍了如何查看 AST，如果没看过倡议先看看）。

屏幕上一连串花花绿绿的字符闪过，最初停在了这里！
没错，这正是咱们须要找的。

能够很分明的看到，最上方的变量申明 VarDecl，以及下方的条件语句 IfStmt。

2 须要测验的节点名称曾经确定，就是 IfStmt。
3 接下来，在曾经生成的规定模板中找对应的回调办法。
我揣测，应该叫做 VisitXXIfStmt 之类的。
果然不出所料，咱们找到了！VisitIfStmt 这个办法，看起来正是咱们所须要的。
4 紧接着，咱们须要获取节点名称和节点形容。（具体的代码能够参看下方提供的残缺规定文件）
5 最初是判断这里的办法名是否合乎规定。（能够应用 llvm，Clang，以及 std 提供的各种函数，如果有你须要的）
6 如果检测进去的办法名是不符合规范的，将节点及形容信息退出 violationSet。

到这里，整体的编写流程曾经实现了。置信你看完下方的实例代码，以及再多读几个官网提供的规定代码之后，很快就能够触类旁通的写出本人的规定了。

这里间接给出上文规定的残缺实现：

#include "oclint/AbstractASTVisitorRule.h"
#include "oclint/RuleSet.h"

using namespace std;
using namespace clang;
using namespace oclint;

class KirinzerTestRule : public AbstractASTVisitorRule<KirinzerTestRule>
{
public:
    virtual const string name() const override
    {return "if else format";}

    virtual int priority() const override
    {return 2;}

    virtual const string category() const override
    {return "controversial";}

#ifdef DOCGEN
    virtual const std::string since() const override
    {return "20.11";}

    virtual const std::string description() const override
    {return "用于查看 if else 条件分支中的括号是否合乎编码标准";}

    virtual const std::string example() const override
    {
        return R"rst(
.. code-block:: cpp

        void example()
        {
        int a = 1;
        if(a > 0) { // (左侧无空格或换行不合规
        a = 10;
        }
        
        if (a > 0){//)右侧无空格或换行不合规
        a = 10;
        }
        
        if (a > 0)
        {a = 10;}else {//}右侧无空格或换行不合规
        a = -1;
        }
        
        if (a > 0)
        {a = 10;} else{ // {左侧无空格或换行不合规
        a = -1;
        }
        }
        )rst";
    }

#endif
    
    bool VisitIfStmt(IfStmt *node)
    {clang::SourceManager *sourceManager = &_carrier->getSourceManager();
        
        SourceLocation begin = node->getIfLoc();
        SourceLocation elseLoc = node->getElseLoc();
        SourceLocation end = node->getEndLoc();
        
        int length = sourceManager->getFileOffset(end) - sourceManager->getFileOffset(begin) + 1; // 计算该节点源码的长度
        string sourceCode = StringRef(sourceManager->getCharacterData(begin), length).str(); // 从起始地位按指定长度读取字符数据
//        printf("%s\n", sourceCode.c_str());
        
        // 查看 if 左括号
        std::size_t found = sourceCode.find("if (");
        if (found==std::string::npos) {//            printf("if ( 格局不正确 \n");
            AppendToViolationSet(node, Description());
        }
        
        // 查看 if 右括号
        found = sourceCode.find(") {");
        if (found==std::string::npos) {found = sourceCode.find(")\n");
            if (found ==std::string::npos) {//                printf("if 右括号 格局不正确 \n");
                AppendToViolationSet(node, Description());
            }
        }
        
        // 没有 else 分支就不再进行查看
        if (!elseLoc.isValid()) {return true;}
        
        // 查看 else 左括号
        found = sourceCode.find("} else");
        if (found==std::string::npos) {found = sourceCode.find("}\n");
            if (found==std::string::npos) {//                printf("} else 格局不正确 \n");
                AppendToViolationSet(node, Description());
            }
        }
        
        // 查看 else 右括号
        found = sourceCode.find("else {");
        if (found==std::string::npos) {found = sourceCode.find("else\n");
            if (found==std::string::npos) {//                printf("else { 格局不正确 \n");
                AppendToViolationSet(node, Description());
            }
        }
        
        return true;
    }
    
    // 将违例信息追加进后果集
    bool AppendToViolationSet(IfStmt *node, string description) {addViolation(node, this, description);
    }
    
    string Description() {return "格局不正确";}
};

static RuleSet rules(new KirinzerTestRule());

依据后面的所学到的内容，咱们晓得了规定的理论体现模式为 dylib 文件。那么如果编写 cpp 的时候没方法调试，那真的是噩梦个别的体验。将咱们当初遇到的问题，如何调试 oclint 规定？

1 首先须要一个 Xcode 工程。

oclint 工程应用 CMakeLists 来保护依赖关系。咱们也可利用 CMake 来将 CMakeLists 生成 xcodeproj。你能够对每个文件夹生成一个 Xcode 工程，在这里咱们对 oclint-rules 生成对应的 Xcode 工程。

// 在 OCLint 源码目录下建设一个文件夹，我这里命名为 oclint-xcoderules
mkdir oclint-xcoderules
cd oclint-xcoderules
// 执行如下命令
cmake -G Xcode -D CMAKE_CXX_COMPILER=../build/llvm-install/bin/clang++  -D CMAKE_C_COMPILER=../build/llvm-install/bin/clang -D OCLINT_BUILD_DIR=../build/oclint-core -D OCLINT_SOURCE_DIR=../oclint-core -D OCLINT_METRICS_SOURCE_DIR=../oclint-metrics -D OCLINT_METRICS_BUILD_DIR=../build/oclint-metrics -D LLVM_ROOT=../build/llvm-install/ ../oclint-rules

2 Xcode 工程创立好之后，咱们须要对指定的 Scheme 增加启动参数。并且在 Scheme 的 Info 一栏抉择 Executable，抉择上文中编译实现的 oclint 可执行文件。

Tip: 编译生成的 oclint 可执行文件在根目录下 build/oclint-release/bin 目录下，以最新版的 oclint 20.11 为例，生成的文件名为 oclint-20.11，会被 Finder 辨认为 Document 类型。（.11 被辨认为了后缀），尽管并不影响在终端的间接调用，然而咱们后续的调试中会须要在 Xcode 中通过 Finder 来选取这个可执行文件，然而因为类型被辨认谬误，会导致无奈点击选中。所以在这里咱们就删除小数点，批改可执行文件名为 oclint-2011 并且没有任何后缀即可。（留神批改的时候，右键 getInfo，在文件名和扩展名那一栏来批改，还有留神是否暗藏了拓展名）。

启动参数如下:
(第一个参数是规定加载门路，第二个是测试规定用文件)

>-R=/Users/developer/TempData/oclint/oclint-xcoderules/rules.dl/Debug /Users/developer/TempData/oclint/oclint-xcoderules/test2.m -- -x objective-c -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk

筹备实现后即可运行规定，在控制台中能够输入你的规定运行的后果以及调试信息。

应用 Xcode 编写的规定实现编译后，能够在 Xcode 的 Products group 中找到相应的 dylib 文件。

默认状况下，规定将从 $(/path/to/bin/oclint)/../lib/oclint/rules 目录中加载，咱们将其命名为“规定搜寻门路”或“规定加载门路”。规定搜寻门路由一组动静库组成，这些库在 Linux，macOS 和 Windows 中具备扩展名 so, dylib 以及 dll。

通过将新规定拖放到规定加载门路中，能够立刻应用它们。 因而，只须要将咱们自定义规定生成的 dylib 放入默认的规定加载目录即可。当然这里的规定目录也是能够配置的。一个我的项目能够应用多个规定搜寻门路，能够为不同的我的项目指定不同的规定加载门路。

更多具体的配置参考这里的官网文档:

抉择 OCLint 查看规定

应用动态代码查看工具，能够高效的查看出代码中的潜在问题，在做继续的业务交付过程中，进步开发同学们对于编码标准的器重，避免代码的劣化，缩小一些因为大意导致的谬误。心愿本文提及的动态查看工具，以及自定义规定的编写的阐明，能帮忙大家写出更高质量，更优雅，更好看的代码。

简述 LLVM 与 Clang 及其关系
Clang Tutorial
Clang Users Manual
oclint-docs v20.11

关于c++:一文说清-OCLint-源码解析及工作流分析

指标读者

预计浏览工夫

实现浏览的播种

不得不提的 Clang

Clang AST

形象语法树示例

拜访形象语法树

OCLint 源码解析

高级：自定义规定

创立规定——scaffoldRule 脚本

编写规定

调试规定

应用规定

总结

参考资料

Just My Socks（注册教程内含优惠码）

关于c++:一文说清-OCLint-源码解析及工作流分析

指标读者

预计浏览工夫

实现浏览的播种

不得不提的 Clang

Clang AST

形象语法树示例

拜访形象语法树

OCLint 源码解析

高级：自定义规定

创立规定——scaffoldRule 脚本

编写规定

调试规定

应用规定

总结

参考资料

Just My Socks（注册教程 内含优惠码）

Just My Socks（注册教程内含优惠码）