AST-实战

共计 4565 个字符，预计需要花费 12 分钟才能阅读完成。

欢迎关注我的公众号 睿 Talk，获取我最新的文章：

最近突然对 AST 产生了兴趣，深入了解后发现它的使用场景还真的不少，很多我们日常开发使用的工具都跟它息息相关，如 Babel、ESLint 和 Prettier 等。本文除了介绍 AST 的一些基本概念外，更偏重实战，讲解如何利用它来对代码进行修改。

AST 全称 Abstract Syntax Tree，也就是抽象语法树，它是将编程语言转换成机器语言的桥梁。浏览器在解析 JS 的过程中，会根据 ECMAScript 标准将字符串进行分词，拆分为一个个语法单元。然后再遍历这些语法单元，进行语义分析，构造出 AST。最后再使用 JIT 编译器的全代码生成器，将 AST 转换为本地可执行的机器码。如下面一段代码：

function add(a, b) {return a + b;}

进行分词后，会得到这些 token：

对 token 进行分析，最终会得到这样一棵 AST（简化版）：

{
  "type": "Program",
  "body": [
    {
      "type": "FunctionDeclaration",
      "id": {
        "type": "Identifier",
        "name": "add"
      },
      "params": [
        {
          "type": "Identifier",
          "name": "a"
        },
        {
          "type": "Identifier",
          "name": "b"
        }
      ],
      "body": {
        "type": "BlockStatement",
        "body": [
          {
            "type": "ReturnStatement",
            "argument": {
              "type": "BinaryExpression",
              "left": {
                "type": "Identifier",
                "name": "a"
              },
              "operator": "+",
              "right": {
                "type": "Identifier",
                "name": "b"
              }
            }
          }
        ]
      }
    }
  ],
  "sourceType": "module"
}

拿到 AST 后就可以根据规则转换为机器码了，在此不再赘述。

AST 除了可以转换为机器码外，还能做很多事情，如 Babel 就能通过分析 AST，将 ES6 的代码转换成 ES5。

Babel 的编译过程分为 3 个阶段：

解析：将代码字符串解析成抽象语法树
变换：对抽象语法树进行变换操作
生成：根据变换后的抽象语法树生成新的代码字符串

Babel 实现了一个 JS 版本的解析器 Babel parser，它能将 JS 字符串转换为 JSON 结构的 AST。为了方便对这棵树进行遍历和变换操作，babel 又提供了traverse 工具函数。完成 AST 的修改后，可以使用 generator 生成新的代码。

下面我们来详细看看如何对 AST 进行操作。先建好如下的代码模板：

import parser from "@babel/parser";
import generator from "@babel/generator";
import t from "@babel/types";
import traverser from "@babel/traverse";

const generate = generator.default;
const traverse = traverser.default;

const code = ``;
const ast = parser.parse(code);

// AST 变换

const output = generate(ast, {}, code);

console.log("Input \n", code);
console.log("Output \n", output.code);

构造一个 hello world

打开 AST Explorer，将左侧代码清空，再输入 hello world，可以看到前后 AST 的样子：

// 空
{
  "type": "Program",
  "body": [],
  "sourceType": "module"
}

// hello world
{
  "type": "Program",
  "body": [
    {
      "type": "ExpressionStatement",
      "expression": {
        "type": "Literal",
        "value": "hello world",
        "raw": "'hello world'"
      },
      "directive": "hello world"
    }
  ],
  "sourceType": "module"
}

接下来通过代码构造这个ExpressionStatement:

const code = ``;
const ast = parser.parse(code);

// 生成 literal
const literal = t.stringLiteral('hello world')
// 生成 expressionStatement
const exp = t.expressionStatement(literal)  
// 将表达式放入 body 中
ast.program.body.push(exp)

const output = generate(ast, {}, code);

可以看到 AST 的创建过程就是自底向上创建各种节点的过程。这里我们借助 babel 提供的 types 对象帮我们创建各种类型的节点。更多类型可以查阅这里。

同样道理，下面我们来看看如何构造一个赋值语句：

const code = ``;
const ast = parser.parse(code);
 
// 生成 identifier
const id = t.identifier('str')
// 生成 literal
const literal = t.stringLiteral('hello world')
// 生成 variableDeclarator
const declarator = t.variableDeclarator(id, literal)
 // 生成 variableDeclaration
const declaration = t.variableDeclaration('const', [declarator])

// 将表达式放入 body 中
ast.program.body.push(declaration)

const output = generate(ast, {}, code);

获取 AST 中的节点

下面我们将对这段代码进行操作：

export default {data() {
    return {count: 0}
  },
  methods: {add() {++this.count},
    minus() {--this.count}
  }
}

假设我想获取这段代码中的 data 方法，可以直接这么访问：

const dataProperty = ast.program.body[0].declaration.properties[0]

也可以使用 babel 提供的 traverse 工具方法：

const code = `
export default {data() {
    return {count: 0}
  },
  methods: {add() {++this.count},
    minus() {--this.count}
  }
}
`;

const ast = parser.parse(code, {sourceType: 'module'});
 
// const dataProperty = ast.program.body[0].declaration.properties[0]

traverse(ast, {ObjectMethod(path) {if (path.node.key.name === 'data') {
      path.node.key.name = 'myData';
      // 停止遍历
      path.stop();}
  }
})

const output = generate(ast, {}, code);

traverse方法的第二个参数是一个对象，只要提供与节点类型同名的属性，就能获取到所有的这种类型的节点。通过 path 参数能访问到节点信息，进而找出需要操作的节点。上面的代码中，我们找到方法名为 data 的方法后，将其改名为myData，然后停止遍历，生成新的代码。

替换 AST 中的节点

可以使用 replaceWith 和replaceWithSourceString替换节点，例子如下：

// 将 this.count 改成 this.data.count

const code = `this.count`;
const ast = parser.parse(code);

traverse(ast, {MemberExpression(path) {
    if (t.isThisExpression(path.node.object) &&
      t.isIdentifier(path.node.property, {name: "count"})
    ) {
      // 将 this 替换为 this.data
      path
        .get("object")
        .replaceWith(t.memberExpression(t.thisExpression(), t.identifier("data"))
        );
        
      // 下面的操作跟上一条语句等价，更加直观方便
      // path.get("object").replaceWithSourceString("this.data");
    }
  }
});

const output = generate(ast, {}, code);

插入新的节点

可以使用 pushContainer、insertBefore 和insertAfter等方法来插入节点：

// 这个例子示范了 3 种节点插入的方法

const code = `
const obj = {
  count: 0,
  message: 'hello world'
}
`;

const ast = parser.parse(code);

const property = t.objectProperty(t.identifier("new"),
  t.stringLiteral("new property")
);

traverse(ast, {ObjectExpression(path) {path.pushContainer("properties", property);
    
    // path.node.properties.push(property);
  }
});

/* 
traverse(ast, {ObjectProperty(path) {
    if (
      t.isIdentifier(path.node.key, {name: "message"})
    ) {path.insertAfter(property);
    }
  }
}); 
*/

const output = generate(ast, {}, code);

删除节点

使用 remove 方法来删除节点：

const code = `
const obj = {
  count: 0,
  message: 'hello world'
}
`;

const ast = parser.parse(code);

traverse(ast, {ObjectProperty(path) {
    if (
      t.isIdentifier(path.node.key, {name: "message"})
    ) {path.remove();
    }
  }
});

const output = generate(ast, {}, code);

本文介绍了 AST 的一些基本概念，讲解了如何使用 Babel 提供的 API，对 AST 进行增删改查的操作。掌握这项技能，再加上一点想象力，就能制作出实用的代码分析和转换工具。

一、前言

二、基本概念

三、Babel 工作原理

四、AST 实战

五、总结

Just My Socks（注册教程内含优惠码）