乐趣区

关于前端:使用-TypeScriptReactANTLR-和-Monaco-Editor-创建一个自定义-Web-编辑器二

译文起源

    欢送浏览如何应用 TypeScript, React, ANTLR4, Monaco Editor 创立一个自定义 Web 编辑器系列的第二章节, 在这之前建议您浏览应用 TypeScript, React, ANTLR4, Monaco Editor 创立一个自定义 Web 编辑器(一)

    在本文中, 我将介绍如何实现语言服务, 语言服务在编辑器中次要用来解析键入文本的沉重工作, 咱们将应用通过 Parser 生成的形象语法树 (AST) 来查找语法或词法谬误, 格局文本, 针对用户键入文本对 TODOS 语法做只能提醒(本文中我不会实现语法主动实现), 基本上, 语言服务裸露如下函数:

  • format(code: string): string
  • validate(code: string): Errors[]
  • autoComplete(code: string, currentPosition: Position): string[]

Add ANTLER, Generate Lexer and Parser From the Grammar

我将引入 ANTLR 库并减少一个依据 TODOLang.g4 语法文件生ParserLexer的脚本, 首先引入两个必须的库:antlr4tsantlr4ts-cli,  antlr4 Typescript 指标生成的解析器对antlr4ts 包有运行时依赖, 另一方面, 顾名思义 antlr4ts-cli 就是CLI 咱们将应用它生成该语言的 ParserLexer

npm add antlr4ts
npm add -D antlr4ts-cli

在根门路创立蕴含 TodoLang 语法规定的文件TodoLangGrammar.g4

grammar TodoLangGrammar;

todoExpressions : (addExpression)* (completeExpression)*;
addExpression : ADD TODO STRING;
completeExpression : COMPLETE TODO STRING;

ADD : 'ADD';
TODO : 'TODO';
COMPLETE: 'COMPLETE';
STRING: '"'~ ["]* '"';
EOL: [\r\n] + -> skip;
WS: [\t] -> skip;

当初咱们在 package.json 文件里减少通过 antlr-cli 生成 ParserLexer的脚本

"antlr4ts": "antlr4ts ./TodoLangGrammar.g4 -o ./src/ANTLR"

让咱们执行一下 antlr4ts 脚本,就能够在 ./src/ANTLR 目录看到生成的解析器的 typescript 源码了

npm run antlr4ts

正如咱们看到的那样, 这里有一个 LexerParser, 如果你查看Parser 文件, 你会发现它导出 TodoLangGrammarParser类, 该类有个构造函数 constructor(input: TokenStream), 该构造函数将TodoLangGrammarLexer 为给定代码生成的 TokenStream 作为参数,  TodoLangGrammarLexer 有一个以代码作为入参的构造函数 constructor(input: CharStream)

Parser文件蕴含了 public todoExpressions(): TodoExpressionsContext 办法,该办法会返回代码中定义的所有 TodoExpressions 的上下文对象, 猜测一下 TodoExpressions 在哪里能够追踪到,其实它是源于咱们语法规定文件的第一行语法规定:

todoExpressions : (addExpression)* (completeExpression)*;

TodoExpressionsContextAST 的根基, 其中的每个节点都是另一个规定的另一个上下文, 它蕴含了终端和节点上下文,终端领有最终令牌(ADD 令牌, TODO 令牌, todo 事项名称的令牌)

TodoExpressionsContext蕴含了 addExpressionscompleteExpressions表达式列表, 来源于以下三条规定

todoExpressions : (addExpression)* (completeExpression)*; 
addExpression : ADD TODO STRING;
completeExpression : COMPLETE TODO STRING;

另一方面, 每个上下文类都蕴含了终端节点, 它根本蕴含以下文本 (代码段或者令牌, 例如:ADD, COMPLETE, 代表 TODO 的字符串), AST 的复杂度取决于你编写的语法规定

让咱们来看看 TodoExpressionsContext, 它蕴含了ADD, TODOSTRING 终端节点, 对应的规定如:

addExpression : ADD TODO STRING;

STRING终端节点保留了咱们要加的 Todo 文本内容, 先来解析一个简略的 TodoLang 代码以来理解 AST 如何工作的,在 ./src/language-service 目录建一个蕴含以下内容的文件parser.ts

import {TodoLangGrammarParser, TodoExpressionsContext} from "../ANTLR/TodoLangGrammarParser";
import {TodoLangGrammarLexer} from "../ANTLR/TodoLangGrammarLexer";
import {ANTLRInputStream, CommonTokenStream} from "antlr4ts";

export default function parseAndGetASTRoot(code: string): TodoExpressionsContext {const inputStream = new ANTLRInputStream(code);
    const lexer = new TodoLangGrammarLexer(inputStream);
    const tokenStream = new CommonTokenStream(lexer);
    const parser = new TodoLangGrammarParser(tokenStream);
    // Parse the input, where `compilationUnit` is whatever entry point you defined
    return parser.todoExpressions();}

parser.ts文件导出了 parseAndGetASTRoot(code) 办法, 它承受 TodoLang 代码并且生成相应的 AST, 解析以下TodoLang 代码:

parseAndGetASTRoot(`
ADD TODO "Create an editor"
COMPLETE TODO "Create an editor"
`)

Implementing Lexical and Syntax Validation

在本节中, 我将疏导您逐渐理解如何向编辑器增加语法验证, ANTLR开箱即用为咱们生成词汇和语法错误, 咱们只须要实现 ANTLRErrorListner 类并将其提供给 LexerParser, 这样咱们就能够在 ANTLR解析代码时收集谬误

./src/language-service 目录下创立 TodoLangErrorListener.ts 文件, 文件导出实现 ANTLRErrorListner 接口的 TodoLangErrorListener

import {ANTLRErrorListener, RecognitionException, Recognizer} from "antlr4ts";

export interface ITodoLangError {
    startLineNumber: number;
    startColumn: number;
    endLineNumber: number;
    endColumn: number;
    message: string;
    code: string;
}

export default class TodoLangErrorListener implements ANTLRErrorListener<any>{private errors: ITodoLangError[] = []
    syntaxError(recognizer: Recognizer<any, any>, offendingSymbol: any, line: number, charPositionInLine: number, message: string, e: RecognitionException | undefined): void {
        
        this.errors.push(
            {
                startLineNumber:line,
                endLineNumber: line,
                startColumn: charPositionInLine,
                endColumn: charPositionInLine+1,//Let's suppose the length of the error is only 1 char for simplicity
                message,
                code: "1" // This the error code you can customize them as you want
            }
        )
    }

    getErrors(): ITodoLangError[] {return this.errors;}
}

每次 ANTLR 在代码解析期间遇到谬误时, 它将调用此 TodoLangErrorListener, 以向其提供无关谬误的信息, 该监听器会返回蕴含解析产生谬误的代码地位极错误信息, 当初咱们尝试把TodoLangErrorListener 绑定到 parser.ts 的文件的 LexerParser里, eg:

import {TodoLangGrammarParser, TodoExpressionsContext} from "../ANTLR/TodoLangGrammarParser";
import {TodoLangGrammarLexer} from "../ANTLR/TodoLangGrammarLexer";
import {ANTLRInputStream, CommonTokenStream} from "antlr4ts";
import TodoLangErrorListener, {ITodoLangError} from "./TodoLangErrorListener";

function parse(code: string): {ast:TodoExpressionsContext, errors: ITodoLangError[]} {const inputStream = new ANTLRInputStream(code);
    const lexer = new TodoLangGrammarLexer(inputStream);
    lexer.removeErrorListeners()
    const todoLangErrorsListner = new TodoLangErrorListener();
    lexer.addErrorListener(todoLangErrorsListner);
    const tokenStream = new CommonTokenStream(lexer);
    const parser = new TodoLangGrammarParser(tokenStream);
    parser.removeErrorListeners();
    parser.addErrorListener(todoLangErrorsListner);
    const ast =  parser.todoExpressions();
    const errors: ITodoLangError[]  = todoLangErrorsListner.getErrors();
    return {ast, errors};
}
export function parseAndGetASTRoot(code: string): TodoExpressionsContext {const {ast} = parse(code);
    return ast;
}
export function parseAndGetSyntaxErrors(code: string): ITodoLangError[] {const {errors} = parse(code);
    return errors;
}

./src/language-service 目录下创立LanguageService.ts, 以下是它导出的内容


import {TodoExpressionsContext} from "../ANTLR/TodoLangGrammarParser";
import {parseAndGetASTRoot, parseAndGetSyntaxErrors} from "./Parser";
import {ITodoLangError} from "./TodoLangErrorListener";

export default class TodoLangLanguageService {validate(code: string): ITodoLangError[] {const syntaxErrors: ITodoLangError[] = parseAndGetSyntaxErrors(code);
        //Later we will append semantic errors
        return syntaxErrors;
    }
}

不错, 咱们实现了编辑器谬误解析, 为此我将要创立上篇文章探讨过的 web worker, 并且增加worker 服务代理, 该代理将调用语言服务区实现编辑器的高级性能

Creating the web worker

首先, 咱们调用 monaco.editor.createWebWorker 来应用内置的 ES6 Proxies 创立代理 TodoLangWorker, TodoLangWorker 将应用语言服务来执行编辑器性能,在 web worker 中执行的那些办法将由 monaco 代理,因而在 web worker 中调用办法仅是在主线程中调用被代理的办法。

./src/todo-lang 文件夹下创立 TodoLangWorker.ts 蕴含以下内容:

import * as monaco from "monaco-editor-core";
import IWorkerContext = monaco.worker.IWorkerContext;
import TodoLangLanguageService from "../language-service/LanguageService";
import {ITodoLangError} from "../language-service/TodoLangErrorListener";

export class TodoLangWorker {
    private _ctx: IWorkerContext;
    private languageService: TodoLangLanguageService;
    constructor(ctx: IWorkerContext) {
        this._ctx = ctx;
        this.languageService = new TodoLangLanguageService();}

    doValidation(): Promise<ITodoLangError[]> {const code = this.getTextDocument();
        return Promise.resolve(this.languageService.validate(code));
    }
  
    private getTextDocument(): string {const model = this._ctx.getMirrorModels()[0];
        return model.getValue();}

咱们创立了 language service 实例 并且增加了 doValidation 办法, 进一步它会调用 language servicevalidate办法, 还增加了 getTextDocument 办法, 该办法用来获取编辑器的文本值, TodoLangWorker类还能够扩大很多性能如果你想要反对多文件编辑等, _ctx: IWorkerContext 是编辑器的上下文对象, 它保留了文件的 model 信息

当初让咱们在 ./src/todo-lang 目录下创立 web worker 文件todolang.worker.ts

import * as worker from 'monaco-editor-core/esm/vs/editor/editor.worker';
import {TodoLangWorker} from './todoLangWorker';

self.onmessage = () => {worker.initialize((ctx) => {return new TodoLangWorker(ctx)
    });
};

咱们应用内置的 worker.initialize 初始化咱们的 worker,并应用 TodoLangWorker 进行必要的办法代理

那是一个 web worker, 因而咱们必须让webpack 输入对应的 worker 文件

// webpack.config.js
entry: {
        app: './src/index.tsx',
        "editor.worker": 'monaco-editor-core/esm/vs/editor/editor.worker.js',
        "todoLangWorker": './src/todo-lang/todolang.worker.ts'
    },
    output: {
        globalObject: 'self',
        filename: (chunkData) => {switch (chunkData.chunk.name) {
                case 'editor.worker':
                    return 'editor.worker.js';
                case 'todoLangWorker':
                    return "todoLangWorker.js"
                default:
                    return 'bundle.[hash].js';
            }
        },
        path: path.resolve(__dirname, 'dist')
    }

咱们命名 worker 文件为 todoLangWorker.js 文件, 当初咱们在编辑器启动函数外面减少getWorkUrl

 (window as any).MonacoEnvironment = {getWorkerUrl: function (moduleId, label) {if (label === languageID)
                return "./todoLangWorker.js";
            return './editor.worker.js';
        }
    }

这是 monaco 如何获取 web worker 的 URL 的办法,  请留神, 如果 worker 的 label 是 TodoLang 的 ID, 咱们将返回用于在 Webpack 中打包输入的同名 worker, 如果当初构建我的项目, 则可能会发现有一个名为todoLangWorker.js 的文件(或者在 dev-tools 中, 您将在线程局部中找到两个worker

当初创立一个用来治理 worker 创立和获取代理 worker 客户端的 WorkerManager

import * as monaco from "monaco-editor-core";

import Uri = monaco.Uri;
import {TodoLangWorker} from './todoLangWorker';
import {languageID} from './config';

export class WorkerManager {

    private worker: monaco.editor.MonacoWebWorker<TodoLangWorker>;
    private workerClientProxy: Promise<TodoLangWorker>;

    constructor() {this.worker = null;}

    private getClientproxy(): Promise<TodoLangWorker> {if (!this.workerClientProxy) {
            this.worker = monaco.editor.createWebWorker<TodoLangWorker>({
                moduleId: 'TodoLangWorker',
                label: languageID,
                createData: {languageId: languageID,}
            });
            this.workerClientProxy = <Promise<TodoLangWorker>><any>this.worker.getProxy();}

        return this.workerClientProxy;
    }

    async getLanguageServiceWorker(...resources: Uri[]): Promise<TodoLangWorker> {const _client: TodoLangWorker = await this.getClientproxy();
        await this.worker.withSyncedResources(resources)
        return _client;
    }
}

咱们应用 createWebWorker 创立 monaco 代理的 web worker, 其次咱们获取返回了代理的客户端对象, 咱们应用workerClientProxy 调用代理的一些办法, 让咱们创立 DiagnosticsAdapter 类, 该类用来连贯 Monaco 标记 Api 和语言服务返回的 error,为了让解析的谬误正确的标记在 monaco

import * as monaco from "monaco-editor-core";
import {WorkerAccessor} from "./setup";
import {languageID} from "./config";
import {ITodoLangError} from "../language-service/TodoLangErrorListener";

export default class DiagnosticsAdapter {constructor(private worker: WorkerAccessor) {const onModelAdd = (model: monaco.editor.IModel): void => {
            let handle: any;
            model.onDidChangeContent(() => {
                // here we are Debouncing the user changes, so everytime a new change is done, we wait 500ms before validating
                // otherwise if the user is still typing, we cancel the
                clearTimeout(handle);
                handle = setTimeout(() => this.validate(model.uri), 500);
            });

            this.validate(model.uri);
        };
        monaco.editor.onDidCreateModel(onModelAdd);
        monaco.editor.getModels().forEach(onModelAdd);
    }
    private async validate(resource: monaco.Uri): Promise<void> {const worker = await this.worker(resource)
        const errorMarkers = await worker.doValidation();
        const model = monaco.editor.getModel(resource);
        monaco.editor.setModelMarkers(model, languageID, errorMarkers.map(toDiagnostics));
    }
}
function toDiagnostics(error: ITodoLangError): monaco.editor.IMarkerData {
    return {
        ...error,
        severity: monaco.MarkerSeverity.Error,
    };
}

onDidChangeContent监听器监听 model 信息, 如果 model 信息变更, 咱们将每隔 500ms 调用 webworker 去验证代码并且减少谬误标记;setModelMarkers告诉 monaco 减少谬误标记, 为了使得编辑器语法验证性能实现,请确保在 setup 函数中调用它们,并留神咱们正在应用 WorkerManager 来获取代理worker

monaco.languages.onLanguage(languageID, () => {monaco.languages.setMonarchTokensProvider(languageID, monarchLanguage);
        monaco.languages.setLanguageConfiguration(languageID, richLanguageConfiguration);
        const client = new WorkerManager();
        const worker: WorkerAccessor = (...uris: monaco.Uri[]): Promise<TodoLangWorker> => {return client.getLanguageServiceWorker(...uris);
        };
        //Call the errors provider
        new DiagnosticsAdapter(worker);
    });
}

export type WorkerAccessor = (...uris: monaco.Uri[]) => Promise<TodoLangWorker>;

当初所有准备就绪, 运行我的项目并且输出谬误的 TodoLang 代码, 你会发现错误被标记在代码上面

Implementing Semantic Validation

当初往编辑器减少语义校验, 记得我在上篇文章提到的两个语义规定

  • 如果应用 ADD TODO 阐明定义了 TODO,咱们能够从新增加它。
  • 在 TODO 中利用中,COMPLETE 指令不应在尚未应用申明 ADD TODO 前

要查看是否定义了 TODO,咱们要做的就是遍历 AST 以获取每个 ADD 表达式并将其推入 definedTodos . 而后咱们在definedTodos 中查看 TODO 的存在. 如果存在, 则是语义谬误, 因而请从 ADD 表达式的上下文中获取谬误的地位, 而后将谬误推送到数组中, 第二条规定也是如此

function checkSemanticRules(ast: TodoExpressionsContext): ITodoLangError[] {const errors: ITodoLangError[] = [];
    const definedTodos: string[] = [];
    ast.children.forEach(node => {if (node instanceof AddExpressionContext) {
            // if a Add expression : ADD TODO "STRING"
            const todo = node.STRING().text;
            // If a TODO is defined using ADD TODO instruction, we can re-add it.
            if (definedTodos.some(todo_ => todo_ === todo)) {
                // node has everything to know the position of this expression is in the code
                errors.push({
                    code: "2",
                    endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex,
                    endLineNumber: node.stop.line,
                    message: `Todo ${todo} already defined`,
                    startColumn: node.stop.charPositionInLine,
                    startLineNumber: node.stop.line
                });
            } else {definedTodos.push(todo);
            }
        }else if(node instanceof CompleteExpressionContext) {const todoToComplete = node.STRING().text;
            if(definedTodos.every(todo_ => todo_ !== todoToComplete)){
                // if the the todo is not yet defined, here we are only checking the predefined todo until this expression
                // which means the order is important
                errors.push({
                    code: "2",
                    endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex,
                    endLineNumber: node.stop.line,
                    message: `Todo ${todoToComplete} is not defined`,
                    startColumn: node.stop.charPositionInLine,
                    startLineNumber: node.stop.line
                });
            }
        }

    })
    return errors;
}

当初调用 checkSemanticRules 函数, 在 language servicevalidate办法中将语义和语法错误合并返回, 当初咱们编辑器曾经反对语义校验

Implementing Auto-Formatting

对于编辑器的主动格式化性能, 您须要通过调用 Monaco API registerDocumentFormattingEditProvider 提供并注册 Monaco 的格式化提供程序. 查看 monaco-editor 文档以获取更多详细信息. 调用并遍历 AST 将为你展现丑化后的代码

// languageService.ts   
format(code: string): string{
        // if the code contains errors, no need to format, because this way of formating the code, will remove some of the code
        // to make things simple, we only allow formatting a valide code
        if(this.validate(code).length > 0)
            return code;
        let formattedCode = "";
        const ast: TodoExpressionsContext = parseAndGetASTRoot(code);
        ast.children.forEach(node => {if (node instanceof AddExpressionContext) {
                // if a Add expression : ADD TODO "STRING"
                const todo = node.STRING().text;
                formattedCode += `ADD TODO ${todo}\n`;
            }else if(node instanceof CompleteExpressionContext) {
                // If a Complete expression: COMPLETE TODO "STRING"
                const todoToComplete = node.STRING().text;
                formattedCode += `COMPLETE TODO ${todoToComplete}\n`;
            }
        });
        return formattedCode;
    }

todoLangWorker 中增加 format 办法, 该 format 办法会应用 language serviceformat办法

当初创立 TodoLangFomattingProvider 类去实现 `DocumentFormattingEditProvider 接口

import * as monaco from "monaco-editor-core";
import {WorkerAccessor} from "./setup";

export default class TodoLangFormattingProvider implements monaco.languages.DocumentFormattingEditProvider {constructor(private worker: WorkerAccessor) { }

    provideDocumentFormattingEdits(model: monaco.editor.ITextModel, options: monaco.languages.FormattingOptions, token: monaco.CancellationToken): monaco.languages.ProviderResult<monaco.languages.TextEdit[]> {return this.format(model.uri, model.getValue());
    }

    private async format(resource: monaco.Uri, code: string): Promise<monaco.languages.TextEdit[]> {
        // get the worker proxy
        const worker = await this.worker(resource)
        // call the validate methode proxy from the langaueg service and get errors
        const formattedCode = await worker.format(code);
        const endLineNumber = code.split("\n").length + 1;
        const endColumn = code.split("\n").map(line => line.length).sort((a, b) => a - b)[0] + 1;
        console.log({endColumn, endLineNumber, formattedCode, code})
        return [
            {
                text: formattedCode,
                range: {
                    endColumn,
                    endLineNumber,
                    startColumn: 0,
                    startLineNumber: 0
                }
            }
        ]
    }
}

TodoLangFormattingProvider通过调用 worker 提供的 format 办法, 并借助 editor.getValue() 作为入参, 并且向 monaco 提供各式后的代码及想要替换的代码范畴, 当初进入 setup 函数并且应用 Monaco registerDocumentFormattingEditProvider API 注册formatting provider,  重跑利用,  你能看到编辑器已反对主动格式化了

monaco.languages.registerDocumentFormattingEditProvider(languageID, new TodoLangFormattingProvider(worker));

尝试点击Format documentShift + Alt + F, 你能看到如图的成果:

Implementing Auto-Completion

若要使主动实现反对定义的 TODO, 您要做的就是从 AST 获取所有定义的 TODO, 并提供 completion provider 通过在 setup 中调用 registerCompletionItemProvidercompletion provider 为您提供代码和光标的以后地位,因而您能够检查用户正在键入的上下文,如果他们在残缺的表达式中键入 TODO,则能够倡议预约义的 TO DOs。请记住,默认状况下,Monaco-editor 反对对代码中的预约义标记进行主动补全,您可能须要禁用该性能并实现本人的标记以使其更加智能化和高低文化

译者信息

退出移动版