译文起源
欢送浏览如何应用 TypeScript, React, ANTLR4, Monaco Editor 创立一个自定义 Web 编辑器系列的第二章节, 在这之前建议您浏览应用 TypeScript, React, ANTLR4, Monaco Editor 创立一个自定义 Web 编辑器(一)
在本文中, 我将介绍如何实现语言服务, 语言服务在编辑器中次要用来解析键入文本的沉重工作, 咱们将应用通过Parser生成的形象语法树(AST)来查找语法或词法谬误, 格局文本, 针对用户键入文本对TODOS语法做只能提醒(本文中我不会实现语法主动实现), 基本上, 语言服务裸露如下函数:
format(code: string): string
validate(code: string): Errors[]
autoComplete(code: string, currentPosition: Position): string[]
Add ANTLER, Generate Lexer and Parser From the Grammar
我将引入ANTLR库并减少一个依据TODOLang.g4
语法文件生Parser和Lexer的脚本, 首先引入两个必须的库:antlr4ts 和antlr4ts-cli, antlr4 Typescript 指标生成的解析器对antlr4ts包有运行时依赖, 另一方面, 顾名思义antlr4ts-cli 就是CLI咱们将应用它生成该语言的Parser和Lexer
npm add antlr4tsnpm add -D antlr4ts-cli
在根门路创立蕴含TodoLang
语法规定的文件TodoLangGrammar.g4
grammar TodoLangGrammar;todoExpressions : (addExpression)* (completeExpression)*;addExpression : ADD TODO STRING;completeExpression : COMPLETE TODO STRING;ADD : 'ADD';TODO : 'TODO';COMPLETE: 'COMPLETE';STRING: '"' ~ ["]* '"';EOL: [\r\n] + -> skip;WS: [ \t] -> skip;
当初咱们在package.json
文件里减少通过antlr-cli生成Parser和Lexer的脚本
"antlr4ts": "antlr4ts ./TodoLangGrammar.g4 -o ./src/ANTLR"
让咱们执行一下antlr4ts脚本,就能够在./src/ANTLR
目录看到生成的解析器的typescript源码了
npm run antlr4ts
正如咱们看到的那样, 这里有一个Lexer 和 Parser, 如果你查看Parser文件, 你会发现它导出 TodoLangGrammarParser
类, 该类有个构造函数constructor(input: TokenStream)
, 该构造函数将TodoLangGrammarLexer
为给定代码生成的TokenStream
作为参数, TodoLangGrammarLexer
有一个以代码作为入参的构造函数 constructor(input: CharStream)
Parser文件蕴含了public todoExpressions(): TodoExpressionsContext
办法,该办法会返回代码中定义的所有TodoExpressions
的上下文对象, 猜测一下TodoExpressions
在哪里能够追踪到,其实它是源于咱们语法规定文件的第一行语法规定:
todoExpressions : (addExpression)* (completeExpression)*;
TodoExpressionsContext
是AST
的根基, 其中的每个节点都是另一个规定的另一个上下文, 它蕴含了终端和节点上下文,终端领有最终令牌(ADD 令牌, TODO 令牌, todo 事项名称的令牌)
TodoExpressionsContext
蕴含了addExpressions
和completeExpressions
表达式列表, 来源于以下三条规定
todoExpressions : (addExpression)* (completeExpression)*; addExpression : ADD TODO STRING;completeExpression : COMPLETE TODO STRING;
另一方面, 每个上下文类都蕴含了终端节点, 它根本蕴含以下文本(代码段或者令牌, 例如:ADD, COMPLETE, 代表 TODO 的字符串), AST的复杂度取决于你编写的语法规定
让咱们来看看TodoExpressionsContext, 它蕴含了ADD
, TODO
和STRING
终端节点, 对应的规定如:
addExpression : ADD TODO STRING;
STRING
终端节点保留了咱们要加的Todo
文本内容, 先来解析一个简略的TodoLang
代码以来理解AST如何工作的,在./src/language-service
目录建一个蕴含以下内容的文件parser.ts
import { TodoLangGrammarParser, TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser";import { TodoLangGrammarLexer } from "../ANTLR/TodoLangGrammarLexer";import { ANTLRInputStream, CommonTokenStream } from "antlr4ts";export default function parseAndGetASTRoot(code: string): TodoExpressionsContext { const inputStream = new ANTLRInputStream(code); const lexer = new TodoLangGrammarLexer(inputStream); const tokenStream = new CommonTokenStream(lexer); const parser = new TodoLangGrammarParser(tokenStream); // Parse the input, where `compilationUnit` is whatever entry point you defined return parser.todoExpressions();}
parser.ts
文件导出了parseAndGetASTRoot(code)
办法, 它承受TodoLang
代码并且生成相应的AST, 解析以下TodoLang
代码:
parseAndGetASTRoot(`ADD TODO "Create an editor"COMPLETE TODO "Create an editor"`)
Implementing Lexical and Syntax Validation
在本节中, 我将疏导您逐渐理解如何向编辑器增加语法验证, ANTLR开箱即用为咱们生成词汇和语法错误, 咱们只须要实现ANTLRErrorListner
类并将其提供给Lexer和Parser, 这样咱们就能够在 ANTLR解析代码时收集谬误
在./src/language-service
目录下创立TodoLangErrorListener.ts
文件, 文件导出实现ANTLRErrorListner
接口的TodoLangErrorListener
类
import { ANTLRErrorListener, RecognitionException, Recognizer } from "antlr4ts";export interface ITodoLangError { startLineNumber: number; startColumn: number; endLineNumber: number; endColumn: number; message: string; code: string;}export default class TodoLangErrorListener implements ANTLRErrorListener<any>{ private errors: ITodoLangError[] = [] syntaxError(recognizer: Recognizer<any, any>, offendingSymbol: any, line: number, charPositionInLine: number, message: string, e: RecognitionException | undefined): void { this.errors.push( { startLineNumber:line, endLineNumber: line, startColumn: charPositionInLine, endColumn: charPositionInLine+1,//Let's suppose the length of the error is only 1 char for simplicity message, code: "1" // This the error code you can customize them as you want } ) } getErrors(): ITodoLangError[] { return this.errors; }}
每次 ANTLR 在代码解析期间遇到谬误时, 它将调用此TodoLangErrorListener
, 以向其提供无关谬误的信息, 该监听器会返回蕴含解析产生谬误的代码地位极错误信息, 当初咱们尝试把TodoLangErrorListener
绑定到parser.ts
的文件的Lexer和Parser里, eg:
import { TodoLangGrammarParser, TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser";import { TodoLangGrammarLexer } from "../ANTLR/TodoLangGrammarLexer";import { ANTLRInputStream, CommonTokenStream } from "antlr4ts";import TodoLangErrorListener, { ITodoLangError } from "./TodoLangErrorListener";function parse(code: string): {ast:TodoExpressionsContext, errors: ITodoLangError[]} { const inputStream = new ANTLRInputStream(code); const lexer = new TodoLangGrammarLexer(inputStream); lexer.removeErrorListeners() const todoLangErrorsListner = new TodoLangErrorListener(); lexer.addErrorListener(todoLangErrorsListner); const tokenStream = new CommonTokenStream(lexer); const parser = new TodoLangGrammarParser(tokenStream); parser.removeErrorListeners(); parser.addErrorListener(todoLangErrorsListner); const ast = parser.todoExpressions(); const errors: ITodoLangError[] = todoLangErrorsListner.getErrors(); return {ast, errors};}export function parseAndGetASTRoot(code: string): TodoExpressionsContext { const {ast} = parse(code); return ast;}export function parseAndGetSyntaxErrors(code: string): ITodoLangError[] { const {errors} = parse(code); return errors;}
在./src/language-service
目录下创立LanguageService.ts
, 以下是它导出的内容
import { TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser";import { parseAndGetASTRoot, parseAndGetSyntaxErrors } from "./Parser";import { ITodoLangError } from "./TodoLangErrorListener";export default class TodoLangLanguageService { validate(code: string): ITodoLangError[] { const syntaxErrors: ITodoLangError[] = parseAndGetSyntaxErrors(code); //Later we will append semantic errors return syntaxErrors; }}
不错, 咱们实现了编辑器谬误解析, 为此我将要创立上篇文章探讨过的web worker
, 并且增加worker
服务代理, 该代理将调用语言服务区实现编辑器的高级性能
Creating the web worker
首先, 咱们调用 monaco.editor.createWebWorker 来应用内置的 ES6 Proxies 创立代理TodoLangWorker
, TodoLangWorker
将应用语言服务来执行编辑器性能,在web worker
中执行的那些办法将由monaco代理,因而在web worker
中调用办法仅是在主线程中调用被代理的办法。
在./src/todo-lang
文件夹下创立TodoLangWorker.ts
蕴含以下内容:
import * as monaco from "monaco-editor-core";import IWorkerContext = monaco.worker.IWorkerContext;import TodoLangLanguageService from "../language-service/LanguageService";import { ITodoLangError } from "../language-service/TodoLangErrorListener";export class TodoLangWorker { private _ctx: IWorkerContext; private languageService: TodoLangLanguageService; constructor(ctx: IWorkerContext) { this._ctx = ctx; this.languageService = new TodoLangLanguageService(); } doValidation(): Promise<ITodoLangError[]> { const code = this.getTextDocument(); return Promise.resolve(this.languageService.validate(code)); } private getTextDocument(): string { const model = this._ctx.getMirrorModels()[0]; return model.getValue(); }
咱们创立了language service
实例 并且增加了doValidation
办法, 进一步它会调用language service
的validate
办法, 还增加了getTextDocument
办法, 该办法用来获取编辑器的文本值, TodoLangWorker
类还能够扩大很多性能如果你想要反对多文件编辑等, _ctx: IWorkerContext
是编辑器的上下文对象, 它保留了文件的 model 信息
当初让咱们在./src/todo-lang
目录下创立 web worker 文件todolang.worker.ts
import * as worker from 'monaco-editor-core/esm/vs/editor/editor.worker';import { TodoLangWorker } from './todoLangWorker';self.onmessage = () => { worker.initialize((ctx) => { return new TodoLangWorker(ctx) });};
咱们应用内置的worker.initialize
初始化咱们的 worker,并应用TodoLangWorker
进行必要的办法代理
那是一个web worker
, 因而咱们必须让webpack
输入对应的worker
文件
// webpack.config.jsentry: { app: './src/index.tsx', "editor.worker": 'monaco-editor-core/esm/vs/editor/editor.worker.js', "todoLangWorker": './src/todo-lang/todolang.worker.ts' }, output: { globalObject: 'self', filename: (chunkData) => { switch (chunkData.chunk.name) { case 'editor.worker': return 'editor.worker.js'; case 'todoLangWorker': return "todoLangWorker.js" default: return 'bundle.[hash].js'; } }, path: path.resolve(__dirname, 'dist') }
咱们命名worker
文件为todoLangWorker.js
文件, 当初咱们在编辑器启动函数外面减少getWorkUrl
(window as any).MonacoEnvironment = { getWorkerUrl: function (moduleId, label) { if (label === languageID) return "./todoLangWorker.js"; return './editor.worker.js'; } }
这是 monaco 如何获取web worker
的 URL 的办法, 请留神, 如果worker
的 label 是TodoLang
的 ID, 咱们将返回用于在 Webpack 中打包输入的同名worker,
如果当初构建我的项目, 则可能会发现有一个名为todoLangWorker.js
的文件(或者在 dev-tools 中, 您将在线程局部中找到两个worker
)
当初创立一个用来治理worker
创立和获取代理worker
客户端的 WorkerManager
import * as monaco from "monaco-editor-core";import Uri = monaco.Uri;import { TodoLangWorker } from './todoLangWorker';import { languageID } from './config';export class WorkerManager { private worker: monaco.editor.MonacoWebWorker<TodoLangWorker>; private workerClientProxy: Promise<TodoLangWorker>; constructor() { this.worker = null; } private getClientproxy(): Promise<TodoLangWorker> { if (!this.workerClientProxy) { this.worker = monaco.editor.createWebWorker<TodoLangWorker>({ moduleId: 'TodoLangWorker', label: languageID, createData: { languageId: languageID, } }); this.workerClientProxy = <Promise<TodoLangWorker>><any>this.worker.getProxy(); } return this.workerClientProxy; } async getLanguageServiceWorker(...resources: Uri[]): Promise<TodoLangWorker> { const _client: TodoLangWorker = await this.getClientproxy(); await this.worker.withSyncedResources(resources) return _client; }}
咱们应用createWebWorker
创立monaco代理的web worker
, 其次咱们获取返回了代理的客户端对象, 咱们应用workerClientProxy
调用代理的一些办法, 让咱们创立DiagnosticsAdapter
类, 该类用来连贯 Monaco 标记 Api 和语言服务返回的 error,为了让解析的谬误正确的标记在monaco上
import * as monaco from "monaco-editor-core";import { WorkerAccessor } from "./setup";import { languageID } from "./config";import { ITodoLangError } from "../language-service/TodoLangErrorListener";export default class DiagnosticsAdapter { constructor(private worker: WorkerAccessor) { const onModelAdd = (model: monaco.editor.IModel): void => { let handle: any; model.onDidChangeContent(() => { // here we are Debouncing the user changes, so everytime a new change is done, we wait 500ms before validating // otherwise if the user is still typing, we cancel the clearTimeout(handle); handle = setTimeout(() => this.validate(model.uri), 500); }); this.validate(model.uri); }; monaco.editor.onDidCreateModel(onModelAdd); monaco.editor.getModels().forEach(onModelAdd); } private async validate(resource: monaco.Uri): Promise<void> { const worker = await this.worker(resource) const errorMarkers = await worker.doValidation(); const model = monaco.editor.getModel(resource); monaco.editor.setModelMarkers(model, languageID, errorMarkers.map(toDiagnostics)); }}function toDiagnostics(error: ITodoLangError): monaco.editor.IMarkerData { return { ...error, severity: monaco.MarkerSeverity.Error, };}
onDidChangeContent
监听器监听model
信息, 如果model
信息变更, 咱们将每隔 500ms 调用webworker
去验证代码并且减少谬误标记;setModelMarkers
告诉monaco减少谬误标记, 为了使得编辑器语法验证性能实现,请确保在setup
函数中调用它们,并留神咱们正在应用WorkerManager来获取代理worker
monaco.languages.onLanguage(languageID, () => { monaco.languages.setMonarchTokensProvider(languageID, monarchLanguage); monaco.languages.setLanguageConfiguration(languageID, richLanguageConfiguration); const client = new WorkerManager(); const worker: WorkerAccessor = (...uris: monaco.Uri[]): Promise<TodoLangWorker> => { return client.getLanguageServiceWorker(...uris); }; //Call the errors provider new DiagnosticsAdapter(worker); });}export type WorkerAccessor = (...uris: monaco.Uri[]) => Promise<TodoLangWorker>;
当初所有准备就绪, 运行我的项目并且输出谬误的TodoLang
代码, 你会发现错误被标记在代码上面
Implementing Semantic Validation
当初往编辑器减少语义校验, 记得我在上篇文章提到的两个语义规定
- 如果应用 ADD TODO 阐明定义了 TODO ,咱们能够从新增加它。
- 在 TODO 中利用中,COMPLETE 指令不应在尚未应用申明 ADD TODO 前
要查看是否定义了 TODO,咱们要做的就是遍历 AST 以获取每个 ADD 表达式并将其推入definedTodos
.而后咱们在definedTodos
中查看 TODO 的存在. 如果存在, 则是语义谬误, 因而请从 ADD 表达式的上下文中获取谬误的地位, 而后将谬误推送到数组中, 第二条规定也是如此
function checkSemanticRules(ast: TodoExpressionsContext): ITodoLangError[] { const errors: ITodoLangError[] = []; const definedTodos: string[] = []; ast.children.forEach(node => { if (node instanceof AddExpressionContext) { // if a Add expression : ADD TODO "STRING" const todo = node.STRING().text; // If a TODO is defined using ADD TODO instruction, we can re-add it. if (definedTodos.some(todo_ => todo_ === todo)) { // node has everything to know the position of this expression is in the code errors.push({ code: "2", endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex, endLineNumber: node.stop.line, message: `Todo ${todo} already defined`, startColumn: node.stop.charPositionInLine, startLineNumber: node.stop.line }); } else { definedTodos.push(todo); } }else if(node instanceof CompleteExpressionContext) { const todoToComplete = node.STRING().text; if(definedTodos.every(todo_ => todo_ !== todoToComplete)){ // if the the todo is not yet defined, here we are only checking the predefined todo until this expression // which means the order is important errors.push({ code: "2", endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex, endLineNumber: node.stop.line, message: `Todo ${todoToComplete} is not defined`, startColumn: node.stop.charPositionInLine, startLineNumber: node.stop.line }); } } }) return errors;}
当初调用checkSemanticRules
函数, 在language service
的validate
办法中将语义和语法错误合并返回, 当初咱们编辑器曾经反对语义校验
Implementing Auto-Formatting
对于编辑器的主动格式化性能, 您须要通过调用Monaco API registerDocumentFormattingEditProvider
提供并注册 Monaco 的格式化提供程序. 查看 monaco-editor 文档以获取更多详细信息. 调用并遍历 AST 将为你展现丑化后的代码
// languageService.ts format(code: string): string{ // if the code contains errors, no need to format, because this way of formating the code, will remove some of the code // to make things simple, we only allow formatting a valide code if(this.validate(code).length > 0) return code; let formattedCode = ""; const ast: TodoExpressionsContext = parseAndGetASTRoot(code); ast.children.forEach(node => { if (node instanceof AddExpressionContext) { // if a Add expression : ADD TODO "STRING" const todo = node.STRING().text; formattedCode += `ADD TODO ${todo}\n`; }else if(node instanceof CompleteExpressionContext) { // If a Complete expression: COMPLETE TODO "STRING" const todoToComplete = node.STRING().text; formattedCode += `COMPLETE TODO ${todoToComplete}\n`; } }); return formattedCode; }
在todoLangWorker
中增加format
办法, 该format
办法会应用language service
的format
办法
当初创立TodoLangFomattingProvider
类去实现`DocumentFormattingEditProvider
接口
import * as monaco from "monaco-editor-core";import { WorkerAccessor } from "./setup";export default class TodoLangFormattingProvider implements monaco.languages.DocumentFormattingEditProvider { constructor(private worker: WorkerAccessor) { } provideDocumentFormattingEdits(model: monaco.editor.ITextModel, options: monaco.languages.FormattingOptions, token: monaco.CancellationToken): monaco.languages.ProviderResult<monaco.languages.TextEdit[]> { return this.format(model.uri, model.getValue()); } private async format(resource: monaco.Uri, code: string): Promise<monaco.languages.TextEdit[]> { // get the worker proxy const worker = await this.worker(resource) // call the validate methode proxy from the langaueg service and get errors const formattedCode = await worker.format(code); const endLineNumber = code.split("\n").length + 1; const endColumn = code.split("\n").map(line => line.length).sort((a, b) => a - b)[0] + 1; console.log({ endColumn, endLineNumber, formattedCode, code }) return [ { text: formattedCode, range: { endColumn, endLineNumber, startColumn: 0, startLineNumber: 0 } } ] }}
TodoLangFormattingProvider
通过调用worker
提供的format
办法, 并借助editor.getValue()
作为入参, 并且向monaco提供各式后的代码及想要替换的代码范畴, 当初进入setup
函数并且应用Monaco registerDocumentFormattingEditProvider
API注册formatting provider
, 重跑利用, 你能看到编辑器已反对主动格式化了
monaco.languages.registerDocumentFormattingEditProvider(languageID, new TodoLangFormattingProvider(worker));
尝试点击Format document 或Shift + Alt + F, 你能看到如图的成果:
Implementing Auto-Completion
若要使主动实现反对定义的 TODO, 您要做的就是从 AST 获取所有定义的 TODO, 并提供completion provider
通过在setup
中调用registerCompletionItemProvider
。completion provider
为您提供代码和光标的以后地位,因而您能够检查用户正在键入的上下文,如果他们在残缺的表达式中键入 TODO,则能够倡议预约义的 TO DOs。 请记住,默认状况下,Monaco-editor 反对对代码中的预约义标记进行主动补全,您可能须要禁用该性能并实现本人的标记以使其更加智能化和高低文化