本文内容基于webpack 5.74.0版本进行剖析

前言

  1. 因为webpack5整体代码过于简单,为了缩小复杂度,本文所有剖析将只基于js文件类型进行剖析,不会对其它类型(cssimage)进行剖析,所举的例子也都是基于js类型
  2. 为了减少可读性,会对源码进行删减、调整程序、扭转的操作,文中所有源码均可视作为伪代码
  3. 文章默认读者曾经把握tapableloaderplugin等基础知识,对文章中呈现asyncQueuetapableloaderplugin相干代码都会间接展现,不会减少过多阐明
  4. 因为webpack5整体代码过于简单,因而会抽离出外围代码进行剖析解说
外围代码是笔者认为外围代码的局部,必定会造成局部内容(读者也感觉是外围代码)缺失,如果发现缺失局部,请参考其它文章或者私信/评论区告知我

文章内容

编译入口->make->seal,而后进行seal阶段整体流程的概述(以流程图和简化代码的模式),而后依据流程图抽离进去的外围模块开展具体的剖析,在剖析过程中,会着重剖析:

  • ModuleChunkChunkGroupChunkGraph之间的关系
  • seal阶段与make阶段的区别
  • SplitChunksPlugin源码的深刻分析

力求可能对简单状况下的Chunk构建有一个清晰的理解

1.seal阶段流程概述

1.1 编译入口->make->seal

//node_modules/webpack/lib/webpack.jsconst webpack = (options, callback) => {  const { compiler, watch, watchOptions } = create(options);  compiler.run();  return compiler;}// node_modules/webpack/lib/Compiler.jsclass Compiler {    run(callback) {        const run = () => {            this.compile(onCompiled);        }        run();    }    compile(callback) {        const params = this.newCompilationParams();        this.hooks.beforeCompile.callAsync(params, err => {            const compilation = this.newCompilation(params);            this.hooks.make.callAsync(compilation, err => {                compilation.seal(err => {                    this.hooks.afterCompile.callAsync(compilation, err => {                        return callback(null, compilation);                    });                });            });        });    }}

在上一篇文章「Webpack5源码」make阶段(流程图)剖析,咱们曾经详细分析其次要模块的代码逻辑:从entry入口文件开始,进行依赖门路的resolve,而后应用loaders对文件内容进行转化,最终转化为AST找到该入口文件的依赖,而后反复门路解析resolve->loaders对文件内容进行转化->AST找到依赖的流程,最终处理完毕后,会触发compliation.seal()流程

1.2 seal阶段整体概述

  • create chunks: 遍历this.entries,进行多个Chunks的构建,包含入口文件造成Chunk、异步依赖造成Chunk等等
  • optimize: 对造成的Chunk进行优化,波及SplitChunkPlgins插件
  • code generation: 依据下面的Chunk造成最终的代码,波及到runtime以及各种module代码的生成
seal(callback) {    const chunkGraph = new ChunkGraph(        this.moduleGraph,        this.outputOptions.hashFunction    );    this.chunkGraph = chunkGraph;    //...    this.logger.time("create chunks");    /** @type {Map<Entrypoint, Module[]>} */    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {        const chunk = this.addChunk(name);        const entrypoint = new Entrypoint(options);        //...    }    //...    buildChunkGraph(this, chunkGraphInit);    this.logger.timeEnd("create chunks");    this.logger.time("optimize");    //...    while (this.hooks.optimizeChunks.call(this.chunks, this.chunkGroups)) {        /* empty */    }    //...    this.logger.timeEnd("optimize");    this.logger.time("code generation");    this.codeGeneration(err => {        //...        this.logger.timeEnd("code generation");    }}
const buildChunkGraph = (compilation, inputEntrypointsAndModules) => {    // PART ONE    logger.time("visitModules");    visitModules(...);    logger.timeEnd("visitModules");    // PART TWO    logger.time("connectChunkGroups");    connectChunkGroups(...);    logger.timeEnd("connectChunkGroups");    for (const [chunkGroup, chunkGroupInfo] of chunkGroupInfoMap) {        for (const chunk of chunkGroup.chunks)            chunk.runtime = mergeRuntime(chunk.runtime, chunkGroupInfo.runtime);    }    // Cleanup work    logger.time("cleanup");    cleanupUnconnectedGroups(compilation, allCreatedChunkGroups);    logger.timeEnd("cleanup");};

1.3 seal阶段整体流程图

1.4 重要概念

Dependency & Module

繁多文件会先构建出Dependency,依据类型的不同,会有不同的Dependency,比方EntryDependencyConcatenatedModule
不同类型的Dependency能够应用不同的ModuleFactory来进行Dependency->NormalModule的转化

一个文件造成的NormalModule,除了原始源代码之外,还蕴含许多有意义的信息,例如:应用的loaders、它的dependencies、它的exports等等
下图来自An in-depth perspective on webpack's bundling process

Chunk & ChunkGroup & EntryPoint

Chunk封装一个或者多个Module
ChunkGroup由一个或者多个Chunk组成,一个ChunkGroup能够是其它ChunkGroupparent或者child
EntryPoint是入口类型的ChunkGroup,蕴含了入口Chunk

下图来自An in-depth perspective on webpack's bundling process

ChunkGraph

治理module、chunk和chunkGroup之间的关系

上面的类图并没有写全属性,只是写上笔者认为重要的属性,上面两个图只是为了更好了解ChunkGraph的作用以及治理逻辑,不是作为概括应用


2.遍历this.entries,创立Chunk和ChunkGroup

  1. 进行new ChunkGraph()的初始化
  2. 遍历this.entries汇合,依据name进行addChunk()创立一个新的Chunk,并且创立对应的new Entrypoint(),也就是ChunkGroup
  3. 进行一系列对象的存储:namedChunkGroupsentrypointschunkGroups,为后续的逻辑做筹备
  4. 最初进行chunk和ChunkGroup的关联: connectChunkGroupAndChunk()
  5. 最初进行this.entries.dependencies的遍历,因为一个入口Chunk可能存在多个文件,比方entry: {A: ["1.js", "2.js"]}ChunkA存在1.js2.js,此时的this.entries.dependencies就是1.js2.js
seal() {    const chunkGraph = new ChunkGraph(        this.moduleGraph,        this.outputOptions.hashFunction    );    this.chunkGraph = chunkGraph;    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {        // 1.获取chunk对象        const chunk = this.addChunk(name);        // 2.依据options创立Entrypoint,entrypoint为chunkGroup对象        const entrypoint = new Entrypoint(options);        // 3.多个Map对象的设置        if (!options.dependOn && !options.runtime) {            entrypoint.setRuntimeChunk(chunk); // 前面生成runtime代码有用        }        entrypoint.setEntrypointChunk(chunk);        this.namedChunkGroups.set(name, entrypoint);        this.entrypoints.set(name, entrypoint);        this.chunkGroups.push(entrypoint);        // 4.关联chunkGroup和chunk        // const connectChunkGroupAndChunk = (chunkGroup, chunk) => {        //     if (chunkGroup.pushChunk(chunk)) {        //         chunk.addGroup(chunkGroup);        //     }        // };        connectChunkGroupAndChunk(entrypoint, chunk);        for (const dep of [...this.globalEntry.dependencies, ...dependencies]) {            entrypoint.addOrigin(null, { name }, /** @type {any} */(dep).request);            const module = this.moduleGraph.getModule(dep);            if (module) {                chunkGraph.connectChunkAndEntryModule(chunk, module, entrypoint);                //...            }        }    }}

2.1 this.entries

this.entries是什么?

在触发hooks.make.tapAsync()的剖析中,咱们晓得一开始会传入入口文件entry,而后应用createDependency()构建EntryDependency,而后调用compilation.addEntry()开始make阶段的执行

// node_modules/webpack/lib/EntryPlugin.jsapply(compiler) {    const { entry, options, context } = this;    const dep = EntryPlugin.createDependency(entry, options);    compiler.hooks.make.tapAsync("EntryPlugin", (compilation, callback) => {        compilation.addEntry(context, dep, options, err => {            callback(err);        });    });}static createDependency(entry, options) {  const dep = new EntryDependency(entry);  // TODO webpack 6 remove string option  dep.loc = { name: typeof options === "object" ? options.name : options };  return dep;}

而在addEntry()中:

  • 创立entryData数据
  • entryData[target].push(entry)
  • this.entries.set(name, entryData)

换句话说,this.entries寄存的就是入口文件类型的Dependency数组

// node_modules/webpack/lib/Compilation.jsaddEntry(context, entry, optionsOrName, callback) {    this._addEntryItem(context, entry, "dependencies", options, callback);}_addEntryItem(context, entry, target, options, callback) {    const { name } = options;    let entryData =        name !== undefined ? this.entries.get(name) : this.globalEntry;    if (entryData === undefined) {        entryData = {            dependencies: [],            includeDependencies: [],            options: {                name: undefined,                ...options            }        };        entryData[target].push(entry);        this.entries.set(name, entryData);    } else {        entryData[target].push(entry);        //...    }    //...    this.addModuleTree();}

回到文章要剖析的seal阶段,咱们就能够晓得,一开始遍历this.entries理论就是遍历入口文件,其中name是入口文件的名称,dependencies就是入口文件类型的EntryDependency,总结起来就是:

在遍历过程中,咱们对每一个入口文件,都调用addChunk()进行Chunk对象的构建+调用new Entrypoint()进行ChunkGroup对象的构建,而后应用connectChunkGroupAndChunk()建设起ChunkGroupChunk的关联

seal() {    const chunkGraph = new ChunkGraph(        this.moduleGraph,        this.outputOptions.hashFunction    );    this.chunkGraph = chunkGraph;    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {        // 1.获取chunk对象        const chunk = this.addChunk(name);        // 2.依据options创立Entrypoint,entrypoint为chunkGroup对象        const entrypoint = new Entrypoint(options);        // 3.多个Map对象的设置        if (!options.dependOn && !options.runtime) {            entrypoint.setRuntimeChunk(chunk); // 前面生成runtime代码有用        }        entrypoint.setEntrypointChunk(chunk);        this.namedChunkGroups.set(name, entrypoint);        this.entrypoints.set(name, entrypoint);        this.chunkGroups.push(entrypoint);        // 4.关联chunkGroup和chunk        // const connectChunkGroupAndChunk = (chunkGroup, chunk) => {        //     if (chunkGroup.pushChunk(chunk)) {        //         chunk.addGroup(chunkGroup);        //     }        // };        connectChunkGroupAndChunk(entrypoint, chunk);        //...    }}addChunk(name) {    //name存在namedChunks则返回以后chunk    if (name) {        const chunk = this.namedChunks.get(name);        if (chunk !== undefined) {            return chunk;        }    }    //新建chunk实例    const chunk = new Chunk(name, this._backCompat);    this.chunks.add(chunk);    if (this._backCompat)        //增加至ChunkGraphForChunk Map        ChunkGraph.setChunkGraphForChunk(chunk, this.chunkGraph);    if (name) {        //增加至namedChunks Map        this.namedChunks.set(name, chunk);    }    return chunk;}

2.2 this.entries.dependencies

比方entry: {A: ["1.js", "2.js"]}ChunkA存在1.js2.js,此时的this.entries.dependencies就是1.js2.js
  1. 通过dep获取对应的NormalModule,即利用dependency获取对应的Module对象
  2. 应用chunkGraph.connectChunkAndEntryModule()关联chunk、module和chunkGroup的关系
  3. assignDepths()办法会遍历入口module所有的依赖,为每一个module设置深度标记
seal() {    const chunkGraph = new ChunkGraph(        this.moduleGraph,        this.outputOptions.hashFunction    );    this.chunkGraph = chunkGraph;    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {        // 每一个入口都进行new Chunk()和new ChunkGroup()        // 关联chunkGroup和chunk    // 关联chunk、module、chunkGroup    const entryModules = new Set();    for (const dep of [...this.globalEntry.dependencies, ...dependencies]) {        entrypoint.addOrigin(null, { name }, /** @type {any} */(dep).request);        const module = this.moduleGraph.getModule(dep);        if (module) {            // const cgm = this._getChunkGraphModule(module);            // const cgc = this._getChunkGraphChunk(chunk);            // if (cgm.entryInChunks === undefined) {            //     cgm.entryInChunks = new Set();            // }            // cgm.entryInChunks.add(chunk);            // cgc.entryModules.set(module, entrypoint);            chunkGraph.connectChunkAndEntryModule(chunk, module, entrypoint);            entryModules.add(module);            const modulesList = chunkGraphInit.get(entrypoint);            if (modulesList === undefined) {                chunkGraphInit.set(entrypoint, [module]);            } else {                modulesList.push(module);            }        }    }    // 为module设置深度标记    this.assignDepths(entryModules);    }}

3.buildChunkGraph概述

从上面代码能够晓得,buildChunkGraph()次要分为三个局部:

  • visitModules()
  • connectChunkGroups()
  • cleanupUnconnectedGroups
因为每一点的逻辑都比较复杂,因而上面咱们将针对每一个点进行具体的剖析
seal(callback) {    const chunkGraph = new ChunkGraph(        this.moduleGraph,        this.outputOptions.hashFunction    );    this.chunkGraph = chunkGraph;    //...    this.logger.time("create chunks");    /** @type {Map<Entrypoint, Module[]>} */    for (const [name, { dependencies, includeDependencies, options }] of this.entries) {        const chunk = this.addChunk(name);        const entrypoint = new Entrypoint(options);        //...    }    //...    buildChunkGraph(this, chunkGraphInit);    //...}const buildChunkGraph = (compilation, inputEntrypointsAndModules) => {    // PART ONE    logger.time("visitModules");    visitModules(...);    logger.timeEnd("visitModules");    // PART TWO    logger.time("connectChunkGroups");    connectChunkGroups(...);    logger.timeEnd("connectChunkGroups");    for (const [chunkGroup, chunkGroupInfo] of chunkGroupInfoMap) {        for (const chunk of chunkGroup.chunks)            chunk.runtime = mergeRuntime(chunk.runtime, chunkGroupInfo.runtime);    }    // Cleanup work    logger.time("cleanup");    cleanupUnconnectedGroups(compilation, allCreatedChunkGroups);    logger.timeEnd("cleanup");};

4.buildChunkGraph-1-visitModules

从上面代码块晓得,visitModules次要分为三个局部:

  • inputEntrypointsAndModules:遍历inputEntrypointsAndModules,初始化chunkGroupInfo
  • 遍历chunkGroupsForCombining:解决chunkGroup有父chunkGroup的状况,将两个chunkGroupInfo进行相互关联
  • 解决queue数据:两个队列,一直循环解决
const visitModules = {    for (const [chunkGroup, modules] of inputEntrypointsAndModules) {      // 遍历inputEntrypointsAndModules,初始化chunkGroupInfo    }    for (const chunkGroupInfo of chunkGroupsForCombining) {      // 解决chunkGroup有父chunkGroup的状况,将两个chunkGroupInfo进行相互关联    }    while (queue.length || queueConnect.size) {      processQueue(); // 内层遍历      if (chunkGroupsForCombining.size > 0) {        processChunkGroupsForCombining();      }      if (queueConnect.size > 0) {        processConnectQueue();        if (chunkGroupsForMerging.size > 0) {          processChunkGroupsForMerging();        }      }      if (outdatedChunkGroupInfo.size > 0) {        processOutdatedChunkGroupInfo();      }    }}

4.1 visitModules流程图

4.2 遍历inputEntrypointsAndModules,初始化chunkGroupInfo

在下面2.1的剖析中,如上面代码所示,咱们会进行chunkGraphInit数据结构的初始化,应用entrypoint作为key,将对应入口所蕴含的Module都退出到数组中

比方entry: {A: ["1.js", "2.js"]}ChunkA存在1.js2.js,此时的this.entries.dependencies就是1.js2.jschunkGraphInit依据entrypoint创立的数组蕴含1.js2.js
// node_modules/webpack/lib/Compilation.jsfor (const [name, { dependencies, includeDependencies, options }] of this    .entries) {    const chunk = this.addChunk(name);    if (options.filename) {        chunk.filenameTemplate = options.filename;    }    const entrypoint = new Entrypoint(options);    //...    for (const dep of [...this.globalEntry.dependencies, ...dependencies]) {        entrypoint.addOrigin(null, { name }, /** @type {any} */(dep).request);        const module = this.moduleGraph.getModule(dep);        if (module) {            chunkGraph.connectChunkAndEntryModule(chunk, module, entrypoint);            entryModules.add(module);            const modulesList = chunkGraphInit.get(entrypoint);            if (modulesList === undefined) {                chunkGraphInit.set(entrypoint, [module]);            } else {                modulesList.push(module);            }        }    }    //...}

从上面代码能够晓得,咱们会遍历所有inputEntrypointsAndModules,获取所有入口文件相干的NormalModule,而后把它们都退出到queue

退出到queue之前会判断以后入口文件类型的chunkGroup是否具备parent,如果有的话,间接放入chunkGroupsForCombining,而不放入queue
// 精简代码,只留下要剖析的代码// inputEntrypointsAndModules = { Entrypoint: [NormalModule] }// 因为Entrypoint extends ChunkGroup,因而// inputEntrypointsAndModules = { ChunkGroup: [NormalModule] }for (const [chunkGroup, modules] of inputEntrypointsAndModules) {    const runtime = getEntryRuntime(        compilation,        chunkGroup.name,        chunkGroup.options    );       // 为entry创立chunkGroupInfo    const chunkGroupInfo = {        chunkGroup,        runtime,        minAvailableModules: undefined, // 可追踪的最小module数量        minAvailableModulesOwned: false,        availableModulesToBeMerged: [],        skippedItems: undefined,        resultingAvailableModules: undefined,        children: undefined,        availableSources: undefined,        availableChildren: undefined    };    if (chunkGroup.getNumberOfParents() > 0) {        // 如果chunkGroup有父chunkGroup,那么可能父chunkGroup曾经在其它中央曾经援用它了,须要另外解决        chunkGroupsForCombining.add(chunkGroupInfo);    } else {        chunkGroupInfo.minAvailableModules = EMPTY_SET;        const chunk = chunkGroup.getEntrypointChunk();        for (const module of modules) {            queue.push({                action: ADD_AND_ENTER_MODULE,                block: module,                module,                chunk,                chunkGroup,                chunkGroupInfo            });        }    }    chunkGroupInfoMap.set(chunkGroup, chunkGroupInfo);    if (chunkGroup.name) {        namedChunkGroups.set(chunkGroup.name, chunkGroupInfo);    }}

4.3 检测chunkGroupsForCombining,解决EntryPoint有父chunkGroup的状况

遍历chunkGroupsForCombining,将两个chunkGroupInfo进行相互关联,实质就是availableSourcesavailableChildren相互增加对方chunkGroupInfo

// 解决chunkGroup有父chunkGroup的状况,将两个chunkGroupInfo进行相互关联for (const chunkGroupInfo of chunkGroupsForCombining) {    const { chunkGroup } = chunkGroupInfo;    chunkGroupInfo.availableSources = new Set();    for (const parent of chunkGroup.parentsIterable) {        const parentChunkGroupInfo = chunkGroupInfoMap.get(parent);        chunkGroupInfo.availableSources.add(parentChunkGroupInfo);        if (parentChunkGroupInfo.availableChildren === undefined) {            parentChunkGroupInfo.availableChildren = new Set();        }        parentChunkGroupInfo.availableChildren.add(chunkGroupInfo);    }}

4.4 processQueue:解决queue

将所有入口类型的module压入queue后,赋予初始状态ADD_AND_ENTER_MODULE,而后一直变动状态值,调用不同办法进行解决
从上面processQueue()能够晓得,会执行因为几个状态都不存在break语句,因而会执行
ADD_AND_ENTER_ENTRY_MODULE->ADD_AND_ENTER_MODULE->ENTER_MODULE->PROCESS_BLOCK

for (const [chunkGroup, modules] of inputEntrypointsAndModules) {    // 为entry创立chunkGroupInfo    const chunkGroupInfo = {        chunkGroup,        runtime,        //...    };    chunkGroupInfo.minAvailableModules = EMPTY_SET;    const chunk = chunkGroup.getEntrypointChunk();    for (const module of modules) {        queue.push({            action: ADD_AND_ENTER_MODULE,            block: module,            module,            chunk,            chunkGroup,            chunkGroupInfo        });    }}// 取queue要pop(),为了保障拜访程序,须要反转一下数组queue.reverse();const processQueue = () => {    while (queue.length) {        statProcessedQueueItems++;        const queueItem = queue.pop();        module = queueItem.module;        block = queueItem.block;        chunk = queueItem.chunk;        chunkGroup = queueItem.chunkGroup;        chunkGroupInfo = queueItem.chunkGroupInfo;        switch (queueItem.action) {            case ADD_AND_ENTER_ENTRY_MODULE:            //...            case ADD_AND_ENTER_MODULE:            //...            case ENTER_MODULE:            //...            case PROCESS_BLOCK: {                processBlock(block);                break;            }            case PROCESS_ENTRY_BLOCK: {                processEntryBlock(block);                break;            }            case LEAVE_MODULE:            //...        }    }}
上面将依照ADD_AND_ENTER_ENTRY_MODULE->ADD_AND_ENTER_MODULE->ENTER_MODULE->PROCESS_BLOCK程序进行解说

4.4.1 ADD_AND_ENTER_ENTRY_MODULE

取目前的入口entryModule,而后进行chunkmodulechunkGroup的关联

switch (queueItem.action) {    case ADD_AND_ENTER_ENTRY_MODULE:        chunkGraph.connectChunkAndEntryModule(            chunk,            module,            /** @type {Entrypoint} */(chunkGroup)        );}// node_modules/webpack/lib/ChunkGraph.jsconnectChunkAndEntryModule(chunk, module, entrypoint) {  const cgm = this._getChunkGraphModule(module);  const cgc = this._getChunkGraphChunk(chunk);  if (cgm.entryInChunks === undefined) {    cgm.entryInChunks = new Set();  }  cgm.entryInChunks.add(chunk);  cgc.entryModules.set(module, entrypoint);}

4.4.2 ADD_AND_ENTER_MODULE

chunkmodule进行相互关联

switch (queueItem.action) {    case ADD_AND_ENTER_ENTRY_MODULE:        chunkGraph.connectChunkAndEntryModule(            chunk,            module,            /** @type {Entrypoint} */(chunkGroup)        );    // fallthrough    case ADD_AND_ENTER_MODULE: {        if (chunkGraph.isModuleInChunk(module, chunk)) {            // already connected, skip it            break;        }        // We connect Module and Chunk        chunkGraph.connectChunkAndModule(chunk, module);    }}// node_modules/webpack/lib/ChunkGraph.jsconnectChunkAndModule(chunk, module) {    const cgm = this._getChunkGraphModule(module);    const cgc = this._getChunkGraphChunk(chunk);    cgm.chunks.add(chunk);    cgc.modules.add(module);}isModuleInChunk(module, chunk) {    const cgc = this._getChunkGraphChunk(chunk);    return cgc.modules.has(module);}

4.4.3 ENTER_MODULE

switch (queueItem.action) {    case ADD_AND_ENTER_ENTRY_MODULE:        chunkGraph.connectChunkAndEntryModule(            chunk,            module,            /** @type {Entrypoint} */(chunkGroup)        );    // fallthrough    case ADD_AND_ENTER_MODULE: {        if (chunkGraph.isModuleInChunk(module, chunk)) {            // already connected, skip it            break;        }        // We connect Module and Chunk        chunkGraph.connectChunkAndModule(chunk, module);    }    case ENTER_MODULE: {        const index = chunkGroup.getModulePreOrderIndex(module);        // ...省略设置index的逻辑        queueItem.action = LEAVE_MODULE;        queue.push(queueItem);    }}

4.4.4 PROCESS_BLOCK

ADD_AND_ENTER_ENTRY_MODULE->ADD_AND_ENTER_MODULE->ENTER_MODULE->PROCESS_BLOCK,此时会触发processBlock()的执行

const processQueue = () => {    while (queue.length) {        statProcessedQueueItems++;        const queueItem = queue.pop();        module = queueItem.module;        block = queueItem.block;        chunk = queueItem.chunk;        chunkGroup = queueItem.chunkGroup;        chunkGroupInfo = queueItem.chunkGroupInfo;        switch (queueItem.action) {            case ADD_AND_ENTER_ENTRY_MODULE:            //...            case ADD_AND_ENTER_MODULE:            //...            case ENTER_MODULE:            //...            case PROCESS_BLOCK: {                processBlock(block);                break;            }            case PROCESS_ENTRY_BLOCK: {                processEntryBlock(block);                break;            }            case LEAVE_MODULE:            //...        }    }}

processBlock()中先触发getBlockModules()

同步依赖的block=module,异步依赖就传递不同的参数
const processBlock = block => {    const blockModules = getBlockModules(block, chunkGroupInfo.runtime);}getBlockModules() {    //...省略初始化blockModules和blockModulesMap的逻辑    extractBlockModules(module, moduleGraph, runtime, blockModulesMap);    blockModules = blockModulesMap.get(block);    return blockModules;}const extractBlockModules = (module, moduleGraph, runtime, blockModulesMap) => {    //...省略很多条件判断    for (const connection of moduleGraph.getOutgoingConnections(module)) {        const m = connection.module;        const i = index << 2;        modules[i] = m;        modules[i + 1] = state;    }    //...省略解决modules[t]为空的逻辑    //最终返回的就是module所有import的依赖+对应的state的数组}

moduleGraph.getOutgoingConnections()是一个看起来十分相熟的办法,在make阶段中咱们就遇到过
// node_modules/webpack/lib/ModuleGraph.jsgetOutgoingConnections(module) {    const connections = this._getModuleGraphModule(module).outgoingConnections;    return connections === undefined ? EMPTY_SET : connections;}

make阶段addModule()办法执行后,咱们会执行moduleGraph.setResolvedModule(),其中会波及到originModuledependencymodule等变量

// node_modules/webpack/lib/Compilation.jsconst unsafeCacheableModule =    /** @type {Module & { restoreFromUnsafeCache: Function }} */ (  module);for (let i = 0; i < dependencies.length; i++) {  const dependency = dependencies[i];  moduleGraph.setResolvedModule(    connectOrigin ? originModule : null,    dependency,    unsafeCacheableModule  );  unsafeCacheDependencies.set(dependency, unsafeCacheableModule);}// node_modules/webpack/lib/ModuleGraph.jssetResolvedModule(originModule, dependency, module) {    const connection = new ModuleGraphConnection(        originModule,        dependency,        module,        undefined,        dependency.weak,        dependency.getCondition(this)    );    const connections = this._getModuleGraphModule(module).incomingConnections;    connections.add(connection);    if (originModule) {        const mgm = this._getModuleGraphModule(originModule);        if (mgm._unassignedConnections === undefined) {            mgm._unassignedConnections = [];        }        mgm._unassignedConnections.push(connection);        if (mgm.outgoingConnections === undefined) {            mgm.outgoingConnections = new SortableSet();        }        mgm.outgoingConnections.add(connection);    } else {        this._dependencyMap.set(dependency, connection);    }}
  • originModule: 父Module,比方上面示例中的index.js
  • dependency: 是父Module的依赖汇合,比方上面示例中的"./item/index_item-parent1.js",它会在originModule中产生4个dependency
// index.jsimport {getC1} from "./item/index_item-parent1.js";var test = _.add(6, 4) + getC1(1, 3);var test1 = _.add(6, 4) + getC1(1, 3);var test2 =  getC1(4, 5);

sortedDependencies[0] = {    dependencies: [        { // HarmonyImportSideEffectDependency            request: "./item/index_item-parent1.js",            userRequest: "./item/index_item-parent1.js"        },        { // HarmonyImportSpecifierDependency            name: "getC1",            request: "./item/index_item-parent1.js",            userRequest: "./item/index_item-parent1.js"        }        //...    ],    originModule: {        userRequest: "/Users/wcbbcc/blog/Frontend-Articles/webpack-debugger/js/src/index.js",        dependencies: [            //...10个依赖,包含下面那两个Dependency        ]    }}
  • module: 在make阶段中,依赖对象dependency会进行handleModuleCreation(),这个时候触发的是NormalModuleFactory.create(),会拿出第一个dependencies[0],也就是下面示例中的HarmonyImportSideEffectDependency,也就是import {getC1} from "./item/index_item-parent1.js",而后转化为module
// node_modules/webpack/lib/NormalModuleFactory.jscreate(data, callback) {    const dependencies = /** @type {ModuleDependency[]} */ (data.dependencies);    const dependency = dependencies[0];    const request = dependency.request;    const dependencyType =        (dependencies.length > 0 && dependencies[0].category) || "";    const resolveData = {        request,        dependencies,        dependencyType    };    // 利用resolveData进行一系列的resolve()和buildModule()操作...}

回到processBlock()的剖析,咱们就能够晓得,connection.module理论就是以后module的所有依赖

其中要记住的是 以后module的同步依赖是建设在 blockModulesMap.set(block, arr)的arr数组中,此时block是以后module
而以后module的异步依赖会另外起一个数组arr,即便blockModulesMap.set(block, arr)的block是以后module的异步依赖
const processBlock = block => {    const blockModules = getBlockModules(block, chunkGroupInfo.runtime);}getBlockModules() {    //...省略初始化blockModules和blockModulesMap的逻辑    extractBlockModules(module, moduleGraph, runtime, blockModulesMap);    blockModules = blockModulesMap.get(block);    return blockModules;}const extractBlockModules = (module, moduleGraph, runtime, blockModulesMap) => {    const queue = [module];    while (queue.length > 0) {        const block = queue.pop();        const arr = [];        arrays.push(arr);        blockModulesMap.set(block, arr);        for (const b of block.blocks) {            queue.push(b);        }    }    for (const connection of moduleGraph.getOutgoingConnections(module)) {        const m = connection.module;        const i = index << 2;        modules[i] = m;        modules[i + 1] = state;    }    //...省略解决modules去重逻辑    //最终返回的就是module所有import的依赖+对应的state的数组}

最终extractBlockModules()会失去一个依赖数据对象blockModulesgetBlockModules()通过以后module获取所有的同步依赖,即上面示例中的Array(14)

processBlock()-解决同步依赖

通过下面的剖析,咱们通过getBlockModules()获取以后block的所有同步依赖后,咱们对这些依赖进行遍历

同步依赖的block=module,异步依赖就传递不同的参数,如上面的queueBuffer的数据结构,blockmodule都是同一个数据refModule

次要分为三个方面的解决:

  • 如果activeState不为true,则退出到skipConnectionBuffer汇合中
  • 如果activeState为true,然而minAvailableModules/minAvailableModules曾经有该module,也就是parent chunks曾经含有该module,则退出到skipBuffer汇合中
  • 如果可能满足下面两个查看,则把以后的module退出到queueBuffer
const processBlock = (block, isSrc) => {    const blockModules = getBlockModules(block, chunkGroupInfo.runtime);    for (let i = 0; i < blockModules.length; i += 2) {        const refModule = /** @type {Module} */ (blockModules[i]);        if (chunkGraph.isModuleInChunk(refModule, chunk)) {            // skip early if already connected            continue;        }        const activeState = /** @type {ConnectionState} */ (            blockModules[i + 1]        );        if (activeState !== true) {            skipConnectionBuffer.push([refModule, activeState]);            if (activeState === false) continue;        }        if (            activeState === true &&            (minAvailableModules.has(refModule) ||                minAvailableModules.plus.has(refModule))        ) {            // already in parent chunks, skip it for now            skipBuffer.push(refModule);            continue;        }        // enqueue, then add and enter to be in the correct order        // this is relevant with circular dependencies        queueBuffer.push({            action: activeState === true ? ADD_AND_ENTER_MODULE : PROCESS_BLOCK,            block: refModule,            module: refModule,            chunk,            chunkGroup,            chunkGroupInfo        });    }    // 解决skipConnectionBuffer    // 解决skipBuffer    // 解决queueBuffer}
因为三段逻辑比拟显著和扩散,咱们能够把它们合在一起

如果activeState不为true,则将以后同步依赖退出到skipConnectionBuffer汇合中,而后放入到以后module的chunkGroupInfo.skippedModuleConnections

for (let i = 0; i < blockModules.length; i += 2) {    const activeState = /** @type {ConnectionState} */ (        blockModules[i + 1]    );    if (activeState !== true) {        skipConnectionBuffer.push([refModule, activeState]);        if (activeState === false) continue;    }}if (skipConnectionBuffer.length > 0) {    let { skippedModuleConnections } = chunkGroupInfo;    if (skippedModuleConnections === undefined) {        chunkGroupInfo.skippedModuleConnections = skippedModuleConnections =            new Set();    }    for (let i = skipConnectionBuffer.length - 1; i >= 0; i--) {        skippedModuleConnections.add(skipConnectionBuffer[i]);    }    skipConnectionBuffer.length = 0;}

如果activeState为true,然而minAvailableModules/minAvailableModules曾经有该module,也就是parent chunks曾经含有该module,则退出到skipBuffer汇合中,而后放入到以后module的chunkGroupInfo.skippedItems

for (let i = 0; i < blockModules.length; i += 2) {    const activeState = /** @type {ConnectionState} */ (        blockModules[i + 1]    );    if (        activeState === true &&        (minAvailableModules.has(refModule) ||            minAvailableModules.plus.has(refModule))    ) {        // already in parent chunks, skip it for now        skipBuffer.push(refModule);        continue;    }}if (skipBuffer.length > 0) {    let {skippedItems} = chunkGroupInfo;    if (skippedItems === undefined) {        chunkGroupInfo.skippedItems = skippedItems = new Set();    }    for (let i = skipBuffer.length - 1; i >= 0; i--) {        skippedItems.add(skipBuffer[i]);    }    skipBuffer.length = 0;}

如果可能满足下面两个查看,则把以后的module的同步依赖退出到queueBuffer中,而后退出到queue,持续在内层循环中解决同步依赖

for (let i = 0; i < blockModules.length; i += 2) {    const activeState = /** @type {ConnectionState} */ (        blockModules[i + 1]    );    queueBuffer.push({        action: activeState === true ? ADD_AND_ENTER_MODULE : PROCESS_BLOCK,        block: refModule,        module: refModule,        chunk,        chunkGroup,        chunkGroupInfo    });}if (queueBuffer.length > 0) {    for (let i = queueBuffer.length - 1; i >= 0; i--) {        queue.push(queueBuffer[i]);    }    queueBuffer.length = 0;}
processBlock()-解决异步依赖

解决实现同步依赖后,会触发iteratorBlock(b)解决以后module的异步依赖
从上面的代码块剖析能够晓得,次要分为3种状况

  • 状况1: 这个异步依赖NormalModule还没有对应的chunkGroup

    • 场景1: Entry类型,压入queueDelayed,状态置为PROCESS_ENTRY_BLOCK,构件新的Chunk
    • 场景2: webpack.config.jsasyncChunks=false/chunkLoading=false,还是应用目前的Chunk,与同步依赖集成在同一文件中
    • 场景3: 非Entry+容许asyncChunk的状况,应用addChunkInGroup()建设新的ChunkGroup和新的Chunk,造成新的文件寄存该异步依赖
  • 状况2: 这个异步依赖NormalModule有对应的chunkGroup,而且它是入口类型的
  • 状况3: 这个异步依赖NormalModule有对应的chunkGroup,而且它不是入口类型的

最初再进行Entry类型和非Entry类型的离开解决

const processBlock = (block, isSrc) => {  //...解决同步依赖  for (const b of block.blocks) {          iteratorBlock(b);  }}const iteratorBlock = b => {    let cgi = blockChunkGroups.get(b);    const entryOptions = b.groupOptions && b.groupOptions.entryOptions;    if (cgi === undefined) {        // 状况1: 这个异步NormalModule还没有对应的chunkGroup        if (entryOptions) {            // 场景1: Entry类型            queueDelayed.push({                action: PROCESS_ENTRY_BLOCK,                block: b,                module: module,                chunk: entrypoint.chunks[0],                chunkGroup: entrypoint,                chunkGroupInfo: cgi            });        } else if (!chunkGroupInfo.asyncChunks || !chunkGroupInfo.chunkLoading) {            // 场景2: webpack.config.js中asyncChunks=false/chunkLoading=false            queue.push({                action: PROCESS_BLOCK,                block: b,                module: module,                chunk,                chunkGroup,                chunkGroupInfo            });        } else {            // 场景3: 非Entry+容许asyncChunk的状况            c = compilation.addChunkInGroup(                b.groupOptions || b.chunkName,                module,                b.loc,                b.request            );            blockConnections.set(b, []);        }    } else if (entryOptions) {        // 状况2: 这个异步NormalModule有对应的chunkGroup,而且它是入口类型的        entrypoint = cgi.chunkGroup;    } else {        // 状况3: 这个异步NormalModule有对应的chunkGroup,而且它不是入口类型的        c = cgi.chunkGroup;    }    if (c !== undefined) {      // 解决不是Entry类型    } else if (entrypoint !== undefined) {      // 解决Entry类型        chunkGroupInfo.chunkGroup.addAsyncEntrypoint(entrypoint);    }}
解决不是Entry类型:queueConnection的构建

c !== undefined时,该异步依赖不是Entry类型,将它放入到queueConnection
而后把以后异步依赖也放入queueDelayed数组中,期待下一次解决,此时咱们要留神,chunkGroup曾经变为c,此时的c有可能是异步依赖建设的新的ChunkGroup

if (c !== undefined) {    blockConnections.get(b).push({        originChunkGroupInfo: chunkGroupInfo,        chunkGroup: c    });    let connectList = queueConnect.get(chunkGroupInfo);    if (connectList === undefined) {        connectList = new Set();        queueConnect.set(chunkGroupInfo, connectList);    }    connectList.add(cgi);    // TODO check if this really need to be done for each traversal    // or if it is enough when it's queued when created    // 4. We enqueue the DependenciesBlock for traversal    queueDelayed.push({        action: PROCESS_BLOCK,        block: b,        module: module,        chunk: c.chunks[0],        chunkGroup: c,        chunkGroupInfo: cgi    });}
processBlock()-解决异步依赖的异步依赖

存储在blocksWithNestedBlocks这个Set数据结构中,等到下一个阶段进行解决

const processBlock = (block, isSrc) => {    //...解决同步依赖    // 解决异步依赖    for (const b of block.blocks) {        iteratorBlock(b);    }    if (block.blocks.length > 0 && module !== block) {        blocksWithNestedBlocks.add(block);    }}
在下面的剖析中,咱们晓得当异步依赖是entry类型时,咱们会将它退出到queueDelayed,并且状态置为PROCESS_ENTRY_BLOCK,那么这个状态执行了什么逻辑呢?

4.4.5 PROCESS_ENTRY_BLOCK

从上面代码能够看出,processEntryBlock()processBlock()的整体逻辑是一样的,都是遍历所有同步依赖blockModules,而后压入到queueBuffer中,而后解决异步依赖,而后解决异步依赖的异步依赖

const processEntryBlock = block => {    const blockModules = getBlockModules(block, chunkGroupInfo.runtime);    for (let i = 0; i < blockModules.length; i += 2) {        const refModule = /** @type {Module} */ (blockModules[i]);        const activeState = /** @type {ConnectionState} */ (            blockModules[i + 1]        );        queueBuffer.push({            action:                activeState === true ? ADD_AND_ENTER_ENTRY_MODULE : PROCESS_BLOCK,            block: refModule,            module: refModule,            chunk,            chunkGroup,            chunkGroupInfo        });    }    if (queueBuffer.length > 0) {        for (let i = queueBuffer.length - 1; i >= 0; i--) {            queue.push(queueBuffer[i]);        }        queueBuffer.length = 0;    }    for (const b of block.blocks) {        iteratorBlock(b);    }    if (block.blocks.length > 0 && module !== block) {        blocksWithNestedBlocks.add(block);    }}

4.4.6 LEAVE_MODULE

最初一个状态,设置index,没有什么特地的逻辑

const processQueue = () => {    while (queue.length) {        statProcessedQueueItems++;        const queueItem = queue.pop();        module = queueItem.module;        block = queueItem.block;        chunk = queueItem.chunk;        chunkGroup = queueItem.chunkGroup;        chunkGroupInfo = queueItem.chunkGroupInfo;        switch (queueItem.action) {            case ADD_AND_ENTER_ENTRY_MODULE:            //...            case ADD_AND_ENTER_MODULE:            //...            case ENTER_MODULE:            //...            case PROCESS_BLOCK: {                processBlock(block);                break;            }            case PROCESS_ENTRY_BLOCK: {                processEntryBlock(block);                break;            }            case LEAVE_MODULE:                const index = chunkGroup.getModulePostOrderIndex(module);                if (index === undefined) {                    chunkGroup.setModulePostOrderIndex(                        module,                        chunkGroupInfo.postOrderIndex++                    );                }                if (                    moduleGraph.setPostOrderIndexIfUnset(                        module,                        nextFreeModulePostOrderIndex                    )                ) {                    nextFreeModulePostOrderIndex++;                }                break;        }    }}

4.4.7 总结

  1. 解决同步的依赖->将异步依赖退出队列中->将异步依赖的异步依赖放入到Set()中
  2. queue->queueBuffer(ADD_AND_ENTER_MODULE)->queueDelayed(PROCESS_ENTRY_BLOCK或者PROCESS_BLOCK)

4.5 解决chunkGroupsForCombining,即chunkGroup有父chunkGroup的状况

chunkGroupsForCombining数据是在哪里增加的?数据结构是怎么的?最初是如何解决的?

在下面visitModules()的剖析中,会进行inputEntrypointsAndModules遍历,而后抉择压入queue解决或者压入chunkGroupsForCombining解决,而这些数据,会等到一轮queue处理完毕后再进行解决

if (chunkGroup.getNumberOfParents() > 0) {    // minAvailableModules for child entrypoints are unknown yet, set to undefined.    // This means no module is added until other sets are merged into    // this minAvailableModules (by the parent entrypoints)    const skippedItems = new Set();    for (const module of modules) {        skippedItems.add(module);    }    chunkGroupInfo.skippedItems = skippedItems;    chunkGroupsForCombining.add(chunkGroupInfo);} else {    for (const module of modules) {        queue.push({            action: ADD_AND_ENTER_MODULE,            block: module,            module,            chunk,            chunkGroup,            chunkGroupInfo        });    }}for (const chunkGroupInfo of chunkGroupsForCombining) {    const { chunkGroup } = chunkGroupInfo;    chunkGroupInfo.availableSources = new Set();    for (const parent of chunkGroup.parentsIterable) {        const parentChunkGroupInfo = chunkGroupInfoMap.get(parent);        chunkGroupInfo.availableSources.add(parentChunkGroupInfo);        if (parentChunkGroupInfo.availableChildren === undefined) {            parentChunkGroupInfo.availableChildren = new Set();        }        parentChunkGroupInfo.availableChildren.add(chunkGroupInfo);    }}

processQueue()的内层循环完结时,咱们会进行chunkGroupsForCombining数据的对立解决

每一次遍历完queue,都会触发一次chunkGroupsForCombining.size的检测
while (queue.length || queueConnect.size) {    processQueue();    if (chunkGroupsForCombining.size > 0) {        processChunkGroupsForCombining();    }    //...    if (queue.length === 0) {        const tempQueue = queue;        queue = queueDelayed.reverse();        queueDelayed = tempQueue;    }}

processChunkGroupsForCombining()具体逻辑如下所示,波及到一个比拟难懂的办法: calculateResultingAvailableModules(),咱们临时了解为它能够计算出以后Chunk的可复用的最小模块,能够应用一个示例简略了解可复用的最小模块:

  • 目前parentModuleentry.js,它有同步依赖a.jsb.jsc.js,异步依赖async_B.js
  • 目前异步依赖async_B.js能够造成新的ChunkChunkGroup,它有同步依赖a.jsb.js
  • 因为异步依赖async_B.js的加载工夫必定慢于parentModule的同步依赖,因而异步依赖async_B.js能够间接复用parentModule的同步依赖a.jsb.js,而不必把a.jsb.js打包进去本人的Chunk

ChunkGroupInfo.minAvailableModules就是a.jsb.jsNormalModule汇合

理分明minAvailableModules的概念后,咱们就能够对上面代码进行剖析:

  • 遍历以后ChunkGroupInfo的所有parent ChunkGroupInfo,即info.availableSources,而后计算出它们的resultingAvailableModules可复用的模块,而后一直合并到以后ChunkGroupInfoavailableModules属性中
  • 最终进行ChunkGroupInfo.minAvailableModules的赋值
  • 最终outdatedChunkGroupInfo增加目前的ChunkGroupInfo
const processChunkGroupsForCombining = () => {  for (const info of chunkGroupsForCombining) {    for (const source of info.availableSources) {      if (!source.minAvailableModules) {        chunkGroupsForCombining.delete(info);        break;      }    }  }  for (const info of chunkGroupsForCombining) {    const availableModules = /** @type {ModuleSetPlus} */ (new Set());    availableModules.plus = EMPTY_SET;    const mergeSet = set => {      if (set.size > availableModules.plus.size) {        for (const item of availableModules.plus) availableModules.add(item);        availableModules.plus = set;      } else {        for (const item of set) availableModules.add(item);      }    };    // combine minAvailableModules from all resultingAvailableModules    for (const source of info.availableSources) {      const resultingAvailableModules =        calculateResultingAvailableModules(source);      mergeSet(resultingAvailableModules);      mergeSet(resultingAvailableModules.plus);    }    info.minAvailableModules = availableModules;    info.minAvailableModulesOwned = false;    info.resultingAvailableModules = undefined;    outdatedChunkGroupInfo.add(info);  }  chunkGroupsForCombining.clear();};

4.6 解决queueConnect和chunkGroupsForMerging

queueConnect数据是在哪里增加的?数据结构是如何?最初是如何解决queueConnect这种数据的?

4.6.1 queueConnect数据增加

在下面的剖析中,咱们能够晓得,解决NormalModule的异步依赖时,咱们会触发iteratorBlock()办法
iteratorBlock()中,咱们会将异步依赖新创建的ChunkGroup退出到queueConnect中,而后将目前的异步依赖的action置为PROCESS_BLOCK,从新进行processBlock的同步依赖和异步依赖的解决

如上面代码块所示,c实际上是一个非入口类型的chunkGroup
queueConnect存储的是:

  • key: 以后ChunkGroupInfo
  • value: 非入口类型创立的新chunkGroup汇合数组
// 解决NormalModule的异步依赖bconst iteratorBlock = b => {    // 如果c之前不存在,须要从新建设,这里只是为了更好了解而摘出这部分代码    c = compilation.addChunkInGroup(        b.groupOptions || b.chunkName,        module,        b.loc,        b.request    );    c.index = nextChunkGroupIndex++;    if (c !== undefined) {        // b为非入口的异步依赖        blockConnections.get(b).push({            originChunkGroupInfo: chunkGroupInfo,            chunkGroup: c        });        let connectList = queueConnect.get(chunkGroupInfo);        if (connectList === undefined) {            connectList = new Set();            queueConnect.set(chunkGroupInfo, connectList);        }        connectList.add(cgi);        queueDelayed.push({            action: PROCESS_BLOCK,            block: b,            module: module,            chunk: c.chunks[0],            chunkGroup: c,            chunkGroupInfo: cgi        });    } else if (entrypoint !== undefined) {        chunkGroupInfo.chunkGroup.addAsyncEntrypoint(entrypoint);    }}

4.6.2 解决queueConnect数据

iteratorBlock()中进行queueConnect数据的构建后
processQueue()的内层循环完结时,咱们会进行queueConnect数据的对立解决

每一次遍历完queue,都会触发一次queueConnect.size的检测
while (queue.length || queueConnect.size) {    processQueue();    if (chunkGroupsForCombining.size > 0) {        processChunkGroupsForCombining();    }    if (queueConnect.size > 0) {        // calculating available modules        processConnectQueue();        if (chunkGroupsForMerging.size > 0) {            // merging available modules            processChunkGroupsForMerging();        }    }    //...    if (queue.length === 0) {        const tempQueue = queue;        queue = queueDelayed.reverse();        queueDelayed = tempQueue;    }}

processConnectQueue()解决以后ChunkGroupInfo的异步依赖,此时

  • chunkGroupInfo: 以后的ChunkGroupInfo
  • targets:以后的ChunkGroupInfo的异步依赖中非入口类型新建的ChunkGroup汇合数组

上面代码整体流程能够概括为:

  • 先将非入口类型异步依赖新建的ChunkGroup都退出到以后的ChunkGroupInfo.children
  • 计算出以后的ChunkGroupInfo最小可复用的module汇合数据,而后增加到新建的ChunkGroup.availableModulesToBeMerged属性中
  • 将非入口类型异步依赖新建的ChunkGroup都退出到chunkGroupsForMerging汇合中,筹备下一个阶段
const processConnectQueue = () => {    // 解决异步依赖创立的<ChunkGroupInfo, chunkGroup[]>之间的关联    for (const [chunkGroupInfo, targets] of queueConnect) {        // 1. Add new targets to the list of children        for (const target of targets) {                    chunkGroupInfo.children.add(target);                }                // 2. Calculate resulting available modules        const resultingAvailableModules =            calculateResultingAvailableModules(chunkGroupInfo);        const runtime = chunkGroupInfo.runtime;        // 3. Update chunk group info        for (const target of targets) {            target.availableModulesToBeMerged.push(resultingAvailableModules);            chunkGroupsForMerging.add(target);            const oldRuntime = target.runtime;            const newRuntime = mergeRuntime(oldRuntime, runtime);            if (oldRuntime !== newRuntime) {                target.runtime = newRuntime;                outdatedChunkGroupInfo.add(target);            }        }        statConnectedChunkGroups += targets.size;    }    queueConnect.clear();};

4.6.3 解决chunkGroupsForMerging数据

在下面调用processConnectQueue()解决实现queueConnect数据后,会触发processChunkGroupsForMerging()解决chunkGroupsForMergings数据

while (queue.length || queueConnect.size) {    processQueue();    if (chunkGroupsForCombining.size > 0) {        processChunkGroupsForCombining();    }    if (queueConnect.size > 0) {        // calculating available modules        processConnectQueue();        if (chunkGroupsForMerging.size > 0) {            // merging available modules            processChunkGroupsForMerging();        }    }    //...    if (queue.length === 0) {        const tempQueue = queue;        queue = queueDelayed.reverse();        queueDelayed = tempQueue;    }}
注:因为processChunkGroupsForMerging()代码量过多,因而为了简化解决,将应用一个示例解说该办法,并且只保留示例会运行的条件代码


如上图所示,有两个入口会同时持有异步依赖async_B.js,在下面processConnectQueue()的剖析中,咱们能够晓得,应用calculateResultingAvailableModules()能够计算出resultingAvailableModules为:

  • entry1.js['./src/entry1.js', './item/entry1_a.js', './item/entry1_b.js', './item/common_____g.js']
  • entry2.js['./src/entry2.js', './item/entry1_b.js', './item/entry2_aa', './item/common_____g.js']

而后触发target.availableModulesToBeMerged.push(resultingAvailableModules),会将下面失去的两个数组放入到ChunkGroupInfo.availableModulesToBeMerged数据中,最终这些数据会带到processChunkGroupsForMerging()

如上面processChunkGroupsForMerging()所示,一开始因为cachedMinAvailableModules为空,会先赋值一个resultingAvailableModulescachedMinAvailableModules,而后再开始比拟计算并集
如上面代码正文所示,计算并集的逻辑其实也不难懂,先拿出cachedMinAvailableModules[i],而后比对availableModules有没有蕴含这个数据,如果没有,则阐明得计算并集,最终触发outdatedChunkGroupInfo.add(info),进行下一个阶段的解决

为什么要计算并集其实也很好了解,如咱们下面所剖析那样
entry1.js能够为async_B.js一些复用的module,entry2.js能够为async_B.js一些复用的module
程序会先加载同步依赖(即复用的module),再加载async_B.js
那么如果async_B.js外部本人也import这些复用的module作为同步依赖,那么就不必把这些可复用的module打包进去async_B.js所造成的Chunk了,因为能够间接应用Parent Chunk的同步依赖
然而entry1.jsentry2.js能够提供复用的module有一些是不一样的怎么办?
比方entry1.js能够提供a、b、c,entry2.js能够提供b、c、d、e,async_B.js须要的同步依赖是a、c
因为不分明是先加载哪个入口文件,因而只能计算entry1.jsentry2.js提供复用的module的并集,也就是b、c
因而async_B.js如果须要b、c,那就不必额定打包了,间接复用即可,然而理论async_B.js须要的同步依赖是a、c,因而async_B.js还得把a打包进去
const processChunkGroupsForMerging = () => {    for (const info of chunkGroupsForMerging) {        const availableModulesToBeMerged = info.availableModulesToBeMerged;        let cachedMinAvailableModules = info.minAvailableModules;        if (availableModulesToBeMerged.length > 1) {            availableModulesToBeMerged.sort(bySetSize);        }        let changed = false;        merge: for (const availableModules of availableModulesToBeMerged) {            if (cachedMinAvailableModules === undefined) {                cachedMinAvailableModules = availableModules;                info.minAvailableModules = cachedMinAvailableModules;                info.minAvailableModulesOwned = false;                changed = true;            } else {                if (info.minAvailableModulesOwned) {                   //...                } else if (cachedMinAvailableModules.plus === availableModules.plus) {                    //...                    // !!!计算并集                    for (const m of cachedMinAvailableModules) {                        if (!availableModules.has(m)) {                               const newSet = /** @type {ModuleSetPlus} */ (new Set());                            newSet.plus = availableModules.plus;                            const iterator = cachedMinAvailableModules[Symbol.iterator]();                                                let it;                            while (!(it = iterator.next()).done) {                                const module = it.value;                                if (module === m) break;                                newSet.add(module);                            }                            while (!(it = iterator.next()).done) {                                const module = it.value;                                if (availableModules.has(module)) {                                    newSet.add(module);                                }                            }                            info.minAvailableModulesOwned = true;                            info.minAvailableModules = newSet;                            changed = true;                            continue merge;                        }                    }                } else {                    //...                }            }        }        if (changed) {            info.resultingAvailableModules = undefined;            outdatedChunkGroupInfo.add(info);        }    }    chunkGroupsForMerging.clear();};

4.7 解决outdatedChunkGroupInfo

在经验processQueue()->processConnectQueue()->processChunkGroupsForMerging()的解决后,最终到processOutdatedChunkGroupInfo()的执行

while (queue.length || queueConnect.size) {    processQueue();    if (chunkGroupsForCombining.size > 0) {        processChunkGroupsForCombining();    }    if (queueConnect.size > 0) {        // calculating available modules        processConnectQueue();        if (chunkGroupsForMerging.size > 0) {            // merging available modules            processChunkGroupsForMerging();        }    }    if (outdatedChunkGroupInfo.size > 0) {        // check modules for revisit        processOutdatedChunkGroupInfo();    }    if (queue.length === 0) {        const tempQueue = queue;        queue = queueDelayed.reverse();        queueDelayed = tempQueue;    }}

processOutdatedChunkGroupInfo()的代码也很多,然而逻辑是比拟清晰易懂的,如上面所示,分为4个局部,因为以后异步依赖ChunkGroupInfominAvailableModules产生了变动,导致之前解决的一些逻辑都得从新查看一遍,次要包含:

  • skippedItems: 之前因为检测到minAvailableModules蕴含以后module,即Parent Chunks能够提供以后module进行复用,因而没有退出到queue中进行解决,当初从新检测了下,这些跳过的module是否还在minAvailableModules中,如果没有,则须要重新加入队列中进行解决
  • skippedModuleConnections:之前因为检测到activeState不为true,因而退出到skippedModuleConnections,当初从新检测下状态是否产生扭转,如果产生扭转,则须要重新加入队列中进行解决
  • children chunk groups:从新将children chunk退出到queueConnect中,也就是须要计算下异步依赖的minAvailableModules,因为异步依赖的minAvailableModules是依靠于parent chunk,当初parent chunkminAvailableModules产生扭转,对应的异步依赖也同样须要从新计算下minAvailableModules
  • availableChildren: 拿出以后ChunkGroup的子ChunkGroup,将children都重新加入到chunkGroupsForCombining从新计算下minAvailableModules
const processOutdatedChunkGroupInfo = () => {    statChunkGroupInfoUpdated += outdatedChunkGroupInfo.size;    // Revisit skipped elements    for (const info of outdatedChunkGroupInfo) {        // 1. Reconsider skipped items        if (info.skippedItems !== undefined) {            const { minAvailableModules } = info;            for (const module of info.skippedItems) {                if (                    !minAvailableModules.has(module) &&                    !minAvailableModules.plus.has(module)                ) {                    queue.push({                        action: ADD_AND_ENTER_MODULE,                        block: module,                        module,                        chunk: info.chunkGroup.chunks[0],                        chunkGroup: info.chunkGroup,                        chunkGroupInfo: info                    });                    info.skippedItems.delete(module);                }            }        }        // 2. Reconsider skipped connections        if (info.skippedModuleConnections !== undefined) {            const { minAvailableModules } = info;            for (const entry of info.skippedModuleConnections) {                const [module, activeState] = entry;                if (activeState === false) continue;                if (activeState === true) {                    info.skippedModuleConnections.delete(entry);                }                if (                    activeState === true &&                    (minAvailableModules.has(module) ||                        minAvailableModules.plus.has(module))                ) {                    info.skippedItems.add(module);                    continue;                }                queue.push({                    action: activeState === true ? ADD_AND_ENTER_MODULE : PROCESS_BLOCK,                    block: module,                    module,                    chunk: info.chunkGroup.chunks[0],                    chunkGroup: info.chunkGroup,                    chunkGroupInfo: info                });            }        }        // 2. Reconsider children chunk groups        if (info.children !== undefined) {            statChildChunkGroupsReconnected += info.children.size;            for (const cgi of info.children) {                let connectList = queueConnect.get(info);                if (connectList === undefined) {                    connectList = new Set();                    queueConnect.set(info, connectList);                }                connectList.add(cgi);            }        }        // 3. Reconsider chunk groups for combining        if (info.availableChildren !== undefined) {            for (const cgi of info.availableChildren) {                chunkGroupsForCombining.add(cgi);            }        }    }    outdatedChunkGroupInfo.clear();};

4.8 calculateResultingAvailableModules详解

4.8.1 源码剖析

在下面的流程中,咱们屡次应用到calculateResultingAvailableModules()这个办法,它自身的代码量也很少,逻辑方面也十分直白,次要是两个公式的计算,次要是minAvailableModules和minAvailableModules.plus的比拟
resultingAvailableModules分为两个局部

  • resultingAvailableModules = new Set():modules of chunk
  • resultingAvailableModules.plus = new Set():比拟minAvailableModules/minAvailableModules.plus

当minAvailableModules的长度<=minAvailableModules.plus的长度时,维持plus不变,将minAvailableModules并入到resultingAvailableModules
当minAvailableModules的长度>minAvailableModules.plus的长度,此时plus须要裁减,将minAvailableModules并入到resultingAvailableModules.plus
因而最终的后果就是

  • resultingAvailableModules = (modules of chunk) + (minAvailableModules + minAvailableModules.plus)
  • resultingAvailableModules = (minAvailableModules + modules of chunk) + (minAvailableModules.plus)

惟一区别就是minAvailableModules到底是放在resultingAvailableModules还是resultingAvailableModules.plus

const calculateResultingAvailableModules = chunkGroupInfo => {        if (chunkGroupInfo.resultingAvailableModules)            return chunkGroupInfo.resultingAvailableModules;        const minAvailableModules = chunkGroupInfo.minAvailableModules;        // Create a new Set of available modules at this point        // We want to be as lazy as possible. There are multiple ways doing this:        // Note that resultingAvailableModules is stored as "(a) + (b)" as it's a ModuleSetPlus        // - resultingAvailableModules = (modules of chunk) + (minAvailableModules + minAvailableModules.plus)        // - resultingAvailableModules = (minAvailableModules + modules of chunk) + (minAvailableModules.plus)        // We choose one depending on the size of minAvailableModules vs minAvailableModules.plus        let resultingAvailableModules;        if (minAvailableModules.size > minAvailableModules.plus.size) {            // resultingAvailableModules = (modules of chunk) + (minAvailableModules + minAvailableModules.plus)            resultingAvailableModules =                /** @type {Set<Module> & {plus: Set<Module>}} */ (new Set());            for (const module of minAvailableModules.plus)                minAvailableModules.add(module);            minAvailableModules.plus = EMPTY_SET;            resultingAvailableModules.plus = minAvailableModules;            chunkGroupInfo.minAvailableModulesOwned = false;        } else {            // resultingAvailableModules = (minAvailableModules + modules of chunk) + (minAvailableModules.plus)            resultingAvailableModules =                /** @type {Set<Module> & {plus: Set<Module>}} */ (                    new Set(minAvailableModules)                );            resultingAvailableModules.plus = minAvailableModules.plus;        }        // add the modules from the chunk group to the set        for (const chunk of chunkGroupInfo.chunkGroup.chunks) {            for (const m of chunkGraph.getChunkModulesIterable(chunk)) {                resultingAvailableModules.add(m);            }        }        return (chunkGroupInfo.resultingAvailableModules =            resultingAvailableModules);    }

4.8.2 示例图解

5.buildChunkGraph-2-connectChunkGroups

5.1 blockConnections数据收集

blockConnections数据在iteratorBlock()解决异步依赖时初始化

解决不是Entry类型:queueConnection的构建
const processBlock = (block, isSrc) => {    //...解决同步依赖    for (const b of block.blocks) {        iteratorBlock(b);    }}const iteratorBlock = b => {    if (c !== undefined) {        blockConnections.get(b).push({            originChunkGroupInfo: chunkGroupInfo,            chunkGroup: c        });        let connectList = queueConnect.get(chunkGroupInfo);        if (connectList === undefined) {            connectList = new Set();            queueConnect.set(chunkGroupInfo, connectList);        }        connectList.add(cgi);        // TODO check if this really need to be done for each traversal        // or if it is enough when it's queued when created        // 4. We enqueue the DependenciesBlock for traversal        queueDelayed.push({            action: PROCESS_BLOCK,            block: b,            module: module,            chunk: c.chunks[0],            chunkGroup: c,            chunkGroupInfo: cgi        });    }}

5.2 解决blockConnections数据,绑定ChunkGroup

如上面代码块所示,areModulesAvailable()次要是判断该异步的chunkGroup所有的依赖是否都处于parent chunkGroupresultingAvailableModules中,也就是parent chunkGroup的一些同步依赖曾经蕴含了异步依赖所须要的所有modules

异步依赖间接拿parent chunkGroup的同步依赖即可,不须要跟其余module建设关系

connectBlockAndChunkGroup(): 异步依赖AsyncDependenciesBlock跟新建设的ChunkGroup进行绑定
connectChunkGroupParentAndChild(): 异步依赖ChunkGroup跟其parent ChunkGroup进行绑定

const connectChunkGroups = (compilation, blocksWithNestedBlocks, blockConnections, chunkGroupInfoMap) => {    const { chunkGraph } = compilation;    // 呈现在父chunkA有异步依赖chunkB,chunkB有同步依赖chunkC    // 然而chunkC是chunkA的同步依赖,那么chunkB就跳过这个异步chunkC的关联    for (const [block, connections] of blockConnections) {        if (            !blocksWithNestedBlocks.has(block) &&            connections.every(({ chunkGroup, originChunkGroupInfo }) =>              // originChunkGroupInfo蕴含了这个chunkGroup的所有Modules              //阐明异步依赖block所在的chunk曾经被所在的chunk的父chunk蕴含了                areModulesAvailable(                    chunkGroup,                    originChunkGroupInfo.resultingAvailableModules                )            )        ) {            continue;        }        for (let i = 0; i < connections.length; i++) {            const { chunkGroup, originChunkGroupInfo } = connections[i];            // 关联这个AsyncDependenciesBlock和chunkGroup            chunkGraph.connectBlockAndChunkGroup(block, chunkGroup);            // 关联这个chunkGroup和它的父chunkGroup            connectChunkGroupParentAndChild(originChunkGroupInfo.chunkGroup, chunkGroup);        }    }};

下面的剖析可能看起来有点懵,然而举一个具体的例子就能很快明确connectChunkGroups()的逻辑,如上面所示

  • 如果entry1.js没有同步依赖async_B.js,那么因为它有异步依赖async_B.jsasync_B.js会独自造成一个ChunkChunkGroup
  • 然而当初entry1.js曾经有了同步依赖async_B.js,那么它就没必要再让async_B.js独自造成一个ChunkChunkGroup,因为entry1.js曾经把async_B.js打包进去本人的Chunk了,而下面代码中areModulesAvailable()就是检测这个逻辑的具体方法,如果originChunkGroupInfo蕴含了这个chunkGroup的所有Modules,那么这个异步ChunkGroup就能够删除了
具体删除逻辑请看下一节的剖析

6.buildChunkGraph-3-cleanupUnconnectedGroups

革除所有没有连贯的chunkGroups

6.1 allCreatedChunkGroups数据收集

allCreatedChunkGroups也是在解决异步依赖iteratorBlock()中进行数据初始化

const processBlock = (block, isSrc) => {  //...解决同步依赖  for (const b of block.blocks) {          iteratorBlock(b);  }}const iteratorBlock = b => {    let cgi = blockChunkGroups.get(b);    const entryOptions = b.groupOptions && b.groupOptions.entryOptions;    if (cgi === undefined) {        // 状况1: 这个异步NormalModule还没有对应的chunkGroup        if (entryOptions) {            // 场景1: Entry类型        } else if (!chunkGroupInfo.asyncChunks || !chunkGroupInfo.chunkLoading) {            // 场景2: webpack.config.js中asyncChunks=false/chunkLoading=false        } else {            // 场景3: 非Entry+容许asyncChunk的状况            c = compilation.addChunkInGroup(                b.groupOptions || b.chunkName,                module,                b.loc,                b.request            );            blockConnections.set(b, []);            allCreatedChunkGroups.add(c);        }    } else if (entryOptions) {        // 状况2: 这个异步NormalModule有对应的chunkGroup,而且它是入口类型的        entrypoint = cgi.chunkGroup;    } else {        // 状况3: 这个异步NormalModule有对应的chunkGroup,而且它不是入口类型的        c = cgi.chunkGroup;    }    if (c !== undefined) {      // 解决不是Entry类型    } else if (entrypoint !== undefined) {      // 解决Entry类型        chunkGroupInfo.chunkGroup.addAsyncEntrypoint(entrypoint);    }}

6.2 allCreatedChunkGroups数据处理

通过chunkGroup.getNumberOfParents()检测异步ChunkGroup是否没有关联其Parent Chunk,如果没有关联,间接革除该ChunkGroup

const cleanupUnconnectedGroups = (compilation, allCreatedChunkGroups) => {    const { chunkGraph } = compilation;    for (const chunkGroup of allCreatedChunkGroups) {    // 清理依赖,如果这个chunkGroup的父chunk为0,阐明没有连贯,间接革除        if (chunkGroup.getNumberOfParents() === 0) {            for (const chunk of chunkGroup.chunks) {                compilation.chunks.delete(chunk);                chunkGraph.disconnectChunk(chunk);            }            chunkGraph.disconnectChunkGroup(chunkGroup);            chunkGroup.remove();        }    }};

如上面所示,当entry1.js曾经有了同步依赖async_B.js,那么它就没必要再让async_B.js独自造成一个ChunkChunkGroup,因而在下面connectChunkGroups()中不会进行connectChunkGroupParentAndChild(originChunkGroupInfo.chunkGroup, chunkGroup)关联ChunkGroup之间的关系,因而会导致异步依赖async_B.js对应的ChunkGroup.getNumberOfParents() === 0,最终触发ChunkGroup删除逻辑,移除该ChunkGroup

7.hooks.optimizeChunks

while (this.hooks.optimizeChunks.call(this.chunks, this.chunkGroups)) {  /* empty */}

在通过visitModules()解决后,会调用hooks.optimizeChunks.call()进行chunks的优化,如下图所示,会触发多个Plugin执行,其中咱们最相熟的就是SplitChunksPlugin插件

因为篇幅起因,具体分析请看下一篇文章《「Webpack5源码」seal阶段剖析(二)》

参考

  1. 精通 Webpack 外围原理专栏
  2. webpack@4.46.0 源码剖析 专栏
  3. webpack5 源码详解 - 封装模块

其它工程化文章

  1. 「Webpack5源码」热更新HRM流程浅析
  2. 「Webpack5源码」make阶段(流程图)剖析
  3. 「Webpack5源码」enhanced-resolve门路解析库源码剖析
  4. 「vite4源码」dev模式整体流程浅析(一)
  5. 「vite4源码」dev模式整体流程浅析(二)