关于webpack:Webpack5源码seal阶段分析二SplitChunksPlugin源码

75次阅读

共计 51149 个字符,预计需要花费 128 分钟才能阅读完成。

本文内容基于 webpack 5.74.0 版本进行剖析

前言

在上一篇文章「Webpack5 源码」seal 阶段(流程图)剖析 (一) 中,咱们曾经剖析了 seal 阶段相干的逻辑,次要包含:

  • new ChunkGraph()
  • 遍历 this.entries,进行ChunkChunkGroup的创立
  • buildChunkGraph()的整体流程
seal(callback) {
    const chunkGraph = new ChunkGraph(
        this.moduleGraph,
        this.outputOptions.hashFunction
    );
    this.chunkGraph = chunkGraph;
    //...

    this.logger.time("create chunks");
    /** @type {Map<Entrypoint, Module[]>} */
    for (const [name, { dependencies, includeDependencies, options}] of this.entries) {const chunk = this.addChunk(name);
        const entrypoint = new Entrypoint(options);
        //...
    }
    //...
    buildChunkGraph(this, chunkGraphInit);
    this.logger.timeEnd("create chunks");

    this.logger.time("optimize");
    //...
    while (this.hooks.optimizeChunks.call(this.chunks, this.chunkGroups)) {/* empty */}
    //...
    this.logger.timeEnd("optimize");

    this.logger.time("code generation");
    this.codeGeneration(err => {
        //...
        this.logger.timeEnd("code generation");
    }
}

buildChunkGraph()是上篇文章剖析中最外围的局部,次要分为 3 个局部开展

const buildChunkGraph = (compilation, inputEntrypointsAndModules) => {
    // PART ONE
    logger.time("visitModules");
    visitModules(...);
    logger.timeEnd("visitModules");

    // PART TWO
    logger.time("connectChunkGroups");
    connectChunkGroups(...);
    logger.timeEnd("connectChunkGroups");

    for (const [chunkGroup, chunkGroupInfo] of chunkGroupInfoMap) {for (const chunk of chunkGroup.chunks)
            chunk.runtime = mergeRuntime(chunk.runtime, chunkGroupInfo.runtime);
    }

    // Cleanup work
    logger.time("cleanup");
    cleanupUnconnectedGroups(compilation, allCreatedChunkGroups);
    logger.timeEnd("cleanup");
};

其中 visitModules() 整体逻辑如下所示

在上篇文章完结剖析 buildChunkGraph() 之后,咱们将开始 hooks.optimizeChunks() 的相干逻辑剖析

文章内容

在没有应用 SplitChunksPlugin 进行分包优化的状况下,如上图所示,一共会生成 6 个 chunk(4 个入口文件造成的 chunk,2 个异步加载造成的 chunk),从上图能够看出,有多个依赖库都被反复打包进入不同的 chunk 中,对于这种状况,咱们能够应用 SplitChunksPlugin 进行分包优化,如下图所示,分包出两个新的 chunk:test2test3,将反复的依赖都打包进去test2test3,防止反复打包造成的打包文件体积过大的问题

本文以下面例子作为外围,剖析 SplitChunksPlugin 分包优化的流程

hooks.optimizeChunks

while (this.hooks.optimizeChunks.call(this.chunks, this.chunkGroups)) {/* empty */}

在通过 visitModules() 解决后,会调用 hooks.optimizeChunks.call() 进行 chunks 的优化,如下图所示,会触发多个 Plugin 执行,其中咱们最相熟的就是 SplitChunksPlugin 插件,上面会集中 SplitChunksPlugin 插件进行解说

2.SplitChunksPlugin 源码解析

配置 cacheGroups,能够对目前曾经划分好的chunks 再进行优化,将一个大的 chunk 划分为两个及以上的chunk,缩小反复打包,减少代码的复用性

比方入口文件打包造成 app1.jsapp2.js,这两个文件(chunk)存在反复的打包代码:第三方库 js-cookie
咱们是否能将 js-cookie 打包造成一个新的 chunk,这样就能够提出 app1.js 和 app2.js 外面的第三方库 js-cookie 代码,同时只须要一个中央打包 js-cookie 代码

module.exports = {
    //...
    optimization: {
        splitChunks: {
            chunks: 'async',
            cacheGroups: {
                defaultVendors: {test: /[\\/]node_modules[\\/]/,
                    priority: -10,
                    reuseExistingChunk: true,
                },
                default: {
                    minChunks: 2,
                    priority: -20,
                    reuseExistingChunk: true,
                },
            },
        },
    },
}

2.0 整体流程图和代码流程概述

2.0.1 代码

依据 logger.time 进行划分,整个流程次要分为:

  • prepare:初始化一些数据结构和办法,为上面流程做筹备
  • modules:遍历所有模块,构建出 chunksInfoMap 数据
  • queue:依据 minSize 进行 chunk 的分包,遍历 chunksInfoMap 数据
  • maxSize:依据 maxSize 进行 chunk 的分包
compilation.hooks.optimizeChunks.tap(
    {    
        name: "SplitChunksPlugin",
        stage: STAGE_ADVANCED
    },
    chunks => {logger.time("prepare");
        //...
        logger.timeEnd("prepare");

        logger.time("modules");
        for (const module of compilation.modules) {//...}
        logger.timeEnd("modules");

        logger.time("queue");
        for (const [key, info] of chunksInfoMap) {//...}
        while (chunksInfoMap.size > 0) {//...}
        logger.timeEnd("queue");

        logger.time("maxSize");
        for (const chunk of Array.from(compilation.chunks)) {//...}
        logger.timeEnd("maxSize");
    }
}

依据配置参数进行对应的源码解析,比方 maxSize、minSize、enforce、maxInitialRequests 等等

2.0.2 流程图

2.1 cacheGroups 默认配置

默认 cacheGroups 配置是在初始化过程中就设置好的参数,不是 SplitChunksPlugin.js 文件中执行的代码

从上面代码块能够晓得,初始化阶段就定义了两个默认的 cacheGroups 配置,其中一个是 node_modules 的配置

// node_modules/webpack/lib/config/defaults.js
const {splitChunks} = optimization;
if (splitChunks) {A(splitChunks, "defaultSizeTypes", () =>
        css ? ["javascript", "css", "unknown"] : ["javascript", "unknown"]
    );
    D(splitChunks, "hidePathInfo", production);
    D(splitChunks, "chunks", "async");
    D(splitChunks, "usedExports", optimization.usedExports === true);
    D(splitChunks, "minChunks", 1);
    F(splitChunks, "minSize", () => (production ? 20000 : 10000));
    F(splitChunks, "minRemainingSize", () => (development ? 0 : undefined));
    F(splitChunks, "enforceSizeThreshold", () => (production ? 50000 : 30000));
    F(splitChunks, "maxAsyncRequests", () => (production ? 30 : Infinity));
    F(splitChunks, "maxInitialRequests", () => (production ? 30 : Infinity));
    D(splitChunks, "automaticNameDelimiter", "-");
    const {cacheGroups} = splitChunks;
    F(cacheGroups, "default", () => ({
        idHint: "",
        reuseExistingChunk: true,
        minChunks: 2,
        priority: -20
    }));
    // const NODE_MODULES_REGEXP = /[\\/]node_modules[\\/]/i;
    F(cacheGroups, "defaultVendors", () => ({
        idHint: "vendors",
        reuseExistingChunk: true,
        test: NODE_MODULES_REGEXP,
        priority: -10
    }));
}

将下面的默认配置转化为 webpack.config.js 就是上面代码块所示,一共有两个默认配置

  • node_modules 相干会打包造成一个 chunk
  • 默认会依据其它参数打包造成 chunk

splitChunks.chunks表明将抉择哪些 chunk 进行优化,默认 chunksasync模式

  • async示意只会从异步类型的 chunk 拆分出新的chunk
  • initial只会从入口 chunk 拆分出新的chunk
  • all示意无论是异步还是非异步,都会思考拆分 chunk

详细分析请看上面的 cacheGroup.chunkFilter 的相干剖析

module.exports = {
    //...
    optimization: {
        splitChunks: {
            chunks: 'async',
            minSize: 20000,
            minRemainingSize: 0,
            minChunks: 1,
            maxAsyncRequests: 30,
            maxInitialRequests: 30,
            enforceSizeThreshold: 50000,
            cacheGroups: {
                defaultVendors: {test: /[\\/]node_modules[\\/]/,
                    priority: -10,
                    reuseExistingChunk: true,
                },
                default: {
                    minChunks: 2,
                    priority: -20,
                    reuseExistingChunk: true,
                },
            },
        },
    },
};

2.2 modules 阶段:遍历 compilation.modules,依据 cacheGroup 造成 chunksInfoMap 数据

for (const module of compilation.modules) {let cacheGroups = this.options.getCacheGroups(module, context);
    let cacheGroupIndex = 0;
    for (const cacheGroupSource of cacheGroups) {const cacheGroup = this._getCacheGroup(cacheGroupSource);
        // ============ 步骤 1 ============
        const combs = cacheGroup.usedExports
            ? getCombsByUsedExports()
            : getCombs();
        for (const chunkCombination of combs) {
            const count =
                chunkCombination instanceof Chunk ? 1 : chunkCombination.size;
            if (count < cacheGroup.minChunks) continue;
            // ============ 步骤 2 ============
            const {chunks: selectedChunks, key: selectedChunksKey} =
                getSelectedChunks(chunkCombination, cacheGroup.chunksFilter);
            // ============ 步骤 3 ============
            addModuleToChunksInfoMap(
                cacheGroup,
                cacheGroupIndex,
                selectedChunks,
                selectedChunksKey,
                module
            );
        }
        cacheGroupIndex++;
    }
}

2.2.1 步骤 1: getCombsByUsedExports()

for (const module of compilation.modules) {let cacheGroups = this.options.getCacheGroups(module, context);
    const getCombsByUsedExports = memoize(() => {
        // fill the groupedByExportsMap
        getExportsChunkSetsInGraph();
        /** @type {Set<Set<Chunk> | Chunk>} */
        const set = new Set();
        const groupedByUsedExports = groupedByExportsMap.get(module);
        for (const chunks of groupedByUsedExports) {const chunksKey = getKey(chunks);
            for (const comb of getExportsCombinations(chunksKey))
                set.add(comb);
        }
        return set;
    });

    for (const cacheGroupSource of cacheGroups) {const cacheGroup = this._getCacheGroup(cacheGroupSource);
        // ============ 步骤 1 ============
        const combs = cacheGroup.usedExports
            ? getCombsByUsedExports()
            : getCombs();
        //...
    }
}

getCombsByUsedExports()的逻辑中波及到多个办法(在 prepare 阶段 进行初始化的办法),整体流程如下所示

遍历 compilation.modules 的过程中,触发 groupedByExportsMap.get(module),拿到以后module 对应的 chunks 数据汇合,最终造成的数据结构是:

// item[0]是通过 key 拿到的 chunk 数组
// item[1]是合乎 minChunks 拿到的 chunks 汇合
// item[2]和 item[3]是合乎 minChunks 拿到的 chunks 汇合
[new Set(3), new Set(2), Chunk, Chunk]
moduleGraph.getExportsInfo

拿到对应 moduleexports对象信息,比方common__g.js

common__g.js拿到的数据如下

依据 exportsInfo.getUsageKey(chunk.runtime) 进行对应 chunks 汇合数据的收集

getUsageKey(chunk.runtime) 作为 key 进行 chunks 汇合数据的收集,在以后的示例中,app1、app2、app3、app4拿到的 getUsageKey(chunk.runtime) 都是一样的,这个办法的解析请参考其它文章进行了解

const groupChunksByExports = module => {const exportsInfo = moduleGraph.getExportsInfo(module);
    const groupedByUsedExports = new Map();

    for (const chunk of chunkGraph.getModuleChunksIterable(module)) {const key = exportsInfo.getUsageKey(chunk.runtime);
        const list = groupedByUsedExports.get(key);
        if (list !== undefined) {list.push(chunk);
        } else {groupedByUsedExports.set(key, [chunk]);
        }
    }
    return groupedByUsedExports.values();};

因而对于 entry1.js 这样的入口文件来说,失去的 groupedByUsedExports.values() 就是一个 chunks:[app1]

对于 common__g.js 这种被 4 个入口文件所应用的依赖,失去的 groupedByUsedExports.values() 就是一个 chunks:[app1,app2,app3,app4]

chunkGraph.getModuleChunksIterable

拿到对应 module 所在的 chunks 汇合,比方下图中的 common__g.js 能够拿到的 chunks 汇合为app1、app2、app3、app4

singleChunkSets、chunkSetsInGraph 和 chunkSets

一共有 4 种 chunks 汇合,别离是:

  • [app1,app2,app3,app4]
  • [app1,app2,app3]
  • [app1,app2]
  • [app2,app3,app4]

对应着下面每一个 module 的 chunks 汇合


而入口文件和异步文件对应的 module 所造成的 chunk 因为数量为 1,因而放在 singleChunkSets

chunkSetsByCount

chunkSetsInGraph 数据的变形,依据 chunkSetsInGraph 中 item 的长度,进行 chunkSetsByCount 的拼接,比方下面例子,造成的 chunkSetsInGraph 为:

一共有 4 种 chunks 汇合,别离是:

  • [app1,app2,app3,app4]
  • [app1,app2,app3]
  • [app1,app2]
  • [app2,app3,app4]

转化为 chunkSetsByCount:

小结
  • 应用 groupChunksExports(module) 拿到该 module 对应的所有 chunk 汇合数据,放入到 groupedByExportsMapgroupedByExportsMap 是以 key=modulevalue=[[chunk1,chunk2], chunk1] 的数据结构
  • 将所有 chunk 汇合数据通过 getKey(chunks) 放入到 chunkSetsInGraph 中,chunkSetsInGraph是以 key=getKey(chunks)value=chunk 汇合数据 的数据结构

当咱们解决某一个 module 时,通过 groupedByExportsMap 拿到该 module 对应的所有 chunk 汇合数据,称为groupedByUsedExports

const groupedByUsedExports = groupedByExportsMap.get(module);

而后遍历所有 chunk 汇合 A,通过该数据汇合 A 造成的 chunksKey 拿到 chunkSetsInGraph 对应的 chunk 汇合数据(该 chunk 汇合数据其实也是数据汇合 A),同时还会利用chunkSetsByCount 获取数量比拟少,然而属于数据汇合 A 子集的数据汇合 B(数据汇合 B 可能是其它 module 拿到的 chunk 汇合)

const groupedByUsedExports = groupedByExportsMap.get(module);
for (const chunks of groupedByUsedExports) {const chunksKey = getKey(chunks);
  for (const comb of getExportsCombinations(chunksKey))
    set.add(comb);
}
return set;

2.2.2 步骤 2: getSelectedChunks()和 cacheGroup.chunksFilter

for (const module of compilation.modules) {let cacheGroups = this.options.getCacheGroups(module, context);
    let cacheGroupIndex = 0;
    for (const cacheGroupSource of cacheGroups) {const cacheGroup = this._getCacheGroup(cacheGroupSource);
        // ============ 步骤 1 ============
        const combs = cacheGroup.usedExports
            ? getCombsByUsedExports()
            : getCombs();
        for (const chunkCombination of combs) {
            const count =
                chunkCombination instanceof Chunk ? 1 : chunkCombination.size;
            if (count < cacheGroup.minChunks) continue;
            // ============ 步骤 2 ============
            const {chunks: selectedChunks, key: selectedChunksKey} =
                getSelectedChunks(chunkCombination, cacheGroup.chunksFilter);
            //...
        }
        cacheGroupIndex++;
    }
}
cacheGroup.chunksFilter

webpack.config.js如果传入 "all",那么cacheGroup.chunksFilter 的内容为const ALL_CHUNK_FILTER = chunk => true;

const INITIAL_CHUNK_FILTER = chunk => chunk.canBeInitial();
const ASYNC_CHUNK_FILTER = chunk => !chunk.canBeInitial();
const ALL_CHUNK_FILTER = chunk => true;
const normalizeChunksFilter = chunks => {if (chunks === "initial") {return INITIAL_CHUNK_FILTER;}
    if (chunks === "async") {return ASYNC_CHUNK_FILTER;}
    if (chunks === "all") {return ALL_CHUNK_FILTER;}
    if (typeof chunks === "function") {return chunks;}
};
const createCacheGroupSource = (options, key, defaultSizeTypes) => {
    //...
    return {
        //...
        chunksFilter: normalizeChunksFilter(options.chunks),
        //...
    };
};
const {chunks: selectedChunks, key: selectedChunksKey} =
    getSelectedChunks(chunkCombination, cacheGroup.chunksFilter);

webpack.config.js如果传入 "async"/"initial" 呢?

从上面代码块咱们能够晓得

  • ChunkGroupchunk.canBeInitial()=false
  • 同步Entrypointchunk.canBeInitial()=true
  • 异步Entrypointchunk.canBeInitial()=false
class ChunkGroup {isInitial() {return false;}
}
class Entrypoint extends ChunkGroup {constructor(entryOptions, initial = true) {this._initial = initial;}
    isInitial() {return this._initial;}
}
// node_modules/webpack/lib/Compilation.js
addAsyncEntrypoint(options, module, loc, request) {const entrypoint = new Entrypoint(options, false);
}
getSelectedChunks()

从上面代码块能够晓得,应用 chunkFilter() 进行 chunks 数组的过滤,因为例子应用 "all"chunkFilter() 任何条件下都会返回 true,因而这里的过滤条件根本没有应用,所有chunk 都合乎题意

chunkFilter()实质就是通过 splitChunks.chunks 配置的参数决定要不要通过 _initial 来筛选,而后联合
一般 ChunkGroup:_initial=false
Entrypoint 类型的 ChunkGroup:_initial=true
AsyncEntrypoint 类型的 ChunkGroup:_initial=false
进行数据的筛选

const getSelectedChunks = (chunks, chunkFilter) => {let entry = selectedChunksCacheByChunksSet.get(chunks);
    if (entry === undefined) {entry = new WeakMap();
        selectedChunksCacheByChunksSet.set(chunks, entry);
    }
    let entry2 = entry.get(chunkFilter);
    if (entry2 === undefined) {const selectedChunks = [];
        if (chunks instanceof Chunk) {if (chunkFilter(chunks)) selectedChunks.push(chunks);
        } else {for (const chunk of chunks) {if (chunkFilter(chunk)) selectedChunks.push(chunk);
            }
        }
        entry2 = {
            chunks: selectedChunks,
            key: getKey(selectedChunks)
        };
        entry.set(chunkFilter, entry2);
    }
    return entry2;
}

2.2.3 步骤 3: addModuleToChunksInfoMap

for (const module of compilation.modules) {let cacheGroups = this.options.getCacheGroups(module, context);
    let cacheGroupIndex = 0;
    for (const cacheGroupSource of cacheGroups) {const cacheGroup = this._getCacheGroup(cacheGroupSource);
        // ============ 步骤 1 ============
        const combs = cacheGroup.usedExports
            ? getCombsByUsedExports()
            : getCombs();
        for (const chunkCombination of combs) {
            const count =
                chunkCombination instanceof Chunk ? 1 : chunkCombination.size;
            if (count < cacheGroup.minChunks) continue;
            // ============ 步骤 2 ============
            const {chunks: selectedChunks, key: selectedChunksKey} =
                getSelectedChunks(chunkCombination, cacheGroup.chunksFilter);
            // ============ 步骤 3 ============
            addModuleToChunksInfoMap(
                cacheGroup,
                cacheGroupIndex,
                selectedChunks,
                selectedChunksKey,
                module
            );
        }
        cacheGroupIndex++;
    }
}

构建 chunksInfoMap 数据,每一个 key 对应的 item(蕴含 modules、chunks、chunksKeys...) 就是 chunksInfoMap 的元素

const addModuleToChunksInfoMap = (...) => {let info = chunksInfoMap.get(key);
    if (info === undefined) {
        chunksInfoMap.set(
            key,
            (info = {
                modules: new SortableSet(
                    undefined,
                    compareModulesByIdentifier
                ),
                chunks: new Set(),
                chunksKeys: new Set()})
        );
    }
    const oldSize = info.modules.size;
    info.modules.add(module);
    if (info.modules.size !== oldSize) {for (const type of module.getSourceTypes()) {info.sizes[type] = (info.sizes[type] || 0) + module.size(type);
        }
    }
    const oldChunksKeysSize = info.chunksKeys.size;
    info.chunksKeys.add(selectedChunksKey);
    if (oldChunksKeysSize !== info.chunksKeys.size) {for (const chunk of selectedChunks) {info.chunks.add(chunk);
        }
    }
};

2.2.4 具体例子

webpack.config.js的配置如下所示

cacheGroups:{
    defaultVendors: {test: /[\\/]node_modules[\\/]/,
        priority: -10,
        reuseExistingChunk: true,
    },
    default: {
        minChunks: 2,
        priority: -20,
        reuseExistingChunk: true,
    },
    test3: {
        chunks: 'all',
        minChunks: 3,
        name: "test3",
        priority: 3
    },
    test2: {
        chunks: 'all',
        minChunks: 2,
        name: "test2",
        priority: 2
    }
}

在示例中,一共有 4 个入口文件

  • app1.js:应用了 js-cookieloadsh 第三方库
  • app2.js:应用了 js-cookieloadshvaca 第三方库
  • app3.js:应用了 js-cookievaca 第三方库
  • app3.js:应用了 vaca 第三方库

这个时候回顾下整体的流程代码,内部循环是 module,拿到该module 对应的 chunks 汇合,也就是combs

外部循环是 cacheGroup(也就是webpack.config.js 配置的分组),应用 combs[i] 对每一个 cacheGroup 进行遍历,实质就是 minChunks+chunksFilter 的筛选,而后将满足条件的数据通过 addModuleToChunksInfoMap() 塞入到 chunksInfoMap

for (const module of compilation.modules) {let cacheGroups = this.options.getCacheGroups(module, context);
    let cacheGroupIndex = 0;
    for (const cacheGroupSource of cacheGroups) {const cacheGroup = this._getCacheGroup(cacheGroupSource);
        // ============ 步骤 1 ============
        const combs = cacheGroup.usedExports
            ? getCombsByUsedExports()
            : getCombs();
        for (const chunkCombination of combs) {
            const count =
                chunkCombination instanceof Chunk ? 1 : chunkCombination.size;
            if (count < cacheGroup.minChunks) continue;
            // ============ 步骤 2 ============
            const {chunks: selectedChunks, key: selectedChunksKey} =
                getSelectedChunks(chunkCombination, cacheGroup.chunksFilter);
            // ============ 步骤 3 ============
            addModuleToChunksInfoMap(
                cacheGroup,
                cacheGroupIndex,
                selectedChunks,
                selectedChunksKey,
                module
            );
        }
        cacheGroupIndex++;
    }
}

因为每一个入口文件都会造成一个 Chunk,因而一共会造成 4 个Chunk,因为addModuleToChunksInfoMap() 是以 module 为单位进行遍历的,因而咱们能够整顿出每一个 module 蕴含的 Chunk 的关系如下:

从下面的代码能够晓得,当咱们应用 NormalModule="js-cookie" 时,通过 getCombsByUserdExports() 会拿到 5 个 chunks 汇合数据,也就是

留神:chunkSetsByCountSet(2){app1,app2} 自身不蕴含 js-cookie 的,依照上图所示,应该蕴含的是 loadsh,然而满足isSubSet() 条件

[Set(3){app1,app2,app3}, Set(2){app1,app2}, app1, app2, app3]

getCombsByUserdExports() 具体的执行逻辑如下图所示,通过 chunkSetsByCount 获取对应的 chunks 汇合

chunkSetsByCountkey=chunk 数量value= 对应的 chunk 汇合(数量为 key),比方
key=3value=[Set(3){app1, app2, app3}]

而咱们代码中是遍历 cacheGroup 的,因而咱们还要思考会命中哪些cacheGroup

NormalModule="js-cookie"

  • cacheGroup=test3,拿到的 combs 汇合是
combs = [["app1","app2","app3"],["app1","app2"],"app1","app2","app3"]

遍历 combs,因为 cacheGroup.minChunks=3,因而最终过滤实现后,触发 addModuleToChunksInfoMap() 的数据是

["app1","app2","app3"]

  • cacheGroup=test2
combs = [["app1","app2","app3"],["app1","app2"],"app1","app2","app3"]

遍历 combs,因为 cacheGroup.minChunks=2,因而最终过滤实现后,触发 addModuleToChunksInfoMap() 的数据是

["app1","app2","app3"]
["app1","app2"]

  • cacheGroup=default
combs = [["app1","app2","app3"],["app1","app2"],"app1","app2","app3"]

遍历 combs,因为 cacheGroup.minChunks=2,因而最终过滤实现后,触发 addModuleToChunksInfoMap() 的数据是

["app1","app2","app3"]
["app1","app2"]

  • cacheGroup=defaultVendors
combs = [["app1","app2","app3"],["app1","app2"],"app1","app2","app3"]

遍历 combs,因为 cacheGroup.minChunks=2,因而最终过滤实现后,触发 addModuleToChunksInfoMap() 的数据是

["app1","app2","app3"]
["app1","app2"]
["app1"]
["app2"]
["app3"]

chunksInfoMap 的 key

在整个流程中,咱们会应用一个属性 key 贯通整个流程

比方上面代码中的chunksKey

const getCombs = memoize(() => {const chunks = chunkGraph.getModuleChunksIterable(module);
  const chunksKey = getKey(chunks);
  return getCombinations(chunksKey);
});

addModuleToChunksInfoMap()传入的selectedChunksKey

const getSelectedChunks = (chunks, chunkFilter) => {let entry = selectedChunksCacheByChunksSet.get(chunks);
    if (entry === undefined) {entry = new WeakMap();
        selectedChunksCacheByChunksSet.set(chunks, entry);
    }
    /** @type {SelectedChunksResult} */
    let entry2 = entry.get(chunkFilter);
    if (entry2 === undefined) {/** @type {Chunk[]} */
        const selectedChunks = [];
        if (chunks instanceof Chunk) {if (chunkFilter(chunks)) selectedChunks.push(chunks);
        } else {for (const chunk of chunks) {if (chunkFilter(chunk)) selectedChunks.push(chunk);
            }
        }
        entry2 = {
            chunks: selectedChunks,
            key: getKey(selectedChunks)
        };
        entry.set(chunkFilter, entry2);
    }
    return entry2;
};

const {chunks: selectedChunks, key: selectedChunksKey} =
    getSelectedChunks(chunkCombination, cacheGroup.chunksFilter);
addModuleToChunksInfoMap(
    cacheGroup,
    cacheGroupIndex,
    selectedChunks,
    selectedChunksKey,
    module
);

咱们将 addModuleToChunksInfoMap() 最终造成的数据 chunksInfoMap 革新下,如下所示,将对应的 selectedChunksKey 换成以后 module 的门路

const key =
    cacheGroup.key +
    (name
        ? ` name:${name}`
        : ` chunks:${keyToString(selectedChunksKey)}`);
// 如果没有 name,则增加对应的 module.rawRequest
const key =
    cacheGroup.key +
    (name
        ? ` name:${name}`
        : ` chunks:${module.rawRequestkey} ${ToString(selectedChunksKey)}`);

最终造成的 chunksInfoMap 如下所示,拿咱们下面的举例 js-cookie 为参考,最终会依据不同的 chunks 汇合造成不同的 selectedChunksKey,最终不同chunks 数据汇合造成 chunksInfoMap 中不同 keyvalue一部分,而不是把所有不同 chunks 数据汇合都塞入到同一个 key

2.3 queue 阶段:依据 minSize 和 minSizeReduction 筛选 chunksInfoMap 数据

compilation.hooks.optimizeChunks.tap(
    {    
        name: "SplitChunksPlugin",
        stage: STAGE_ADVANCED
    },
    chunks => {logger.time("prepare");
        //...
        logger.timeEnd("prepare");

        logger.time("modules");
        for (const module of compilation.modules) {//...}
        logger.timeEnd("modules");

        logger.time("queue");
        for (const [key, info] of chunksInfoMap) {//...}
        while (chunksInfoMap.size > 0) {//...}
        logger.timeEnd("queue");

        logger.time("maxSize");
        for (const chunk of Array.from(compilation.chunks)) {//...}
        logger.timeEnd("maxSize");
    }
}

maxSize 比 maxInitialRequest/maxAsyncRequests 具备更高的优先级,优先级 maxInitialRequest/maxAsyncRequests < maxSize < minSize

// Filter items were size < minSize
for (const [key, info] of chunksInfoMap) {if (removeMinSizeViolatingModules(info)) {chunksInfoMap.delete(key);
    } else if (
        !checkMinSizeReduction(
            info.sizes,
            info.cacheGroup.minSizeReduction,
            info.chunks.size
        )
    ) {chunksInfoMap.delete(key);
    }
}

removeMinSizeViolatingModules(): 如上面代码块和图片所示,通过 cacheGroup.minSize 判断目前 infomodule类型,比方 javascript 的总体大小 size 是否小于 cacheGroup.minSize,如果小于,则剔除这些类型的modules,不造成新的chunk

在下面拼凑 info.sizes[type]时,会将同种类型的 size 累加

const removeMinSizeViolatingModules = info => {
    const violatingSizes = getViolatingMinSizes(
        info.sizes,
        info.cacheGroup.minSize
    );
    if (violatingSizes === undefined) return false;
    removeModulesWithSourceType(info, violatingSizes);
    return info.modules.size === 0;
};
const removeModulesWithSourceType = (info, sourceTypes) => {for (const module of info.modules) {const types = module.getSourceTypes();
        if (sourceTypes.some(type => types.has(type))) {info.modules.delete(module);
            for (const type of types) {info.sizes[type] -= module.size(type);
            }
        }
    }
};

checkMinSizeReduction(): 波及到 cacheGroup.minSizeReduction 配置,生成 chunk 所需的主 chunk(bundle)的最小体积(以字节为单位)缩减。这意味着如果宰割成一个 chunk 并没有缩小主 chunk(bundle)的给定字节数,它将不会被宰割,即便它满足 splitChunks.minSize

为了生成 chunk,splitChunks.minSizeReductionsplitChunks.minSize 都须要被满足,如果提取出这些 chunk,使得 主 chunk缩小的体积少于cacheGroup.minSizeReduction,那就不要提取进去造成新的 chunk 了

const checkMinSizeReduction = (sizes, minSizeReduction, chunkCount) => {// minSizeReduction 数据结构跟 minSize 一样,都是{javascript: 200;unknown: 200}
    for (const key of Object.keys(minSizeReduction)) {const size = sizes[key];
        if (size === undefined || size === 0) continue;
        if (size * chunkCount < minSizeReduction[key]) return false;
    }
    return true;
};

2.4 queue 阶段:遍历 chunksInfoMap,依据规定进行 chunk 的从新组织

compilation.hooks.optimizeChunks.tap(
    {    
        name: "SplitChunksPlugin",
        stage: STAGE_ADVANCED
    },
    chunks => {logger.time("prepare");
        //...
        logger.timeEnd("prepare");

        logger.time("modules");
        for (const module of compilation.modules) {//...}
        logger.timeEnd("modules");

        logger.time("queue");
        for (const [key, info] of chunksInfoMap) {//...}
        while (chunksInfoMap.size > 0) {//...}
        logger.timeEnd("queue");

        logger.time("maxSize");
        for (const chunk of Array.from(compilation.chunks)) {//...}
        logger.timeEnd("maxSize");
    }
}

chunksInfoMap 每一个元素 info,实质就是一个 cacheGroup,这个 cacheGroup 带有 chunks 和 modules

while (chunksInfoMap.size > 0) {
    //compareEntries 比拟优先级构建 bestEntry
    // 
}

// ... 解决 maxSize,下一个大节

2.4.1 compareEntries 找到优先级最高的 chunksInfoMap 的 item

找出 cacheGroup 的优先级哪个比拟高,因为有一些 chunk 是合乎多个 cacheGroup 的,优先级高优先进行宰割,优先产生打包后果

依据以下属性从上到下的优先级进行两个 info 的排序,拿到最高级的那个info,即chunksInfoMap 的 item

let bestEntryKey;
let bestEntry;
for (const pair of chunksInfoMap) {const key = pair[0];
    const info = pair[1];
    if (
        bestEntry === undefined ||
        compareEntries(bestEntry, info) < 0
    ) {
        bestEntry = info;
        bestEntryKey = key;
    }
}
const item = bestEntry;
chunksInfoMap.delete(bestEntryKey);

具体的比拟办法在 compareEntries() 中,从代码中能够看出

  • priority:数值越大,优先级越高
  • chunks.size:数量最多,优先级越高
  • size reduction:totalSize(a.sizes) * (a.chunks.size – 1)数值越大,优先级越高
  • cache group index:数值越小,优先级越高
  • number of modules:数值越大,优先级越高
  • module identifiers:数值越大,优先级越高
const compareEntries = (a, b) => {
  // 1. by priority
  const diffPriority = a.cacheGroup.priority - b.cacheGroup.priority;
  if (diffPriority) return diffPriority;
  // 2. by number of chunks
  const diffCount = a.chunks.size - b.chunks.size;
  if (diffCount) return diffCount;
  // 3. by size reduction
  const aSizeReduce = totalSize(a.sizes) * (a.chunks.size - 1);
  const bSizeReduce = totalSize(b.sizes) * (b.chunks.size - 1);
  const diffSizeReduce = aSizeReduce - bSizeReduce;
  if (diffSizeReduce) return diffSizeReduce;
  // 4. by cache group index
  const indexDiff = b.cacheGroupIndex - a.cacheGroupIndex;
  if (indexDiff) return indexDiff;
  // 5. by number of modules (to be able to compare by identifier)
  const modulesA = a.modules;
  const modulesB = b.modules;
  const diff = modulesA.size - modulesB.size;
  if (diff) return diff;
  // 6. by module identifiers
  modulesA.sort();
  modulesB.sort();
  return compareModuleIterables(modulesA, modulesB);
};

拿到 bestEntry 后,从 chunksInfoMap 删除掉它,而后就对这个 bestEntry 进行解决

let bestEntryKey;
let bestEntry;
for (const pair of chunksInfoMap) {const key = pair[0];
    const info = pair[1];
    if (
        bestEntry === undefined ||
        compareEntries(bestEntry, info) < 0
    ) {
        bestEntry = info;
        bestEntryKey = key;
    }
}
const item = bestEntry;
chunksInfoMap.delete(bestEntryKey);

2.4.2 开始解决 chunksInfoMap 中拿到优先级最大的 item

拿到优先级最大的chunksInfoMap item,称为bestEntry

isExistingChunk

先进行了 isExistingChunk 的检测,如果 cacheGroup 有名称,并且从名称中拿到了曾经存在的chunk,间接复用该chunk

波及到 webpack.config.js 配置参数reuseExistingChunk,能够参考 reuseExistingChunk 具体例子,次要就是复用现有的chunk,而不是创立新的chunk

Chunk 1 (named one): modules A
Chunk 2 (named two): no modules (removed by optimization)
Chunk 3 (named one~two): modules B, C

下面是配置了 reuseExistingChunk=false
上面是配置了 reuseExistingChunk=true

Chunk 1 (named one): modules A
Chunk 2 (named two): modules B, C

如果 cacheGroup 没有名称,则遍历 item.chunks 寻找能复用的chunk

最终后果是将是否复用的 chunk 赋值给newChunk,并且设置isExistingChunk=true

let chunkName = item.name;
let newChunk;
if (chunkName) {const chunkByName = compilation.namedChunks.get(chunkName);
    if (chunkByName !== undefined) {
        newChunk = chunkByName;
        const oldSize = item.chunks.size;
        item.chunks.delete(newChunk);
        isExistingChunk = item.chunks.size !== oldSize;
    }
} else if (item.cacheGroup.reuseExistingChunk) {outer: for (const chunk of item.chunks) {
        if (chunkGraph.getNumberOfChunkModules(chunk) !==
            item.modules.size
        ) {continue;}
        if (
            item.chunks.size > 1 &&
            chunkGraph.getNumberOfEntryModules(chunk) > 0
        ) {continue;}
        for (const module of item.modules) {if (!chunkGraph.isModuleInChunk(module, chunk)) {continue outer;}
        }
        if (!newChunk || !newChunk.name) {newChunk = chunk;} else if (
            chunk.name &&
            chunk.name.length < newChunk.name.length
        ) {newChunk = chunk;} else if (
            chunk.name &&
            chunk.name.length === newChunk.name.length &&
            chunk.name < newChunk.name
        ) {newChunk = chunk;}
    }
    if (newChunk) {item.chunks.delete(newChunk);
        chunkName = undefined;
        isExistingChunk = true;
        isReusedWithAllModules = true;
    }
}
enforceSizeThreshold

webpack.config.js配置参数splitChunks.enforceSizeThreshold

当一个 chunk 的大小超过 enforceSizeThreshold 时,执行拆分的大小阈值和其余限度(minRemainingSize、maxAsyncRequests、maxInitialRequests)将被疏忽

如上面代码所示,如果 item.sizes[i] 大于enforceSizeThreshold,那么enforced=true,就不必执行接下来的 maxInitialRequests 和 maxAsyncRequests 测验

const hasNonZeroSizes = sizes => {for (const key of Object.keys(sizes)) {if (sizes[key] > 0) return true;
  }
  return false;
};
const checkMinSize = (sizes, minSize) => {for (const key of Object.keys(minSize)) {const size = sizes[key];
    if (size === undefined || size === 0) continue;
    if (size < minSize[key]) return false;
  }
  return true;
};

// _conditionalEnforce: hasNonZeroSizes(enforceSizeThreshold)
const enforced =
  item.cacheGroup._conditionalEnforce &&
  checkMinSize(item.sizes, item.cacheGroup.enforceSizeThreshold);        
maxInitialRequests 和 maxAsyncRequests

maxInitialRequests: 入口点的最大并行申请数
maxAsyncRequests: 按需加载时的最大并行申请数。
maxSize 比 maxInitialRequest/maxAsyncRequests 具备更高的优先级。理论优先级是 maxInitialRequest/maxAsyncRequests < maxSize < minSize

检测目前 usedChunks 中每一个 chunk 所持有的 chunkGroup 总体数量是否大于 cacheGroup.maxInitialRequests 或者是cacheGroup.maxAsyncRequests,如果超过这个限度,则删除这个chunk

const usedChunks = new Set(item.chunks);
if (
    !enforced &&
    (Number.isFinite(item.cacheGroup.maxInitialRequests) ||
        Number.isFinite(item.cacheGroup.maxAsyncRequests))
) {for (const chunk of usedChunks) {
        // respect max requests
        const maxRequests = chunk.isOnlyInitial()
            ? item.cacheGroup.maxInitialRequests
            : chunk.canBeInitial()
                ? Math.min(
                    item.cacheGroup.maxInitialRequests,
                    item.cacheGroup.maxAsyncRequests
                )
                : item.cacheGroup.maxAsyncRequests;
        if (isFinite(maxRequests) &&
            getRequests(chunk) >= maxRequests
        ) {usedChunks.delete(chunk);
        }
    }
}
chunkGraph.isModuleInChunk(module, chunk)

进行一些 chunk 的剔除,在一直迭代进行切割分组时,可能存在某一个 bestEntry 的 module 曾经被其它 bestEntry 分走,然而 chunk 还没清理的状况,这个时候通过 chunkGraph.isModuleInChunk 检测是否存在 chunk 不被 bestEntry 外面所有 module 所须要,如果存在,间接删除该 chunk

outer: for (const chunk of usedChunks) {for (const module of item.modules) {if (chunkGraph.isModuleInChunk(module, chunk)) continue outer;
    }
    usedChunks.delete(chunk);
}
usedChunks.size<item.chunks.size

如果 usedChunks 删除了一些 chunk,那么从新应用addModuleToChunksInfoMap() 建设新的元素到 chunksInfoMap,即去除不符合条件的 chunk 之后的重新加入chunksInfoMap 造成新的cacheGroup 组

一开始咱们遍历 chunksInfoMap 时,会删除目前解决的bestEntry,这个时候处理完毕后重新加入到chunksInfoMap,而后再进入循环进行解决

// Were some (invalid) chunks removed from usedChunks?
// => readd all modules to the queue, as things could have been changed
if (usedChunks.size < item.chunks.size) {if (isExistingChunk) usedChunks.add(newChunk);
    if (usedChunks.size >= item.cacheGroup.minChunks) {const chunksArr = Array.from(usedChunks);
        for (const module of item.modules) {
            addModuleToChunksInfoMap(
                item.cacheGroup,
                item.cacheGroupIndex,
                chunksArr,
                getKey(usedChunks),
                module
            );
        }
    }
    continue;
}
minRemainingSize

webpack.config.js配置参数splitChunks.minRemainingSize,仅在残余单个 chunk 时失效

通过确保拆分后残余的最小 chunk 体积超过限度来防止大小为零的模块,"development"模式中默认为0

const getViolatingMinSizes = (sizes, minSize) => {
  let list;
  for (const key of Object.keys(minSize)) {const size = sizes[key];
    if (size === undefined || size === 0) continue;
    if (size < minSize[key]) {if (list === undefined) list = [key];
      else list.push(key);
    }
  }
  return list;
};

// Validate minRemainingSize constraint when a single chunk is left over
if (
    !enforced &&
    item.cacheGroup._validateRemainingSize &&
    usedChunks.size === 1
) {const [chunk] = usedChunks;
    let chunkSizes = Object.create(null);
    for (const module of chunkGraph.getChunkModulesIterable(chunk)) {if (!item.modules.has(module)) {for (const type of module.getSourceTypes()) {chunkSizes[type] =
                    (chunkSizes[type] || 0) + module.size(type);
            }
        }
    }
    const violatingSizes = getViolatingMinSizes(
        chunkSizes,
        item.cacheGroup.minRemainingSize
    );
    if (violatingSizes !== undefined) {
        const oldModulesSize = item.modules.size;
        removeModulesWithSourceType(item, violatingSizes);
        if (
            item.modules.size > 0 &&
            item.modules.size !== oldModulesSize
        ) {
            // queue this item again to be processed again
            // without violating modules
            chunksInfoMap.set(bestEntryKey, item);
        }
        continue;
    }
}

如下面代码所示,先应用 getViolatingMinSizes() 失去 size 太小的类型汇合,而后应用 removeModulesWithSourceType() 删除对应的 module(如上面代码所示),同时更新对应的sizes 属性,最终将更新结束的 item 从新放入到chunksInfoMap

此时的 chunksInfoMap 对应的 bestEntryKey 数据曾经删除小的modules

const removeModulesWithSourceType = (info, sourceTypes) => {for (const module of info.modules) {const types = module.getSourceTypes();
        if (sourceTypes.some(type => types.has(type))) {info.modules.delete(module);
            for (const type of types) {info.sizes[type] -= module.size(type);
            }
        }
    }
};
创立 newChunk 以及 chunk.split(newChunk)

如果没有能够复用的 Chunk,就应用compilation.addChunk(chunkName) 建设一个新的Chunk

// Create the new chunk if not reusing one
if (newChunk === undefined) {newChunk = compilation.addChunk(chunkName);
}
// Walk through all chunks
for (const chunk of usedChunks) {
    // Add graph connections for splitted chunk
    chunk.split(newChunk);
}

在下面的剖析能够晓得,isReusedWithAllModules=true代表的是 cacheGroup 没有名称,遍历所有 item.chunks 找到能够复用的 Chunk,因而这里不必connectChunkAndModule() 建设新的分割,只须要将所有的 item.modulesitem.chunks解除关联

而当 isReusedWithAllModules=false 时,须要将所有的 item.modulesitem.chunks解除关联,将所有 item.modulesnewChunk建立联系

比方构建出往 app3 这个入口 Chunk 的 ChunkGroup 插入 newChunk,建设它们的依赖关系,在前面生成代码时能够正确生成对应的依赖关系,即 app3-Chunk 能够加载 newChunk,毕竟是从 app1、app2、app3、app4 这 4 个 Chunk 分离出来的 newChunk

if (!isReusedWithAllModules) {
    // Add all modules to the new chunk
    for (const module of item.modules) {if (!module.chunkCondition(newChunk, compilation)) continue;
        // Add module to new chunk
        chunkGraph.connectChunkAndModule(newChunk, module);
        // Remove module from used chunks
        for (const chunk of usedChunks) {chunkGraph.disconnectChunkAndModule(chunk, module);
        }
    }
} else {
    // Remove all modules from used chunks
    for (const module of item.modules) {for (const chunk of usedChunks) {chunkGraph.disconnectChunkAndModule(chunk, module);
        }
    }
}

将目前 newChunk 更新到 maxSizeQueueMap,期待后续maxSize 阶段 解决

if (Object.keys(item.cacheGroup.maxAsyncSize).length > 0 ||
    Object.keys(item.cacheGroup.maxInitialSize).length > 0
) {const oldMaxSizeSettings = maxSizeQueueMap.get(newChunk);
    maxSizeQueueMap.set(newChunk, {
        minSize: oldMaxSizeSettings
            ? combineSizes(
                oldMaxSizeSettings.minSize,
                item.cacheGroup._minSizeForMaxSize,
                Math.max
            )
            : item.cacheGroup.minSize,
        maxAsyncSize: oldMaxSizeSettings
            ? combineSizes(
                oldMaxSizeSettings.maxAsyncSize,
                item.cacheGroup.maxAsyncSize,
                Math.min
            )
            : item.cacheGroup.maxAsyncSize,
        maxInitialSize: oldMaxSizeSettings
            ? combineSizes(
                oldMaxSizeSettings.maxInitialSize,
                item.cacheGroup.maxInitialSize,
                Math.min
            )
            : item.cacheGroup.maxInitialSize,
        automaticNameDelimiter: item.cacheGroup.automaticNameDelimiter,
        keys: oldMaxSizeSettings
            ? oldMaxSizeSettings.keys.concat(item.cacheGroup.key)
            : [item.cacheGroup.key]
    });
}
删除其它 chunksInfoMap item 的 info.modules[i]

以后解决的是最高优先级的 chunksInfoMap item,处理完毕后,检测其它chunksInfoMap iteminfo.chunks是否有目前最高优先级的 chunksInfoMap itemchunks

有的话应用 info.modulesitem.modules比拟,删除其它 chunksInfoMap iteminfo.modules[i]

删除实现后,检测 info.modules.size 是否等于 0 以及 checkMinSizeReduction(),而后决定是否要进行cacheGroup 的革除工作

checkMinSizeReduction()波及到 cacheGroup.minSizeReduction 配置,生成 chunk 所需的主 chunk(bundle)的最小体积(以字节为单位)缩减。这意味着如果宰割成一个 chunk 并没有缩小主 chunk(bundle)的给定字节数,它将不会被宰割,即便它满足 splitChunks.minSize

const isOverlap = (a, b) => {for (const item of a) {if (b.has(item)) return true;
  }
  return false;
};

// remove all modules from other entries and update size
for (const [key, info] of chunksInfoMap) {if (isOverlap(info.chunks, usedChunks)) {
        // update modules and total size
        // may remove it from the map when < minSize
        let updated = false;
        for (const module of item.modules) {if (info.modules.has(module)) {
                // remove module
                info.modules.delete(module);
                // update size
                for (const key of module.getSourceTypes()) {info.sizes[key] -= module.size(key);
                }
                updated = true;
            }
        }
        if (updated) {if (info.modules.size === 0) {chunksInfoMap.delete(key);
                continue;
            }
            if (removeMinSizeViolatingModules(info) ||
                !checkMinSizeReduction(
                    info.sizes,
                    info.cacheGroup.minSizeReduction,
                    info.chunks.size
                )
            ) {chunksInfoMap.delete(key);
                continue;
            }
        }
    }
}

2.5 maxSize 阶段:依据 maxSize,将过大的 chunk 进行再分包

compilation.hooks.optimizeChunks.tap(
    {    
        name: "SplitChunksPlugin",
        stage: STAGE_ADVANCED
    },
    chunks => {logger.time("prepare");
        //...
        logger.timeEnd("prepare");

        logger.time("modules");
        for (const module of compilation.modules) {//...}
        logger.timeEnd("modules");

        logger.time("queue");
        for (const [key, info] of chunksInfoMap) {//...}
        while (chunksInfoMap.size > 0) {//...}
        logger.timeEnd("queue");

        logger.time("maxSize");
        for (const chunk of Array.from(compilation.chunks)) {//...}
        logger.timeEnd("maxSize");
    }
}

maxSize 比 maxInitialRequest/maxAsyncRequests 具备更高的优先级。理论优先级是 maxInitialRequest/maxAsyncRequests < maxSize < minSize
应用 maxSize 通知 webpack 尝试将大于 maxSize 个字节的 chunk 宰割成较小的局部。这些较小的局部在体积上至多为 minSize(仅次于 maxSize)
maxSize 选项旨在与 HTTP/2 和长期缓存一起应用。它减少了申请数量以实现更好的缓存。它还能够用于减小文件大小,以放慢二次构建速度

在下面 cacheGroups 生成的 chunks 合并到入口和异步造成的 chunks 后,咱们将校验 maxSize 值,如果生成的 chunks 体积过大,还须要再次分包!

比方咱们上面配置中,申明一个maxSize: 50

splitChunks: {
    minSize: 1,
    chunks: 'all',
    maxInitialRequests: 10,
    maxAsyncRequests: 10,
    maxSize: 50,
    cacheGroups: {
        test3: {
            chunks: 'all',
            minChunks: 3,
            name: "test3",
            priority: 3
        },
        test2: {
            chunks: 'all',
            minChunks: 2,
            name: "test2",
            priority: 2
        }
    }
}

最终生成的文件中,本来只有 app1.jsapp2.js,因为 maxSize 的限度,咱们切割为 3 个 app1-xxx.js 和 2 个 app2-xxx.js 文件


发现问题

  • maxSize 是如何切割的?依据 NormalModule 进行切割的吗?
  • 如果 maxSize 过小,会不会有数量限度?
  • 这些切割的文件是如何在运行中合并起来的呢?是通过 runtime 代码吗?还是通过 chunkGroup? 同一个chunkGroup 中有不同的chunks?
  • maxSize 是如何解决切割不均的状况,比方切割成为两局部,如何保障两局部都大于 minSize 又小于maxSize?

接下来的源码会次要围绕这下面问题进行剖析


如上面代码所示,咱们应用 deterministicGroupingForModules 进行 chunk 的切割失去多个后果results

如果切割后果result.length<=1,那么证实不必切割,不必解决

如果切割后果result.length>1

  • 咱们须要将切割进去的 newPart 插入到 chunk 对应的 ChunkGroup
  • 咱们须要将切割完的每一个 chunk 和它对应的 module 关联:chunkGraph.connectChunkAndModule(newPart, module)
  • 同时须要将原来一个大的 chunk 跟目前 newPartChunk 对应的 module 进行解除关联:chunkGraph.disconnectChunkAndModule(chunk, module)
//node_modules/webpack/lib/optimize/SplitChunksPlugin.js
for (const chunk of Array.from(compilation.chunks)) {
    //...
    
    const results = deterministicGroupingForModules({...});
    if (results.length <= 1) {continue;}
    for (let i = 0; i < results.length; i++) {const group = results[i];
        //...
        if (i !== results.length - 1) {const newPart = compilation.addChunk(name);
            chunk.split(newPart);
            newPart.chunkReason = chunk.chunkReason;
            // Add all modules to the new chunk
            for (const module of group.items) {if (!module.chunkCondition(newPart, compilation)) {continue;}
                // Add module to new chunk
                chunkGraph.connectChunkAndModule(newPart, module);
                // Remove module from used chunks
                chunkGraph.disconnectChunkAndModule(chunk, module);
            }
        } else {
            // change the chunk to be a part
            chunk.name = name;
        }
    }
}

2.5.1 deterministicGroupingForModules 宰割外围办法

切割的最小单位是 NormalModule,如果一个NormalModule 十分大,则间接成为一个组,也就是新的chunk

const nodes = Array.from(
    items,
    item => new Node(item, getKey(item), getSize(item))
);
for (const node of nodes) {if (isTooBig(node.size, maxSize) && !isTooSmall(node.size, minSize)) {result.push(new Group([node], []));
    } else {//....}
}

如果单个 NormalModule 小于 maxSize,则退出到initialNodes

for (const node of nodes) {if (isTooBig(node.size, maxSize) && !isTooSmall(node.size, minSize)) {result.push(new Group([node], []));
    } else {initialNodes.push(node);
    }
}

而后咱们会进行 initialNodes 的解决,因为 initialNodes[i] 是小于 maxSize 的,然而多个 initialNodes[i] 合并起来未必小于maxSize,因而咱们咱们得分状况探讨

if (initialNodes.length > 0) {const initialGroup = new Group(initialNodes, getSimilarities(initialNodes));
    if (initialGroup.nodes.length > 0) {const queue = [initialGroup];

        while (queue.length) {const group = queue.pop();
            // 步骤 1:判断整体大小是否还小于 maxSize
            // 步骤 2:removeProblematicNodes()后重新处理
          
            // 步骤 3:宰割右边和左边局部,使得 leftSize>minSize && rightSize>minSize
            // 步骤 3 -1:判断宰割是否重叠,即 left-1>right
            // 步骤 3 -2:判断 left 和 right 两头是否有元素还没纳入左右两个区间内,即宰割两头依然有空余的局部
          
            // 步骤 4: 为左区间和右区间创立不同的 Group 数据,而后压入 queue 中重新处理
        }
    }
    // 步骤 5: 赋值 key,造成最终数据结构返回
}

步骤 1: 判断是否存在类型大于 maxSize

如果所有 type 类型数据的大小都无奈大于 maxSize,那就没有宰割的必要性,间接退出后果result 即可

这里的 group.size 是所有类型加起来的大小

if (initialNodes.length > 0) {const initialGroup = new Group(initialNodes, getSimilarities(initialNodes));
    if (initialGroup.nodes.length > 0) {const queue = [initialGroup];
        while (queue.length) {const group = queue.pop();
            // 步骤 1: 判断是否存在类型大于 maxSize
            if (!isTooBig(group.size, maxSize)) {result.push(group);
                continue;
            }
            // 步骤 2:removeProblematicNodes()找出是否有类型是小于 minSize

            // 步骤 3:宰割右边和左边局部,使得 leftSize>minSize && rightSize>minSize
            // 步骤 3 -1:判断宰割是否重叠,即 left-1>right
            // 步骤 3 -2:判断 left 和 right 两头是否有元素还没纳入左右两个区间内,即宰割两头依然有空余的局部

            // 步骤 4: 为左区间和右区间创立不同的 Group 数据,而后压入 queue 中重新处理
        }
    }
}
步骤 2:removeProblematicNodes()找出是否有类型是小于 minSize

如果有类型小于 minSize,尝试拆出来 group 中蕴含该类型的 node 数据,而后合并到其它 Group 中,而后重新处理group

这个办法十分高频,前面多个流程都呈现该办法,因而须要好好剖析下,见上面的步骤 2

if (initialNodes.length > 0) {const initialGroup = new Group(initialNodes, getSimilarities(initialNodes));
    if (initialGroup.nodes.length > 0) {const queue = [initialGroup];
        while (queue.length) {const group = queue.pop();
            // 步骤 1: 判断是否存在类型大于 maxSize
            if (!isTooBig(group.size, maxSize)) {result.push(group);
                continue;
            }
            // 步骤 2:removeProblematicNodes()找出是否有类型是小于 minSize
            if (removeProblematicNodes(group)) {
                // This changed something, so we try this group again
                queue.push(group);
                continue;
            }

            // 步骤 3:宰割右边和左边局部,使得 leftSize>minSize && rightSize>minSize
            // 步骤 3 -1:判断宰割是否重叠,即 left-1>right
            // 步骤 3 -2:判断 left 和 right 两头是否有元素还没纳入左右两个区间内,即宰割两头依然有空余的局部

            // 步骤 4: 为左区间和右区间创立不同的 Group 数据,而后压入 queue 中重新处理
        }
    }
}

getTooSmallTypes():传入的size={**javascript**: 125}minSize={**javascript**: 61,**unknown: 61}**,比拟失去目前不满足要求的类型的数组,比方types=["javascript"]

const removeProblematicNodes = (group, consideredSize = group.size) => {const problemTypes = getTooSmallTypes(consideredSize, minSize);
    if (problemTypes.size > 0) {
      //...
      return true;
    }
    else return false;
};
const getTooSmallTypes = (size, minSize) => {const types = new Set();
    for (const key of Object.keys(size)) {const s = size[key];
        if (s === 0) continue;
        const minSizeValue = minSize[key];
        if (typeof minSizeValue === "number") {if (s < minSizeValue) types.add(key);
        }
    }
    return types;
};

咱们从 getTooSmallTypes() 失去目前 group 中不满足 minSize 的类型数组problemTypes

getNumberOfMatchingSizeTypes():依据传入的 node.sizeproblemTypes判断该 node 是否是问题节点,如果该 node 蕴含不满足 minSizetypes

咱们通过 group.popNodes+getNumberOfMatchingSizeTypes() 获取问题的节点 problemNodes,而后通过result+getNumberOfMatchingSizeTypes() 获取那些自身满足 minSize+maxSize 的汇合possibleResultGroups

const removeProblematicNodes = (group, consideredSize = group.size) => {const problemTypes = getTooSmallTypes(consideredSize, minSize);
    if (problemTypes.size > 0) {
        const problemNodes = group.popNodes(n => getNumberOfMatchingSizeTypes(n.size, problemTypes) > 0
        );
        if (problemNodes === undefined) return false;
        // Only merge it with result nodes that have the problematic size type
        const possibleResultGroups = result.filter(n => getNumberOfMatchingSizeTypes(n.size, problemTypes) > 0
        );
    }
    else return false;
}
const getNumberOfMatchingSizeTypes = (size, types) => {
    let i = 0;
    for (const key of Object.keys(size)) {if (size[key] !== 0 && types.has(key)) i++;
    }
    return i;
};
// for (const node of nodes) {//     if (isTooBig(node.size, maxSize) && !isTooSmall(node.size, minSize)) {//         result.push(new Group([node], []));
//     } else {//         initialNodes.push(node);
//     }
// }

那为什么咱们要获取自身满足 minSize+maxSize 的汇合 possibleResultGroups 呢?从上面代码咱们能够晓得,咱们拿到汇合后,再进行了筛选,筛选出那些更加合乎 problemTypes 问题类型的group,称为bestGroup

const removeProblematicNodes = (group, consideredSize = group.size) => {const problemTypes = getTooSmallTypes(consideredSize, minSize);
    if (problemTypes.size > 0) {
        const problemNodes = group.popNodes(n => getNumberOfMatchingSizeTypes(n.size, problemTypes) > 0
        );
        if (problemNodes === undefined) return false;
        const possibleResultGroups = result.filter(n => getNumberOfMatchingSizeTypes(n.size, problemTypes) > 0
        );
        if (possibleResultGroups.length > 0) {const bestGroup = possibleResultGroups.reduce((min, group) => {const minMatches = getNumberOfMatchingSizeTypes(min, problemTypes);
                const groupMatches = getNumberOfMatchingSizeTypes(
                    group,
                    problemTypes
                );
                if (minMatches !== groupMatches)
                    return minMatches < groupMatches ? group : min;
                if (selectiveSizeSum(min.size, problemTypes) >
                    selectiveSizeSum(group.size, problemTypes)
                )
                    return group;
                return min;
            });
            for (const node of problemNodes) bestGroup.nodes.push(node);
            bestGroup.nodes.sort((a, b) => {if (a.key < b.key) return -1;
                if (a.key > b.key) return 1;
                return 0;
            });
        } else {//...}
        return true;
    }
    else return false;
}
//Group 的一个办法
popNodes(filter) {
        debugger;
        const newNodes = [];
        const newSimilarities = [];
        const resultNodes = [];
        let lastNode;
        for (let i = 0; i < this.nodes.length; i++) {const node = this.nodes[i];
            if (filter(node)) {resultNodes.push(node);
            } else {if (newNodes.length > 0) {
                    newSimilarities.push(lastNode === this.nodes[i - 1]
                            ? this.similarities[i - 1]
                            : similarity(lastNode.key, node.key)
                    );
                }
                newNodes.push(node);
                lastNode = node;
            }
        }
        if (resultNodes.length === this.nodes.length) return undefined;
        this.nodes = newNodes;
        this.similarities = newSimilarities;
        this.size = sumSize(newNodes);
        return resultNodes;
}

而后将这些不满足 minSizenode合并到自身满足 minSize+maxSizegroup中,次要外围就是上面这一句代码,那为什么要这么做呢?

for (const node of problemNodes) bestGroup.nodes.push(node);

起因就是这些 node 自身是无奈满足 minSize 的,也就是整体太小了,这个时候将它合并到目前最好最有可能接收它的汇合中,就能够解决满足它所须要的minSize

当然,也有可能找不到能够接收它的汇合,那咱们只能从新创立一个 new Group() 接收它了

 if (possibleResultGroups.length > 0) {//...} else {
    // There are no other nodes with the same size types
    // We create a new group and have to accept that it's smaller than minSize
    result.push(new Group(problemNodes, null));
}
return true;
小结

removeProblematicNodes() 传输 groupconsideredSize,其中 consideredSize 是一个 Object 对象,它须要跟 minSize 对象进行比照,而后获取对应的类型数组 problemTypes,而后检测是否从传入的汇合group 抽离出 problemTypes 类型的一些 node 汇合,而后合并到曾经确定的 result 汇合 / 新建一个 new Group() 汇合中

如果抽离进去的 node 汇合等于 group 自身,则间接返回 false,不进行任何合并 / 新建操作
如果 group 汇合中没有任何类型是小于 minSize 的,则返回 false,不进行任何合并 / 新建操作
如果 group 汇合的问题类型数组找不到对应能够合并的 result 汇合,则放入到 new Group() 汇合


发现问题

  • bestGroup.nodes.push(node)之后会不会超过 maxSize?

步骤 3:宰割右边和左边局部,使得 leftSize>minSize && rightSize>minSize

步骤 1 是判断整体的 size 是否不满足 maxSize 切割,而步骤 2 则是判断局部属性是否不满足 minSize 的要求,如果有,则须要合并到其它 group/ 新建new Group() 接收它,无论是哪种后果,都须要将旧的 group/ 新的group 压入 queue 中从新进行解决

在经验步骤 1 和步骤 2 对于 minSize 的解决后,步骤 3 开始进行左右区域的合并,要求左右区域都满足大于或等于minSize

//      left v   v right
// [O O O] O O O [O O O]
// ^^^^^^^^^ leftSize
//       rightSize ^^^^^^^^^
// leftSize > minSize
// rightSize > minSize
//                      r     l
// Perfect split: [O O O] [O O O]
//                right === left - 1
let left = 1;
let leftSize = Object.create(null);
addSizeTo(leftSize, group.nodes[0].size);
while (left < group.nodes.length && isTooSmall(leftSize, minSize)) {addSizeTo(leftSize, group.nodes[left].size);
  left++;
}
let right = group.nodes.length - 2;
let rightSize = Object.create(null);
addSizeTo(rightSize, group.nodes[group.nodes.length - 1].size);
while (right >= 0 && isTooSmall(rightSize, minSize)) {addSizeTo(rightSize, group.nodes[right].size);
  right--;
}

合并后,会呈现三种状况

  • right === left - 1: 完满切割,不必解决
  • right < left - 1: 两个区域有重叠的中央
  • right > left - 1: 两个区域两头存在没有应用的区域

right < left - 1

比拟目前 leftright的地位,取占据较为位的一边,减去最右边 / 最左边的 size,此时 prevSize 必定不满足 minSize,因为从下面的剖析能够晓得,都是间接 addSizeTo 使得 leftArearightArea都满足 minSize

通过 removeProblematicNodes() 传入以后 groupprevSize,通过 prevSizeminSize的比拟获取问题类型数组 problemTypes,而后依据目前的problemTypes 获取子集合 (group 的一部分或者 undefined)

如果依据目前的 problemTypes 拿到的就是 group,则无奈合并该子集到其它chunk 中,removeProblematicNodes()返回 false

如果该子集是 group 的一部分,则合并到其它曾经造成的 result(多个group 汇合)中最适宜的一个 (依据problemTypes 类型越多合乎的准则),而后将剩下的合乎 minSizegroup局部放入 queue 中,从新进行解决

if (left - 1 > right) {
    let prevSize;
    // left 左边残余的数量 比 right 右边的数量 大
    // a b c d e f g
    //   r   l
    if (right < group.nodes.length - left) {subtractSizeFrom(rightSize, group.nodes[right + 1].size);
        prevSize = rightSize;
    } else {subtractSizeFrom(leftSize, group.nodes[left - 1].size);
        prevSize = leftSize;
    }
    if (removeProblematicNodes(group, prevSize)) {queue.push(group);
        continue;
    }
    // can't split group while holding minSize
    // because minSize is preferred of maxSize we return
    // the problematic nodes as result here even while it's too big
    // To avoid this make sure maxSize > minSize * 3
    result.push(group);
    continue;
}

咱们从步骤 1 能够晓得,目前 group 必定存在大于 maxSize 的类型,并且通过步骤 2 的 removeProblematicNodes(),咱们要么剔除不了那些小于minSize 类型的数据,要么不存在小于小于 minSize 类型的数据

left - 1 > right 代表着步骤 2 中咱们剔除不了那些小于 minSize 类型的数据,因而咱们在步骤 3 再次尝试剔除小于 minSize 类型的数据,如果失败,因为优先级 minSize>maxSize,即便以后group 存在类型大于 maxSize,然而强行分区leftArearightArea必定不能满足 minSize 的要求,因而疏忽 maxSize,间接为以后group 建设一个chunk

具体例子

从上面的例子能够晓得,app1这个 chunk 的大小是超过 maxSize=124 的,然而它是满足 minSize 大小的,如果强行拆分为两个 chunkmaxSize 可能满足,然而 minSize 就无奈满足,因为优先级 minSize>maxSize,因而只能放弃maxSize 而抉择minSize

上面例子只是其中一种比拟常见的状况,必定还有其它状况,因为笔者精力有限,在该逻辑代码中没有再持续深入研究,请参考别人文章进行深刻理解 left - 1 > right 步骤 3 的解决


right > left - 1

两个区域两头存在没有应用的区域,应用 similarity 寻找最佳宰割点,寻找最小的 similarity 进行切割,分为左右两半

其中 pos=left,而后在[left, right] 中进行递增,其中 rightSize[pos, nodes.length-1]的总和
在一直递增 pos 的过程中,一直减少 leftSize 以及一直缩小对应的 rightSize,判断是否会小于minSize,通过group.similarities 找到最小的值,也就是类似度最小的两个地位(文件门路差距最大的两个地位),进行切割

if (left <= right) {
    let best = -1;
    let bestSimilarity = Infinity;
    let pos = left;
    let rightSize = sumSize(group.nodes.slice(pos));

    while (pos <= right + 1) {const similarity = group.similarities[pos - 1];
        if (
            similarity < bestSimilarity &&
            !isTooSmall(leftSize, minSize) &&
            !isTooSmall(rightSize, minSize)
        ) {
            best = pos;
            bestSimilarity = similarity;
        }
        addSizeTo(leftSize, group.nodes[pos].size);
        subtractSizeFrom(rightSize, group.nodes[pos].size);
        pos++;
    }
    if (best < 0) {result.push(group);
        continue;
    }
    left = best;
    right = best - 1;
}

group.similarities[pos - 1] 是什么意思呢?

依据两个相邻的 node.keysimilarity() 进行每一个字符的比拟,比方

  • 最靠近一样的两个 key,ca-cb=510 - Math.abs(ca - cb)=5
  • 不雷同的两个 key,ca-cb=610 - Math.abs(ca - cb)=4
  • 两个 key 不雷同到离谱,则 10 – Math.abs(ca – cb)<0,最终Math.max(0, 10 - Math.abs(ca - cb))=0

因而两个相邻 node 对应的 node.key 最靠近,similarities[x]最大

const initialGroup = new Group(initialNodes, getSimilarities(initialNodes))
const getSimilarities = nodes => {
    // calculate similarities between lexically adjacent nodes
    /** @type {number[]} */
    const similarities = [];
    let last = undefined;
    for (const node of nodes) {if (last !== undefined) {similarities.push(similarity(last.key, node.key));
        }
        last = node;
    }
    return similarities;
};
const similarity = (a, b) => {const l = Math.min(a.length, b.length);
    let dist = 0;
    for (let i = 0; i < l; i++) {const ca = a.charCodeAt(i);
        const cb = b.charCodeAt(i);
        dist += Math.max(0, 10 - Math.abs(ca - cb));
    }
    return dist;
};

而在一开始的时候,咱们就依据 node.key 进行了排序

const initialNodes = [];
// lexically ordering of keys
nodes.sort((a, b) => {if (a.key < b.key) return -1;
  if (a.key > b.key) return 1;
  return 0;
});

因而应用 similarity 寻找最佳宰割点,寻找最小的 similarity 进行切割,分为左右两半,就是在寻找两个 nodekey最不雷同的一个index

具体例子

node.key 是如何生成的?

依据上面 getKey() 代码能够晓得,先获取绝对应的门路 name="./src/entry1.js",而后通过hashFilename() 失去对应的 hash 值,最终拼成

门路 fullKey="./src/entry1.js-6a89fa05",而后requestToId() 转化为key="src_entry1_js-6a89fa05"

// node_modules/webpack/lib/optimize/SplitChunksPlugin.js
const results = deterministicGroupingForModules({
    //...
    getKey(module) {
        debugger;
        const cache = getKeyCache.get(module);
        if (cache !== undefined) return cache;
        const ident = cachedMakePathsRelative(module.identifier());
        const nameForCondition =
            module.nameForCondition && module.nameForCondition();
        const name = nameForCondition
            ? cachedMakePathsRelative(nameForCondition)
            : ident.replace(/^.*!|\?[^?!]*$/g, "");
        const fullKey =
            name +
            automaticNameDelimiter +
            hashFilename(ident, outputOptions);
        const key = requestToId(fullKey);
        getKeyCache.set(module, key);
        return key;
    }
}

实质就是拿到文件门路最不雷同的一个点?比方其中 5 个 module 都在 a 文件夹,其中 3 个 module 都在 b 文件夹,那么就以此为切割点?切割 a 文件夹为leftArea,切割 b 文件夹为rightArea??

字符串的比拟是依照字符(母)一一进行比拟的,从头到尾,一位一位进行比拟,谁大则该字符串大,比方

  • "Z"> "A"
  • "ABC"> "ABA"
  • "ABC"< "AC"
  • "ABC"> "AB"

间接模仿一系列的 nodes 数据,手动制订 leftright,移除对应的 leftSizerightSize

size只是为了判断目前宰割的大小是否满足 minSize,咱们上面例子次要是为了模仿应用similarity 寻找最佳宰割点,寻找最小的 similarity 进行切割,分为左右两半的逻辑,因而临时不关注size

class Group {constructor(nodes, similarities, size) {
        this.nodes = nodes;
        this.similarities = similarities;
        this.key = undefined;
    }
}
const getSimilarities = nodes => {const similarities = [];
    let last = undefined;
    for (const node of nodes) {if (last !== undefined) {similarities.push(similarity(last.key, node.key));
        }
        last = node;
    }
    return similarities;
};
const similarity = (a, b) => {const l = Math.min(a.length, b.length);
    let dist = 0;
    for (let i = 0; i < l; i++) {const ca = a.charCodeAt(i);
        const cb = b.charCodeAt(i);
        dist += Math.max(0, 10 - Math.abs(ca - cb));
    }
    return dist;
};
function test() {
    const initialNodes = [
        {key: "src2_entry1_js-6a89fa02"},
        {key: "src3_entry2_js-3a33ff02"},
        {key: "src2_entry3_js-6aaafa01"},
        {key: "src1_entry0_js-ea33aa12"},
        {key: "src1_entry1_js-6a89fa02"},
        {key: "src1_entry2_js-ea33aa13"},
        {key: "src1_entry3_js-ea33aa14"}
    ];
    initialNodes.sort((a, b) => {if (a.key < b.key) return -1;
        if (a.key > b.key) return 1;
        return 0;
    });
    const initialGroup = new Group(initialNodes, getSimilarities(initialNodes));
    console.info(initialGroup);
    let left = 1;
    let right = 4;
    if (left <= right) {
        let best = -1;
        let bestSimilarity = Infinity;
        let pos = left;
        while (pos <= right + 1) {const similarity = initialGroup.similarities[pos - 1];
            if (similarity < bestSimilarity) {
                best = pos;
                bestSimilarity = similarity;
            }
            pos++;
        }
        left = best;
        right = best - 1;
    }
    console.warn("left", left);
    console.warn("right", right);
}

test();

最终执行后果如下所示,文件夹不同文件之间的 similarities 是最小的,因而会依照文件夹分成左右两个区域

尽管体现是依照文件夹宰割,然而并不能阐明都是如此,笔者没有深入研究这方面为什么要依据 similarities 进行宰割,请读者参考其它文章进行钻研,目前举例只是作为 right > left - 1 流程的集体了解


步骤 4: 为左区间和右区间创立不同的 Group 数据,而后压入 queue 中重新处理

依据下面几个步骤确定的 leftright,为 leftArearightArea创立对应的new Group(),而后压入queue,再次重新处理分好的两个组,看看这两个组是否须要再进行分组

const rightNodes = [group.nodes[right + 1]];
/** @type {number[]} */
const rightSimilarities = [];
for (let i = right + 2; i < group.nodes.length; i++) {rightSimilarities.push(group.similarities[i - 1]);
    rightNodes.push(group.nodes[i]);
}
queue.push(new Group(rightNodes, rightSimilarities));
const leftNodes = [group.nodes[0]];
/** @type {number[]} */
const leftSimilarities = [];
for (let i = 1; i < left; i++) {leftSimilarities.push(group.similarities[i - 1]);
    leftNodes.push(group.nodes[i]);
}
queue.push(new Group(leftNodes, leftSimilarities));
步骤 5: 赋值 key,造成最终数据结构返回
result.sort((a, b) => {if (a.nodes[0].key < b.nodes[0].key) return -1;
    if (a.nodes[0].key > b.nodes[0].key) return 1;
    return 0;
});
// give every group a name
const usedNames = new Set();
for (let i = 0; i < result.length; i++) {const group = result[i];
    if (group.nodes.length === 1) {group.key = group.nodes[0].key;
    } else {const first = group.nodes[0];
        const last = group.nodes[group.nodes.length - 1];
        const name = getName(first.key, last.key, usedNames);
        group.key = name;
    }
}
// return the results
return result.map(group => {
    return {
        key: group.key,
        items: group.nodes.map(node => node.item),
        size: group.size
    };
});

2.6 具体示例

2.6.1 造成新 chunk:test3

在下面具体实例中,咱们一开始的 chunksInfoMap 如上面所示,通过 compareEntries() 拿出 bestEntry=test3 的 cacheGroup,而后通过一系列的参数校验后,开始检测其它 chunksInfoMap[j]info.chunks是否有目前最高优先级的 chunksInfoMap[i]chunks

bestEntry=test3 具备的 chunks 是 app1、app2、app3、app4,曾经笼罩了所有入口文件 chunk,因而所有chunksInfoMap[j] 都得应用 info.modulesitem.modules比拟,删除其它 chunksInfoMap[j]info.modules[i]

通过最高级别的 cacheGroup:test3 的整顿后,咱们将 minChunks=3common___gjs-cookievoca放入到 newChunk 中,删除其它 cacheGroup 中这三个 NormalModule

而后触发代码,进行 chunksInfoMap 的 key 删除

if (info.modules.size === 0) {chunksInfoMap.delete(key);
    continue;
}

最终 chunksInfoMap 的数据只剩下 5 个 key,如上面所示

2.6.2 造成新 chunk:test2

拆分出 chunk:test3 后,进入下一轮循环,通过 compareEntries() 拿出 bestEntry=test2 相干的 cacheGroup

在经验

  • isExistingChunk
  • maxInitialRequests 和 maxAsyncRequests

的流程解决后,进入了 chunkGraph.isModuleInChunk 环节

outer: for (const chunk of usedChunks) {for (const module of item.modules) {if (chunkGraph.isModuleInChunk(module, chunk)) continue outer;
    }
    usedChunks.delete(chunk);
}

从下图能够晓得,目前 bestEntry=test2 中,modules只剩下 loadsh,然而chunks 还存在app1、app2、app3、app4

从下图能够晓得,loadsh只领有 app1、app2,因而下面代码块会触发usedChunks.delete(chunk) 删除掉app3、app4

那为什么会存在 cacheGroup=test2 会领有 app3、app4 呢?

那是因为在 modules 阶段:遍历 compilation.modules,依据 cacheGroup 造成 chunksInfoMap 数据 的过程中,它对每一个 module 进行遍历,而后进行每一个 cacheGroup 的遍历,只有合乎 cacheGroup.minChunks=2 都会被退出到 cacheGroup=test2

那为什么当初 cacheGroup=test2 又对应不上 app3、app4 呢?

那是因为 cacheGroup=test3 的优先级比 cacheGroup=test2 高,它把一些 module:common_gjs-cookievoca都曾经并入到 chunk=test3 中,因而导致了 cacheGroup=test2 只剩下 module:loadsh,这个时候 loadsh 只须要 app1、app2 这两个 chunk,因而当初得删除 app3、app4 这两个失去作用的 chunk


处理完毕 chunkGraph.isModuleInChunk 环节后,会进入 usedChunks.size<item.chunks.size 环节,因为下面的环节曾经删除了 usedChunks 的两个元素,因而这里满足 usedChunks.size<item.chunks.size,会将目前这个bestEntry 重新加入到 chunksInfoMap 再次解决

// Were some (invalid) chunks removed from usedChunks?
// => readd all modules to the queue, as things could have been changed
if (usedChunks.size < item.chunks.size) {if (isExistingChunk) usedChunks.add(newChunk);
    if (usedChunks.size >= item.cacheGroup.minChunks) {const chunksArr = Array.from(usedChunks);
        for (const module of item.modules) {
            addModuleToChunksInfoMap(
                item.cacheGroup,
                item.cacheGroupIndex,
                chunksArr,
                getKey(usedChunks),
                module
            );
        }
    }
    continue;
}

退出实现后,chunksInfoMap的数据如下所示,test2就只剩下一个 module 以及它对应的两个 chunk

再度触发 新 chunk:test2的解决逻辑

2.6.3 再度触发造成新 chunk:test2

从新执行所有流程

  • isExistingChunk
  • maxInitialRequests 和 maxAsyncRequests
  • chunkGraph.isModuleInChunk
  • 不合乎 usedChunks.size<item.chunks.size
  • minRemainingSize 检测通过

最终触发了创立 newChunk 以及 chunk.split(newChunk)的逻辑

// Create the new chunk if not reusing one
if (newChunk === undefined) {newChunk = compilation.addChunk(chunkName);
}
// Walk through all chunks
for (const chunk of usedChunks) {
    // Add graph connections for splitted chunk
    chunk.split(newChunk);
}

而后进行删除其它 chunksInfoMap 其它 item 的 info.modules[i]

const isOverlap = (a, b) => {for (const item of a) {if (b.has(item)) return true;
  }
  return false;
};

// remove all modules from other entries and update size
for (const [key, info] of chunksInfoMap) {if (isOverlap(info.chunks, usedChunks)) {
        // update modules and total size
        // may remove it from the map when < minSize
        let updated = false;
        for (const module of item.modules) {if (info.modules.has(module)) {
                // remove module
                info.modules.delete(module);
                // update size
                for (const key of module.getSourceTypes()) {info.sizes[key] -= module.size(key);
                }
                updated = true;
            }
        }
        if (updated) {if (info.modules.size === 0) {chunksInfoMap.delete(key);
                continue;
            }
            if (removeMinSizeViolatingModules(info) ||
                !checkMinSizeReduction(
                    info.sizes,
                    info.cacheGroup.minSizeReduction,
                    info.chunks.size
                )
            ) {chunksInfoMap.delete(key);
                continue;
            }
        }
    }
}

由下图能够晓得,须要删除的是 app1app2,因而所有 chunksInfoMap 其它 item 都会被删除,至此整个 queue 阶段:遍历 chunksInfoMap,依据规定进行 chunk 的从新组织 完结,造成了两个新的 chunktest3test2

queue 阶段 完结后进入了 maxSize 阶段

2.6.3 检测是否配置 maxSize,是否要切割 chunk

具体能够看下面 maxSize 阶段 的具体示例,这里不再赘述

3.codeGeneration: 模块转译

因为篇幅起因,具体分析请看下一篇文章《「Webpack5 源码」seal 阶段剖析三)》

参考

  1. 精读 Webpack SplitChunksPlugin 插件源码

其它工程化文章

  1. 「Webpack5 源码」热更新 HRM 流程浅析
  2. 「Webpack5 源码」make 阶段(流程图)剖析
  3. 「Webpack5 源码」enhanced-resolve 门路解析库源码剖析
  4. 「vite4 源码」dev 模式整体流程浅析(一)
  5. 「vite4 源码」dev 模式整体流程浅析(二)

正文完
 0