本文内容基于webpack 5.74.0
版本进行剖析
前言
在上一篇文章「Webpack5源码」seal阶段(流程图)剖析(一)中,咱们曾经剖析了seal阶段相干的逻辑,次要包含:
new ChunkGraph()
- 遍历
this.entries
,进行Chunk
和ChunkGroup
的创立 buildChunkGraph()
的整体流程
seal(callback) { const chunkGraph = new ChunkGraph( this.moduleGraph, this.outputOptions.hashFunction ); this.chunkGraph = chunkGraph; //... this.logger.time("create chunks"); /** @type {Map<Entrypoint, Module[]>} */ for (const [name, { dependencies, includeDependencies, options }] of this.entries) { const chunk = this.addChunk(name); const entrypoint = new Entrypoint(options); //... } //... buildChunkGraph(this, chunkGraphInit); this.logger.timeEnd("create chunks"); this.logger.time("optimize"); //... while (this.hooks.optimizeChunks.call(this.chunks, this.chunkGroups)) { /* empty */ } //... this.logger.timeEnd("optimize"); this.logger.time("code generation"); this.codeGeneration(err => { //... this.logger.timeEnd("code generation"); }}
buildChunkGraph()
是上篇文章剖析中最外围的局部,次要分为3个局部开展
const buildChunkGraph = (compilation, inputEntrypointsAndModules) => { // PART ONE logger.time("visitModules"); visitModules(...); logger.timeEnd("visitModules"); // PART TWO logger.time("connectChunkGroups"); connectChunkGroups(...); logger.timeEnd("connectChunkGroups"); for (const [chunkGroup, chunkGroupInfo] of chunkGroupInfoMap) { for (const chunk of chunkGroup.chunks) chunk.runtime = mergeRuntime(chunk.runtime, chunkGroupInfo.runtime); } // Cleanup work logger.time("cleanup"); cleanupUnconnectedGroups(compilation, allCreatedChunkGroups); logger.timeEnd("cleanup");};
其中visitModules()
整体逻辑如下所示
在上篇文章完结剖析buildChunkGraph()
之后,咱们将开始hooks.optimizeChunks()
的相干逻辑剖析
文章内容
在没有应用SplitChunksPlugin
进行分包优化的状况下,如上图所示,一共会生成6个chunk(4个入口文件造成的chunk,2个异步加载造成的chunk),从上图能够看出,有多个依赖库都被反复打包进入不同的chunk中,对于这种状况,咱们能够应用SplitChunksPlugin
进行分包优化,如下图所示,分包出两个新的chunk:test2
和test3
,将反复的依赖都打包进去test2
和test3
,防止反复打包造成的打包文件体积过大的问题
本文以下面例子作为外围,剖析SplitChunksPlugin
分包优化的流程
hooks.optimizeChunks
while (this.hooks.optimizeChunks.call(this.chunks, this.chunkGroups)) { /* empty */}
在通过visitModules()
解决后,会调用hooks.optimizeChunks.call()
进行chunks
的优化,如下图所示,会触发多个Plugin
执行,其中咱们最相熟的就是SplitChunksPlugin
插件,上面会集中SplitChunksPlugin
插件进行解说
2.SplitChunksPlugin源码解析
配置cacheGroups
,能够对目前曾经划分好的chunks
再进行优化,将一个大的chunk
划分为两个及以上的chunk
,缩小反复打包,减少代码的复用性
比方入口文件打包造成app1.js
和app2.js
,这两个文件(chunk)存在反复的打包代码:第三方库js-cookie
咱们是否能将js-cookie
打包造成一个新的chunk,这样就能够提出app1.js和app2.js外面的第三方库js-cookie
代码,同时只须要一个中央打包js-cookie
代码
module.exports = { //... optimization: { splitChunks: { chunks: 'async', cacheGroups: { defaultVendors: { test: /[\\/]node_modules[\\/]/, priority: -10, reuseExistingChunk: true, }, default: { minChunks: 2, priority: -20, reuseExistingChunk: true, }, }, }, },}
2.0 整体流程图和代码流程概述
2.0.1 代码
依据logger.time进行划分,整个流程次要分为:
prepare
:初始化一些数据结构和办法,为上面流程做筹备modules
:遍历所有模块,构建出chunksInfoMap
数据queue
:依据minSize
进行chunk的分包,遍历chunksInfoMap
数据maxSize
:依据maxSize
进行chunk的分包
compilation.hooks.optimizeChunks.tap( { name: "SplitChunksPlugin", stage: STAGE_ADVANCED }, chunks => { logger.time("prepare"); //... logger.timeEnd("prepare"); logger.time("modules"); for (const module of compilation.modules) { //... } logger.timeEnd("modules"); logger.time("queue"); for (const [key, info] of chunksInfoMap) { //... } while (chunksInfoMap.size > 0) { //... } logger.timeEnd("queue"); logger.time("maxSize"); for (const chunk of Array.from(compilation.chunks)) { //... } logger.timeEnd("maxSize"); }}
依据配置参数进行对应的源码解析,比方maxSize、minSize、enforce、maxInitialRequests等等
2.0.2 流程图
2.1 cacheGroups默认配置
默认cacheGroups
配置是在初始化过程中就设置好的参数,不是SplitChunksPlugin.js
文件中执行的代码
从上面代码块能够晓得,初始化阶段就定义了两个默认的cacheGroups
配置,其中一个是node_modules
的配置
// node_modules/webpack/lib/config/defaults.jsconst { splitChunks } = optimization;if (splitChunks) { A(splitChunks, "defaultSizeTypes", () => css ? ["javascript", "css", "unknown"] : ["javascript", "unknown"] ); D(splitChunks, "hidePathInfo", production); D(splitChunks, "chunks", "async"); D(splitChunks, "usedExports", optimization.usedExports === true); D(splitChunks, "minChunks", 1); F(splitChunks, "minSize", () => (production ? 20000 : 10000)); F(splitChunks, "minRemainingSize", () => (development ? 0 : undefined)); F(splitChunks, "enforceSizeThreshold", () => (production ? 50000 : 30000)); F(splitChunks, "maxAsyncRequests", () => (production ? 30 : Infinity)); F(splitChunks, "maxInitialRequests", () => (production ? 30 : Infinity)); D(splitChunks, "automaticNameDelimiter", "-"); const { cacheGroups } = splitChunks; F(cacheGroups, "default", () => ({ idHint: "", reuseExistingChunk: true, minChunks: 2, priority: -20 })); // const NODE_MODULES_REGEXP = /[\\/]node_modules[\\/]/i; F(cacheGroups, "defaultVendors", () => ({ idHint: "vendors", reuseExistingChunk: true, test: NODE_MODULES_REGEXP, priority: -10 }));}
将下面的默认配置转化为webpack.config.js
就是上面代码块所示,一共有两个默认配置
- node_modules相干会打包造成一个chunk
- 默认会依据其它参数打包造成chunk
splitChunks.chunks
表明将抉择哪些 chunk 进行优化,默认chunks
为async
模式
async
示意只会从异步类型的chunk
拆分出新的chunk
initial
只会从入口chunk
拆分出新的chunk
all
示意无论是异步还是非异步,都会思考拆分chunk详细分析请看上面的
cacheGroup.chunkFilter
的相干剖析
module.exports = { //... optimization: { splitChunks: { chunks: 'async', minSize: 20000, minRemainingSize: 0, minChunks: 1, maxAsyncRequests: 30, maxInitialRequests: 30, enforceSizeThreshold: 50000, cacheGroups: { defaultVendors: { test: /[\\/]node_modules[\\/]/, priority: -10, reuseExistingChunk: true, }, default: { minChunks: 2, priority: -20, reuseExistingChunk: true, }, }, }, },};
2.2 modules阶段:遍历compilation.modules,依据cacheGroup造成chunksInfoMap数据
for (const module of compilation.modules) { let cacheGroups = this.options.getCacheGroups(module, context); let cacheGroupIndex = 0; for (const cacheGroupSource of cacheGroups) { const cacheGroup = this._getCacheGroup(cacheGroupSource); // ============步骤1============ const combs = cacheGroup.usedExports ? getCombsByUsedExports() : getCombs(); for (const chunkCombination of combs) { const count = chunkCombination instanceof Chunk ? 1 : chunkCombination.size; if (count < cacheGroup.minChunks) continue; // ============步骤2============ const { chunks: selectedChunks, key: selectedChunksKey } = getSelectedChunks(chunkCombination, cacheGroup.chunksFilter); // ============步骤3============ addModuleToChunksInfoMap( cacheGroup, cacheGroupIndex, selectedChunks, selectedChunksKey, module ); } cacheGroupIndex++; }}
2.2.1 步骤1: getCombsByUsedExports()
for (const module of compilation.modules) { let cacheGroups = this.options.getCacheGroups(module, context); const getCombsByUsedExports = memoize(() => { // fill the groupedByExportsMap getExportsChunkSetsInGraph(); /** @type {Set<Set<Chunk> | Chunk>} */ const set = new Set(); const groupedByUsedExports = groupedByExportsMap.get(module); for (const chunks of groupedByUsedExports) { const chunksKey = getKey(chunks); for (const comb of getExportsCombinations(chunksKey)) set.add(comb); } return set; }); for (const cacheGroupSource of cacheGroups) { const cacheGroup = this._getCacheGroup(cacheGroupSource); // ============步骤1============ const combs = cacheGroup.usedExports ? getCombsByUsedExports() : getCombs(); //... }}
getCombsByUsedExports()
的逻辑中波及到多个办法(在prepare阶段
进行初始化的办法),整体流程如下所示
遍历compilation.modules
的过程中,触发groupedByExportsMap.get(module)
,拿到以后module
对应的chunks数据汇合,最终造成的数据结构是:
// item[0]是通过key拿到的chunk数组// item[1]是合乎minChunks拿到的chunks汇合// item[2]和item[3]是合乎minChunks拿到的chunks汇合[new Set(3), new Set(2), Chunk, Chunk]
moduleGraph.getExportsInfo
拿到对应module
的exports
对象信息,比方common__g.js
common__g.js
拿到的数据如下
依据exportsInfo.getUsageKey(chunk.runtime)
进行对应chunks
汇合数据的收集
以getUsageKey(chunk.runtime)
作为key
进行chunks
汇合数据的收集,在以后的示例中,app1、app2、app3、app4
拿到的getUsageKey(chunk.runtime)
都是一样的,这个办法的解析请参考其它文章进行了解
const groupChunksByExports = module => { const exportsInfo = moduleGraph.getExportsInfo(module); const groupedByUsedExports = new Map(); for (const chunk of chunkGraph.getModuleChunksIterable(module)) { const key = exportsInfo.getUsageKey(chunk.runtime); const list = groupedByUsedExports.get(key); if (list !== undefined) { list.push(chunk); } else { groupedByUsedExports.set(key, [chunk]); } } return groupedByUsedExports.values();};
因而对于entry1.js
这样的入口文件来说,失去的groupedByUsedExports.values()
就是一个chunks:[app1]
对于common__g.js
这种被4个入口文件所应用的依赖,失去的groupedByUsedExports.values()
就是一个chunks:[app1,app2,app3,app4]
chunkGraph.getModuleChunksIterable
拿到对应module
所在的chunks
汇合,比方下图中的common__g.js
能够拿到的chunks
汇合为app1、app2、app3、app4
singleChunkSets、chunkSetsInGraph和chunkSets
一共有4种chunks
汇合,别离是:
[app1,app2,app3,app4]
[app1,app2,app3]
[app1,app2]
[app2,app3,app4]
对应着下面每一个module的chunks汇合
而入口文件和异步文件对应的module所造成的chunk因为数量为1,因而放在singleChunkSets
中
chunkSetsByCount
chunkSetsInGraph数据的变形,依据chunkSetsInGraph中item的长度,进行chunkSetsByCount的拼接,比方下面例子,造成的chunkSetsInGraph为:
一共有4种chunks
汇合,别离是:
[app1,app2,app3,app4]
[app1,app2,app3]
[app1,app2]
[app2,app3,app4]
转化为chunkSetsByCount:
小结
- 应用
groupChunksExports(module)
拿到该module
对应的所有chunk
汇合数据,放入到groupedByExportsMap
,groupedByExportsMap
是以key
=module
,value
=[[chunk1,chunk2], chunk1]
的数据结构 - 将所有
chunk
汇合数据通过getKey(chunks)
放入到chunkSetsInGraph
中,chunkSetsInGraph
是以key
=getKey(chunks)
,value
=chunk汇合数据
的数据结构
当咱们解决某一个module
时,通过groupedByExportsMap
拿到该module
对应的所有chunk
汇合数据,称为groupedByUsedExports
const groupedByUsedExports = groupedByExportsMap.get(module);
而后遍历所有chunk
汇合A,通过该数据汇合A造成的chunksKey
拿到chunkSetsInGraph
对应的chunk汇合数据
(该chunk汇合数据其实也是数据汇合A),同时还会利用chunkSetsByCount
获取数量比拟少,然而属于数据汇合A子集的数据汇合B(数据汇合B可能是其它module拿到的chunk汇合)
const groupedByUsedExports = groupedByExportsMap.get(module);for (const chunks of groupedByUsedExports) { const chunksKey = getKey(chunks); for (const comb of getExportsCombinations(chunksKey)) set.add(comb);}return set;
2.2.2 步骤2: getSelectedChunks()和cacheGroup.chunksFilter
for (const module of compilation.modules) { let cacheGroups = this.options.getCacheGroups(module, context); let cacheGroupIndex = 0; for (const cacheGroupSource of cacheGroups) { const cacheGroup = this._getCacheGroup(cacheGroupSource); // ============步骤1============ const combs = cacheGroup.usedExports ? getCombsByUsedExports() : getCombs(); for (const chunkCombination of combs) { const count = chunkCombination instanceof Chunk ? 1 : chunkCombination.size; if (count < cacheGroup.minChunks) continue; // ============步骤2============ const { chunks: selectedChunks, key: selectedChunksKey } = getSelectedChunks(chunkCombination, cacheGroup.chunksFilter); //... } cacheGroupIndex++; }}
cacheGroup.chunksFilter
webpack.config.js
如果传入"all"
,那么cacheGroup.chunksFilter
的内容为const ALL_CHUNK_FILTER = chunk => true;
const INITIAL_CHUNK_FILTER = chunk => chunk.canBeInitial();const ASYNC_CHUNK_FILTER = chunk => !chunk.canBeInitial();const ALL_CHUNK_FILTER = chunk => true;const normalizeChunksFilter = chunks => { if (chunks === "initial") { return INITIAL_CHUNK_FILTER; } if (chunks === "async") { return ASYNC_CHUNK_FILTER; } if (chunks === "all") { return ALL_CHUNK_FILTER; } if (typeof chunks === "function") { return chunks; }};const createCacheGroupSource = (options, key, defaultSizeTypes) => { //... return { //... chunksFilter: normalizeChunksFilter(options.chunks), //... };};const { chunks: selectedChunks, key: selectedChunksKey } = getSelectedChunks(chunkCombination, cacheGroup.chunksFilter);
webpack.config.js
如果传入"async"
/"initial"
呢?
从上面代码块咱们能够晓得
ChunkGroup
:chunk.canBeInitial()=false
- 同步
Entrypoint
:chunk.canBeInitial()=true
- 异步
Entrypoint
:chunk.canBeInitial()=false
class ChunkGroup { isInitial() { return false; }}class Entrypoint extends ChunkGroup { constructor(entryOptions, initial = true) { this._initial = initial; } isInitial() { return this._initial; }}// node_modules/webpack/lib/Compilation.jsaddAsyncEntrypoint(options, module, loc, request) { const entrypoint = new Entrypoint(options, false);}
getSelectedChunks()
从上面代码块能够晓得,应用chunkFilter()
进行chunks
数组的过滤,因为例子应用"all"
,chunkFilter()
任何条件下都会返回true
,因而这里的过滤条件根本没有应用,所有chunk
都合乎题意
chunkFilter()
实质就是通过splitChunks.chunks
配置的参数决定要不要通过_initial
来筛选,而后联合
一般ChunkGroup:_initial=false
Entrypoint类型的ChunkGroup:_initial=true
AsyncEntrypoint类型的ChunkGroup:_initial=false
进行数据的筛选
const getSelectedChunks = (chunks, chunkFilter) => { let entry = selectedChunksCacheByChunksSet.get(chunks); if (entry === undefined) { entry = new WeakMap(); selectedChunksCacheByChunksSet.set(chunks, entry); } let entry2 = entry.get(chunkFilter); if (entry2 === undefined) { const selectedChunks = []; if (chunks instanceof Chunk) { if (chunkFilter(chunks)) selectedChunks.push(chunks); } else { for (const chunk of chunks) { if (chunkFilter(chunk)) selectedChunks.push(chunk); } } entry2 = { chunks: selectedChunks, key: getKey(selectedChunks) }; entry.set(chunkFilter, entry2); } return entry2;}
2.2.3 步骤3: addModuleToChunksInfoMap
for (const module of compilation.modules) { let cacheGroups = this.options.getCacheGroups(module, context); let cacheGroupIndex = 0; for (const cacheGroupSource of cacheGroups) { const cacheGroup = this._getCacheGroup(cacheGroupSource); // ============步骤1============ const combs = cacheGroup.usedExports ? getCombsByUsedExports() : getCombs(); for (const chunkCombination of combs) { const count = chunkCombination instanceof Chunk ? 1 : chunkCombination.size; if (count < cacheGroup.minChunks) continue; // ============步骤2============ const { chunks: selectedChunks, key: selectedChunksKey } = getSelectedChunks(chunkCombination, cacheGroup.chunksFilter); // ============步骤3============ addModuleToChunksInfoMap( cacheGroup, cacheGroupIndex, selectedChunks, selectedChunksKey, module ); } cacheGroupIndex++; }}
构建chunksInfoMap
数据,每一个key
对应的item(蕴含modules、chunks、chunksKeys...)
就是chunksInfoMap
的元素
const addModuleToChunksInfoMap = (...) => { let info = chunksInfoMap.get(key); if (info === undefined) { chunksInfoMap.set( key, (info = { modules: new SortableSet( undefined, compareModulesByIdentifier ), chunks: new Set(), chunksKeys: new Set() }) ); } const oldSize = info.modules.size; info.modules.add(module); if (info.modules.size !== oldSize) { for (const type of module.getSourceTypes()) { info.sizes[type] = (info.sizes[type] || 0) + module.size(type); } } const oldChunksKeysSize = info.chunksKeys.size; info.chunksKeys.add(selectedChunksKey); if (oldChunksKeysSize !== info.chunksKeys.size) { for (const chunk of selectedChunks) { info.chunks.add(chunk); } }};
2.2.4 具体例子
webpack.config.js
的配置如下所示
cacheGroups:{ defaultVendors: { test: /[\\/]node_modules[\\/]/, priority: -10, reuseExistingChunk: true, }, default: { minChunks: 2, priority: -20, reuseExistingChunk: true, }, test3: { chunks: 'all', minChunks: 3, name: "test3", priority: 3 }, test2: { chunks: 'all', minChunks: 2, name: "test2", priority: 2 }}
在示例中,一共有4个入口文件
app1.js
:应用了js-cookie
、loadsh
第三方库app2.js
:应用了js-cookie
、loadsh
、vaca
第三方库app3.js
:应用了js-cookie
、vaca
第三方库app3.js
:应用了vaca
第三方库
这个时候回顾下整体的流程代码,内部循环是module
,拿到该module
对应的chunks
汇合,也就是combs
外部循环是cacheGroup
(也就是webpack.config.js
配置的分组),应用combs[i]
对每一个cacheGroup
进行遍历,实质就是minChunks
+chunksFilter
的筛选,而后将满足条件的数据通过addModuleToChunksInfoMap()
塞入到chunksInfoMap
中
for (const module of compilation.modules) { let cacheGroups = this.options.getCacheGroups(module, context); let cacheGroupIndex = 0; for (const cacheGroupSource of cacheGroups) { const cacheGroup = this._getCacheGroup(cacheGroupSource); // ============步骤1============ const combs = cacheGroup.usedExports ? getCombsByUsedExports() : getCombs(); for (const chunkCombination of combs) { const count = chunkCombination instanceof Chunk ? 1 : chunkCombination.size; if (count < cacheGroup.minChunks) continue; // ============步骤2============ const { chunks: selectedChunks, key: selectedChunksKey } = getSelectedChunks(chunkCombination, cacheGroup.chunksFilter); // ============步骤3============ addModuleToChunksInfoMap( cacheGroup, cacheGroupIndex, selectedChunks, selectedChunksKey, module ); } cacheGroupIndex++; }}
因为每一个入口文件都会造成一个Chunk
,因而一共会造成4个Chunk
,因为addModuleToChunksInfoMap()
是以module
为单位进行遍历的,因而咱们能够整顿出每一个module
蕴含的Chunk
的关系如下:
从下面的代码能够晓得,当咱们应用NormalModule="js-cookie"
时,通过getCombsByUserdExports()
会拿到5个chunks
汇合数据,也就是
留神:chunkSetsByCount
中Set(2){app1,app2}
自身不蕴含js-cookie
的,依照上图所示,应该蕴含的是loadsh
,然而满足isSubSet()
条件
[Set(3){app1,app2,app3}, Set(2){app1,app2}, app1, app2, app3]
而getCombsByUserdExports()
具体的执行逻辑如下图所示,通过chunkSetsByCount
获取对应的chunks
汇合
chunkSetsByCount
是key
=chunk数量
,value
=对应的chunk汇合(数量为key)
,比方key
=3
,value
=[Set(3){app1, app2, app3}]
而咱们代码中是遍历cacheGroup
的,因而咱们还要思考会命中哪些cacheGroup
NormalModule="js-cookie"
- cacheGroup=test3,拿到的combs汇合是
combs = [["app1","app2","app3"],["app1","app2"],"app1","app2","app3"]
遍历combs,因为cacheGroup.minChunks=3,因而最终过滤实现后,触发addModuleToChunksInfoMap()
的数据是
["app1","app2","app3"]
- cacheGroup=test2
combs = [["app1","app2","app3"],["app1","app2"],"app1","app2","app3"]
遍历combs,因为cacheGroup.minChunks=2,因而最终过滤实现后,触发addModuleToChunksInfoMap()
的数据是
["app1","app2","app3"]["app1","app2"]
- cacheGroup=default
combs = [["app1","app2","app3"],["app1","app2"],"app1","app2","app3"]
遍历combs,因为cacheGroup.minChunks=2,因而最终过滤实现后,触发addModuleToChunksInfoMap()
的数据是
["app1","app2","app3"]["app1","app2"]
- cacheGroup=defaultVendors
combs = [["app1","app2","app3"],["app1","app2"],"app1","app2","app3"]
遍历combs,因为cacheGroup.minChunks=2,因而最终过滤实现后,触发addModuleToChunksInfoMap()
的数据是
["app1","app2","app3"]["app1","app2"]["app1"]["app2"]["app3"]
chunksInfoMap的key
在整个流程中,咱们会应用一个属性key
贯通整个流程
比方上面代码中的chunksKey
const getCombs = memoize(() => { const chunks = chunkGraph.getModuleChunksIterable(module); const chunksKey = getKey(chunks); return getCombinations(chunksKey);});
addModuleToChunksInfoMap()
传入的selectedChunksKey
const getSelectedChunks = (chunks, chunkFilter) => { let entry = selectedChunksCacheByChunksSet.get(chunks); if (entry === undefined) { entry = new WeakMap(); selectedChunksCacheByChunksSet.set(chunks, entry); } /** @type {SelectedChunksResult} */ let entry2 = entry.get(chunkFilter); if (entry2 === undefined) { /** @type {Chunk[]} */ const selectedChunks = []; if (chunks instanceof Chunk) { if (chunkFilter(chunks)) selectedChunks.push(chunks); } else { for (const chunk of chunks) { if (chunkFilter(chunk)) selectedChunks.push(chunk); } } entry2 = { chunks: selectedChunks, key: getKey(selectedChunks) }; entry.set(chunkFilter, entry2); } return entry2;};const { chunks: selectedChunks, key: selectedChunksKey } = getSelectedChunks(chunkCombination, cacheGroup.chunksFilter);addModuleToChunksInfoMap( cacheGroup, cacheGroupIndex, selectedChunks, selectedChunksKey, module);
咱们将addModuleToChunksInfoMap()
最终造成的数据chunksInfoMap
革新下,如下所示,将对应的selectedChunksKey
换成以后module
的门路
const key = cacheGroup.key + (name ? ` name:${name}` : ` chunks:${keyToString(selectedChunksKey)}`);// 如果没有name,则增加对应的module.rawRequestconst key = cacheGroup.key + (name ? ` name:${name}` : ` chunks:${module.rawRequestkey} ${ToString(selectedChunksKey)}`);
最终造成的chunksInfoMap
如下所示,拿咱们下面的举例js-cookie
为参考,最终会依据不同的chunks
汇合造成不同的selectedChunksKey
,最终不同chunks
数据汇合造成chunksInfoMap
中不同key
的value
一部分,而不是把所有不同chunks
数据汇合都塞入到同一个key
中
2.3 queue阶段:依据minSize和minSizeReduction筛选chunksInfoMap数据
compilation.hooks.optimizeChunks.tap( { name: "SplitChunksPlugin", stage: STAGE_ADVANCED }, chunks => { logger.time("prepare"); //... logger.timeEnd("prepare"); logger.time("modules"); for (const module of compilation.modules) { //... } logger.timeEnd("modules"); logger.time("queue"); for (const [key, info] of chunksInfoMap) { //... } while (chunksInfoMap.size > 0) { //... } logger.timeEnd("queue"); logger.time("maxSize"); for (const chunk of Array.from(compilation.chunks)) { //... } logger.timeEnd("maxSize"); }}
maxSize 比 maxInitialRequest/maxAsyncRequests 具备更高的优先级,优先级maxInitialRequest/maxAsyncRequests
<maxSize
<minSize
// Filter items were size < minSizefor (const [key, info] of chunksInfoMap) { if (removeMinSizeViolatingModules(info)) { chunksInfoMap.delete(key); } else if ( !checkMinSizeReduction( info.sizes, info.cacheGroup.minSizeReduction, info.chunks.size ) ) { chunksInfoMap.delete(key); }}
removeMinSizeViolatingModules()
: 如上面代码块和图片所示,通过cacheGroup.minSize
判断目前info
的module
类型,比方javascript
的总体大小size
是否小于cacheGroup.minSize
,如果小于,则剔除这些类型的modules
,不造成新的chunk
了
在下面拼凑info.sizes[type]时,会将同种类型的size累加
const removeMinSizeViolatingModules = info => { const violatingSizes = getViolatingMinSizes( info.sizes, info.cacheGroup.minSize ); if (violatingSizes === undefined) return false; removeModulesWithSourceType(info, violatingSizes); return info.modules.size === 0;};const removeModulesWithSourceType = (info, sourceTypes) => { for (const module of info.modules) { const types = module.getSourceTypes(); if (sourceTypes.some(type => types.has(type))) { info.modules.delete(module); for (const type of types) { info.sizes[type] -= module.size(type); } } }};
checkMinSizeReduction()
: 波及到cacheGroup.minSizeReduction
配置,生成 chunk 所需的主 chunk(bundle)的最小体积(以字节为单位)缩减。这意味着如果宰割成一个 chunk 并没有缩小主 chunk(bundle)的给定字节数,它将不会被宰割,即便它满足 splitChunks.minSize
为了生成 chunk,splitChunks.minSizeReduction
与splitChunks.minSize
都须要被满足,如果提取出这些chunk,使得主chunk
缩小的体积少于cacheGroup.minSizeReduction
,那就不要提取进去造成新的chunk了
const checkMinSizeReduction = (sizes, minSizeReduction, chunkCount) => { // minSizeReduction数据结构跟minSize一样,都是{javascript: 200;unknown: 200} for (const key of Object.keys(minSizeReduction)) { const size = sizes[key]; if (size === undefined || size === 0) continue; if (size * chunkCount < minSizeReduction[key]) return false; } return true;};
2.4 queue阶段:遍历chunksInfoMap,依据规定进行chunk的从新组织
compilation.hooks.optimizeChunks.tap( { name: "SplitChunksPlugin", stage: STAGE_ADVANCED }, chunks => { logger.time("prepare"); //... logger.timeEnd("prepare"); logger.time("modules"); for (const module of compilation.modules) { //... } logger.timeEnd("modules"); logger.time("queue"); for (const [key, info] of chunksInfoMap) { //... } while (chunksInfoMap.size > 0) { //... } logger.timeEnd("queue"); logger.time("maxSize"); for (const chunk of Array.from(compilation.chunks)) { //... } logger.timeEnd("maxSize"); }}
chunksInfoMap每一个元素info,实质就是一个cacheGroup,这个cacheGroup带有chunks和modules
while (chunksInfoMap.size > 0) { //compareEntries比拟优先级构建bestEntry // }// ...解决maxSize,下一个大节
2.4.1 compareEntries找到优先级最高的chunksInfoMap的item
找出cacheGroup的优先级哪个比拟高,因为有一些chunk是合乎多个cacheGroup的,优先级高优先进行宰割,优先产生打包后果
依据以下属性从上到下的优先级进行两个info
的排序,拿到最高级的那个info
,即chunksInfoMap的item
let bestEntryKey;let bestEntry;for (const pair of chunksInfoMap) { const key = pair[0]; const info = pair[1]; if ( bestEntry === undefined || compareEntries(bestEntry, info) < 0 ) { bestEntry = info; bestEntryKey = key; }}const item = bestEntry;chunksInfoMap.delete(bestEntryKey);
具体的比拟办法在compareEntries()
中,从代码中能够看出
priority
:数值越大,优先级越高chunks.size
:数量最多,优先级越高size reduction
:totalSize(a.sizes) * (a.chunks.size - 1)数值越大,优先级越高cache group index
:数值越小,优先级越高number of modules
:数值越大,优先级越高module identifiers
:数值越大,优先级越高
const compareEntries = (a, b) => { // 1. by priority const diffPriority = a.cacheGroup.priority - b.cacheGroup.priority; if (diffPriority) return diffPriority; // 2. by number of chunks const diffCount = a.chunks.size - b.chunks.size; if (diffCount) return diffCount; // 3. by size reduction const aSizeReduce = totalSize(a.sizes) * (a.chunks.size - 1); const bSizeReduce = totalSize(b.sizes) * (b.chunks.size - 1); const diffSizeReduce = aSizeReduce - bSizeReduce; if (diffSizeReduce) return diffSizeReduce; // 4. by cache group index const indexDiff = b.cacheGroupIndex - a.cacheGroupIndex; if (indexDiff) return indexDiff; // 5. by number of modules (to be able to compare by identifier) const modulesA = a.modules; const modulesB = b.modules; const diff = modulesA.size - modulesB.size; if (diff) return diff; // 6. by module identifiers modulesA.sort(); modulesB.sort(); return compareModuleIterables(modulesA, modulesB);};
拿到bestEntry
后,从chunksInfoMap
删除掉它,而后就对这个bestEntry
进行解决
let bestEntryKey;let bestEntry;for (const pair of chunksInfoMap) { const key = pair[0]; const info = pair[1]; if ( bestEntry === undefined || compareEntries(bestEntry, info) < 0 ) { bestEntry = info; bestEntryKey = key; }}const item = bestEntry;chunksInfoMap.delete(bestEntryKey);
2.4.2 开始解决chunksInfoMap中拿到优先级最大的item
拿到优先级最大的chunksInfoMap item
,称为bestEntry
isExistingChunk
先进行了isExistingChunk
的检测,如果cacheGroup
有名称,并且从名称中拿到了曾经存在的chunk
,间接复用该chunk
波及到webpack.config.js
配置参数reuseExistingChunk
,能够参考reuseExistingChunk具体例子,次要就是复用现有的chunk
,而不是创立新的chunk
Chunk 1 (named one): modules A
Chunk 2 (named two): no modules(removed by optimization)
Chunk 3 (named one~two): modules B, C下面是配置了reuseExistingChunk=
false
上面是配置了reuseExistingChunk=true
Chunk 1 (named one): modules A
Chunk 2 (named two): modules B, C
如果cacheGroup
没有名称,则遍历item.chunks
寻找能复用的chunk
最终后果是将是否复用的chunk
赋值给newChunk
,并且设置isExistingChunk
=true
let chunkName = item.name;let newChunk;if (chunkName) { const chunkByName = compilation.namedChunks.get(chunkName); if (chunkByName !== undefined) { newChunk = chunkByName; const oldSize = item.chunks.size; item.chunks.delete(newChunk); isExistingChunk = item.chunks.size !== oldSize; }} else if (item.cacheGroup.reuseExistingChunk) { outer: for (const chunk of item.chunks) { if ( chunkGraph.getNumberOfChunkModules(chunk) !== item.modules.size ) { continue; } if ( item.chunks.size > 1 && chunkGraph.getNumberOfEntryModules(chunk) > 0 ) { continue; } for (const module of item.modules) { if (!chunkGraph.isModuleInChunk(module, chunk)) { continue outer; } } if (!newChunk || !newChunk.name) { newChunk = chunk; } else if ( chunk.name && chunk.name.length < newChunk.name.length ) { newChunk = chunk; } else if ( chunk.name && chunk.name.length === newChunk.name.length && chunk.name < newChunk.name ) { newChunk = chunk; } } if (newChunk) { item.chunks.delete(newChunk); chunkName = undefined; isExistingChunk = true; isReusedWithAllModules = true; }}
enforceSizeThreshold
webpack.config.js
配置参数splitChunks.enforceSizeThreshold
当一个chunk
的大小超过enforceSizeThreshold
时,执行拆分的大小阈值和其余限度(minRemainingSize、maxAsyncRequests、maxInitialRequests)将被疏忽
如上面代码所示,如果item.sizes[i]
大于enforceSizeThreshold
,那么enforced
=true
,就不必执行接下来的maxInitialRequests和maxAsyncRequests测验
const hasNonZeroSizes = sizes => { for (const key of Object.keys(sizes)) { if (sizes[key] > 0) return true; } return false;};const checkMinSize = (sizes, minSize) => { for (const key of Object.keys(minSize)) { const size = sizes[key]; if (size === undefined || size === 0) continue; if (size < minSize[key]) return false; } return true;};// _conditionalEnforce: hasNonZeroSizes(enforceSizeThreshold)const enforced = item.cacheGroup._conditionalEnforce && checkMinSize(item.sizes, item.cacheGroup.enforceSizeThreshold);
maxInitialRequests和maxAsyncRequests
maxInitialRequests
: 入口点的最大并行申请数maxAsyncRequests
: 按需加载时的最大并行申请数。
maxSize 比 maxInitialRequest/maxAsyncRequests 具备更高的优先级。理论优先级是 maxInitialRequest/maxAsyncRequests < maxSize < minSize
检测目前usedChunks
中每一个chunk
所持有的chunkGroup
总体数量是否大于cacheGroup.maxInitialRequests
或者是cacheGroup.maxAsyncRequests
,如果超过这个限度,则删除这个chunk
const usedChunks = new Set(item.chunks);if ( !enforced && (Number.isFinite(item.cacheGroup.maxInitialRequests) || Number.isFinite(item.cacheGroup.maxAsyncRequests))) { for (const chunk of usedChunks) { // respect max requests const maxRequests = chunk.isOnlyInitial() ? item.cacheGroup.maxInitialRequests : chunk.canBeInitial() ? Math.min( item.cacheGroup.maxInitialRequests, item.cacheGroup.maxAsyncRequests ) : item.cacheGroup.maxAsyncRequests; if ( isFinite(maxRequests) && getRequests(chunk) >= maxRequests ) { usedChunks.delete(chunk); } }}
chunkGraph.isModuleInChunk(module, chunk)
进行一些chunk
的剔除,在一直迭代进行切割分组时,可能存在某一个bestEntry
的module曾经被其它bestEntry
分走,然而chunk还没清理的状况,这个时候通过chunkGraph.isModuleInChunk
检测是否存在chunk不被bestEntry
外面所有module所须要,如果存在,间接删除该chunk
outer: for (const chunk of usedChunks) { for (const module of item.modules) { if (chunkGraph.isModuleInChunk(module, chunk)) continue outer; } usedChunks.delete(chunk);}
usedChunks.size<item.chunks.size
如果usedChunks
删除了一些chunk
,那么从新应用addModuleToChunksInfoMap()
建设新的元素到chunksInfoMap
,即去除不符合条件的chunk之后的重新加入chunksInfoMap
造成新的cacheGroup组
一开始咱们遍历chunksInfoMap
时,会删除目前解决的bestEntry
,这个时候处理完毕后重新加入到chunksInfoMap
,而后再进入循环进行解决
// Were some (invalid) chunks removed from usedChunks?// => readd all modules to the queue, as things could have been changedif (usedChunks.size < item.chunks.size) { if (isExistingChunk) usedChunks.add(newChunk); if (usedChunks.size >= item.cacheGroup.minChunks) { const chunksArr = Array.from(usedChunks); for (const module of item.modules) { addModuleToChunksInfoMap( item.cacheGroup, item.cacheGroupIndex, chunksArr, getKey(usedChunks), module ); } } continue;}
minRemainingSize
webpack.config.js
配置参数splitChunks.minRemainingSize
,仅在残余单个 chunk 时失效
通过确保拆分后残余的最小chunk体积超过限度来防止大小为零的模块,"development"
模式中默认为0
const getViolatingMinSizes = (sizes, minSize) => { let list; for (const key of Object.keys(minSize)) { const size = sizes[key]; if (size === undefined || size === 0) continue; if (size < minSize[key]) { if (list === undefined) list = [key]; else list.push(key); } } return list;};// Validate minRemainingSize constraint when a single chunk is left overif ( !enforced && item.cacheGroup._validateRemainingSize && usedChunks.size === 1) { const [chunk] = usedChunks; let chunkSizes = Object.create(null); for (const module of chunkGraph.getChunkModulesIterable(chunk)) { if (!item.modules.has(module)) { for (const type of module.getSourceTypes()) { chunkSizes[type] = (chunkSizes[type] || 0) + module.size(type); } } } const violatingSizes = getViolatingMinSizes( chunkSizes, item.cacheGroup.minRemainingSize ); if (violatingSizes !== undefined) { const oldModulesSize = item.modules.size; removeModulesWithSourceType(item, violatingSizes); if ( item.modules.size > 0 && item.modules.size !== oldModulesSize ) { // queue this item again to be processed again // without violating modules chunksInfoMap.set(bestEntryKey, item); } continue; }}
如下面代码所示,先应用getViolatingMinSizes()
失去size
太小的类型汇合,而后应用removeModulesWithSourceType()
删除对应的module
(如上面代码所示),同时更新对应的sizes
属性,最终将更新结束的item
从新放入到chunksInfoMap
此时的chunksInfoMap
对应的bestEntryKey
数据曾经删除小的modules
const removeModulesWithSourceType = (info, sourceTypes) => { for (const module of info.modules) { const types = module.getSourceTypes(); if (sourceTypes.some(type => types.has(type))) { info.modules.delete(module); for (const type of types) { info.sizes[type] -= module.size(type); } } }};
创立newChunk以及chunk.split(newChunk)
如果没有能够复用的Chunk
,就应用compilation.addChunk(chunkName)
建设一个新的Chunk
// Create the new chunk if not reusing oneif (newChunk === undefined) { newChunk = compilation.addChunk(chunkName);}// Walk through all chunksfor (const chunk of usedChunks) { // Add graph connections for splitted chunk chunk.split(newChunk);}
在下面的剖析能够晓得,isReusedWithAllModules
=true
代表的是cacheGroup
没有名称,遍历所有item.chunks
找到能够复用的Chunk
,因而这里不必connectChunkAndModule()
建设新的分割,只须要将所有的item.modules
跟item.chunks
解除关联
而当isReusedWithAllModules
=false
时,须要将所有的item.modules
跟item.chunks
解除关联,将所有item.modules
与newChunk
建立联系
比方构建出往app3这个入口Chunk的ChunkGroup插入newChunk,建设它们的依赖关系,在前面生成代码时能够正确生成对应的依赖关系,即app3-Chunk能够加载newChunk,毕竟是从app1、app2、app3、app4这4个Chunk分离出来的newChunk
if (!isReusedWithAllModules) { // Add all modules to the new chunk for (const module of item.modules) { if (!module.chunkCondition(newChunk, compilation)) continue; // Add module to new chunk chunkGraph.connectChunkAndModule(newChunk, module); // Remove module from used chunks for (const chunk of usedChunks) { chunkGraph.disconnectChunkAndModule(chunk, module); } }} else { // Remove all modules from used chunks for (const module of item.modules) { for (const chunk of usedChunks) { chunkGraph.disconnectChunkAndModule(chunk, module); } }}
将目前newChunk
更新到maxSizeQueueMap
,期待后续maxSize阶段
解决
if ( Object.keys(item.cacheGroup.maxAsyncSize).length > 0 || Object.keys(item.cacheGroup.maxInitialSize).length > 0) { const oldMaxSizeSettings = maxSizeQueueMap.get(newChunk); maxSizeQueueMap.set(newChunk, { minSize: oldMaxSizeSettings ? combineSizes( oldMaxSizeSettings.minSize, item.cacheGroup._minSizeForMaxSize, Math.max ) : item.cacheGroup.minSize, maxAsyncSize: oldMaxSizeSettings ? combineSizes( oldMaxSizeSettings.maxAsyncSize, item.cacheGroup.maxAsyncSize, Math.min ) : item.cacheGroup.maxAsyncSize, maxInitialSize: oldMaxSizeSettings ? combineSizes( oldMaxSizeSettings.maxInitialSize, item.cacheGroup.maxInitialSize, Math.min ) : item.cacheGroup.maxInitialSize, automaticNameDelimiter: item.cacheGroup.automaticNameDelimiter, keys: oldMaxSizeSettings ? oldMaxSizeSettings.keys.concat(item.cacheGroup.key) : [item.cacheGroup.key] });}
删除其它chunksInfoMap item的info.modules[i]
以后解决的是最高优先级的chunksInfoMap item
,处理完毕后,检测其它chunksInfoMap item
的info.chunks
是否有目前最高优先级的chunksInfoMap item
的chunks
有的话应用info.modules
和item.modules
比拟,删除其它chunksInfoMap item
的info.modules[i]
删除实现后,检测info.modules.size
是否等于0以及checkMinSizeReduction()
,而后决定是否要进行cacheGroup
的革除工作
checkMinSizeReduction()
波及到cacheGroup.minSizeReduction
配置,生成 chunk 所需的主 chunk(bundle)的最小体积(以字节为单位)缩减。这意味着如果宰割成一个 chunk 并没有缩小主 chunk(bundle)的给定字节数,它将不会被宰割,即便它满足splitChunks.minSize
const isOverlap = (a, b) => { for (const item of a) { if (b.has(item)) return true; } return false;};// remove all modules from other entries and update sizefor (const [key, info] of chunksInfoMap) { if (isOverlap(info.chunks, usedChunks)) { // update modules and total size // may remove it from the map when < minSize let updated = false; for (const module of item.modules) { if (info.modules.has(module)) { // remove module info.modules.delete(module); // update size for (const key of module.getSourceTypes()) { info.sizes[key] -= module.size(key); } updated = true; } } if (updated) { if (info.modules.size === 0) { chunksInfoMap.delete(key); continue; } if ( removeMinSizeViolatingModules(info) || !checkMinSizeReduction( info.sizes, info.cacheGroup.minSizeReduction, info.chunks.size ) ) { chunksInfoMap.delete(key); continue; } } }}
2.5 maxSize阶段:依据maxSize,将过大的chunk进行再分包
compilation.hooks.optimizeChunks.tap( { name: "SplitChunksPlugin", stage: STAGE_ADVANCED }, chunks => { logger.time("prepare"); //... logger.timeEnd("prepare"); logger.time("modules"); for (const module of compilation.modules) { //... } logger.timeEnd("modules"); logger.time("queue"); for (const [key, info] of chunksInfoMap) { //... } while (chunksInfoMap.size > 0) { //... } logger.timeEnd("queue"); logger.time("maxSize"); for (const chunk of Array.from(compilation.chunks)) { //... } logger.timeEnd("maxSize"); }}
maxSize 比 maxInitialRequest/maxAsyncRequests 具备更高的优先级。理论优先级是 maxInitialRequest/maxAsyncRequests < maxSize < minSize
应用 maxSize通知 webpack 尝试将大于 maxSize 个字节的 chunk 宰割成较小的局部。 这些较小的局部在体积上至多为 minSize(仅次于 maxSize)
maxSize 选项旨在与 HTTP/2 和长期缓存一起应用。它减少了申请数量以实现更好的缓存。它还能够用于减小文件大小,以放慢二次构建速度
在下面cacheGroups
生成的chunks
合并到入口和异步造成的chunks
后,咱们将校验maxSize
值,如果生成的chunks
体积过大,还须要再次分包!
比方咱们上面配置中,申明一个maxSize: 50
splitChunks: { minSize: 1, chunks: 'all', maxInitialRequests: 10, maxAsyncRequests: 10, maxSize: 50, cacheGroups: { test3: { chunks: 'all', minChunks: 3, name: "test3", priority: 3 }, test2: { chunks: 'all', minChunks: 2, name: "test2", priority: 2 } }}
最终生成的文件中,本来只有app1.js
和app2.js
,因为maxSize
的限度,咱们切割为3个app1-xxx.js
和2个app2-xxx.js
文件
发现问题
- maxSize是如何切割的?依据
NormalModule
进行切割的吗? - 如果maxSize过小,会不会有数量限度?
- 这些切割的文件是如何在运行中合并起来的呢?是通过
runtime
代码吗?还是通过chunkGroup
?同一个chunkGroup
中有不同的chunks
? - maxSize是如何解决切割不均的状况,比方切割成为两局部,如何保障两局部都大于
minSize
又小于maxSize
?
接下来的源码会次要围绕这下面问题进行剖析
如上面代码所示,咱们应用deterministicGroupingForModules
进行chunk
的切割失去多个后果results
如果切割后果result.length<=1
,那么证实不必切割,不必解决
如果切割后果result.length>1
- 咱们须要将切割进去的
newPart
插入到chunk
对应的ChunkGroup
中 - 咱们须要将切割完的每一个
chunk
和它对应的module
关联:chunkGraph.connectChunkAndModule(newPart, module)
- 同时须要将原来一个大的
chunk
跟目前newPartChunk
对应的module
进行解除关联:chunkGraph.disconnectChunkAndModule(chunk, module)
//node_modules/webpack/lib/optimize/SplitChunksPlugin.jsfor (const chunk of Array.from(compilation.chunks)) { //... const results = deterministicGroupingForModules({ ...}); if (results.length <= 1) { continue; } for (let i = 0; i < results.length; i++) { const group = results[i]; //... if (i !== results.length - 1) { const newPart = compilation.addChunk(name); chunk.split(newPart); newPart.chunkReason = chunk.chunkReason; // Add all modules to the new chunk for (const module of group.items) { if (!module.chunkCondition(newPart, compilation)) { continue; } // Add module to new chunk chunkGraph.connectChunkAndModule(newPart, module); // Remove module from used chunks chunkGraph.disconnectChunkAndModule(chunk, module); } } else { // change the chunk to be a part chunk.name = name; } }}
2.5.1 deterministicGroupingForModules宰割外围办法
切割的最小单位是NormalModule
,如果一个NormalModule
十分大,则间接成为一个组,也就是新的chunk
const nodes = Array.from( items, item => new Node(item, getKey(item), getSize(item)));for (const node of nodes) { if (isTooBig(node.size, maxSize) && !isTooSmall(node.size, minSize)) { result.push(new Group([node], [])); } else { //.... }}
如果单个NormalModule
小于maxSize
,则退出到initialNodes
中
for (const node of nodes) { if (isTooBig(node.size, maxSize) && !isTooSmall(node.size, minSize)) { result.push(new Group([node], [])); } else { initialNodes.push(node); }}
而后咱们会进行initialNodes
的解决,因为initialNodes[i]
是小于maxSize
的,然而多个initialNodes[i]
合并起来未必小于maxSize
,因而咱们咱们得分状况探讨
if (initialNodes.length > 0) { const initialGroup = new Group(initialNodes, getSimilarities(initialNodes)); if (initialGroup.nodes.length > 0) { const queue = [initialGroup]; while (queue.length) { const group = queue.pop(); // 步骤1:判断整体大小是否还小于maxSize // 步骤2:removeProblematicNodes()后重新处理 // 步骤3:宰割右边和左边局部,使得leftSize>minSize && rightSize>minSize // 步骤3-1:判断宰割是否重叠,即left-1>right // 步骤3-2:判断left和right两头是否有元素还没纳入左右两个区间内,即宰割两头依然有空余的局部 // 步骤4: 为左区间和右区间创立不同的Group数据,而后压入queue中重新处理 } } // 步骤5: 赋值key,造成最终数据结构返回}
步骤1: 判断是否存在类型大于maxSize
如果所有type
类型数据的大小都无奈大于maxSize
,那就没有宰割的必要性,间接退出后果result
即可
这里的group.size是所有类型加起来的大小
if (initialNodes.length > 0) { const initialGroup = new Group(initialNodes, getSimilarities(initialNodes)); if (initialGroup.nodes.length > 0) { const queue = [initialGroup]; while (queue.length) { const group = queue.pop(); // 步骤1: 判断是否存在类型大于maxSize if (!isTooBig(group.size, maxSize)) { result.push(group); continue; } // 步骤2:removeProblematicNodes()找出是否有类型是小于minSize // 步骤3:宰割右边和左边局部,使得leftSize>minSize && rightSize>minSize // 步骤3-1:判断宰割是否重叠,即left-1>right // 步骤3-2:判断left和right两头是否有元素还没纳入左右两个区间内,即宰割两头依然有空余的局部 // 步骤4: 为左区间和右区间创立不同的Group数据,而后压入queue中重新处理 } }}
步骤2:removeProblematicNodes()找出是否有类型是小于minSize
如果有类型小于minSize,尝试拆出来group
中蕴含该类型的node
数据,而后合并到其它Group
中,而后重新处理group
这个办法十分高频,前面多个流程都呈现该办法,因而须要好好剖析下,见上面的步骤2
if (initialNodes.length > 0) { const initialGroup = new Group(initialNodes, getSimilarities(initialNodes)); if (initialGroup.nodes.length > 0) { const queue = [initialGroup]; while (queue.length) { const group = queue.pop(); // 步骤1: 判断是否存在类型大于maxSize if (!isTooBig(group.size, maxSize)) { result.push(group); continue; } // 步骤2:removeProblematicNodes()找出是否有类型是小于minSize if (removeProblematicNodes(group)) { // This changed something, so we try this group again queue.push(group); continue; } // 步骤3:宰割右边和左边局部,使得leftSize>minSize && rightSize>minSize // 步骤3-1:判断宰割是否重叠,即left-1>right // 步骤3-2:判断left和right两头是否有元素还没纳入左右两个区间内,即宰割两头依然有空余的局部 // 步骤4: 为左区间和右区间创立不同的Group数据,而后压入queue中重新处理 } }}
getTooSmallTypes()
:传入的size={**javascript**: 125}
,minSize={**javascript**: 61,**unknown: 61}**
,比拟失去目前不满足要求的类型的数组,比方types=["javascript"]
const removeProblematicNodes = (group, consideredSize = group.size) => { const problemTypes = getTooSmallTypes(consideredSize, minSize); if (problemTypes.size > 0) { //... return true; } else return false;};const getTooSmallTypes = (size, minSize) => { const types = new Set(); for (const key of Object.keys(size)) { const s = size[key]; if (s === 0) continue; const minSizeValue = minSize[key]; if (typeof minSizeValue === "number") { if (s < minSizeValue) types.add(key); } } return types;};
咱们从getTooSmallTypes()
失去目前group
中不满足minSize
的类型数组problemTypes
getNumberOfMatchingSizeTypes()
:依据传入的node.size
和problemTypes
判断该node
是否是问题节点,如果该node
蕴含不满足minSize
的types
咱们通过group.popNodes+getNumberOfMatchingSizeTypes()
获取问题的节点problemNodes
,而后通过result+getNumberOfMatchingSizeTypes()
获取那些自身满足minSize
+maxSize
的汇合possibleResultGroups
const removeProblematicNodes = (group, consideredSize = group.size) => { const problemTypes = getTooSmallTypes(consideredSize, minSize); if (problemTypes.size > 0) { const problemNodes = group.popNodes( n => getNumberOfMatchingSizeTypes(n.size, problemTypes) > 0 ); if (problemNodes === undefined) return false; // Only merge it with result nodes that have the problematic size type const possibleResultGroups = result.filter( n => getNumberOfMatchingSizeTypes(n.size, problemTypes) > 0 ); } else return false;}
const getNumberOfMatchingSizeTypes = (size, types) => { let i = 0; for (const key of Object.keys(size)) { if (size[key] !== 0 && types.has(key)) i++; } return i;};// for (const node of nodes) {// if (isTooBig(node.size, maxSize) && !isTooSmall(node.size, minSize)) {// result.push(new Group([node], []));// } else {// initialNodes.push(node);// }// }
那为什么咱们要获取自身满足minSize
+maxSize
的汇合possibleResultGroups
呢?从上面代码咱们能够晓得,咱们拿到汇合后,再进行了筛选,筛选出那些更加合乎problemTypes
问题类型的group
,称为bestGroup
const removeProblematicNodes = (group, consideredSize = group.size) => { const problemTypes = getTooSmallTypes(consideredSize, minSize); if (problemTypes.size > 0) { const problemNodes = group.popNodes( n => getNumberOfMatchingSizeTypes(n.size, problemTypes) > 0 ); if (problemNodes === undefined) return false; const possibleResultGroups = result.filter( n => getNumberOfMatchingSizeTypes(n.size, problemTypes) > 0 ); if (possibleResultGroups.length > 0) { const bestGroup = possibleResultGroups.reduce((min, group) => { const minMatches = getNumberOfMatchingSizeTypes(min, problemTypes); const groupMatches = getNumberOfMatchingSizeTypes( group, problemTypes ); if (minMatches !== groupMatches) return minMatches < groupMatches ? group : min; if ( selectiveSizeSum(min.size, problemTypes) > selectiveSizeSum(group.size, problemTypes) ) return group; return min; }); for (const node of problemNodes) bestGroup.nodes.push(node); bestGroup.nodes.sort((a, b) => { if (a.key < b.key) return -1; if (a.key > b.key) return 1; return 0; }); } else { //... } return true; } else return false;}//Group的一个办法popNodes(filter) { debugger; const newNodes = []; const newSimilarities = []; const resultNodes = []; let lastNode; for (let i = 0; i < this.nodes.length; i++) { const node = this.nodes[i]; if (filter(node)) { resultNodes.push(node); } else { if (newNodes.length > 0) { newSimilarities.push( lastNode === this.nodes[i - 1] ? this.similarities[i - 1] : similarity(lastNode.key, node.key) ); } newNodes.push(node); lastNode = node; } } if (resultNodes.length === this.nodes.length) return undefined; this.nodes = newNodes; this.similarities = newSimilarities; this.size = sumSize(newNodes); return resultNodes;}
而后将这些不满足minSize
的node
合并到自身满足minSize
+maxSize
的group
中,次要外围就是上面这一句代码,那为什么要这么做呢?
for (const node of problemNodes) bestGroup.nodes.push(node);
起因就是这些node
自身是无奈满足minSize
的,也就是整体太小了,这个时候将它合并到目前最好最有可能接收它的汇合中,就能够解决满足它所须要的minSize
当然,也有可能找不到能够接收它的汇合,那咱们只能从新创立一个new Group()
接收它了
if (possibleResultGroups.length > 0) { //...} else { // There are no other nodes with the same size types // We create a new group and have to accept that it's smaller than minSize result.push(new Group(problemNodes, null));}return true;
小结
removeProblematicNodes()
传输group
和consideredSize
,其中consideredSize
是一个Object
对象,它须要跟minSize
对象进行比照,而后获取对应的类型数组problemTypes
,而后检测是否从传入的汇合group
抽离出problemTypes
类型的一些node
汇合,而后合并到曾经确定的result
汇合/新建一个new Group()
汇合中
如果抽离进去的node
汇合等于group
自身,则间接返回false,不进行任何合并/新建操作
如果group
汇合中没有任何类型是小于minSize
的,则返回false,不进行任何合并/新建操作
如果group
汇合的问题类型数组找不到对应能够合并的result
汇合,则放入到new Group()
汇合
发现问题
- bestGroup.nodes.push(node)之后会不会超过maxSize?
步骤3:宰割右边和左边局部,使得leftSize>minSize && rightSize>minSize
步骤1是判断整体的size
是否不满足maxSize
切割,而步骤2则是判断局部属性是否不满足minSize
的要求,如果有,则须要合并到其它group
/新建new Group()
接收它,无论是哪种后果,都须要将旧的group
/新的group
压入queue
中从新进行解决
在经验步骤1和步骤2对于minSize
的解决后,步骤3开始进行左右区域的合并,要求左右区域都满足大于或等于minSize
// left v v right// [ O O O ] O O O [ O O O ]// ^^^^^^^^^ leftSize// rightSize ^^^^^^^^^// leftSize > minSize// rightSize > minSize// r l// Perfect split: [ O O O ] [ O O O ]// right === left - 1
let left = 1;let leftSize = Object.create(null);addSizeTo(leftSize, group.nodes[0].size);while (left < group.nodes.length && isTooSmall(leftSize, minSize)) { addSizeTo(leftSize, group.nodes[left].size); left++;}let right = group.nodes.length - 2;let rightSize = Object.create(null);addSizeTo(rightSize, group.nodes[group.nodes.length - 1].size);while (right >= 0 && isTooSmall(rightSize, minSize)) { addSizeTo(rightSize, group.nodes[right].size); right--;}
合并后,会呈现三种状况
right === left - 1
: 完满切割,不必解决right < left - 1
: 两个区域有重叠的中央right > left - 1
: 两个区域两头存在没有应用的区域
right < left - 1
比拟目前left
和right
的地位,取占据较为位的一边,减去最右边/最左边的size,此时prevSize
必定不满足minSize,因为从下面的剖析能够晓得,都是间接addSizeTo
使得leftArea
和rightArea
都满足minSize
通过removeProblematicNodes()
传入以后group
和prevSize
,通过prevSize
和minSize
的比拟获取问题类型数组problemTypes
,而后依据目前的problemTypes
获取子集合(group
的一部分或者undefined)
如果依据目前的problemTypes
拿到的就是group
,则无奈合并该子集到其它chunk
中,removeProblematicNodes()
返回false
如果该子集是group
的一部分,则合并到其它曾经造成的result
(多个group
汇合)中最适宜的一个(依据problemTypes
类型越多合乎的准则),而后将剩下的合乎minSize
的group
局部放入queue
中,从新进行解决
if (left - 1 > right) { let prevSize; // left左边残余的数量 比 right右边的数量 大 // a b c d e f g // r l if (right < group.nodes.length - left) { subtractSizeFrom(rightSize, group.nodes[right + 1].size); prevSize = rightSize; } else { subtractSizeFrom(leftSize, group.nodes[left - 1].size); prevSize = leftSize; } if (removeProblematicNodes(group, prevSize)) { queue.push(group); continue; } // can't split group while holding minSize // because minSize is preferred of maxSize we return // the problematic nodes as result here even while it's too big // To avoid this make sure maxSize > minSize * 3 result.push(group); continue;}
咱们从步骤1能够晓得,目前group
必定存在大于maxSize
的类型,并且通过步骤2的removeProblematicNodes()
,咱们要么剔除不了那些小于minSize
类型的数据,要么不存在小于小于minSize
类型的数据
而left - 1 > right
代表着步骤2中咱们剔除不了那些小于minSize
类型的数据,因而咱们在步骤3再次尝试剔除小于minSize
类型的数据,如果失败,因为优先级minSize>maxSize
,即便以后group
存在类型大于maxSize
,然而强行分区leftArea
和rightArea
必定不能满足minSize
的要求,因而疏忽maxSize
,间接为以后group
建设一个chunk
具体例子
从上面的例子能够晓得,app1
这个chunk
的大小是超过maxSize=124的,然而它是满足minSize大小的,如果强行拆分为两个chunk
,maxSize
可能满足,然而minSize
就无奈满足,因为优先级minSize>maxSize
,因而只能放弃maxSize
而抉择minSize
上面例子只是其中一种比拟常见的状况,必定还有其它状况,因为笔者精力有限,在该逻辑代码中没有再持续深入研究,请参考别人文章进行深刻理解left - 1 > right
步骤3的解决
right > left - 1
两个区域两头存在没有应用的区域,应用similarity
寻找最佳宰割点,寻找最小的similarity
进行切割,分为左右两半
其中pos=left
,而后在[left, right]
中进行递增,其中rightSize
为[pos, nodes.length-1]
的总和
在一直递增pos
的过程中,一直减少leftSize
以及一直缩小对应的rightSize
,判断是否会小于minSize
,通过group.similarities
找到最小的值,也就是类似度最小的两个地位(文件门路差距最大的两个地位),进行切割
if (left <= right) { let best = -1; let bestSimilarity = Infinity; let pos = left; let rightSize = sumSize(group.nodes.slice(pos)); while (pos <= right + 1) { const similarity = group.similarities[pos - 1]; if ( similarity < bestSimilarity && !isTooSmall(leftSize, minSize) && !isTooSmall(rightSize, minSize) ) { best = pos; bestSimilarity = similarity; } addSizeTo(leftSize, group.nodes[pos].size); subtractSizeFrom(rightSize, group.nodes[pos].size); pos++; } if (best < 0) { result.push(group); continue; } left = best; right = best - 1;}
而group.similarities[pos - 1]
是什么意思呢?
依据两个相邻的node.key
,similarity()
进行每一个字符的比拟,比方
- 最靠近一样的两个key,
ca-cb=5
,10 - Math.abs(ca - cb)
=5
- 不雷同的两个key,
ca-cb=6
,10 - Math.abs(ca - cb)
=4
- 两个key不雷同到离谱,则10 - Math.abs(ca - cb)<0,最终
Math.max(0, 10 - Math.abs(ca - cb))
=0
因而两个相邻node
对应的node.key
最靠近,similarities[x]
最大
const initialGroup = new Group(initialNodes, getSimilarities(initialNodes))const getSimilarities = nodes => { // calculate similarities between lexically adjacent nodes /** @type {number[]} */ const similarities = []; let last = undefined; for (const node of nodes) { if (last !== undefined) { similarities.push(similarity(last.key, node.key)); } last = node; } return similarities;};const similarity = (a, b) => { const l = Math.min(a.length, b.length); let dist = 0; for (let i = 0; i < l; i++) { const ca = a.charCodeAt(i); const cb = b.charCodeAt(i); dist += Math.max(0, 10 - Math.abs(ca - cb)); } return dist;};
而在一开始的时候,咱们就依据node.key
进行了排序
const initialNodes = [];// lexically ordering of keysnodes.sort((a, b) => { if (a.key < b.key) return -1; if (a.key > b.key) return 1; return 0;});
因而应用similarity
寻找最佳宰割点,寻找最小的similarity
进行切割,分为左右两半,就是在寻找两个node
的key
最不雷同的一个index
具体例子
node.key是如何生成的?
依据上面getKey()
代码能够晓得,先获取绝对应的门路name="./src/entry1.js"
,而后通过hashFilename()
失去对应的hash
值,最终拼成
门路fullKey
="./src/entry1.js-6a89fa05"
,而后requestToId()
转化为key
="src_entry1_js-6a89fa05"
// node_modules/webpack/lib/optimize/SplitChunksPlugin.jsconst results = deterministicGroupingForModules({ //... getKey(module) { debugger; const cache = getKeyCache.get(module); if (cache !== undefined) return cache; const ident = cachedMakePathsRelative(module.identifier()); const nameForCondition = module.nameForCondition && module.nameForCondition(); const name = nameForCondition ? cachedMakePathsRelative(nameForCondition) : ident.replace(/^.*!|\?[^?!]*$/g, ""); const fullKey = name + automaticNameDelimiter + hashFilename(ident, outputOptions); const key = requestToId(fullKey); getKeyCache.set(module, key); return key; }}
实质就是拿到文件门路最不雷同的一个点?比方其中5个module都在a文件夹,其中3个module都在b文件夹,那么就以此为切割点?切割a文件夹为leftArea
,切割b文件夹为rightArea
??
字符串的比拟是依照字符(母)一一进行比拟的,从头到尾,一位一位进行比拟,谁大则该字符串大,比方
"Z"
>"A"
"ABC"
>"ABA"
"ABC"
<"AC"
"ABC"
>"AB"
间接模仿一系列的nodes
数据,手动制订left
和right
,移除对应的leftSize
和rightSize
size
只是为了判断目前宰割的大小是否满足minSize
,咱们上面例子次要是为了模仿应用similarity
寻找最佳宰割点,寻找最小的similarity
进行切割,分为左右两半的逻辑,因而临时不关注size
class Group { constructor(nodes, similarities, size) { this.nodes = nodes; this.similarities = similarities; this.key = undefined; }}const getSimilarities = nodes => { const similarities = []; let last = undefined; for (const node of nodes) { if (last !== undefined) { similarities.push(similarity(last.key, node.key)); } last = node; } return similarities;};const similarity = (a, b) => { const l = Math.min(a.length, b.length); let dist = 0; for (let i = 0; i < l; i++) { const ca = a.charCodeAt(i); const cb = b.charCodeAt(i); dist += Math.max(0, 10 - Math.abs(ca - cb)); } return dist;};function test() { const initialNodes = [ { key: "src2_entry1_js-6a89fa02" }, { key: "src3_entry2_js-3a33ff02" }, { key: "src2_entry3_js-6aaafa01" }, { key: "src1_entry0_js-ea33aa12" }, { key: "src1_entry1_js-6a89fa02" }, { key: "src1_entry2_js-ea33aa13" }, { key: "src1_entry3_js-ea33aa14" } ]; initialNodes.sort((a, b) => { if (a.key < b.key) return -1; if (a.key > b.key) return 1; return 0; }); const initialGroup = new Group(initialNodes, getSimilarities(initialNodes)); console.info(initialGroup); let left = 1; let right = 4; if (left <= right) { let best = -1; let bestSimilarity = Infinity; let pos = left; while (pos <= right + 1) { const similarity = initialGroup.similarities[pos - 1]; if ( similarity < bestSimilarity ) { best = pos; bestSimilarity = similarity; } pos++; } left = best; right = best - 1; } console.warn("left", left); console.warn("right", right);}test();
最终执行后果如下所示,文件夹不同文件之间的similarities
是最小的,因而会依照文件夹分成左右两个区域
尽管体现是依照文件夹宰割,然而并不能阐明都是如此,笔者没有深入研究这方面为什么要依据similarities
进行宰割,请读者参考其它文章进行钻研,目前举例只是作为right > left - 1
流程的集体了解
步骤4: 为左区间和右区间创立不同的Group数据,而后压入queue中重新处理
依据下面几个步骤确定的left
和right
,为leftArea
和rightArea
创立对应的new Group()
,而后压入queue
,再次重新处理分好的两个组,看看这两个组是否须要再进行分组
const rightNodes = [group.nodes[right + 1]];/** @type {number[]} */const rightSimilarities = [];for (let i = right + 2; i < group.nodes.length; i++) { rightSimilarities.push(group.similarities[i - 1]); rightNodes.push(group.nodes[i]);}queue.push(new Group(rightNodes, rightSimilarities));const leftNodes = [group.nodes[0]];/** @type {number[]} */const leftSimilarities = [];for (let i = 1; i < left; i++) { leftSimilarities.push(group.similarities[i - 1]); leftNodes.push(group.nodes[i]);}queue.push(new Group(leftNodes, leftSimilarities));
步骤5: 赋值key,造成最终数据结构返回
result.sort((a, b) => { if (a.nodes[0].key < b.nodes[0].key) return -1; if (a.nodes[0].key > b.nodes[0].key) return 1; return 0;});// give every group a nameconst usedNames = new Set();for (let i = 0; i < result.length; i++) { const group = result[i]; if (group.nodes.length === 1) { group.key = group.nodes[0].key; } else { const first = group.nodes[0]; const last = group.nodes[group.nodes.length - 1]; const name = getName(first.key, last.key, usedNames); group.key = name; }}// return the resultsreturn result.map(group => { return { key: group.key, items: group.nodes.map(node => node.item), size: group.size };});
2.6 具体示例
2.6.1 造成新chunk:test3
在下面具体实例中,咱们一开始的chunksInfoMap
如上面所示,通过compareEntries()
拿出bestEntry=test3的cacheGroup,而后通过一系列的参数校验后,开始检测其它chunksInfoMap[j]
的info.chunks
是否有目前最高优先级的chunksInfoMap[i]
的chunks
bestEntry=test3具备的chunks是app1、app2、app3、app4
,曾经笼罩了所有入口文件chunk,因而所有chunksInfoMap[j]
都得应用info.modules
和item.modules
比拟,删除其它chunksInfoMap[j]
的info.modules[i]
通过最高级别的cacheGroup:test3
的整顿后,咱们将minChunks=3
的common___g
、js-cookie
、voca
放入到newChunk
中,删除其它cacheGroup
中这三个NormalModule
而后触发代码,进行chunksInfoMap
的key删除
if (info.modules.size === 0) { chunksInfoMap.delete(key); continue;}
最终chunksInfoMap
的数据只剩下5个key,如上面所示
2.6.2 造成新chunk:test2
拆分出chunk:test3
后,进入下一轮循环,通过compareEntries()
拿出bestEntry=test2
相干的cacheGroup
在经验
- isExistingChunk
- maxInitialRequests和maxAsyncRequests
的流程解决后,进入了chunkGraph.isModuleInChunk
环节
outer: for (const chunk of usedChunks) { for (const module of item.modules) { if (chunkGraph.isModuleInChunk(module, chunk)) continue outer; } usedChunks.delete(chunk);}
从下图能够晓得,目前bestEntry=test2
中,modules
只剩下loadsh
,然而chunks
还存在app1、app2、app3、app4
从下图能够晓得,loadsh
只领有app1、app2
,因而下面代码块会触发usedChunks.delete(chunk)
删除掉app3、app4
那为什么会存在cacheGroup=test2
会领有app3、app4
呢?
那是因为在modules阶段:遍历compilation.modules,依据cacheGroup造成chunksInfoMap数据
的过程中,它对每一个module
进行遍历,而后进行每一个cacheGroup
的遍历,只有合乎cacheGroup.minChunks=2
都会被退出到cacheGroup=test2
中
那为什么当初cacheGroup=test2
又对应不上app3、app4
呢?
那是因为cacheGroup=test3
的优先级比cacheGroup=test2
高,它把一些module:common_g
、js-cookie
、voca
都曾经并入到chunk=test3
中,因而导致了cacheGroup=test2
只剩下module:loadsh
,这个时候loadsh
只须要app1、app2
这两个chunk,因而当初得删除app3、app4
这两个失去作用的chunk
处理完毕chunkGraph.isModuleInChunk
环节后,会进入usedChunks.size<item.chunks.size
环节,因为下面的环节曾经删除了usedChunks
的两个元素,因而这里满足usedChunks.size<item.chunks.size
,会将目前这个bestEntry
重新加入到chunksInfoMap
再次解决
// Were some (invalid) chunks removed from usedChunks?// => readd all modules to the queue, as things could have been changedif (usedChunks.size < item.chunks.size) { if (isExistingChunk) usedChunks.add(newChunk); if (usedChunks.size >= item.cacheGroup.minChunks) { const chunksArr = Array.from(usedChunks); for (const module of item.modules) { addModuleToChunksInfoMap( item.cacheGroup, item.cacheGroupIndex, chunksArr, getKey(usedChunks), module ); } } continue;}
退出实现后,chunksInfoMap
的数据如下所示,test2
就只剩下一个module以及它对应的两个chunk
再度触发新chunk:test2
的解决逻辑
2.6.3 再度触发造成新chunk:test2
从新执行所有流程
- isExistingChunk
- maxInitialRequests和maxAsyncRequests
- chunkGraph.isModuleInChunk
- 不合乎usedChunks.size<item.chunks.size
- minRemainingSize检测通过
最终触发了创立newChunk以及chunk.split(newChunk)的逻辑
// Create the new chunk if not reusing oneif (newChunk === undefined) { newChunk = compilation.addChunk(chunkName);}// Walk through all chunksfor (const chunk of usedChunks) { // Add graph connections for splitted chunk chunk.split(newChunk);}
而后进行删除其它chunksInfoMap其它item的info.modules[i]
const isOverlap = (a, b) => { for (const item of a) { if (b.has(item)) return true; } return false;};// remove all modules from other entries and update sizefor (const [key, info] of chunksInfoMap) { if (isOverlap(info.chunks, usedChunks)) { // update modules and total size // may remove it from the map when < minSize let updated = false; for (const module of item.modules) { if (info.modules.has(module)) { // remove module info.modules.delete(module); // update size for (const key of module.getSourceTypes()) { info.sizes[key] -= module.size(key); } updated = true; } } if (updated) { if (info.modules.size === 0) { chunksInfoMap.delete(key); continue; } if ( removeMinSizeViolatingModules(info) || !checkMinSizeReduction( info.sizes, info.cacheGroup.minSizeReduction, info.chunks.size ) ) { chunksInfoMap.delete(key); continue; } } }}
由下图能够晓得,须要删除的是app1
和app2
,因而所有chunksInfoMap其它item都会被删除,至此整个queue阶段:遍历chunksInfoMap,依据规定进行chunk的从新组织
完结,造成了两个新的chunk
:test3
和test2
queue阶段
完结后进入了maxSize
阶段
2.6.3 检测是否配置maxSize,是否要切割chunk
具体能够看下面maxSize阶段
的具体示例,这里不再赘述
3.codeGeneration: 模块转译
因为篇幅起因,具体分析请看下一篇文章《「Webpack5源码」seal阶段剖析三)》
参考
- 精读 Webpack SplitChunksPlugin 插件源码
其它工程化文章
- 「Webpack5源码」热更新HRM流程浅析
- 「Webpack5源码」make阶段(流程图)剖析
- 「Webpack5源码」enhanced-resolve门路解析库源码剖析
- 「vite4源码」dev模式整体流程浅析(一)
- 「vite4源码」dev模式整体流程浅析(二)