关于深度学习:深度学习Logistic回归算法和向量化编程全md文档笔记代码文档已分享

6次阅读

共计 108810 个字符,预计需要花费 273 分钟才能阅读完成。

本系列文章 md 笔记(已分享)次要探讨深度学习相干常识。能够让大家熟练掌握机器学习根底, 如分类、回归(含代码),熟练掌握 numpy,pandas,sklearn 等框架应用。在算法上,把握神经网络的数学原理,手动实现简略的神经网络构造,在利用上熟练掌握 TensorFlow 框架应用,把握神经网络图像相干案例。具体包含:TensorFlow 的数据流图构造,神经网络与 tf.keras,卷积神经网络(CNN),商品物体检测我的项目介绍,YOLO 与 SSD,商品检测数据集训练和模型导出与部署。

全套笔记和代码自取移步 gitee 仓库:gitee 仓库获取残缺文档和代码

感兴趣的小伙伴能够自取哦,欢送大家点赞转发~


共 9 章,60 子模块

TensorFlow 介绍

阐明 TensorFlow 的数据流图构造
利用 TensorFlow 操作图
说明会话在 TensorFlow 程序中的作用
利用 TensorFlow 实现张量的创立、形态类型批改操作
利用 Variable 实现变量 op 的创立
利用 Tensorboard 实现图构造以及张量值的显示
利用 tf.train.saver 实现 TensorFlow 的模型保留以及加载
利用 tf.app.flags 实现命令行参数增加和应用
利用 TensorFlow 实现线性回归

1.2 神经网络根底

学习指标

  • 指标

    • 晓得逻辑回归的算法计算输入、损失函数
    • 晓得导数的计算图
    • 晓得逻辑回归的梯度降落算法
    • 晓得多样本的向量计算
  • 利用

    • 利用实现向量化运算
    • 利用实现一个单神经元神经网络的构造

1.2.1 Logistic 回归

1.2.1.1 Logistic 回归

逻辑回归是一个次要用于二分分类类的算法。那么逻辑回归是给定一个 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>x</mi></mrow><annotation encoding=”application/x-tex”>x</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.43056em;”></span><span class=”strut bottom” style=”height:0.43056em;vertical-align:0em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit”>x</span></span></span></span>, 输入一个该样本属于 1 对应类别的预测概率 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mo>=</mo><mi>P</mi><mo>(</mo><mi>y</mi><mo>=</mo><mn>1</mn><mi mathvariant=”normal”>∣</mi><mi>x</mi><mo>)</mo></mrow><annotation encoding=”application/x-tex”>\hat{y}=P(y=1|x)</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.75em;”></span><span class=”strut bottom” style=”height:1em;vertical-align:-0.25em;”></span><span class=”base textstyle uncramped”><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mrel”>=</span><span class=”mord mathit” style=”margin-right:0.13889em;”>P</span><span class=”mopen”>(</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”mrel”>=</span><span class=”mord mathrm”>1</span><span class=”mord mathrm”>∣</span><span class=”mord mathit”>x</span><span class=”mclose”>)</span></span></span></span>。

Logistic 回归中应用的参数如下:

<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><msup><mi>e</mi><mrow><mo>−</mo><mi>z</mi></mrow></msup></mrow><annotation encoding=”application/x-tex”>e^{-z}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.771331em;”></span><span class=”strut bottom” style=”height:0.771331em;vertical-align:0em;”></span><span class=”base textstyle uncramped”><span class=”mord”><span class=”mord mathit”>e</span><span class=”msupsub”><span class=”vlist”><span style=”top:-0.363em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mtight”>−</span><span class=”mord mathit mtight” style=”margin-right:0.04398em;”>z</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span></span> 的函数如下

例如:

1.2.1.2 逻辑回归损失函数

损失函数(loss function)用于掂量预测后果与实在值之间的误差。最简略的损失函数定义形式为平方差损失:

<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>L</mi><mo>(</mo><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mo separator=”true”>,</mo><mi>y</mi><mo>)</mo><mo>=</mo><mfrac><mrow><mn>1</mn></mrow><mrow><mn>2</mn></mrow></mfrac><mo>(</mo><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mo>−</mo><mi>y</mi><msup><mo>)</mo><mn>2</mn></msup></mrow><annotation encoding=”application/x-tex”>L(\hat{y},y) = \frac{1}{2}(\hat{y}-y)^2</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.845108em;”></span><span class=”strut bottom” style=”height:1.190108em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit”>L</span><span class=”mopen”>(</span><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mpunct”>,</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”mclose”>)</span><span class=”mrel”>=</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathrm mtight”>2</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathrm mtight”>1</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mopen”>(</span><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mbin”>−</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”mclose”><span class=”mclose”>)</span><span class=”msupsub”><span class=”vlist”><span style=”top:-0.363em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord mathrm mtight”>2</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span></span>

逻辑回归个别应用 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>L</mi><mo>(</mo><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mo separator=”true”>,</mo><mi>y</mi><mo>)</mo><mo>=</mo><mo>−</mo><mo>(</mo><mi>y</mi><mi>log</mi><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mo>)</mo><mo>−</mo><mo>(</mo><mn>1</mn><mo>−</mo><mi>y</mi><mo>)</mo><mi>log</mi><mo>(</mo><mn>1</mn><mo>−</mo><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mo>)</mo></mrow><annotation encoding=”application/x-tex”>L(\hat{y},y) = -(y\log\hat{y})-(1-y)\log(1-\hat{y})</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.75em;”></span><span class=”strut bottom” style=”height:1em;vertical-align:-0.25em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit”>L</span><span class=”mopen”>(</span><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mpunct”>,</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”mclose”>)</span><span class=”mrel”>=</span><span class=”mord”>−</span><span class=”mopen”>(</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”mop”>lo<span style=”margin-right:0.01389em;”>g</span></span><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose”>)</span><span class=”mbin”>−</span><span class=”mopen”>(</span><span class=”mord mathrm”>1</span><span class=”mbin”>−</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”mclose”>)</span><span class=”mop”>lo<span style=”margin-right:0.01389em;”>g</span></span><span class=”mopen”>(</span><span class=”mord mathrm”>1</span><span class=”mbin”>−</span><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose”>)</span></span></span></span>

该式子的了解:

  • 如果 y =1, 损失为 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mo>−</mo><mi>log</mi><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover></mrow><annotation encoding=”application/x-tex”>- \log\hat{y}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.69444em;”></span><span class=”strut bottom” style=”height:0.8888799999999999em;vertical-align:-0.19444em;”></span><span class=”base textstyle uncramped”><span class=”mord”>−</span><span class=”mop”>lo<span style=”margin-right:0.01389em;”>g</span></span><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span>,那么要想损失越小,<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover></mrow><annotation encoding=”application/x-tex”>\hat{y}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.69444em;”></span><span class=”strut bottom” style=”height:0.8888799999999999em;vertical-align:-0.19444em;”></span><span class=”base textstyle uncramped”><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span> 的值必须越大,即越趋近于或者等于 1
  • 如果 y =0, 损失为 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mn>1</mn><mi>log</mi><mo>(</mo><mn>1</mn><mo>−</mo><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mo>)</mo></mrow><annotation encoding=”application/x-tex”>1\log(1-\hat{y})</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.75em;”></span><span class=”strut bottom” style=”height:1em;vertical-align:-0.25em;”></span><span class=”base textstyle uncramped”><span class=”mord mathrm”>1</span><span class=”mop”>lo<span style=”margin-right:0.01389em;”>g</span></span><span class=”mopen”>(</span><span class=”mord mathrm”>1</span><span class=”mbin”>−</span><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose”>)</span></span></span></span>, 那么要想损失越小,那么 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover></mrow><annotation encoding=”application/x-tex”>\hat{y}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.69444em;”></span><span class=”strut bottom” style=”height:0.8888799999999999em;vertical-align:-0.19444em;”></span><span class=”base textstyle uncramped”><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span> 的值越小,即趋近于或者等于 0

损失函数是在单个训练样本中定义的,它掂量了在 单个 训练样本上的体现。代价函数(cost function)掂量的是在 整体 训练样本上的体现,即掂量参数 w 和 b 的成果,所有训练样本的损失平均值

<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>J</mi><mo>(</mo><mi>w</mi><mo separator=”true”>,</mo><mi>b</mi><mo>)</mo><mo>=</mo><mfrac><mrow><mn>1</mn></mrow><mrow><mi>m</mi></mrow></mfrac><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></msubsup><mi>L</mi><mo>(</mo><msup><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow></msup><mo separator=”true”>,</mo><msup><mi>y</mi><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow></msup><mo>)</mo></mrow><annotation encoding=”application/x-tex”>J(w,b) = \frac{1}{m}\sum_{i=1}^mL(\hat{y}^{(i)},y^{(i)})</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8879999999999999em;”></span><span class=”strut bottom” style=”height:1.2329999999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.09618em;”>J</span><span class=”mopen”>(</span><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”mpunct”>,</span><span class=”mord mathit”>b</span><span class=”mclose”>)</span><span class=”mrel”>=</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>m</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathrm mtight”>1</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mop”><span class=”mop op-symbol small-op” style=”top:-0.0000050000000000050004em;”>∑</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.30001em;margin-left:0em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>i</span><span class=”mrel mtight”>=</span><span class=”mord mathrm mtight”>1</span></span></span></span><span style=”top:-0.364em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord mathit mtight”>m</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mord mathit”>L</span><span class=”mopen”>(</span><span class=”mord”><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”msupsub”><span class=”vlist”><span style=”top:-0.363em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mopen mtight”>(</span><span class=”mord mathit mtight”>i</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mpunct”>,</span><span class=”mord”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”msupsub”><span class=”vlist”><span style=”top:-0.363em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mopen mtight”>(</span><span class=”mord mathit mtight”>i</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mclose”>)</span></span></span></span>

1.2.2 梯度降落算法

目标:使损失函数的值找到最小值

形式:梯度降落

函数的 梯度(gradient)指出了函数的最陡增长方向。梯度的方向走,函数增长得就越快。那么按梯度的负方向走,函数值天然就升高得最快了。模型的训练指标即是寻找适合的 w 与 b 以最小化代价函数值。假如 w 与 b 都是一维实数,那么能够失去如下的 J 对于 w 与 b 的图:

能够看到,老本函数 J 是一个 凸函数,与非凸函数的区别在于其不含有多个部分最低。

参数 w 和 b 的更新公式为:

<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>w</mi><mo>:</mo><mo>=</mo><mi>w</mi><mo>−</mo><mi>α</mi><mfrac><mrow><mi>d</mi><mi>J</mi><mo>(</mo><mi>w</mi><mo separator=”true”>,</mo><mi>b</mi><mo>)</mo></mrow><mrow><mi>d</mi><mi>w</mi></mrow></mfrac></mrow><annotation encoding=”application/x-tex”>w := w – \alpha\frac{dJ(w, b)}{dw}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:1.01em;”></span><span class=”strut bottom” style=”height:1.355em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”mrel”>:</span><span class=”mrel”>=</span><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”mbin”>−</span><span class=”mord mathit” style=”margin-right:0.0037em;”>α</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.485em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span><span class=”mopen mtight”>(</span><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”mpunct mtight”>,</span><span class=”mord mathit mtight”>b</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span></span></span></span>,<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>b</mi><mo>:</mo><mo>=</mo><mi>b</mi><mo>−</mo><mi>α</mi><mfrac><mrow><mi>d</mi><mi>J</mi><mo>(</mo><mi>w</mi><mo separator=”true”>,</mo><mi>b</mi><mo>)</mo></mrow><mrow><mi>d</mi><mi>b</mi></mrow></mfrac></mrow><annotation encoding=”application/x-tex”>b := b – \alpha\frac{dJ(w, b)}{db}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:1.01em;”></span><span class=”strut bottom” style=”height:1.355em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit”>b</span><span class=”mrel”>:</span><span class=”mrel”>=</span><span class=”mord mathit”>b</span><span class=”mbin”>−</span><span class=”mord mathit” style=”margin-right:0.0037em;”>α</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>b</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.485em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span><span class=”mopen mtight”>(</span><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”mpunct mtight”>,</span><span class=”mord mathit mtight”>b</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span></span></span></span>

注:其中 α 示意学习速率,即每次更新的 w 的步调长度。当 w 大于最优解 w′ 时,导数大于 0,那么 w 就会向更小的方向更新。反之当 w 小于最优解 w′ 时,导数小于 0,那么 w 就会向更大的方向更新。迭代直到收敛。

通过立体来了解梯度降落过程:

1.2.3 导数

了解梯度降落的过程之后,咱们通过例子来阐明梯度降落在计算导数意义或者说这个导数的意义。

1.2.3.1 导数

导数也能够了解成某一点处的斜率。斜率这个词更直观一些。

  • 各点处的导数值一样

咱们看到这里有一条直线,这条直线的斜率为 4。咱们来计算一个例子

例:取一点为 a =2, 那么 y 的值为 8,咱们略微减少 a 的值为 a =2.001, 那么 y 的值为 8.004,也就是当 a 减少了 0.001,随后 y 减少了 0.004,即 4 倍

那么咱们的这个斜率能够了解为当一个点偏移一个不可估量的小的值,所减少的为 4 倍。

能够记做 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>f</mi><mo>(</mo><mi>a</mi><mo>)</mo></mrow><mrow><mi>d</mi><mi>a</mi></mrow></mfrac></mrow><annotation encoding=”application/x-tex”>\frac{f(a)}{da}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:1.01em;”></span><span class=”strut bottom” style=”height:1.355em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>a</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.485em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight” style=”margin-right:0.10764em;”>f</span><span class=”mopen mtight”>(</span><span class=”mord mathit mtight”>a</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span></span></span></span> 或者 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi></mrow><mrow><mi>d</mi><mi>a</mi></mrow></mfrac><mi>f</mi><mo>(</mo><mi>a</mi><mo>)</mo></mrow><annotation encoding=”application/x-tex”>\frac{d}{da}f(a)</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.2251079999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>a</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mord mathit” style=”margin-right:0.10764em;”>f</span><span class=”mopen”>(</span><span class=”mord mathit”>a</span><span class=”mclose”>)</span></span></span></span>

  • 各点的导数值不全统一

例:取一点为 a =2, 那么 y 的值为 4,咱们略微减少 a 的值为 a =2.001, 那么 y 的值约等于 4.004(4.004001),也就是当 a 减少了 0.001,随后 y 减少了 4 倍

取一点为 a =5, 那么 y 的值为 25,咱们略微减少 a 的值为 a =5.001, 那么 y 的值约等于 25.01(25.010001),也就是当 a 减少了 0.001,随后 y 减少了 10 倍

能够得出该函数的导数 2 为 2a。

  • 更多函数的导数后果
函数 导数
f(a)=a2f(a) = a^2f(a)=a​2​​ 2a2a2a
f(a)=a3f(a)=a^3f(a)=a​3​​ 3a23a^23a​2​​
f(a)=ln(a)f(a)=ln(a)f(a)=ln(a) 1a\frac{1}{a}​a​​1​​
f(a)=eaf(a) = e^af(a)=e​a​​ eae^ae​a​​
σ(z)=11+e−z\sigma(z) = \frac{1}{1+e^{-z}}σ(z)=​1+e​−z​​​​1​​ σ(z)(1−σ(z))\sigma(z)(1-\sigma(z))σ(z)(1−σ(z))
g(z)=tanh(z)=ez−e−zez+e−zg(z) = tanh(z) = \frac{e^z – e^{-z}}{e^z + e^{-z}}g(z)=tanh(z)=​e​z​​+e​−z​​​​e​z​​−e​−z​​​​ 1−(tanh(z))2=1−(g(z))21-(tanh(z))^2=1-(g(z))^21−(tanh(z))​2​​=1−(g(z))​2​​
1.2.3.2 导数计算图

那么接下来咱们来看看含有多个变量的到导数流程图,假如 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>J</mi><mo>(</mo><mi>a</mi><mo separator=”true”>,</mo><mi>b</mi><mo separator=”true”>,</mo><mi>c</mi><mo>)</mo><mo>=</mo><mn>3</mn><mrow><mo>(</mo><mi>a</mi><mo>+</mo><mi>b</mi><mi>c</mi><mo>)</mo></mrow></mrow><annotation encoding=”application/x-tex”>J(a,b,c) = 3{(a + bc)}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.75em;”></span><span class=”strut bottom” style=”height:1em;vertical-align:-0.25em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.09618em;”>J</span><span class=”mopen”>(</span><span class=”mord mathit”>a</span><span class=”mpunct”>,</span><span class=”mord mathit”>b</span><span class=”mpunct”>,</span><span class=”mord mathit”>c</span><span class=”mclose”>)</span><span class=”mrel”>=</span><span class=”mord mathrm”>3</span><span class=”mord textstyle uncramped”><span class=”mopen”>(</span><span class=”mord mathit”>a</span><span class=”mbin”>+</span><span class=”mord mathit”>b</span><span class=”mord mathit”>c</span><span class=”mclose”>)</span></span></span></span></span>

咱们以上面的流程图代替

这样就相当于从左到右计算出后果,而后从后往前计算出导数

  • 导数计算

问题:那么当初咱们要计算 J 绝对于三个变量 a,b,c 的导数?

假如 b =4,c=2,a=7,u=8,v=15,j=45

  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>v</mi></mrow></mfrac><mo>=</mo><mn>3</mn></mrow><annotation encoding=”application/x-tex”>\frac{dJ}{dv}=3</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.2251079999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.03588em;”>v</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathrm”>3</span></span></span></span>

减少 v 从 15 到 15.001,那么 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>J</mi><mo>≈</mo><mn>4</mn><mn>5</mn><mi mathvariant=”normal”>.</mi><mn>0</mn><mn>0</mn><mn>3</mn></mrow><annotation encoding=”application/x-tex”>J\approx45.003</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.68333em;”></span><span class=”strut bottom” style=”height:0.68333em;vertical-align:0em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.09618em;”>J</span><span class=”mrel”>≈</span><span class=”mord mathrm”>4</span><span class=”mord mathrm”>5</span><span class=”mord mathrm”>.</span><span class=”mord mathrm”>0</span><span class=”mord mathrm”>0</span><span class=”mord mathrm”>3</span></span></span></span>

  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>a</mi></mrow></mfrac><mo>=</mo><mn>3</mn></mrow><annotation encoding=”application/x-tex”>\frac{dJ}{da}=3</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.2251079999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>a</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathrm”>3</span></span></span></span>

减少 a 从 7 到 7.001, 那么 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>v</mi><mo>=</mo><mo>≈</mo><mn>1</mn><mn>5</mn><mi mathvariant=”normal”>.</mi><mn>0</mn><mn>0</mn><mn>1</mn></mrow><annotation encoding=”application/x-tex”>v=\approx15.001</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.64444em;”></span><span class=”strut bottom” style=”height:0.64444em;vertical-align:0em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>v</span><span class=”mrel”>=</span><span class=”mrel”>≈</span><span class=”mord mathrm”>1</span><span class=”mord mathrm”>5</span><span class=”mord mathrm”>.</span><span class=”mord mathrm”>0</span><span class=”mord mathrm”>0</span><span class=”mord mathrm”>1</span></span></span></span>,<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>J</mi><mo>≈</mo><mn>4</mn><mn>5</mn><mi mathvariant=”normal”>.</mi><mn>0</mn><mn>0</mn><mn>3</mn></mrow><annotation encoding=”application/x-tex”>J\approx45.003</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.68333em;”></span><span class=”strut bottom” style=”height:0.68333em;vertical-align:0em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.09618em;”>J</span><span class=”mrel”>≈</span><span class=”mord mathrm”>4</span><span class=”mord mathrm”>5</span><span class=”mord mathrm”>.</span><span class=”mord mathrm”>0</span><span class=”mord mathrm”>0</span><span class=”mord mathrm”>3</span></span></span></span>

这里也波及到链式法则

1.2.3.3 链式法则
  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>a</mi></mrow></mfrac><mo>=</mo><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>v</mi></mrow></mfrac><mfrac><mrow><mi>d</mi><mi>v</mi></mrow><mrow><mi>d</mi><mi>a</mi></mrow></mfrac><mo>=</mo><mn>3</mn><mo>∗</mo><mn>1</mn><mo>=</mo><mn>3</mn></mrow><annotation encoding=”application/x-tex”>\frac{dJ}{da}=\frac{dJ}{dv}\frac{dv}{da}=3*1=3</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.2251079999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>a</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.03588em;”>v</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>a</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.03588em;”>v</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathrm”>3</span><span class=”mbin”>∗</span><span class=”mord mathrm”>1</span><span class=”mrel”>=</span><span class=”mord mathrm”>3</span></span></span></span>

J 绝对于 a 减少的量能够了解为 J 绝对于 v * v 绝对于 a 减少的

接下来计算

  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>b</mi></mrow></mfrac><mo>=</mo><mn>6</mn><mo>=</mo><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>u</mi></mrow></mfrac><mfrac><mrow><mi>d</mi><mi>u</mi></mrow><mrow><mi>d</mi><mi>b</mi></mrow></mfrac><mo>=</mo><mn>3</mn><mo>∗</mo><mn>2</mn></mrow><annotation encoding=”application/x-tex”>\frac{dJ}{db}=6=\frac{dJ}{du}\frac{du}{db}=3*2</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.2251079999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>b</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathrm”>6</span><span class=”mrel”>=</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>u</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>b</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>u</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathrm”>3</span><span class=”mbin”>∗</span><span class=”mord mathrm”>2</span></span></span></span>
  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>c</mi></mrow></mfrac><mo>=</mo><mn>9</mn><mo>=</mo><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>u</mi></mrow></mfrac><mfrac><mrow><mi>d</mi><mi>u</mi></mrow><mrow><mi>d</mi><mi>c</mi></mrow></mfrac><mo>=</mo><mn>3</mn><mo>∗</mo><mn>3</mn></mrow><annotation encoding=”application/x-tex”>\frac{dJ}{dc}=9=\frac{dJ}{du}\frac{du}{dc}=3*3</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.2251079999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>c</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathrm”>9</span><span class=”mrel”>=</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>u</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>c</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>u</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathrm”>3</span><span class=”mbin”>∗</span><span class=”mord mathrm”>3</span></span></span></span>
1.2.3.4 逻辑回归的梯度降落

逻辑回归的梯度降落过程计算图,首先从前往后的计算图得出如下

  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>z</mi><mo>=</mo><msup><mi>w</mi><mi>T</mi></msup><mi>x</mi><mo>+</mo><mi>b</mi></mrow><annotation encoding=”application/x-tex”>z = w^Tx + b</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8413309999999999em;”></span><span class=”strut bottom” style=”height:0.924661em;vertical-align:-0.08333em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.04398em;”>z</span><span class=”mrel”>=</span><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:-0.363em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord mathit mtight” style=”margin-right:0.13889em;”>T</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mord mathit”>x</span><span class=”mbin”>+</span><span class=”mord mathit”>b</span></span></span></span>
  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mo>=</mo><mi>a</mi><mo>=</mo><mi>σ</mi><mo>(</mo><mi>z</mi><mo>)</mo></mrow><annotation encoding=”application/x-tex”>\hat{y} =a= \sigma(z)</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.75em;”></span><span class=”strut bottom” style=”height:1em;vertical-align:-0.25em;”></span><span class=”base textstyle uncramped”><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mrel”>=</span><span class=”mord mathit”>a</span><span class=”mrel”>=</span><span class=”mord mathit” style=”margin-right:0.03588em;”>σ</span><span class=”mopen”>(</span><span class=”mord mathit” style=”margin-right:0.04398em;”>z</span><span class=”mclose”>)</span></span></span></span>
  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>L</mi><mo>(</mo><mover accent=”true”><mrow><mi>y</mi></mrow><mo>^</mo></mover><mo separator=”true”>,</mo><mi>y</mi><mo>)</mo><mo>=</mo><mo>−</mo><mo>(</mo><mi>y</mi><mi>log</mi><mrow><mi>a</mi></mrow><mo>)</mo><mo>−</mo><mo>(</mo><mn>1</mn><mo>−</mo><mi>y</mi><mo>)</mo><mi>log</mi><mo>(</mo><mn>1</mn><mo>−</mo><mi>a</mi><mo>)</mo></mrow><annotation encoding=”application/x-tex”>L(\hat{y},y) = -(y\log{a})-(1-y)\log(1-a)</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.75em;”></span><span class=”strut bottom” style=”height:1em;vertical-align:-0.25em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit”>L</span><span class=”mopen”>(</span><span class=”mord accent”><span class=”vlist”><span style=”top:0em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”mord textstyle cramped”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span><span style=”top:0em;margin-left:0.11112em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”accent-body”><span>^</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mpunct”>,</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”mclose”>)</span><span class=”mrel”>=</span><span class=”mord”>−</span><span class=”mopen”>(</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”mop”>lo<span style=”margin-right:0.01389em;”>g</span></span><span class=”mord textstyle uncramped”><span class=”mord mathit”>a</span></span><span class=”mclose”>)</span><span class=”mbin”>−</span><span class=”mopen”>(</span><span class=”mord mathrm”>1</span><span class=”mbin”>−</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”mclose”>)</span><span class=”mop”>lo<span style=”margin-right:0.01389em;”>g</span></span><span class=”mopen”>(</span><span class=”mord mathrm”>1</span><span class=”mbin”>−</span><span class=”mord mathit”>a</span><span class=”mclose”>)</span></span></span></span>

那么计算图从前向过程为, 假如样本有两个特色

问题:计算出 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>J</mi></mrow><annotation encoding=”application/x-tex”>J</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.68333em;”></span><span class=”strut bottom” style=”height:0.68333em;vertical-align:0em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.09618em;”>J</span></span></span></span> 对于 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>z</mi></mrow><annotation encoding=”application/x-tex”>z</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.43056em;”></span><span class=”strut bottom” style=”height:0.43056em;vertical-align:0em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.04398em;”>z</span></span></span></span> 的导数

  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>d</mi><mi>z</mi><mo>=</mo><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>a</mi></mrow></mfrac><mfrac><mrow><mi>d</mi><mi>a</mi></mrow><mrow><mi>d</mi><mi>z</mi></mrow></mfrac><mo>=</mo><mi>a</mi><mo>−</mo><mi>y</mi></mrow><annotation encoding=”application/x-tex”>dz = \frac{dJ}{da}\frac{da}{dz} = a-y</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.2251079999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit”>d</span><span class=”mord mathit” style=”margin-right:0.04398em;”>z</span><span class=”mrel”>=</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>a</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.04398em;”>z</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>a</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathit”>a</span><span class=”mbin”>−</span><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span></span></span></span>

    • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>a</mi></mrow></mfrac><mo>=</mo><mo>−</mo><mfrac><mrow><mi>y</mi></mrow><mrow><mi>a</mi></mrow></mfrac><mo>+</mo><mfrac><mrow><mn>1</mn><mo>−</mo><mi>y</mi></mrow><mrow><mn>1</mn><mo>−</mo><mi>a</mi></mrow></mfrac></mrow><annotation encoding=”application/x-tex”>\frac{dJ}{da} = -\frac{y}{a} + \frac{1-y}{1-a}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.897216em;”></span><span class=”strut bottom” style=”height:1.300547em;vertical-align:-0.403331em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>a</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord”>−</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>a</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.44610799999999995em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight” style=”margin-right:0.03588em;”>y</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mbin”>+</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span><span class=”mbin mtight”>−</span><span class=”mord mathit mtight”>a</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.44610799999999995em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathrm mtight”>1</span><span class=”mbin mtight”>−</span><span class=”mord mathit mtight” style=”margin-right:0.03588em;”>y</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span></span></span></span>
    • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>a</mi></mrow><mrow><mi>d</mi><mi>z</mi></mrow></mfrac><mo>=</mo><mi>a</mi><mo>(</mo><mn>1</mn><mo>−</mo><mi>a</mi><mo>)</mo></mrow><annotation encoding=”application/x-tex”>\frac{da}{dz} = a(1-a)</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.2251079999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.04398em;”>z</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>a</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathit”>a</span><span class=”mopen”>(</span><span class=”mord mathrm”>1</span><span class=”mbin”>−</span><span class=”mord mathit”>a</span><span class=”mclose”>)</span></span></span></span>

所以咱们这样能够求出总损失绝对于 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><msub><mi>w</mi><mn>1</mn></msub><mo separator=”true”>,</mo><msub><mi>w</mi><mn>2</mn></msub><mo separator=”true”>,</mo><mi>b</mi></mrow><annotation encoding=”application/x-tex”>w_1,w_2,b</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.69444em;”></span><span class=”strut bottom” style=”height:0.8888799999999999em;vertical-align:-0.19444em;”></span><span class=”base textstyle uncramped”><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mpunct”>,</span><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>2</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mpunct”>,</span><span class=”mord mathit”>b</span></span></span></span> 参数的某一点导数,从而能够更新参数

  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><msub><mi>w</mi><mn>1</mn></msub></mrow></mfrac><mo>=</mo><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>z</mi></mrow></mfrac><mfrac><mrow><mi>d</mi><mi>z</mi></mrow><mrow><mi>d</mi><msub><mi>w</mi><mn>1</mn></msub></mrow></mfrac><mo>=</mo><mi>d</mi><mi>z</mi><mo>∗</mo><mi>x</mi><mn>1</mn></mrow><annotation encoding=”application/x-tex”>\frac{dJ}{dw_1} = \frac{dJ}{dz}\frac{dz}{dw_1}=dz*x1</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.325208em;vertical-align:-0.44509999999999994em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mtight”><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.143em;margin-right:0.07142857142857144em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-scriptstyle scriptscriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.04398em;”>z</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mtight”><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.143em;margin-right:0.07142857142857144em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-scriptstyle scriptscriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.04398em;”>z</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathit”>d</span><span class=”mord mathit” style=”margin-right:0.04398em;”>z</span><span class=”mbin”>∗</span><span class=”mord mathit”>x</span><span class=”mord mathrm”>1</span></span></span></span>
  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><msub><mi>w</mi><mn>2</mn></msub></mrow></mfrac><mo>=</mo><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>z</mi></mrow></mfrac><mfrac><mrow><mi>d</mi><mi>z</mi></mrow><mrow><mi>d</mi><msub><mi>w</mi><mn>1</mn></msub></mrow></mfrac><mo>=</mo><mi>d</mi><mi>z</mi><mo>∗</mo><mi>x</mi><mn>2</mn></mrow><annotation encoding=”application/x-tex”>\frac{dJ}{dw_2} = \frac{dJ}{dz}\frac{dz}{dw_1}=dz*x2</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.325208em;vertical-align:-0.44509999999999994em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mtight”><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.143em;margin-right:0.07142857142857144em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-scriptstyle scriptscriptstyle cramped mtight”><span class=”mord mathrm mtight”>2</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.04398em;”>z</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mtight”><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.143em;margin-right:0.07142857142857144em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-scriptstyle scriptscriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.04398em;”>z</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathit”>d</span><span class=”mord mathit” style=”margin-right:0.04398em;”>z</span><span class=”mbin”>∗</span><span class=”mord mathit”>x</span><span class=”mord mathrm”>2</span></span></span></span>
  • <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mfrac><mrow><mi>d</mi><mi>J</mi></mrow><mrow><mi>d</mi><mi>b</mi></mrow></mfrac><mo>=</mo><mi>d</mi><mi>z</mi></mrow><annotation encoding=”application/x-tex”>\frac{dJ}{db}=dz</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8801079999999999em;”></span><span class=”strut bottom” style=”height:1.2251079999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>b</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mrel”>=</span><span class=”mord mathit”>d</span><span class=”mord mathit” style=”margin-right:0.04398em;”>z</span></span></span></span>

置信下面的导数计算应该都能了解了,所以当咱们 计算损失函数的某个点绝对于 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><msub><mi>w</mi><mn>1</mn></msub><mo separator=”true”>,</mo><msub><mi>w</mi><mn>2</mn></msub><mo separator=”true”>,</mo><mi>b</mi></mrow><annotation encoding=”application/x-tex”>w_1,w_2,b</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.69444em;”></span><span class=”strut bottom” style=”height:0.8888799999999999em;vertical-align:-0.19444em;”></span><span class=”base textstyle uncramped”><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mpunct”>,</span><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>2</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mpunct”>,</span><span class=”mord mathit”>b</span></span></span></span> 的导数之后,就能够更新这次优化后的后果。

<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><msub><mi>w</mi><mn>1</mn></msub><mo>:</mo><mo>=</mo><msub><mi>w</mi><mn>1</mn></msub><mo>−</mo><mi>α</mi><mfrac><mrow><mi>d</mi><mi>J</mi><mo>(</mo><msub><mi>w</mi><mn>1</mn></msub><mo separator=”true”>,</mo><mi>b</mi><mo>)</mo></mrow><mrow><mi>d</mi><msub><mi>w</mi><mn>1</mn></msub></mrow></mfrac></mrow><annotation encoding=”application/x-tex”>w_1 := w_1 – \alpha\frac{dJ(w_1, b)}{dw_1}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:1.01em;”></span><span class=”strut bottom” style=”height:1.4550999999999998em;vertical-align:-0.44509999999999994em;”></span><span class=”base textstyle uncramped”><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mrel”>:</span><span class=”mrel”>=</span><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mbin”>−</span><span class=”mord mathit” style=”margin-right:0.0037em;”>α</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mtight”><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.143em;margin-right:0.07142857142857144em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-scriptstyle scriptscriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.485em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span><span class=”mopen mtight”>(</span><span class=”mord mtight”><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.143em;margin-right:0.07142857142857144em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-scriptstyle scriptscriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mpunct mtight”>,</span><span class=”mord mathit mtight”>b</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span></span></span></span>

<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><msub><mi>w</mi><mn>2</mn></msub><mo>:</mo><mo>=</mo><msub><mi>w</mi><mn>2</mn></msub><mo>−</mo><mi>α</mi><mfrac><mrow><mi>d</mi><mi>J</mi><mo>(</mo><msub><mi>w</mi><mn>2</mn></msub><mo separator=”true”>,</mo><mi>b</mi><mo>)</mo></mrow><mrow><mi>d</mi><msub><mi>w</mi><mn>2</mn></msub></mrow></mfrac></mrow><annotation encoding=”application/x-tex”>w_2 := w_2 – \alpha\frac{dJ(w_2, b)}{dw_2}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:1.01em;”></span><span class=”strut bottom” style=”height:1.4550999999999998em;vertical-align:-0.44509999999999994em;”></span><span class=”base textstyle uncramped”><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>2</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mrel”>:</span><span class=”mrel”>=</span><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>2</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mbin”>−</span><span class=”mord mathit” style=”margin-right:0.0037em;”>α</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mtight”><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.143em;margin-right:0.07142857142857144em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-scriptstyle scriptscriptstyle cramped mtight”><span class=”mord mathrm mtight”>2</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.485em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span><span class=”mopen mtight”>(</span><span class=”mord mtight”><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.143em;margin-right:0.07142857142857144em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-scriptstyle scriptscriptstyle cramped mtight”><span class=”mord mathrm mtight”>2</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mpunct mtight”>,</span><span class=”mord mathit mtight”>b</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span></span></span></span>

<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>b</mi><mo>:</mo><mo>=</mo><mi>b</mi><mo>−</mo><mi>α</mi><mfrac><mrow><mi>d</mi><mi>J</mi><mo>(</mo><mi>w</mi><mo separator=”true”>,</mo><mi>b</mi><mo>)</mo></mrow><mrow><mi>d</mi><mi>b</mi></mrow></mfrac></mrow><annotation encoding=”application/x-tex”>b := b – \alpha\frac{dJ(w, b)}{db}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:1.01em;”></span><span class=”strut bottom” style=”height:1.355em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit”>b</span><span class=”mrel”>:</span><span class=”mrel”>=</span><span class=”mord mathit”>b</span><span class=”mbin”>−</span><span class=”mord mathit” style=”margin-right:0.0037em;”>α</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight”>b</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.485em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathit mtight”>d</span><span class=”mord mathit mtight” style=”margin-right:0.09618em;”>J</span><span class=”mopen mtight”>(</span><span class=”mord mathit mtight” style=”margin-right:0.02691em;”>w</span><span class=”mpunct mtight”>,</span><span class=”mord mathit mtight”>b</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span></span></span></span>

1.2.4 向量化编程

每更新一次梯度时候,在训练期间咱们会领有 m 个样本,那么这样每个样本提供进去都能够做一个梯度降落计算。所以咱们要去做在所有样本上的计算结果、梯度等操作

<span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>J</mi><mo>(</mo><mi>w</mi><mo separator=”true”>,</mo><mi>b</mi><mo>)</mo><mo>=</mo><mfrac><mrow><mn>1</mn></mrow><mrow><mi>m</mi></mrow></mfrac><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>m</mi></msubsup><mi>L</mi><mo>(</mo><msup><mrow><mi>a</mi></mrow><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow></msup><mo separator=”true”>,</mo><msup><mi>y</mi><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow></msup><mo>)</mo></mrow><annotation encoding=”application/x-tex”>J(w,b) = \frac{1}{m}\sum_{i=1}^mL({a}^{(i)},y^{(i)})</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8879999999999999em;”></span><span class=”strut bottom” style=”height:1.2329999999999999em;vertical-align:-0.345em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.09618em;”>J</span><span class=”mopen”>(</span><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”mpunct”>,</span><span class=”mord mathit”>b</span><span class=”mclose”>)</span><span class=”mrel”>=</span><span class=”mord reset-textstyle textstyle uncramped”><span class=”mopen sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span><span class=”mfrac”><span class=”vlist”><span style=”top:0.345em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>m</span></span></span></span><span style=”top:-0.22999999999999998em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle textstyle uncramped frac-line”></span></span><span style=”top:-0.394em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mord mathrm mtight”>1</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span><span class=”mclose sizing reset-size5 size5 reset-textstyle textstyle uncramped nulldelimiter”></span></span><span class=”mop”><span class=”mop op-symbol small-op” style=”top:-0.0000050000000000050004em;”>∑</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.30001em;margin-left:0em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord scriptstyle cramped mtight”><span class=”mord mathit mtight”>i</span><span class=”mrel mtight”>=</span><span class=”mord mathrm mtight”>1</span></span></span></span><span style=”top:-0.364em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord mathit mtight”>m</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mord mathit”>L</span><span class=”mopen”>(</span><span class=”mord”><span class=”mord textstyle uncramped”><span class=”mord mathit”>a</span></span><span class=”msupsub”><span class=”vlist”><span style=”top:-0.363em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mopen mtight”>(</span><span class=”mord mathit mtight”>i</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mpunct”>,</span><span class=”mord”><span class=”mord mathit” style=”margin-right:0.03588em;”>y</span><span class=”msupsub”><span class=”vlist”><span style=”top:-0.363em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord scriptstyle uncramped mtight”><span class=”mopen mtight”>(</span><span class=”mord mathit mtight”>i</span><span class=”mclose mtight”>)</span></span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mclose”>)</span></span></span></span>

计算参数的梯度为:这样,咱们想要失去最终的 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>d</mi><mrow><msub><mi>w</mi><mn>1</mn></msub></mrow><mo separator=”true”>,</mo><mi>d</mi><mrow><msub><mi>w</mi><mn>2</mn></msub></mrow><mo separator=”true”>,</mo><mi>d</mi><mrow><mi>b</mi></mrow></mrow><annotation encoding=”application/x-tex”>d{w_1},d{w_2},d{b}</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.69444em;”></span><span class=”strut bottom” style=”height:0.8888799999999999em;vertical-align:-0.19444em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit”>d</span><span class=”mord textstyle uncramped”><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>1</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span><span class=”mpunct”>,</span><span class=”mord mathit”>d</span><span class=”mord textstyle uncramped”><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:0.15em;margin-right:0.05em;margin-left:-0.02691em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle cramped mtight”><span class=”mord mathrm mtight”>2</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span></span><span class=”mpunct”>,</span><span class=”mord mathit”>d</span><span class=”mord textstyle uncramped”><span class=”mord mathit”>b</span></span></span></span></span>,如何去设计一个算法计算?伪代码实现:

1.2.4.1 向量化劣势

什么是向量化

因为在进行计算的时候,最好不要应用 for 循环去进行计算,因为有 Numpy 能够进行更加疾速的向量化计算。

在公式 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>z</mi><mo>=</mo><msup><mi>w</mi><mi>T</mi></msup><mi>x</mi><mo>+</mo><mi>b</mi></mrow><annotation encoding=”application/x-tex”>z = w^Tx+b</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.8413309999999999em;”></span><span class=”strut bottom” style=”height:0.924661em;vertical-align:-0.08333em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.04398em;”>z</span><span class=”mrel”>=</span><span class=”mord”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”msupsub”><span class=”vlist”><span style=”top:-0.363em;margin-right:0.05em;”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span><span class=”reset-textstyle scriptstyle uncramped mtight”><span class=”mord mathit mtight” style=”margin-right:0.13889em;”>T</span></span></span><span class=”baseline-fix”><span class=”fontsize-ensurer reset-size5 size5″><span style=”font-size:0em;”>​</span></span>​</span></span></span></span><span class=”mord mathit”>x</span><span class=”mbin”>+</span><span class=”mord mathit”>b</span></span></span></span> 中 <span class=”katex”><span class=”katex-mathml”><math><semantics><mrow><mi>w</mi><mo separator=”true”>,</mo><mi>x</mi></mrow><annotation encoding=”application/x-tex”>w,x</annotation></semantics></math></span><span aria-hidden=”true” class=”katex-html”><span class=”strut” style=”height:0.43056em;”></span><span class=”strut bottom” style=”height:0.625em;vertical-align:-0.19444em;”></span><span class=”base textstyle uncramped”><span class=”mord mathit” style=”margin-right:0.02691em;”>w</span><span class=”mpunct”>,</span><span class=”mord mathit”>x</span></span></span></span> 都可能是多个值,也就是

import numpy as np
import time
a = np.random.rand(100000)
b = np.random.rand(100000)
  • 第一种办法
  
  
# 第一种 for 循环
  
  
c = 0
start = time.time()
for i in range(100000):
    c += a[i]*b[i]
end = time.time()

print("计算所用工夫 %s" % str(1000*(end-start)) + "ms")
  • 第二种向量化形式应用 np.dot
  
  
# 向量化运算
  
  
start = time.time()
c = np.dot(a, b)
end = time.time()
print("计算所用工夫 %s" % str(1000*(end-start)) + "ms")

Numpy 可能充沛的利用并行化,Numpy 当中提供了很多函数应用

函数 作用
np.ones or np.zeros 全为 1 或者 0 的矩阵
np.exp 指数计算
np.log 对数计算
np.abs 绝对值计算

所以上述的 m 个样本的梯度更新过程,就是去除掉 for 循环。本来这样的计算

1.2.4.2 向量化实现伪代码
  • 思路
z1=wTx1+bz^1 = w^Tx^1+bz​1​​=w​T​​x​1​​+b z2=wTx2+bz^2 = w^Tx^2+bz​2​​=w​T​​x​2​​+b z3=wTx3+bz^3 = w^Tx^3+bz​3​​=w​T​​x​3​​+b
a1=σ(z1)a^1 = \sigma(z^1)a​1​​=σ(z​1​​) a2=σ(z2)a^2 = \sigma(z^2)a​2​​=σ(z​2​​) a3=σ(z3)a^3 = \sigma(z^3)a​3​​=σ(z​3​​)

能够变成这样的计算

注:w 的形态为(n,1), x 的形态为(n, m),其中 n 为特色数量,m 为样本数量

咱们能够让,得出的后果为 (1, m) 大小的矩阵 注:大写的 wx 为多个样本示意

  • 实现多个样本向量化计算的伪代码

这相当于一次应用了 M 个样本的所有特征值与目标值,那咱们晓得如果想屡次迭代,使得这 M 个样本反复若干次计算

1.2.5 案例:实现逻辑回归

1.2.5.1 应用数据:制作二分类数据集
from sklearn.datasets import load_iris, make_classification
from sklearn.model_selection import train_test_split
import tensorflow as tf
import numpy as np

X, Y = make_classification(n_samples=500, n_features=5, n_classes=2)
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
1.2.5.2 步骤设计:

别离构建算法的不同模块

  • 1、初始化参数
def initialize_with_zeros(shape):
    """
    创立一个形态为 (shape, 1) 的 w 参数和 b =0.
    return:w, b
    """

    w = np.zeros((shape, 1))
    b = 0

    return w, b
  • 计算成本函数及其梯度

    • w (n,1).T * x (n, m)
    • y: (1, n)
def propagate(w, b, X, Y):
    """
    参数:w,b,X,Y:网络参数和数据
    Return:
    损失 cost、参数 W 的梯度 dw、参数 b 的梯度 db
    """
    m = X.shape[1]

    # w (n,1), x (n, m)
    A = basic_sigmoid(np.dot(w.T, X) + b)
    # 计算损失
    cost = -1 / m * np.sum(Y * np.log(A) + (1 - Y) * np.log(1 - A))
    dz = A - Y
    dw = 1 / m * np.dot(X, dz.T)
    db = 1 / m * np.sum(dz)

    cost = np.squeeze(cost)

    grads = {"dw": dw,
             "db": db}

    return grads, cost

须要一个根底函数 sigmoid

def basic_sigmoid(x):
    """计算 sigmoid 函数"""

    s = 1 / (1 + np.exp(-x))

    return s
  • 应用优化算法(梯度降落)

    • 实现优化函数. 全局的参数随着 w,b 对损失 J 进行优化扭转. 对参数进行梯度降落公式计算,指定学习率和步长。
    • 循环:

      • 计算以后损失
      • 计算以后梯度
      • 更新参数(梯度降落)
def optimize(w, b, X, Y, num_iterations, learning_rate):
    """
    参数:w: 权重,b: 偏置,X 特色,Y 目标值,num_iterations 总迭代次数,learning_rate 学习率
    Returns:
    params: 更新后的参数字典
    grads: 梯度
    costs: 损失后果
    """

    costs = []

    for i in range(num_iterations):

        # 梯度更新计算函数
        grads, cost = propagate(w, b, X, Y)

        # 取出两个局部参数的梯度
        dw = grads['dw']
        db = grads['db']

        # 依照梯度降落公式去计算
        w = w - learning_rate * dw
        b = b - learning_rate * db

        if i % 100 == 0:
            costs.append(cost)
        if i % 100 == 0:
            print("损失后果 %i: %f" %(i, cost))
            print(b)

    params = {"w": w,
              "b": b}

    grads = {"dw": dw,
             "db": db}

    return params, grads, costs
  • 预测函数(不必实现)

利用得出的参数来进行测试得出准确率

def predict(w, b, X):
    '''
    利用训练好的参数预测
    return:预测后果
    '''

    m = X.shape[1]
    y_prediction = np.zeros((1, m))
    w = w.reshape(X.shape[0], 1)

    # 计算结果
    A = basic_sigmoid(np.dot(w.T, X) + b)

    for i in range(A.shape[1]):

        if A[0, i] <= 0.5:
            y_prediction[0, i] = 0
        else:
            y_prediction[0, i] = 1

    return y_prediction
  • 整体逻辑

    • 模型训练
def model(x_train, y_train, x_test, y_test, num_iterations=2000, learning_rate=0.0001):
    """"""

    # 批改数据形态
    x_train = x_train.reshape(-1, x_train.shape[0])
    x_test = x_test.reshape(-1, x_test.shape[0])
    y_train = y_train.reshape(1, y_train.shape[0])
    y_test = y_test.reshape(1, y_test.shape[0])
    print(x_train.shape)
    print(x_test.shape)
    print(y_train.shape)
    print(y_test.shape)

    # 1、初始化参数
    w, b = initialize_with_zeros(x_train.shape[0])

    # 2、梯度降落
    # params: 更新后的网络参数
    # grads: 最初一次梯度
    # costs: 每次更新的损失列表
    params, grads, costs = optimize(w, b, x_train, y_train, num_iterations, learning_rate)

    # 获取训练的参数
    # 预测后果
    w = params['w']
    b = params['b']
    y_prediction_train = predict(w, b, x_train)
    y_prediction_test = predict(w, b, x_test)

    # 打印准确率
    print("训练集准确率: {}".format(100 - np.mean(np.abs(y_prediction_train - y_train)) * 100))
    print("测试集准确率: {}".format(100 - np.mean(np.abs(y_prediction_test - y_test)) * 100))

    return None
  • 训练
model(x_train, y_train, x_test, y_test, num_iterations=2000, learning_rate=0.0001)

未完待续,同学们请期待下一期

全套笔记和代码自取移步 gitee 仓库:gitee 仓库获取残缺文档和代码

感兴趣的小伙伴能够自取哦,欢送大家点赞转发~

正文完
 0