pytorch分别用MLP和RNN拟合sinx

0 实践上带有一个非线性函数的网络可能拟合任意函数。那显然MLP和RNN是科研拟合sinx的。
结尾先把后果给展现进去，而后是代码，最初是我的过程。懒得看的间接看前半部分行了，过程给有趣味的人看看。

先上后果图

注：每次训练torch初始化有不同，所以后果有出入。

代码

乍一看挺多的，实际上简略得一批。只不过是定义了两个网络，训练了两次，展现图片的反复代码而已。具体代码曾经正文。

import torch

import math

import matplotlib.pyplot as plt
class MLP(torch.nn.Module):

    def __init__(self):

        super().__init__()

        self.layer1=torch.nn.Linear(1,16)

        self.layer2=torch.nn.Linear(16,16)

        self.layer3=torch.nn.Linear(16,1)
    def forward(self,x):

        x=self.layer1(x)

        x=torch.nn.functional.relu(x)
        x=self.layer2(x)

        x=torch.nn.functional.relu(x)
        x=self.layer3(x)
        return x
# rnn takes 3d input while mlp only takes 2d input

class RecNN(torch.nn.Module):

    def __init__(self):

        super().__init__()

        self.rnn=torch.nn.LSTM(input_size=1,hidden_size=2,num_layers=1,batch_first=True)

        #至于这个线性层为什么是2维度接管，要看最初网络输入的维度是否匹配label的维度

        self.linear=torch.nn.Linear(2,1)
    def forward(self,x):

        # print("x shape: {}".format(x.shape))

        # x [batch_size, seq_len, input_size]

        output,hn=self.rnn(x)

        # print("output shape: {}".format(output.shape))

        # out [seq_len, batch_size, hidden_size]

        x=output.reshape(-1,2)
        # print("after change shape: {}".format(x.shape))

        x=self.linear(x)
        # print("after linear shape: {}".format(x.shape))
        return x
def PlotCurve(mlp, rnn, input_x, x):

    # input_x 是输出网络的x。

    # sin_x 是列表，x的取值，一维数据、

    # 尽管他们的内容（不是维度）是一样的。能够print shape看一下。

    mlp_eval = mlp.eval()

    rnn_eval = rnn.eval()

    mlp_y = mlp_eval(input_x)

    rnn_y = rnn_eval(input_x.unsqueeze(0))
    plt.figure(figsize=(6, 8))
    plt.subplot(211)

    plt.plot([i + 1 for i in range(EPOCH)], mlp_loss, label='MLP')

    plt.plot([i + 1 for i in range(EPOCH)], rnn_loss, label='RNN')

    plt.title('loss')

    plt.legend()
    plt.subplot(212)

    plt.plot(x, torch.sin(x), label="original", linewidth=3)

    plt.plot(x, [y[0] for y in mlp_y], label='MLP')

    plt.plot(x, [y[0] for y in rnn_y], label='RNN')

    plt.title('evaluation')

    plt.legend()
    plt.tight_layout()

    plt.show()
#常量都取出来，以便改变

EPOCH=1000

RNN_LR=0.01

MLP_LR=0.001

left,right=-10,10

PI=math.pi
if __name__ == '__main__':

    mlp=MLP()

    rnn=RecNN()
    # x,y 是一般sinx 的torch tensor

    x = torch.tensor([num * PI  4 for num in range(left, right)])

    y = torch.sin(x)

    # input_x和labels是训练网络时候用的输出和标签。

    input_x=x.reshape(-1, 1)

    labels=y.reshape(-1,1)
    #训练mlp

    mlp_optimizer=torch.optim.Adam(mlp.parameters(), lr=MLP_LR)

    mlp_loss=[]

    for epoch in range(EPOCH):

        preds=mlp(input_x)

        loss=torch.nn.functional.mse_loss(preds,labels)
        mlp_optimizer.zero_grad()

        loss.backward()

        mlp_optimizer.step()

        mlp_loss.append(loss.item())
    #训练rnn

    rnn_optimizer=torch.optim.Adam(rnn.parameters(),lr=RNN_LR)

    rnn_loss=[]

    for epoch in range(EPOCH):

        preds=rnn(input_x.unsqueeze(0))

        # print(x.unsqueeze(0).shape)

        # print(preds.shape)

        # print(labels.shape)

        loss=torch.nn.functional.mse_loss(preds,labels)
        rnn_optimizer.zero_grad()

        loss.backward()

        rnn_optimizer.step()

        rnn_loss.append(loss.item())
    PlotCurve(mlp, rnn, input_x, x)


一些留神的点（过程）


有些人的代码是多加了dalaloader来做了数据集的loader，我集体认为没啥必要，这么简略的货色。当然加了loader或者更加合乎习惯。

为什么数据只取了20个（从left到right只有sinx的20个数据）？我一开始是从-128左近取到了128左近，然而发现训练成果奇差无比，狐疑人生了都。这仅仅取了20个数据，都须要1000次训练，更大的数据集的工夫代价可见一斑。

RNN的lr是0.01，MLP的是0.001？这个也是依据loss的图来调节的，0.001在我这个rnn里并不适宜，训练太慢了。而且为了和mlp的EPOCH保持一致，就换了0.01的学习率。然而为什么RNN比MLP降落的慢？这个有待进一步探讨（当然是因为我太菜了）。

对于loss function，为什么用mse loss？轻易选的。我又换了l1_loss和其余的loss试了，成果差不多，毕竟这么简略的函数拟合，什么损失函数无所谓了。

论文指出，RNN系列网络比MLP拟合工夫序列数据能力更强，为什么这次训练反而比MLP降落更慢？不仅如此，其实如果屡次比拟MLP和RNN的拟合成果，发现MLP成果更稳固更好一些，这又是为什么呢？有待进一步探讨。

pytorch分别用MLP和RNN拟合sinx

0

实践上带有一个非线性函数的网络可能拟合任意函数。那显然MLP和RNN是科研拟合sinx的。
结尾先把后果给展现进去，而后是代码，最初是我的过程。懒得看的间接看前半部分行了，过程给有趣味的人看看。

先上后果图

注：每次训练torch初始化有不同，所以后果有出入。

代码

一些留神的点（过程）

评论

发表回复取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

pytorch分别用MLP和RNN拟合sinx

0

实践上带有一个非线性函数的网络可能拟合任意函数。那显然MLP和RNN是科研拟合sinx的。结尾先把后果给展现进去，而后是代码，最初是我的过程。懒得看的间接看前半部分行了，过程给有趣味的人看看。

先上后果图

注：每次训练torch初始化有不同，所以后果有出入。

代码

一些留神的点（过程）

评论

发表回复 取消回复

更多文章

DDN HPC 存储硬件架构设计深度分析

探秘IO500：从Lustre并行文件系统出发，开启HPC存储性能新征程

苹果iOS打包的ipa应用无法安装？一篇文章带你了解可能的原因及排查方法

图解Golang：从零开始实现简易版过期LRU缓存

实践上带有一个非线性函数的网络可能拟合任意函数。那显然MLP和RNN是科研拟合sinx的。
结尾先把后果给展现进去，而后是代码，最初是我的过程。懒得看的间接看前半部分行了，过程给有趣味的人看看。

发表回复取消回复