关于人工智能:在PyTorch中使用深度自编码器实现图像重建

作者|DR. VAIBHAV KUMAR
编译|VK
起源|Analytics In Diamag

人工神经网络有许多风行的变体，可用于有监督和无监督学习问题。自编码器也是神经网络的一个变种，次要用于无监督学习问题。

当它们在体系结构中有多个暗藏层时，它们被称为深度自编码器。这些模型能够利用于包含图像重建在内的各种利用。

在图像重建中，他们学习输出图像模式的示意，并重建与原始输出图像模式匹配的新图像。图像重建有许多重要的利用，特地是在医学畛域，须要从现有的不残缺或有噪声的图像中提取解码后的无噪声图像。

在本文中，咱们将演示在PyTorch中实现用于重建图像的深度自编码器。该深度学习模型将以MNIST手写数字为训练对象，在学习输出图像的示意后重建数字图像。

自编码器

自编码器是人工神经网络的变体，通常用于以无监督的形式学习无效的数据编码。

他们通常在一个示意学习计划中学习，在那里他们学习一组数据的编码。网络通过学习输出数据的示意，以十分类似的形式重建输出数据。自编码器的根本构造如下所示。

该体系结构通常包含输出层、输入层和连贯输出和输入层的一个或多个暗藏层。输入层与输出层具备雷同数量的节点，因为它要从新结构输出。

在它的个别模式中，只有一个暗藏层，但在深度主动编码器的状况下，有多个暗藏层。这种深度的减少缩小了示意某些函数的计算成本，也缩小了学习某些函数所需的训练数据量。其应用领域包含异样检测、图像处理、信息检索、药物发现等。

在PyTorch中实现深度自编码器

首先，咱们将导入所有必须的库。

import osimport torch import torchvisionimport torch.nn as nnimport torchvision.transforms as transformsimport torch.optim as optimimport matplotlib.pyplot as pltimport torch.nn.functional as Ffrom torchvision import datasetsfrom torch.utils.data import DataLoaderfrom torchvision.utils import save_imagefrom PIL import Image

当初，咱们将定义超参数的值。

Epochs = 100Lr_Rate = 1e-3Batch_Size = 128

以下函数将用于PyTorch模型所需的图像转换。

transform = transforms.Compose([    transforms.ToTensor(),    transforms.Normalize((0.5,), (0.5,))])

应用上面的代码片段，咱们将下载MNIST手写数字数据集，并为进一步解决做好筹备。

train_set = datasets.MNIST(root='./data', train=True, download=True, transform=transform)test_set = datasets.MNIST(root='./data', train=False, download=True, transform=transform)train_loader = DataLoader(train_set, Batch_Size=Batch_Size, shuffle=True)test_loader = DataLoader(test_set, Batch_Size=Batch_Size, shuffle=True)

让咱们看看对于训练数据及其类的一些信息。

print(train_set)

print(train_set.classes)

在下一步中，咱们将定义用于定义模型的Autoencoder类。

class Autoencoder(nn.Module):    def __init__(self):        super(Autoencoder, self).__init__()        #编码器        self.enc1 = nn.Linear(in_features=784, out_features=256) # Input image (28*28 = 784)        self.enc2 = nn.Linear(in_features=256, out_features=128)        self.enc3 = nn.Linear(in_features=128, out_features=64)        self.enc4 = nn.Linear(in_features=64, out_features=32)        self.enc5 = nn.Linear(in_features=32, out_features=16)        #解码器         self.dec1 = nn.Linear(in_features=16, out_features=32)        self.dec2 = nn.Linear(in_features=32, out_features=64)        self.dec3 = nn.Linear(in_features=64, out_features=128)        self.dec4 = nn.Linear(in_features=128, out_features=256)        self.dec5 = nn.Linear(in_features=256, out_features=784) # Output image (28*28 = 784)    def forward(self, x):        x = F.relu(self.enc1(x))        x = F.relu(self.enc2(x))        x = F.relu(self.enc3(x))        x = F.relu(self.enc4(x))        x = F.relu(self.enc5(x))        x = F.relu(self.dec1(x))        x = F.relu(self.dec2(x))        x = F.relu(self.dec3(x))        x = F.relu(self.dec4(x))        x = F.relu(self.dec5(x))        return x

当初，咱们将创立Autoencoder模型作为下面定义的Autoencoder类的一个对象。

model = Autoencoder()print(model)

当初，咱们将定义损失函数和优化办法。

criterion = nn.MSELoss()optimizer = optim.Adam(net.parameters(), lr=Lr_Rate)

以下函数将启用CUDA环境。

def get_device():    if torch.cuda.is_available():        device = 'cuda:0'    else:        device = 'cpu'    return device

上面的函数将创立一个目录来保留后果。

def make_dir():    image_dir = 'MNIST_Out_Images'    if not os.path.exists(image_dir):        os.makedirs(image_dir)

应用上面的函数，咱们将保留模型生成的重建图像。

def save_decod_img(img, epoch):    img = img.view(img.size(0), 1, 28, 28)    save_image(img, './MNIST_Out_Images/Autoencoder_image{}.png'.format(epoch))

将调用上面的函数来训练模型。

def training(model, train_loader, Epochs):    train_loss = []    for epoch in range(Epochs):        running_loss = 0.0        for data in train_loader:            img, _ = data            img = img.to(device)            img = img.view(img.size(0), -1)            optimizer.zero_grad()            outputs = model(img)            loss = criterion(outputs, img)            loss.backward()            optimizer.step()            running_loss += loss.item()        loss = running_loss / len(train_loader)        train_loss.append(loss)        print('Epoch {} of {}, Train Loss: {:.3f}'.format(            epoch+1, Epochs, loss))        if epoch % 5 == 0:            save_decod_img(outputs.cpu().data, epoch)    return train_loss

以下函数将对训练后的模型进行图像重建测试。

def test_image_reconstruct(model, test_loader):     for batch in test_loader:        img, _ = batch        img = img.to(device)        img = img.view(img.size(0), -1)        outputs = model(img)        outputs = outputs.view(outputs.size(0), 1, 28, 28).cpu().data        save_image(outputs, 'MNIST_reconstruction.png')        break

在训练之前，模型将被推送到CUDA环境中，并应用下面定义的函数创立目录来保留后果图像。

device = get_device()model.to(device)make_dir()

当初，将对模型进行训练。

train_loss = training(model, train_loader, Epochs)

训练胜利后，咱们将在训练中可视化损失。

plt.figure()plt.plot(train_loss)plt.title('Train Loss')plt.xlabel('Epochs')plt.ylabel('Loss')plt.savefig('deep_ae_mnist_loss.png')

咱们将可视化训练期间保留的一些图像。

Image.open('/content/MNIST_Out_Images/Autoencoder_image0.png')

Image.open('/content/MNIST_Out_Images/Autoencoder_image50.png')

Image.open('/content/MNIST_Out_Images/Autoencoder_image95.png')

在最初一步，咱们将测试咱们的自编码器模型来重建图像。

test_image_reconstruct(model, testloader)Image.open('/content/MNIST_reconstruction.png')

所以，咱们能够看到，自训练过程开始时，自编码器模型就开始重建图像。第一个epoch当前，重建的品质不是很好，直到50 epoch后才失去改良。

通过残缺的训练，咱们能够看到，在95 epoch当前生成的图像和测试中，它能够结构出与原始输出图像十分匹配的图像。

咱们依据loss值，能够晓得epoch能够设置100或200。

通过长时间的训练，无望取得更清晰的重建图像。然而，通过这个演示，咱们能够了解如何在PyTorch中实现用于图像重建的深度自编码器。

参考文献：

Sovit Ranjan Rath, “Implementing Deep Autoencoder in PyTorch”
Abien Fred Agarap, “Implementing an Autoencoder in PyTorch”
Reyhane Askari, “Auto Encoders”

原文链接：https://analyticsindiamag.com...

欢送关注磐创AI博客站：
http://panchuang.net/

sklearn机器学习中文官网文档：
http://sklearn123.com/

欢送关注磐创博客资源汇总站：
http://docs.panchuang.net/