GoogLeNet是一个深度卷积神经网络架构,于2014年由Google团队在论文《Going Deeper with Convolutions》中提出。它是在ILSVRC比赛中获得优异成绩的一种神经网络构造,能够用于图像分类和其余计算机视觉工作。GoogLeNet之所以引人注目,是因为它在缩小参数数量的同时放弃了较高的准确率,以及引入了一种称为"Inception模块"的翻新构造。Inception模块通过在同一层内应用多个不同大小的卷积核和池化操作,来捕获不同尺度的图像特色。这种并行的特征提取形式有助于网络更好地捕获图像中的部分和全局特色,从而进步了分类性能。GoogLeNet的一个显著特点是其参数极少,相较于传统的深度卷积神经网络,它应用了全局均匀池化层来显著缩小参数数量。这有助于加重过拟合问题,进步了模型的泛化能力。GoogLeNet在深度学习畛域的奉献是引入了Inception模块,展现了一种无效地缩小参数数量同时放弃模型性能的办法,为后续的网络设计提供了有价值的启发。然而,自GoogLeNet之后,深度神经网络架构还失去了许多进一步的倒退和改良,如ResNet、Transformer等。GoogLeNet在以下几个方面表现出色:
参数效率: GoogLeNet引入了Inception模块,容许在同一层内应用多个不同大小的卷积核和池化操作,从而更无效地捕捉不同尺度的图像特色。这种设计缩小了网络中的参数数量,使得模型更加轻量,同时依然可能放弃较高的准确性。加重梯度隐没问题:在Inception模块内,1x1的卷积核被用来缩小特色图的通道数,从而有助于加重梯度隐没问题,使网络更易于训练。多尺度特征提取:通过Inception模块的并行构造,GoogLeNet能够同时从不同尺度上提取特色。这有助于网络更好地捕捉图像中的部分和全局特色,进步了图像分类性能。泛化能力: 因为参数较少且采纳了全局均匀池化层,GoogLeNet在肯定水平上加重了过拟合问题,进步了模型的泛化能力,能够更好地适应新数据。图像分类性能:GoogLeNet在ILSVRC 2014比赛中获得了优异的问题,证实了它在图像分类工作上的有效性。它的设计理念也为后续的深度卷积神经网络提供了启发。本次采纳GoogLeNet对MNIST数据集进行训练,在炼丹侠平台中别离通过A100和CPU进行训练,比照两者之间性能差距。GPU版本残缺代码如下:import torchimport torch.nn as nnimport torch.optim as optimimport torchvisionimport torchvision.transforms as transforms# 定义GoogLeNet模型class Inception(nn.Module): def __init__(self, in_channels, out1x1, reduce3x3, out3x3, reduce5x5, out5x5, out1x1pool): super(Inception, self).__init__() # 定义Inception模块的各个分支 self.branch1 = nn.Conv2d(in_channels, out1x1, kernel_size=1) self.branch2 = nn.Sequential( nn.Conv2d(in_channels, reduce3x3, kernel_size=1), nn.ReLU(inplace=True), nn.Conv2d(reduce3x3, out3x3, kernel_size=3, padding=1) ) self.branch3 = nn.Sequential( nn.Conv2d(in_channels, reduce5x5, kernel_size=1), nn.ReLU(inplace=True), nn.Conv2d(reduce5x5, out5x5, kernel_size=5, padding=2) ) self.branch4 = nn.Sequential( nn.MaxPool2d(kernel_size=3, stride=1, padding=1), nn.Conv2d(in_channels, out1x1pool, kernel_size=1) ) def forward(self, x): return torch.cat([self.branch1(x), self.branch2(x), self.branch3(x), self.branch4(x)], 1)class GoogLeNet(nn.Module): def __init__(self, num_classes=10): super(GoogLeNet, self).__init__() self.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3) self.maxpool1 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) self.conv2 = nn.Conv2d(64, 64, kernel_size=1) self.conv3 = nn.Conv2d(64, 192, kernel_size=3, padding=1) self.maxpool2 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) self.inception3a = Inception(192, 64, 96, 128, 16, 32, 32) self.inception3b = Inception(256, 128, 128, 192, 32, 96, 64) self.maxpool3 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) self.inception4a = Inception(480, 192, 96, 208, 16, 48, 64) self.inception4b = Inception(512, 160, 112, 224, 24, 64, 64) self.inception4c = Inception(512, 128, 128, 256, 24, 64, 64) self.inception4d = Inception(512, 112, 144, 288, 32, 64, 64) self.inception4e = Inception(528, 256, 160, 320, 32, 128, 128) self.maxpool4 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) self.inception5a = Inception(832, 256, 160, 320, 32, 128, 128) self.inception5b = Inception(832, 384, 192, 384, 48, 128, 128) self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) self.dropout = nn.Dropout(0.4) self.fc = nn.Linear(1024, num_classes) def forward(self, x): x = self.maxpool1(self.conv1(x)) x = self.maxpool2(self.conv3(self.conv2(x))) x = self.inception3b(self.inception3a(x)) x = self.maxpool3(x) x = self.inception4e(self.inception4d(self.inception4c(self.inception4b(self.inception4a(x))))) x = self.maxpool4(x) x = self.inception5b(self.inception5a(x)) x = self.avgpool(x) x = x.view(x.size(0), -1) x = self.dropout(x) x = self.fc(x) return x# 初始化模型并将其移至GPUmodel = GoogLeNet().cuda()# 定义数据预处理transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])# 加载训练数据集train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transform, download=True)train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)# 定义损失函数和优化器criterion = nn.CrossEntropyLoss()optimizer = optim.Adam(model.parameters(), lr=0.001)# 训练模型num_epochs = 10for epoch in range(num_epochs): model.train() running_loss = 0.0 for images, labels in train_loader: images = images.cuda() labels = labels.cuda() optimizer.zero_grad() outputs = model(images) loss = criterion(outputs, labels) loss.backward() optimizer.step() running_loss += loss.item() print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader):.4f}')CPU版本残缺代码如下:
...