关于算法:自编码器-AEAutoEncoder程序

0次阅读

共计 11618 个字符,预计需要花费 30 分钟才能阅读完成。

原文链接

1. 程序解说

(1)香草编码器

在这种自编码器的最简略构造中,只有三个网络层,即只有一个暗藏层的神经网络。它的输出和输入是雷同的,可通过应用 Adam 优化器和均方误差损失函数,来学习如何重构输出。

在这里,如果 隐含层维数(64)小于输出维数(784),则称这个编码器是有损的。通过这个束缚,来迫使神经网络来学习数据的压缩表征。

input_size = 784
hidden_size = 64
output_size = 784

x = Input(shape=(input_size,))

# Encoder
h = Dense(hidden_size, activation='relu')(x)

# Decoder
r = Dense(output_size, activation='sigmoid')(h)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

Dense:Keras Dense 层,keras.layers.core.Dense(units, activation=None)

units, #代表该层的输入维度

activation=None, #激活函数. 然而默认 liner

Activation:激活层对一个层的输入施加激活函数

model.compile():Model 模型办法之一:compile

optimizer:优化器,为预约义优化器名或优化器对象,参考优化器

loss:损失函数,为预约义损失函数名或一个指标函数,参考损失函数

adam:adaptive moment estimation,是对 RMSProp 优化器的更新。利用梯度的一阶矩预计和二阶矩预计动静调整每个参数的学习率。长处:每一次迭代学习率都有一个明确的范畴, 使得参数变动很安稳。

mse:mean_squared_error,均方误差

(2)多层自编码器

如果一个隐含层还不够,显然能够将主动编码器的隐含层数目进一步提高。

在这里,实现中应用了 3 个隐含层,而不是只有一个。任意一个隐含层都能够作为特色表征,然而为了使网络对称,咱们应用了最两头的网络层。

input_size = 784
hidden_size = 128
code_size = 64

x = Input(shape=(input_size,))

# Encoder
hidden_1 = Dense(hidden_size, activation='relu')(x)
h = Dense(code_size, activation='relu')(hidden_1)

# Decoder
hidden_2 = Dense(hidden_size, activation='relu')(h)
r = Dense(input_size, activation='sigmoid')(hidden_2)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

(3)卷积自编码器

除了全连贯层,自编码器也能利用到卷积层,原理是一样的,然而 要应用 3D 矢量(如图像)而不是展平后的一维矢量 。对输出图像进行 下采样,以提供较小维度的潜在表征,来迫使自编码器从压缩后的数据进行学习。

x = Input(shape=(28, 28,1)) 

# Encoder
conv1_1 = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
pool1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
conv1_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool1)
pool2 = MaxPooling2D((2, 2), padding='same')(conv1_2)
conv1_3 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool2)
h = MaxPooling2D((2, 2), padding='same')(conv1_3)

# Decoder
conv2_1 = Conv2D(8, (3, 3), activation='relu', padding='same')(h)
up1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(up1)
up2 = UpSampling2D((2, 2))(conv2_2)
conv2_3 = Conv2D(16, (3, 3), activation='relu')(up2)
up3 = UpSampling2D((2, 2))(conv2_3)
r = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up3)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

conv2d:Conv2D(filters, kernel_size, strides=(1, 1), padding=’valid’)

filters:卷积核的数目(即输入的维度)。

kernel_size:卷积核的宽度和长度,单个整数或由两个整数形成的 list/tuple。如为单个整数,则示意在各个空间维度的雷同长度。

strides:卷积的步长,单个整数或由两个整数形成的 list/tuple。如为单个整数,则示意在各个空间维度的雷同步长。任何不为 1 的 strides 均与任何不为 1 的 dilation_rate 均不兼容。

padding:补 0 策略,有“valid”,“same”两种。“valid”代表只进行无效的卷积,即对边界数据不解决。“same”代表保留边界处的卷积后果,通常会导致输入 shape 与输出 shape 雷同。

MaxPooling2D:2D 输出的最大池化层。MaxPooling2D(pool_size=(2, 2), strides=None, border_mode=’valid’)

pool_size:pool_size:长为 2 的整数 tuple,代表在两个方向(竖直,程度)上的下采样因子,如取(2,2)将使图片在两个维度上均变为原长的一半。
strides:长为 2 的整数 tuple,或者 None,步长值。
padding: 字符串,“valid”或者”same”。

UpSampling2D:上采样。UpSampling2D(size=(2, 2))

size:整数 tuple,别离为行和列上采样因子。

(4)正则自编码器

除了施加一个比输出维度小的隐含层,一些其余办法也可用来束缚自编码器重构,如正则自编码器。

正则自编码器不须要应用浅层的编码器和解码器以及小的编码维数来限度模型容量,而是应用 损失函数 来激励模型学习其余个性(除了将输出复制到输入)。这些个性包含稠密表征、小导数表征、以及对噪声或输出缺失的鲁棒性。

即便模型容量大到足以学习一个无意义的恒等函数,非线性且过齐备的正则自编码器依然可能从数据中学到一些对于数据分布的有用信息。

在理论利用中,罕用到两种正则自编码器,别离是 稠密自编码器 降噪自编码器

(5)稠密自编码器

个别用来学习特色,以便用于像分类这样的工作。稠密正则化的自编码器必须反映训练数据集的独特统计特色,而不是简略地充当恒等函数。以这种形式训练,执行附带稠密惩办的复现工作能够失去能学习有用特色的模型。

还有一种用来束缚主动编码器重构的办法,是对其损失函数施加束缚。比方,可对 损失函数增加一个正则化束缚,这样能使自编码器学习到数据的稠密表征。

要留神,在隐含层中,咱们还退出了L1 正则化,作为优化阶段中损失函数的惩办项。与香草自编码器相比,这样操作后的数据表征更为稠密。

input_size = 784
hidden_size = 64
output_size = 784

x = Input(shape=(input_size,))

# Encoder
h = Dense(hidden_size, activation='relu', activity_regularizer=regularizers.l1(10e-5))(x)
#施加在输入上的 L1 正则项

# Decoder
r = Dense(output_size, activation='sigmoid')(h)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

activity_regularizer:施加在输入上的正则项,为 ActivityRegularizer 对象

l1(l=0.01):L1 正则项,正则项通常用于对模型的训练施加某种束缚,L1 正则项即 L1 范数束缚,该束缚会使被约束矩阵 / 向量更稠密。

(6)降噪自编码器

这里不是通过对损失函数施加惩办项,而是 通过扭转损失函数的重构误差项来学习一些有用信息

向训练数据退出噪声,并使自编码器学会去除这种噪声来取得没有被噪声污染过的实在输出。因而,这就迫使编码器学习提取最重要的特色并学习输出数据中更加鲁棒的表征,这也是它的泛化能力 比个别编码器强 的起因。

这种构造能够通过梯度降落算法来训练。

x = Input(shape=(28, 28, 1))

# Encoder
conv1_1 = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
pool1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
conv1_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(pool1)
h = MaxPooling2D((2, 2), padding='same')(conv1_2)

# Decoder
conv2_1 = Conv2D(32, (3, 3), activation='relu', padding='same')(h)
up1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(up1)
up2 = UpSampling2D((2, 2))(conv2_2)
r = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up2)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

2. 程序实例:

(1)单层自编码器

from keras.layers import Input, Dense
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
 
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)
 
 #单层自编码器
encoding_dim = 32
input_img = Input(shape=(784,))
 
encoded = Dense(encoding_dim, activation='relu')(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
encoder = Model(inputs=input_img, outputs=encoded)
 
encoded_input = Input(shape=(encoding_dim,))
decoder_layer = autoencoder.layers[-1]
 
decoder = Model(inputs=encoded_input, outputs=decoder_layer(encoded_input))
 
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256, 
                shuffle=True, validation_data=(x_test, x_test))
 
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)
 
 #输入图像
n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

(2)卷积自编码器

from keras.layers import Input, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
from keras.callbacks import TensorBoard

(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape) 
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape) 
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
print(x_train.shape)
print(x_test.shape)


#卷积自编码器
input_img = Input(shape=(28, 28, 1))
 
x = Convolution2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
 
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, (3, 3), activation='sigmoid', padding='same')(x)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
# 关上一个终端并启动 TensorBoard,终端中输出 tensorboard --logdir=/autoencoder
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256,
                shuffle=True, validation_data=(x_test, x_test),
                callbacks=[TensorBoard(log_dir='autoencoder')])
 
decoded_imgs = autoencoder.predict(x_test)


#输入图像
n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

(3)深度自编码器

from keras.layers import Input, Dense
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt

(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)



#深度自编码器
input_img = Input(shape=(784,))
encoded = Dense(128, activation='relu')(input_img)
encoded = Dense(64, activation='relu')(encoded)
decoded_input = Dense(32, activation='relu')(encoded)
 
decoded = Dense(64, activation='relu')(decoded_input)
decoded = Dense(128, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(encoded)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
encoder = Model(inputs=input_img, outputs=decoded_input)
 
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256, 
                shuffle=True, validation_data=(x_test, x_test))
 
encoded_imgs = encoder.predict(x_test)
decoded_imgs = autoencoder.predict(x_test)



#输入图像
n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

(4)降噪自编码器

from keras.layers import Input, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
from keras.callbacks import TensorBoard
 
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape) 
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape) 
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
print(x_train.shape)
print(x_test.shape)
 
input_img = Input(shape=(28, 28, 1))
 
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
 
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, (3, 3), activation='sigmoid', padding='same')(x)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
# 关上一个终端并启动 TensorBoard,终端中输出 tensorboard --logdir=/autoencoder
autoencoder.fit(x_train_noisy, x_train, epochs=10, batch_size=256,
                shuffle=True, validation_data=(x_test_noisy, x_test),
                callbacks=[TensorBoard(log_dir='autoencoder', write_graph=False)])
 
decoded_imgs = autoencoder.predict(x_test_noisy)
 
n = 10
plt.figure(figsize=(30, 6))
for i in range(n):
    ax = plt.subplot(3, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
    
    ax = plt.subplot(3, n, i + 1 + n)
    plt.imshow(x_test_noisy[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    ax = plt.subplot(3, n, i + 1 + 2*n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

学习更多编程常识,请关注我的公众号:

代码的路

正文完
 0