接上文,本次将以入门级CNN卷积神经网络来实现价格辨认。
(为了映照前文,最初再做一次题目党)
1 剖析
原始图片曾经获取结束,而后对图片进行解决加工,再进行切割。作为机器学习原始素材。
因为图片是PNG格局的,个别为4通道(RGB + 透明度)。
个别解决流程:
1 获取原始图片:
4通道(RGB + 透明度)
2 转换为灰度图片:单通道,像素值为0-255
灰度转换公式:L = R 299/1000 + G 587/1000 + B * 114/1000
3 灰度图片二值化:其实就是将图片像素值转换为0或1
(二值化转换时,须要依据图片以后数据进行适当调整 [0 if _ < 200 else 1])
如果数据简单,还会波及到去边框、边缘检测、歪斜改正、切割、降噪(侵蚀、收缩)等。
本次数据比较简单,转换为二值数据后可间接应用。
2 辨认
2.1 切割图片
切割要害代码:
lines = [-281.16, -249.92, -218.68, -187.44, -156.2, -124.96, -93.72, -62.48, -31.24, -0.0]lines_step = 22lines_map = { '-281.16': 336, '-249.92': 299, '-218.68': 261, '-187.44': 223, '-156.2': 187, '-124.96': 149, '-93.72': 112, '-62.48': 74, '-31.24': 38, '-0.0': 1,}idx = 1def process_img(imgpath: str): global idx # 原始图片 img = Image.open(imgpath) width, height = img.size img2 = copy.deepcopy(img) img_arr = np.array(img) print(img_arr.shape) # 转灰度 # 转换算法:L = R * 299/1000 + G * 587/1000 + B * 114/1000 ≈ 361 img_gray = img.convert('L') img_gray_arr = np.array(img_gray) print(img_gray_arr.shape) for data in img_gray_arr: pass # print(''.join(['{:03}'.format(_) for _ in data])) # print(''.join(['{:03}'.format(_) if _ != 0 else '...' for _ in data])) # 二值化 img_bin = img_gray.point([0 if _ < 128 else 1 for _ in range(256)], '1') img_bin_arr = np.array(img_bin) print(img_bin_arr.shape) for data in img_bin_arr: pass # print(''.join(['1' if _ else '0' for _ in data])) # print(''.join(['X' if _ else '.' for _ in data])) # 图片解决 img_draw = ImageDraw.Draw(img2) for line in lines: new_line = lines_map.get(str(line)) p1 = (new_line, 1) p2 = (new_line+22, height-1) # 图片圈选 img_draw.rectangle((p1, p2), outline='red') # 图片裁剪 img_crop = img_bin.crop((new_line, 0, new_line+22, height)) img_crop.save(os.path.join('imgs_crop', '{:03}.png'.format(idx))) idx += 1 plt.imshow(img2) plt.show()
切割后的图片:
而后对图片进行手动分类,将图片搁置到按数字命名的文件夹中。即实现人工标注。
2.2 辨认训练
次要应用Python3 Keras + TensorFlow来实现。
模型代码示例:
def gen_model(): """ 构建模型 :return: model """ _model = Sequential([ # 卷积层 # 36为输入维度,即卷积核的数目 # kernel_size为卷积核的尺寸 Conv2D(36, kernel_size=3, padding='same', activation='relu', input_shape=(36, 22, 1)), # 最大池化层 MaxPooling2D(pool_size=(2, 2)), # Dropout 包含在训练中每次更新时, 将输出单元的按比率随机设置为 0, 这有助于避免过拟合。 Dropout(0.25), # 卷积层 Conv2D(64, kernel_size=3, padding='same', activation='relu', input_shape=(36, 36, 1)), # 最大池化层 MaxPooling2D(pool_size=(2, 2)), # Dropout(0.25), # 将输出展平 行将多维数据变成一维数据 Flatten(), # 全连贯层 Dense(512, activation='relu'), Dropout(0.5), Dense(10, activation='softmax'), ]) return _model
训练代码示例:
def train(): model = gen_model() model.summary() # 模型编译 # optimizer优化器模型 # loss损失函数名,指标函数 # metrics蕴含评估模型在训练和测试时的网络性能的指标 model.compile(optimizer='adam',# keras.optimizers.Adadelta() loss='sparse_categorical_crossentropy', metrics=['accuracy'] ) x_train, y_train = load_data() x_train = x_train.reshape(-1, 36, 22, 1) x_test, y_test = load_test_data() x_test = x_test.reshape(-1, 36, 22, 1) # 模型加载训练集 callbacks=tensorboard 监控 # 进行训练评估 # x_train 输出数据 # y_train 标签 # batch_size 梯度下降时,每个batch蕴含的样本数。训练时一个batch的样本会被计算一次梯度降落,使指标函数优化一步。 # epochs 整数,训练的轮数,每个epoch会把训练集轮一遍。 # verbose 日志显示,0为不在规范输入流输入日志信息,1为输入进度条记录,2为每个epoch输入一行记录 # validation_data 验证数据集 history = model.fit(x_train, y_train, batch_size=32, epochs=20, verbose=1, validation_data=(x_test, y_test),) # epochs 数据集所有样本跑过一遍的次数 搭配 batch_size多少个一组进行训练 调整权重 score = model.evaluate(x_test, y_test, verbose=0) print('Test loss:', score[0]) print('Test accuracy:', score[1]) # 绘制训练过程中训练集和测试汇合的准确率值 plt.plot(history.history['accuracy']) plt.plot(history.history['val_accuracy']) plt.title('Model accuracy') plt.ylabel('Accuracy') plt.xlabel('Epoch') plt.legend(['Train', 'Test'], loc='upper left') plt.show() # 绘制训练过程中训练集和测试汇合的损失值 plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.title('Model loss') plt.ylabel('Loss') plt.xlabel('Epoch') plt.legend(['Train', 'Test'], loc='upper left') plt.show() model.save('model/ziru.h5')
训练数据生成代码示例:
次要分: train_lable
和train_data
。lable为对应的数据标签,即要辨认为的值。data为相应数据的具体数据值。
def gen_train_data(parent_path: str): train_data = [] train_label = [] for idx in range(10): cur_path = os.path.join(parent_path, str(idx)) for dirpath, dirnames, filenames in os.walk(cur_path): for filename in filenames: if filename.endswith('png'): imgpath = os.path.join(cur_path, filename) label = imgpath.split('/')[1] data = np.array(Image.open(imgpath)) train_label.append(int(label)) train_data.append(data) return np.array(train_data), np.array(train_label)
训练过程如下:
因为图片比较简单,简略训练根本可达100%辨认。
Epoch 1/207/7 [==============================] - 1s 68ms/step - loss: 2.0173 - accuracy: 0.3350 - val_loss: 1.3893 - val_accuracy: 0.7950Epoch 2/207/7 [==============================] - 0s 43ms/step - loss: 1.1314 - accuracy: 0.6900 - val_loss: 0.5309 - val_accuracy: 1.0000Epoch 3/207/7 [==============================] - 0s 36ms/step - loss: 0.5474 - accuracy: 0.8100 - val_loss: 0.1853 - val_accuracy: 1.0000Epoch 4/207/7 [==============================] - 0s 36ms/step - loss: 0.2606 - accuracy: 0.9250 - val_loss: 0.0842 - val_accuracy: 1.0000Epoch 5/207/7 [==============================] - 0s 34ms/step - loss: 0.2730 - accuracy: 0.9250 - val_loss: 0.1025 - val_accuracy: 0.9700Epoch 6/207/7 [==============================] - 0s 37ms/step - loss: 0.1857 - accuracy: 0.9300 - val_loss: 0.0365 - val_accuracy: 1.0000Epoch 7/207/7 [==============================] - 0s 35ms/step - loss: 0.0952 - accuracy: 0.9800 - val_loss: 0.0165 - val_accuracy: 1.0000Epoch 8/207/7 [==============================] - 0s 35ms/step - loss: 0.0560 - accuracy: 0.9900 - val_loss: 0.0076 - val_accuracy: 1.0000Epoch 9/207/7 [==============================] - 0s 35ms/step - loss: 0.0125 - accuracy: 1.0000 - val_loss: 0.0066 - val_accuracy: 1.0000Epoch 10/207/7 [==============================] - 0s 36ms/step - loss: 0.0173 - accuracy: 1.0000 - val_loss: 0.0024 - val_accuracy: 1.0000Epoch 11/207/7 [==============================] - 0s 34ms/step - loss: 0.0086 - accuracy: 1.0000 - val_loss: 0.0014 - val_accuracy: 1.0000Epoch 12/207/7 [==============================] - 0s 37ms/step - loss: 0.0061 - accuracy: 1.0000 - val_loss: 8.3420e-04 - val_accuracy: 1.0000Epoch 13/207/7 [==============================] - 0s 33ms/step - loss: 0.0051 - accuracy: 1.0000 - val_loss: 4.9917e-04 - val_accuracy: 1.0000Epoch 14/207/7 [==============================] - 0s 35ms/step - loss: 0.0020 - accuracy: 1.0000 - val_loss: 3.4299e-04 - val_accuracy: 1.0000Epoch 15/207/7 [==============================] - 0s 35ms/step - loss: 0.0037 - accuracy: 1.0000 - val_loss: 2.3839e-04 - val_accuracy: 1.0000Epoch 16/207/7 [==============================] - 0s 34ms/step - loss: 0.0028 - accuracy: 1.0000 - val_loss: 2.0110e-04 - val_accuracy: 1.0000Epoch 17/207/7 [==============================] - 0s 36ms/step - loss: 0.0012 - accuracy: 1.0000 - val_loss: 1.8016e-04 - val_accuracy: 1.0000Epoch 18/207/7 [==============================] - 0s 35ms/step - loss: 0.0015 - accuracy: 1.0000 - val_loss: 1.5284e-04 - val_accuracy: 1.0000Epoch 19/207/7 [==============================] - 0s 38ms/step - loss: 8.4545e-04 - accuracy: 1.0000 - val_loss: 1.3383e-04 - val_accuracy: 1.0000Epoch 20/207/7 [==============================] - 0s 36ms/step - loss: 7.2767e-04 - accuracy: 1.0000 - val_loss: 1.2135e-04 - val_accuracy: 1.0000Test loss: 0.00012135423457948491Test accuracy: 1.0
训练损失及准确率图表:
2.3 辨认验证
加载模型,传入数据,失去辨认后果。
示例代码:
def __recognize_img(img_data): model = load_model('model/ziru.h5') img_arr = np.array(img_data) img_arr = img_arr.reshape((-1, 36, 22, 1)) result = model.predict(img_arr) predict_val = __parse_result(result) return predict_valdef __parse_result(result): result = result[0] max_val = max(result) for i in range(10): if max_val == result[i]: return i
3 封装
整个辨认流程结束后,剩下的就是将服务封装并对外裸露。
为了不便,已做成接口服务提供:测试接口==>https://lemon.lpe234.xyz/common/ziru/
4 总结
本文对CNN的应用根本属于入门级别。其实数字辨认也能够应用要害像素点形式
进行辨认,比方1
和3
图片像素必定有差异,找出这个差异根本也能辨认进去。