共计 6821 个字符,预计需要花费 18 分钟才能阅读完成。
前言
上一篇和大家一起分享了如何应用 LabVIEW OpenCV dnn 实现手写数字辨认,明天咱们一起来看一下如何应用 LabVIEW OpenCV dnn 实现 图像分类。
一、什么是图像分类?
1、图像分类的概念
图像分类 ,外围是从给定的分类汇合中给图像调配一个标签的工作。实际上,这意味着咱们的工作是剖析一个输出图像并返回一个将图像分类的标签。标签总是来自预约义的可能类别集。
示例:咱们假设一个可能的类别集 categories = {dog, cat, eagle},之后咱们提供一张图片(下图)给分类零碎。这里的指标是依据输出图像,从类别集中调配一个类别,这里为 eagle, 咱们的分类零碎也能够依据概率给图像调配多个标签,如 eagle:95%,cat:4%,panda:1%
2、MobileNet 简介
MobileNet:根本单元是深度级可拆散卷积(depthwise separable convolution),其实这种构造之前曾经被应用在 Inception 模型中。深度级可拆散卷积其实是一种可分解卷积操作(factorized convolutions),其能够合成为两个更小的操作:depthwise convolution 和 pointwise convolution,如图 1 所示。Depthwise convolution 和规范卷积不同,对于规范卷积其卷积核是用在所有的输出通道上(input channels),而 depthwise convolution 针对每个输出通道采纳不同的卷积核,就是说一个卷积核对应一个输出通道,所以说 depthwise convolution 是 depth 级别的操作。而 pointwise convolution 其实就是一般的卷积,只不过其采纳 1 ×1 的卷积核。图 2 中更清晰地展现了两种操作。对于 depthwise separable convolution,其首先是采纳 depthwise convolution 对不同输出通道别离进行卷积,而后采纳 pointwise convolution 将下面的输入再进行联合,这样其实整体成果和一个规范卷积是差不多的,然而会大大减少计算量和模型参数量。
MobileNet 的网络结构如表所示。首先是一个 3 ×3 的规范卷积,而后前面就是沉积 depthwise separable convolution,并且能够看到其中的局部 depthwise convolution 会通过 strides= 2 进行 down sampling。而后采纳 average pooling 将 feature 变成 1 ×1,依据预测类别大小加上全连贯层,最初是一个 softmax 层。如果独自计算 depthwise convolution 和 pointwise convolution,整个网络有 28 层(这里 Avg Pool 和 Softmax 不计算在内)。
二、应用 python 实现图像分类(py_to_py_ssd_mobilenet.py)
1、获取预训练模型
- 应用 tensorflow.keras.applications 获取模型(以 mobilenet 为例);
from tensorflow.keras.applications import MobileNet
original_tf_model = MobileNet(
include_top=True,
weights="imagenet"
)
- 把 original_tf_model 打包成 pb
def get_tf_model_proto(tf_model):
# define the directory for .pb model
pb_model_path = "models"
# define the name of .pb model
pb_model_name = "mobilenet.pb"
# create directory for further converted model
os.makedirs(pb_model_path, exist_ok=True)
# get model TF graph
tf_model_graph = tf.function(lambda x: tf_model(x))
# get concrete function
tf_model_graph = tf_model_graph.get_concrete_function(tf.TensorSpec(tf_model.inputs[0].shape, tf_model.inputs[0].dtype))
# obtain frozen concrete function
frozen_tf_func = convert_variables_to_constants_v2(tf_model_graph)
# get frozen graph
frozen_tf_func.graph.as_graph_def()
# save full tf model
tf.io.write_graph(graph_or_graph_def=frozen_tf_func.graph,
logdir=pb_model_path,
name=pb_model_name,
as_text=False)
return os.path.join(pb_model_path, pb_model_name)
2、应用 opencv_dnn 进行推理
- 图像预处理(blob)
def get_preprocessed_img(img_path):
# read the image
input_img = cv2.imread(img_path, cv2.IMREAD_COLOR)
input_img = input_img.astype(np.float32)
# define preprocess parameters
mean = np.array([1.0, 1.0, 1.0]) * 127.5
scale = 1 / 127.5
# prepare input blob to fit the model input:
# 1. subtract mean
# 2. scale to set pixel values from 0 to 1
input_blob = cv2.dnn.blobFromImage(
image=input_img,
scalefactor=scale,
size=(224, 224), # img target size
mean=mean,
swapRB=True, # BGR -> RGB
crop=True # center crop
)
print("Input blob shape: {}\n".format(input_blob.shape))
return input_blob
- 调用 pb 模型进行推理
def get_tf_dnn_prediction(original_net, preproc_img, imagenet_labels):
# inference
preproc_img = preproc_img.transpose(0, 2, 3, 1)
print("TF input blob shape: {}\n".format(preproc_img.shape))
out = original_net(preproc_img)
print("\nTensorFlow model prediction: \n")
print("* shape:", out.shape)
# get the predicted class ID
imagenet_class_id = np.argmax(out)
print("* class ID: {}, label: {}".format(imagenet_class_id, imagenet_labels[imagenet_class_id]))
# get confidence
confidence = out[0][imagenet_class_id]
print("* confidence: {:.4f}".format(confidence))
3、实现图像分类(代码汇总)
import os
import cv2
import numpy as np
import tensorflow as tf
from tensorflow.keras.applications import MobileNet
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
def get_tf_model_proto(tf_model):
# define the directory for .pb model
pb_model_path = "models"
# define the name of .pb model
pb_model_name = "mobilenet.pb"
# create directory for further converted model
os.makedirs(pb_model_path, exist_ok=True)
# get model TF graph
tf_model_graph = tf.function(lambda x: tf_model(x))
# get concrete function
tf_model_graph = tf_model_graph.get_concrete_function(tf.TensorSpec(tf_model.inputs[0].shape, tf_model.inputs[0].dtype))
# obtain frozen concrete function
frozen_tf_func = convert_variables_to_constants_v2(tf_model_graph)
# get frozen graph
frozen_tf_func.graph.as_graph_def()
# save full tf model
tf.io.write_graph(graph_or_graph_def=frozen_tf_func.graph,
logdir=pb_model_path,
name=pb_model_name,
as_text=False)
return os.path.join(pb_model_path, pb_model_name)
def get_preprocessed_img(img_path):
# read the image
input_img = cv2.imread(img_path, cv2.IMREAD_COLOR)
input_img = input_img.astype(np.float32)
# define preprocess parameters
mean = np.array([1.0, 1.0, 1.0]) * 127.5
scale = 1 / 127.5
# prepare input blob to fit the model input:
# 1. subtract mean
# 2. scale to set pixel values from 0 to 1
input_blob = cv2.dnn.blobFromImage(
image=input_img,
scalefactor=scale,
size=(224, 224), # img target size
mean=mean,
swapRB=True, # BGR -> RGB
crop=True # center crop
)
print("Input blob shape: {}\n".format(input_blob.shape))
return input_blob
def get_imagenet_labels(labels_path):
with open(labels_path) as f:
imagenet_labels = [line.strip() for line in f.readlines()]
return imagenet_labels
def get_opencv_dnn_prediction(opencv_net, preproc_img, imagenet_labels):
# set OpenCV DNN input
opencv_net.setInput(preproc_img)
# OpenCV DNN inference
out = opencv_net.forward()
print("OpenCV DNN prediction: \n")
print("* shape:", out.shape)
# get the predicted class ID
imagenet_class_id = np.argmax(out)
# get confidence
confidence = out[0][imagenet_class_id]
print("* class ID: {}, label: {}".format(imagenet_class_id, imagenet_labels[imagenet_class_id]))
print("* confidence: {:.4f}\n".format(confidence))
def get_tf_dnn_prediction(original_net, preproc_img, imagenet_labels):
# inference
preproc_img = preproc_img.transpose(0, 2, 3, 1)
print("TF input blob shape: {}\n".format(preproc_img.shape))
out = original_net(preproc_img)
print("\nTensorFlow model prediction: \n")
print("* shape:", out.shape)
# get the predicted class ID
imagenet_class_id = np.argmax(out)
print("* class ID: {}, label: {}".format(imagenet_class_id, imagenet_labels[imagenet_class_id]))
# get confidence
confidence = out[0][imagenet_class_id]
print("* confidence: {:.4f}".format(confidence))
def main():
# configure TF launching
#set_tf_env()
# initialize TF MobileNet model
original_tf_model = MobileNet(
include_top=True,
weights="imagenet"
)
# get TF frozen graph path
full_pb_path = get_tf_model_proto(original_tf_model)
print(full_pb_path)
# read frozen graph with OpenCV API
opencv_net = cv2.dnn.readNetFromTensorflow(full_pb_path)
print("OpenCV model was successfully read. Model layers: \n", opencv_net.getLayerNames())
# get preprocessed image
input_img = get_preprocessed_img("yaopin.png")
# get ImageNet labels
imagenet_labels = get_imagenet_labels("classification_classes.txt")
# obtain OpenCV DNN predictions
get_opencv_dnn_prediction(opencv_net, input_img, imagenet_labels)
# obtain TF model predictions
get_tf_dnn_prediction(original_tf_model, input_img, imagenet_labels)
if __name__ == "__main__":
main()
三、应用 LabVIEW dnn 实现图像分类(callpb_photo.vi)本博客中所用实例基于 LabVIEW2018 版本, 调用 mobilenet pb 模型 1、读取待分类的图片和 pb 模型
2、将待分类的图片进行预处理
3、将图像输出至神经网络中并进行推理
4、实现图像分类
5、总体程序源码:依照如下图所示程序进行编码,实现图像分类,本范例中应用了一分类,分类出置信度最高的物体。
如下图所示为加载药瓶图片失去的分类后果,在前面板能够看到图片和 label:
四、源码下载链接:https://pan.baidu.com/s/10yO7…
提取码:8888 总结更多对于 LabVIEW 与人工智能技术,可增加技术交换群进一步探讨。qq 群号:705637299,请备注暗号:LabVIEW 机器学习